ArrayList源码分析-jdk11 (18.9）

1.概述
2.源码分析
3.其他细节
- 3.1 快速失败机制
- 3.2 关于遍历时删除

1.概述

ArrayList 是一种变长的集合类，基于定长数组实现。ArrayList 允许空值和重复元素，当往 ArrayList 中添加的元素数量大于其底层数组容量时，其会通过扩容机制重新生成一个更大的数组。另外，由于 ArrayList 底层基于数组实现，所以其可以保证在 O(1) 复杂度下完成随机查找操作。其他方面，ArrayList 是非线程安全类，并发环境下，多个线程同时操作 ArrayList，会引发不可预知的错误。

ArrayList 是大家最为常用的集合类，作为一个变长集合类，其核心是扩容机制。所以只要知道它是怎么扩容的，以及基本的操作是怎样实现就够了。本文后续内容将围绕jdk11 (18.9)中ArrayList的源码展开叙述。

2.源码分析

2.1参数

1、ArrayList默认容量为10DEFAULT_CAPACITY注释

2、ArrayList并不是在初始化的时候就创建了 DEFAULT_CAPACITY=10 的数组。而是在往里边 add 第一个数据的时候会扩容到 10 elementData注释

  private static final long serialVersionUID = 8683452581122892189L;

    /**
     * 默认初始化容量
     */
    private static final int DEFAULT_CAPACITY = 10;

    /**
     * 用于空实例的共享空数组实例。
     */
    private static final Object[] EMPTY_ELEMENTDATA = {};

    /**
     * Shared empty array instance used for default sized empty instances. We
     * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
     * first element is added.
    * 用于默认大小的空实例的共享空数组实例。我们将其与EMPTY_ELEMENTDATA分开来，以了解添加第一个元素时要膨胀多少。
     */
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

    /**
     * The array buffer into which the elements of the ArrayList are stored.
     * The capacity of the ArrayList is the length of this array buffer. Any
     * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
     * will be expanded to DEFAULT_CAPACITY when the first element is added.
     存储ArrayList元素的数组缓冲区，ArrayList的容量是这个数组缓冲区的长度。当第一个元素被添加的时候，elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA 将被扩展成 DEFAULT_CAPACITY
     */
    transient Object[] elementData; // non-private to simplify nested class access

    /**
     * The size of the ArrayList (the number of elements it contains).
     * 数组大小
     * @serial
     */
    private int size;

2.2 构造方法

ArrayList 有三个构造方法，无参构造方法、构造空的具有特定初始容量值方法、构造一个包含指定集合元素的列表,按照集合的迭代器返回它们的顺序。

2.2.1 无参构造方法

注意下图中的注释Constructs an empty list with an initial capacity of ten 调用无参构造方法，默认构造一个容量为10的空list.

  /**
     * Constructs an empty list with an initial capacity of ten.
     */
    public ArrayList() {
        this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
    }

2.2.2 构造空的具有特定初始容量值方法

1、在知道将会向 ArrayList 插入多少元素的情况下 2、在有大量数据写入时；一定要初始化指定长度。

/**
     * Constructs an empty list with the specified initial capacity.
     *
     * @param  initialCapacity  the initial capacity of the list
     * @throws IllegalArgumentException if the specified initial capacity
     *         is negative
     */
    public ArrayList(int initialCapacity) {
        if (initialCapacity > 0) {
            this.elementData = new Object[initialCapacity];
        } else if (initialCapacity == 0) {
            this.elementData = EMPTY_ELEMENTDATA;
        } else {
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        }
    }

2.2.3构造一个包含指定集合元素的列表,按照集合的迭代器返回它们的顺序

 /**
     * Constructs a list containing the elements of the specified
     * collection, in the order they are returned by the collection's
     * iterator.
     构造一个包含指定集合元素的列表,按照集合的迭代器返回它们的顺序
     * @param c the collection whose elements are to be placed into this list
     * @throws NullPointerException if the specified collection is null
     */
    public ArrayList(Collection<? extends E> c) {
        elementData = c.toArray();
        if ((size = elementData.length) != 0) {
            // defend against c.toArray (incorrectly) not returning Object[]
            //防御c.toArray（错误地）不返回Object[]
            // (see e.g. https://bugs.openjdk.java.net/browse/JDK-6260652)
            // jdk bug（Arrays内部实现的ArrayList的toArray()方法的行为与规范不一致） 15年修复；
            if (elementData.getClass() != Object[].class)
                elementData = Arrays.copyOf(elementData, size, Object[].class);
        } else {
            // replace with empty array.
            this.elementData = EMPTY_ELEMENTDATA;
        }
    }

JDK-6260652 `(see e.g. https://bugs.openjdk.java.net/browse/JDK-6260652)`

简单来说就是，就是下图代码会产生的情况。

Object[] objects = new String[]{"string"};
objects[0] = 1;
/**
Exception in thread "main" java.lang.ArrayStoreException: java.lang.Integer
*/

产生原因

  public Object[] toArray() {
            return a.clone();
    }

Arrays内部实现的ArrayList的toArray()方法的行为与规范不一致。根据JLS规范String[]的clone方法返回的也是String[]类型，所以toArray()方法返回的真实类型是String[]，所以个toArray()[0]赋值时可能会导致类型不匹配的错误

jdk11中的Arrays内部实现的ArrayList的toArray()方法。所以调用copyOf()返回值类型为Object[]

  public Object[] toArray() {
            return Arrays.copyOf(a, a.length, Object[].class);
        }

2.3常见方法

2.3.1插入

对于数组（线性表）结构，插入操作分为两种情况。一种是在元素序列尾部插入，另一种是在元素序列其他位置插入。ArrayList 的源码里也体现了这两种插入情况，如下：

    /**
     * This helper method split out from add(E) to keep method
     * bytecode size under 35 (the -XX:MaxInlineSize default value),
     * which helps when add(E) is called in a C1-compiled loop.
     (这个辅助方法是从add(E)方法分离而来的,为了保持方法字节码低于35,这将有助于add(E)方法调用C1编译循环)
     */
    private void add(E e, Object[] elementData, int s) {
        if (s == elementData.length)
            elementData = grow();
        /**
         private Object[] grow() {
        return grow(size + 1);
        }**/
        //将新元素插入序列尾部
        elementData[s] = e;
        size = s + 1;
    }
/**
     * Inserts the specified element at the specified position in this
     * list. Shifts the element currently at that position (if any) and
     * any subsequent elements to the right (adds one to their indices).
     在此列表的指定位置插入指定的元素。将当前位于该位置的元素（如果有）和任何后续元素向右移动（在其索引中添加一个元素）
     * @param index index at which the specified element is to be inserted
     * @param element element to be inserted
     * @throws IndexOutOfBoundsException {@inheritDoc}
     */
    public void add(int index, E element) {
        rangeCheckForAdd(index);
        modCount++;
        final int s;
        Object[] elementData;
        if ((s = size) == (elementData = this.elementData).length)
            elementData = grow();
        // 2. 将 index 及其之后的所有元素都向后移一位
        System.arraycopy(elementData, index,
                         elementData, index + 1,
                         s - index);
            // 3. 将新元素插入至 index 处
        elementData[index] = element;
        size = s + 1;
    }

2.3.1.1元素序列尾部插入

检测数组是否有足够的空间插入
将新元素插入至序列尾部

如下图：
ArrayList源码分析-jdk11 (18.9）-LMLPHP

2.3.1.2元素序列指定位置（假设该位置合理）插入

检测数组是否有足够的空间
将 index 及其之后的所有元素向后移一位
将新元素插入至 index 处

如下图：
ArrayList源码分析-jdk11 (18.9）-LMLPHP
从上图可以看出，将新元素插入至序列指定位置，需要先将该位置及其之后的元素都向后移动一位，为新元素腾出位置。这个操作的时间复杂度为O(N)，频繁移动元素可能会导致效率问题，特别是集合中元素数量较多时。在日常开发中，若非所需，我们应当尽量避免在大集合中调用第二个插入方法。

2.3.2 ArrayList 的扩容机制

对于变长数据结构，当结构中没有空余空间可供使用时，就需要进行扩容。在 ArrayList 中，当空间用完，其会按照原数组空间的1.5倍进行扩容。相关源码如下：

/** 扩容的入口方法 */
    /**
     * Increases the capacity of this {@code ArrayList} instance, if
     * necessary, to ensure that it can hold at least the number of elements
     * specified by the minimum capacity argument.
     *增加{@code ArrayList}实例的容量，如果必需的，以确保它至少可以容纳minimum capacity参数指定的元素数
     * @param minCapacity the desired minimum capacity
     */
    public void ensureCapacity(int minCapacity) {
        if (minCapacity > elementData.length
            && !(elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
                 && minCapacity <= DEFAULT_CAPACITY)) {
            modCount++;
            grow(minCapacity);
        }
    }

/** 扩容的核心方法 */
/**
     * Returns a capacity at least as large as the given minimum capacity.
     * Returns the current capacity increased by 50% if that suffices.
     * Will not return a capacity greater than MAX_ARRAY_SIZE unless
     * the given minimum capacity is greater than MAX_ARRAY_SIZE.
   返回至少等于给定最小值的容量容量。返回当前容量增加50%，如果够了。不会返回大于MAX_ARRAY_SIZE的容量，除非给定的最小容量大于MAX_ARRAY_SIZE
     * @param minCapacity the desired minimum capacity
     * @throws OutOfMemoryError if minCapacity is less than zero
     */
    private int newCapacity(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1); //旧容量大小+在旧容量基础上增加50%（左移1位相当于除以2）
        if (newCapacity - minCapacity <= 0) {
            if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
                return Math.max(DEFAULT_CAPACITY, minCapacity);
            if (minCapacity < 0) // overflow
                throw new OutOfMemoryError();
            return minCapacity;
        }
        return (newCapacity - MAX_ARRAY_SIZE <= 0)
            ? newCapacity
            : hugeCapacity(minCapacity);
    }

private static int hugeCapacity(int minCapacity) {
    if (minCapacity < 0) // overflow
        throw new OutOfMemoryError();
    // 如果最小容量超过 MAX_ARRAY_SIZE，则将数组容量扩容至 Integer.MAX_VALUE
    return (minCapacity > MAX_ARRAY_SIZE) ?
        Integer.MAX_VALUE :
        MAX_ARRAY_SIZE;
}

2.3.3删除

不同于插入操作，ArrayList 没有无参删除方法。所以其只能删除指定位置的元素或删除指定元素，这样就无法避免移动元素（除非从元素序列的尾部删除）。相关代码如下：

/** 删除指定位置的元素 */
    /**
     * Removes the element at the specified position in this list.
     * Shifts any subsequent elements to the left (subtracts one from their
     * indices).
     *
     * @param index the index of the element to be removed
     * @return the element that was removed from the list
     * @throws IndexOutOfBoundsException {@inheritDoc}
     */
    public E remove(int index) {
        Objects.checkIndex(index, size);
        final Object[] es = elementData;
            // 返回被删除的元素值
        @SuppressWarnings("unchecked")
        E oldValue = (E) es[index];
        fastRemove(es, index);
        return oldValue;
    }

/** 从列表中删除第一个出现的指定元素（如果存在）。
如果列表不包含元素，则不变。更准确地说，删除索引最低的元素 */
 /**
     * Removes the first occurrence of the specified element from this list,
     * if it is present.  If the list does not contain the element, it is
     * unchanged.  More formally, removes the element with the lowest index
     * {@code i} such that
     * {@code Objects.equals(o, get(i))}
     * (if such an element exists).  Returns {@code true} if this list
     * contained the specified element (or equivalently, if this list
     * changed as a result of the call).
     *
     * @param o element to be removed from this list, if present
     * @return {@code true} if this list contained the specified element
     */
    public boolean remove(Object o) {
        final Object[] es = elementData;
        final int size = this.size;
        int i = 0;
          // 遍历数组，查找要删除元素的位置
        found: {
            if (o == null) {
                for (; i < size; i++)
                    if (es[i] == null)
                        break found;
            } else {
                for (; i < size; i++)
                    if (o.equals(es[i]))
                        break found;
            }
            return false;
        }
        fastRemove(es, i);
        return true;
    }

  /**
     * Private remove method that skips bounds checking and does not
     * return the value removed.
     */
   private void fastRemove(Object[] es, int i) {
        modCount++;
        final int newSize;
        if ((newSize = size - 1) > i)
         // 将 index + 1 及之后的元素向前移动一位，覆盖被删除值
            System.arraycopy(es, i + 1, es, i, newSize - i);
             // 将最后一个元素置空，并将 size 值减1
        es[size = newSize] = null;
    }

上面的删除方法并不复杂，这里以第一个删除方法为例，删除一个元素步骤如下：

获取指定位置 index 处的元素值
将 index + 1 及之后的元素向前移动一位
将最后一个元素置空，并将 size 值减 1
返回被删除值，完成删除操作

如下图：
ArrayList源码分析-jdk11 (18.9）-LMLPHP
现在，考虑这样一种情况。我们往 ArrayList 插入大量元素后，又删除很多元素，此时底层数组会空闲处大量的空间。因为 ArrayList 没有自动缩容机制，导致底层数组大量的空闲空间不能被释放，造成浪费。对于这种情况，ArrayList 也提供了相应的处理方法，如下：

/** 将数组容量缩小至元素数量 */
  /**
     * Trims the capacity of this {@code ArrayList} instance to be the
     * list's current size.  An application can use this operation to minimize
     * the storage of an {@code ArrayList} instance.
     */
    public void trimToSize() {
        modCount++;
        if (size < elementData.length) {
            elementData = (size == 0)
              ? EMPTY_ELEMENTDATA
              : Arrays.copyOf(elementData, size);
        }
    }

ArrayList源码分析-jdk11 (18.9）-LMLPHP

我们可以使用trimToSize()手动触发 ArrayList 的缩容机制,释放多余的空间。

2.3.4 遍历

ArrayList 实现了 RandomAccess 接口（该接口是个标志性接口），表明它具有快速随机访问的能力。ArrayList 底层基于数组实现，所以它可在常数阶的时间内完成随机访问，效率很高。对 ArrayList 进行遍历时，一般情况下，我们喜欢使用 foreach 循环遍历，但这并不是推荐的遍历方式。ArrayList 具有随机访问的能力，如果在一些效率要求比较高的场景下，更推荐下面这种方式：

for (int i = 0; i < list.size(); i++) {
    list.get(i);
}

官网还特意说明了，如果是实现了这个接口的 List，那么使用for循环的方式获取数据会优于用迭代器获取数据

3.其他细节

3.1 快速失败机制

在 Java 集合框架中，很多类都实现了快速失败机制。该机制被触发时，会抛出并发修改异常ConcurrentModificationException，这个异常大家在平时开发中多多少少应该都碰到过。关于快速失败机制，ArrayList 的注释里对此做了解释，这里引用一下：

上面注释大致意思是，ArrayList 迭代器中的方法都是均具有快速失败的特性，当遇到并发修改的情况时，迭代器会快速失败，以避免程序在将来不确定的时间里出现不确定的行为。

以上就是 Java 集合框架中引入快速失败机制的原因，并不难理解，这里不多说了。

3.2 关于遍历时删除

遍历时删除是一个不正确的操作，即使有时候代码不出现异常，但执行逻辑也会出现问题。关于这个问题，阿里巴巴 Java 开发手册里也有所提及。这里引用一下：

雅核