集合深度学习09—HashMap源码解析

一、HashMap的原理简单介绍

HashMap的对象：存储的是双列数据，键值对 key - value

在这里插入图片描述

键相同，则哈希值相同，则直接替换值，返回原值
键不同，如果计算后哈希值也相同了。则链表法。
在遇到哈希冲突时，使用链表法：7上8下

JDK7及之前是插在头部
JDK8及之后是插在尾部

二、HashMap的重要属性

  		static final int DEFAULT_INITIAL_CAPACITY = 16;//哈希表主数组的默认长度
       
       //定义了一个float类型的变量，以后作为：默认的负载因子
       			 //负载因子是表示Hsah表中元素的填满的程度
      				 //太大容易引起哈西冲突，太小容易浪费  0.75是经过大量运算后得到的最好值
       //这个值其实可以自己改，但是不建议改，因为这个0.75是大量运算得到的
       static final float DEFAULT_LOAD_FACTOR = 0.75f;
       
       transient Entry<K,V>[] table;//主数组,每个元素为Entry类型
       
       transient int size;
       
       int threshold;//数组扩容的界限值,门槛值   16*0.75=12 
       
       final float loadFactor;//用来接收装填因子的变量，上面默认是常量，不可直接修改
       
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

三、HashMap的构造器

JDK1.7

	   public HashMap() {
	        this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
	    }
        //本类中带参数构造器：--》作用给一些数值进行初始化的！
        public HashMap(int initialCapacity, float loadFactor) {
        //给capacity赋值，capacity的值一定是 大于你传进来的initialCapacity 的 最小的 2的幂次
        int capacity = 1;
        while (capacity < initialCapacity)
            capacity <<= 1;
        //给loadFactor赋值，将装填因子0.75赋值给loadFactor
        this.loadFactor = loadFactor;
        //数组扩容的界限值,门槛值       16 * 0.75 = 12       1<<30 + 1
        threshold = (int)Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
        //给table数组赋值，初始化数组长度为16
        table = new Entry[capacity];            
    }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

JDK1.8

底层：数组+链表 ——>数组+红黑树
什么时候变为红黑树？

红黑树：（非平衡二叉树，相对平衡，有序的）

每个节点是黑的或者红的
根结点是黑的
如果一个节点是红的，则它的两个儿子节点是黑的
对于每个叶子节点，则从该节点到根结点的所有路径上包含相同数目的黑节点。

链表长度>8并且数组长度>64 ，两个条件都要满足
碰撞后，七上八下，七头插法。八尾插法。单向链表会变成双向链表，为树化做准备。

3.1 Map map = new HashMap();

先来看HashMap的结构

public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable {
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16 初始容量
    static final int MAXIMUM_CAPACITY = 1 << 30; //最大容量
    static final float DEFAULT_LOAD_FACTOR = 0.75f; //默认负载因子，有效元素 超过 负载因子 * 容量 则扩容
    static final int TREEIFY_THRESHOLD = 8; //扩容阈值，等于  负载因子 * 容量
    static final int MIN_TREEIFY_CAPACITY = 64; //数组长度超过64 且 链表长度>=8 就树化
    transient Node<K,V>[] table; // HashMap的数组
    transient Set<Map.Entry<K,V>> entrySet; //链表中存放的结点数据类型
    transient int size;//有效元素个数
    int threshold;//扩容阈值变量
    final float loadFactor;//负载因子

    // 数组中的元素是Node类型，存   [哈希值，key，value，下一个元素地址]
    static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;
        .....
	}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

看看这行代码会发生什么事

把默认的负载因子赋值给负载因子变量

    public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
    }
1
2
3

三、HashMap的 put( ) 方法

JDK1.8

放元素
存放哈希值，key，value …

    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }
    
   static final int hash(Object key) {
        int h;
        //hashmap可以存键为null，哈希值为0,，如果不是null，会进行二次散列 + 扰动算法，减少哈希碰撞
        //hashtable不可以存
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
  }
  
	 final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {         
        Node<K,V>[] tab;
        Node<K,V> p; 
        int n;//数组长度
        int i;
        //如果是空数组
        if ((tab = table) == null || (n = tab.length) == 0)
        //就调用 resize()方法，并获取它的长度
            n = (tab = resize()).length;
        //计算元素在数组中存放的位置，效果等效 i=hash%length; p=tab[i]
        //如果这个位置没有元素
        if ((p = tab[i = (n - 1) & hash]) == null)
        //就在这个位置创建一个节点
            tab[i] = newNode(hash, key, value, null);
        else {
        //如果这个位置有元素
            Node<K,V> e; K k;
            //判断当前传来的元素和存放的元素的哈希值是否相等 并且 （key 是否是同一个 或 如果key不为空，key的值是否相等）
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                //如果都满足，则是同一个元素，e指向当前位置的元素
                e = p;
            //如果结点是 TreeNode 类型    
            else if (p instanceof TreeNode)
            //添加树结点
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
            //获取到当前链表（桶）的尾元素，如果链表长度>=8，则树化，否则在尾部添加元素。（如果是JDK1.7.则在头部添加元素。这里是JDK1.8）
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                    //添加尾结点
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1)
                        //树化
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            //如果存在这个键（哈希值相同并且key的值也相同，则是同一个键），则替换旧值，返回新值
            if (e != null) { 
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        //如果size到达阈值，则扩容
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

//扩容
  final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        //如果数组为null，则旧容量赋值为0，否则赋值为当前数组容量
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        //旧的扩容阈值
        int oldThr = threshold;
        //新容量和新的扩容阈值
        int newCap, newThr = 0;
        //如果旧容量>0
        if (oldCap > 0) {
        //如果旧容量>=最大容量，则赋值为int最大值，返回旧容量
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            //新容量=旧容量*2，扩容两倍，如果小于最大容量 并且 旧容量>=默认初始化容量
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                //则新阈值也扩大2倍
                newThr = oldThr << 1; 
        }
        //如果旧阈值0>0
        else if (oldThr > 0) 
        //则新容量=旧阈值，能走到这层说明旧容量为0，旧阈值>0。 所以 数组 在 初始化时大小为旧阈值，即为 threshold 
        //而此时threshold 还是0，还无法进入这层
            newCap = oldThr;
        else {            
        //当阈值和容量都为0
        //新容量为默认初始容量  16
            newCap = DEFAULT_INITIAL_CAPACITY;
            //新阈值为 负载因子 （数组使用比例） * 默认初始容量  0.75 * 16 = 12
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        // 如果新阈值为0
        if (newThr == 0) {
        // 阈值= 新容量 * 负载因子
            float ft = (float)newCap * loadFactor;
            // 新阈值 = ft 或 int 最大值         MAXIMUM_CAPACITY 1 <<< 30 
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        //阈值容量初始化为新阈值
        threshold = newThr;
        // 创建 HashMap的数组，为Node类型，初始容量为 newCap   16
        @SuppressWarnings({"rawtypes","unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        // newTab 赋值给 table
        table = newTab;
        //如果旧表不为空
        if (oldTab != null) {
        //遍历旧表
            for (int j = 0; j < oldCap; ++j) {
                //获取旧表每个元素
                Node<K,V> e;
                //如果不为空
                //赋值给e
                if ((e = oldTab[j]) != null) {
                    //旧表该位置设为null
                    oldTab[j] = null;
                    //如果有下一个元素，新表的 计算后的位置 设为 原旧表元素
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                     //如果的TreeNode类型   
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else {
                    //loHead，下标不变情况下的链表头
                    //loTail，下标不变情况下的链表尾
                    
                    //hiHead，下标改变情况下的链表头
                    //hiTail，下标改变情况下的链表尾
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                        //设置next指针指向当前元素的下一个
                            next = e.next;
                            //如果其新位置是0
                            if ((e.hash & oldCap) == 0) {
                                //如果没有元素，就头结点赋值为e
                                //否则尾结点的下一个赋值为e
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                               //尾结点后移     
                                loTail = e;
                            }
                            else {
                            //如果不是0位置
                               //如果没有元素，就头结点赋值为e                  
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    //否则尾结点的下一个赋值为e
                                    hiTail.next = e;
                               //尾结点后移 
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        //如果有尾结点
                        if (loTail != null) {
                        //尾结点置空，新表当前位置设置Wie头结点
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        //如果尾结点不空
                        if (hiTail != null) {
                        //尾结点下一个置空
                            hiTail.next = null;
                            //新表 j+旧容量位置 设为头结点
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        /返回新表
        return newTab;
    }
// 设置容量为 大于当前容量的最小2的幂次    
 static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

//为树化做准备
    final void treeifyBin(Node<K,V>[] tab, int hash) {
        int n, index; Node<K,V> e;
        //容量 < 最小树化容量阈值 则不树化，只扩容
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            resize();
        else if ((e = tab[index = (n - 1) & hash]) != null) {
            TreeNode<K,V> hd = null, tl = null;
            do {
                TreeNode<K,V> p = replacementTreeNode(e, null);
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while ((e = e.next) != null);
            if ((tab[index] = hd) != null)
                hd.treeify(tab);
        }
    }    
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226

JDK1.7

      public V put(K key, V value) {
                //【1】对空值的判断
        if (key == null)
            return putForNullKey(value);
                //【2】调用hash方法，获取哈希码
        int hash = hash(key);
                //【3】得到key对应在数组中的位置
        int i = indexFor(hash, table.length);
                //【4】如果你放入的元素，在主数组那个位置上没有值，e==null  那么下面这个循环不走
                //当在同一个位置上放入元素的时候
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
             //哈希值一样  并且  equals相比一样   
             //(k = e.key) == key  如果是一个对象就不用比较equals了
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
            // 旧值替换新值
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        modCount++;
       //【4】走addEntry添加这个节点的方法：
        addEntry(hash, key, value, i);
        return null;
    }
        
        //【2.1】hash方法返回这个key对应的哈希值，内部进行二次散列，为了尽量保证不同的key得到不同的哈希码！
    final int hash(Object k) {
        int h = 0;
        if (useAltHashing) {
            if (k instanceof String) {
                return sun.misc.Hashing.stringHash32((String) k);
            }
            h = hashSeed;
        }
                //k.hashCode()函数调用的是key键值类型自带的哈希函数，
                //由于不同的对象其hashCode()有可能相同，所以需对hashCode()再次哈希，以降低相同率。
        h ^= k.hashCode();
                /*
                接下来的一串与运算和异或运算，称之为“扰动函数”，
                扰动的核心思想在于使计算出来的值在保留原有相关特性的基础上，
                增加其值的不确定性，从而降低冲突的概率。
                不同的版本实现的方式不一样，但其根本思想是一致的。
                往右移动的目的，就是为了将h的高位利用起来，减少哈西冲突
                */
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }
        //【3.1】返回int类型数组的坐标
        static int indexFor(int h, int length) {
                //其实这个算法就是取模运算：h%length，取模效率不如位运算
        return h & (length-1);
    }
    //【4.1】调用addEntry
    void addEntry(int hash, K key, V value, int bucketIndex) {
    //满足条件则扩容
        //size的大小  大于 16*0.75=12的时候，比如你放入的是第13个，这第13个你打算放在没有元素的位置上的时候
        if ((size >= threshold) && (null != table[bucketIndex])) {
           //主数组扩容为2倍
            resize(2 * table.length);
            //重新调整当前元素的hash码
            hash = (null != key) ? hash(key) : 0;
            //重新计算元素位置
            bucketIndex = indexFor(hash, table.length);
        }
      //将hash,key,value,bucketIndex位置  封装为一个Entry对象：
        createEntry(hash, key, value, bucketIndex);
    }

        void createEntry(int hash, K key, V value, int bucketIndex) {
         //获取bucketIndex位置上的元素给e
        Entry<K,V> e = table[bucketIndex];
         //然后将hash, key, value封装为一个对象，然后将下一个元素的指向为e 
         //（链表的头插法）
          //将新的Entry放在table[bucketIndex]的位置上
        table[bucketIndex] = new Entry<>(hash, key, value, e);
           //集合中加入一个元素 size+1
        size++;
    }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81

五、HashMap底层数组的扩容

每次扩容为之前的1倍，比如原16，扩容后为32

JDK1.7

  void resize(int newCapacity) {
  //存储旧表
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
        //如果无法再扩容1倍，就再扩容 0.25 （之前是0.75倍） 
            threshold = Integer.MAX_VALUE;
            return;
        }
         //创建长度为newCapacity的数组
        Entry[] newTable = new Entry[newCapacity];
        boolean oldAltHashing = useAltHashing;
        useAltHashing |= sun.misc.VM.isBooted() &&
                (newCapacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);
        boolean rehash = oldAltHashing ^ useAltHashing;
                //转让方法：将老数组中的东西都重新放入新数组中
        transfer(newTable, rehash);
                //老数组替换为新数组
        table = newTable;
         //重新计算 扩容阈值
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }
        //老数组里的东西加到新数组里
  void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = newTable.length;
        for (Entry<K,V> e : table) {
            while(null != e) {
                Entry<K,V> next = e.next;
                if (rehash) {
                    e.hash = null == e.key ? 0 : hash(e.key);
                }
                                //将哈希值，和新的数组容量传进去，重新计算key在新数组中的位置
                int i = indexFor(e.hash, newCapacity);
                                //头插法
                e.next = newTable[i];//获取链表上元素给e.next
                newTable[i] = e;//然后将e放在i位置 
                e = next;//e再指向下一个节点继续遍历
            }
        }
    }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

六、HashMap的两个经典面试题

1. 负载因子（装填因子、加载因子）为什么是 0.75？

装填因子设置为1：空间利用率得到了很大的满足，但很容易碰撞，产生链表-> 查询效率低，时间长
装填因子设置为0.5：发生碰撞的概率低，扩容，产生链表的几率也低，查询效率高，但空间利用率低
在时间和空间的开销中，取中间值
在这里插入图片描述

2. 主数组的长度为什么必须为2的n次幂？

h & ( length - 1) 等效 h % length 操作，等效的前提就是：length必须是2的整数倍
为了防止哈希冲突，位置冲突，不然很容易很容易算出相同的哈希值。

七、HashMap1.8的底层原理

底层原理：数组 + 链表 = 哈希表

相关阅读:
Neo4j深度学习
 如何创建加载项（1）
【SpringCloud】认识微服务
 swift指针&内存管理-引用
 总结一下JavaScript 中的 for 循环
 移动魔百盒UNT401A、UNT403A、UNT413A_晶晨S905L3A/B芯片_红外蓝牙语音_免拆卡刷固件
 前后端分离项目-基于springboot+vue的图书馆管理系统的设计与实现（内含代码+文档+报告）
Docker三剑客之docker-swarm
批量更新 MongoDB 中的文档
 海外推广必备|如何制定领英LinkedIn营销战略？
原文地址：https://blog.csdn.net/niTaoTaoa/article/details/126559033