Java数据结构——第十二节 - Map和Set

在这里插入图片描述
视频讲解

Java中的两大神器Map和Set（47分钟分析）

Map和Set

1. 搜索
- 1.1 概念及场景
- 1.2 模型
2. Map 的使用
3. Set 的说明
- 3.1 常见方法说明
- 3.2 HashSet的使用案例
4 oj题练习
5. 搜索树
6. 哈希表

1. 搜索

1.1 概念及场景

Map和set是一种专门用来进行搜索的容器或者数据结构，其搜索的效率与其具体的实例化子类有关。 以前常见的搜索方式有：

直接遍历，时间复杂度为O(N)，元素如果比较多效率会非常慢
二分查找，时间复杂度为 ,但搜索前必须要求序列是有序的

上述排序比较适合静态类型的查找，即一般不会对区间进行插入和删除操作了，而现实中的查找比如：

根据姓名查询考试成绩
通讯录，即根据姓名查询联系方式
不重复集合，即需要先搜索关键字是否已经在集合中
可能在查找时进行一些插入和删除的操作，即动态查找，那上述两种方式就不太适合了，本节介绍的Map和Set是一种适合动态查找的集合容器。

1.2 模型

一般把搜索的数据称为关键字（Key），和关键字对应的称为值（Value），将其称之为Key-value的键值对，所以模型会有两种：

纯 key 模型，比如：
有一个英文词典，快速查找一个单词是否在词典中
快速查找某个名字在不在通讯录中
Key-Value 模型,比如：
统计文件中每个单词出现的次数，统计结果是每个单词都有与其对应的次数：<单词，单词出现的次数>
梁山好汉的江湖绰号：每个好汉都有自己的江湖绰号
而Map中存储的就是key-value的键值对，Set中只存储了Key。

2. Map 的使用

Map的官方文档在这里插入图片描述

2.1 关于Map的说明

Map是一个接口类，该类没有继承自Collection，该类中存储的是结构的键值对，并且K一定是唯一的，不能重复。

2.2 Map.Entry（表示Map中的一个节点，用来遍历Map）

Map.Entry 是Map内部实现的用来存放键值对映射关系的内部类，该内部类中主要提供了的获取，value的设置以及Key的比较方式。
注意：Map.Entry并没有提供设置Key的方法

Set<Map.Entry<String,Integer>> entrySet = map.entrySet();
for(Map.Entry<String,Integer> entry : entrySet) {
    System.out.println("key: "+entry.getKey()+" val:" + entry.getValue());
}
1
2
3
4

2.3 Map 的常用方法说明

在这里插入图片描述
注意：

Map是一个接口，不能直接实例化对象，如果要实例化对象只能实例化其实现类TreeMap或者HashMap

Map<String,Integer> map = new HashMap<String,Integer>();
Map<String,Integer> map2 = new TreeMap<>();
1
2

Map中存放键值对的Key是唯一的，value是可以重复的

        map.put("zx",0);
        System.out.println(map);
        if(map.containsKey("zx")){
            int val = map.get("zx");
            map.put("zx",val+1);
        }
1
2
3
4
5
6

Map中的Key可以全部分离出来，存储到Set中来进行访问(因为Key不能重复)。
Map中的value可以全部分离出来，存储在Collection的任何一个子集合中(value可能有重复)。
Map中键值对的Key不能直接修改，value可以修改，如果要修改key，只能先将该key删除掉，然后再来进行重新插入。
TreeMap和HashMap的区别

2.4 ***HashMap的使用案例

    import java.util.TreeMap;
    import java.util.Map;
    public static void TestMap(){
        Map<String, String> m = new HashMap<>();

        // put(key, value):插入key-value的键值对
        // 如果key不存在，会将key-value的键值对插入到map中,返回null
        m.put("林冲", "豹子头");
        m.put("鲁智深", "花和尚");
        m.put("武松", "行者");
        m.put("宋江", "及时雨");
        String str = m.put("李逵", "黑旋风");
        System.out.println(m.size());
        System.out.println(m);

        // put(key,value): 注意key不能为空，但是value可以为空
        // key如果为空，会抛出空指针异常
        //m.put(null, "花名");
        str = m.put("无名", null);
        System.out.println(m.size());

        // put(key, value):
        // 如果key存在，会使用value替换原来key所对应的value，返回旧value
        str = m.put("李逵", "铁牛");

        // get(key): 返回key所对应的value
        // 如果key存在，返回key所对应的value
        // 如果key不存在，返回null
        System.out.println(m.get("鲁智深"));
        System.out.println(m.get("史进"));

        //GetOrDefault(): 如果key存在，返回与key所对应的value，如果key不存在，返回一个默认值
        System.out.println(m.getOrDefault("李逵", "铁牛"));
        System.out.println(m.getOrDefault("史进", "九纹龙"));
        System.out.println(m.size());

        //containKey(key)：检测key是否包含在Map中，时间复杂度：O(logN)
        // 按照红黑树的性质来进行查找
        // 找到返回true，否则返回false
        System.out.println(m.containsKey("林冲"));
        System.out.println(m.containsKey("史进"));

        // containValue(value): 检测value是否包含在Map中，时间复杂度: O(N)
        // 找到返回true，否则返回false
        System.out.println(m.containsValue("豹子头"));
        System.out.println(m.containsValue("九纹龙"));

        // 打印所有的key
        // keySet是将map中的key防止在Set中返回的
        for(String s : m.keySet())
            System.out.print(s + " ");
        System.out.println();

        // 打印所有的value
        // values()是将map中的value放在collect的一个集合中返回的
        	Collection<Integer> list = map.values();
        System.out.println(list);
        
        for(String s : m.values()){
            System.out.print(s + " ");
        }
        System.out.println();

        // 打印所有的键值对
        // entrySet(): 将Map中的键值对放在Set中返回了
        for(Map.Entry<String, String> entry : m.entrySet()){
            System.out.println(entry.getKey() + "--->" + entry.getValue());
        }
        System.out.println();
    }

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71

同学们可使用TreeMap来实例化Map，看看TreeMap和HashMap的不同

3. Set 的说明

Set的官方说明

Set与Map主要的不同有两点：Set是继承自Collection的接口类，Set中只存储了Key。

3.1 常见方法说明

在这里插入图片描述
注意：

Set是继承自Collection的一个接口类
Set中只存储了key，并且要求key一定要唯一
Set的底层是使用Map来实现的，其使用key与Object的一个默认对象作为键值对插入到Map中的
Set最大的功能就是对集合中的元素进行去重
实现Set接口的常用类有TreeSet和HashSet，还有一个LinkedHashSet，LinkedHashSet是在HashSet的基础
上维护了一个双向链表来记录元素的插入次序。
Set中的Key不能修改，如果要修改，先将原来的删除掉，然后再重新插入
Set中不能插入null的key。
TreeSet和HashSet的区别

3.2 HashSet的使用案例

import java.util.TreeSet;
import java.util.Iterator;
import java.util.Set;
public static void TestSet(){
    Set<String> s = new HashSet<>();

    // add(key): 如果key不存在，则插入，返回ture
    // 如果key存在，返回false
    boolean isIn = s.add("apple");
    s.add("orange");
    s.add("peach");
    s.add("banana");
    System.out.println(s.size());
    System.out.println(s);

    isIn = s.add("apple");

    // add(key): key如果是空，抛出空指针异常
    //s.add(null);
    // contains(key): 如果key存在，返回true，否则返回false
    System.out.println(s.contains("apple"));
    System.out.println(s.contains("watermelen"));

    // remove(key): key存在，删除成功返回true
    // key不存在，删除失败返回false
    // key为空，抛出空指针异常
    s.remove("apple");
    System.out.println(s);

    s.remove("watermelen");
    System.out.println(s);
    
//Set的遍历！！！
    Iterator<String> it = s.iterator();
    while(it.hasNext()){
        System.out.print(it.next() + " ");
    }
    System.out.println();
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

4 oj题练习

136. 只出现一次的数字

应用：set.contains(key)

    public int singleNumber(int[] nums) {
        int ret = 0;
        for(int i = 0; i< nums.length;i++) {
            ret ^= nums[i];
        }
        return ret;
    }

    public int singleNumber2(int[] nums) {

        Set<Integer> set = new HashSet<>();
        for(int i = 0;i < nums.length;i++) {
            int key = nums[i];
            if(!set.contains(key)) {
                set.add(nums[i]);
            }else {
                set.remove(key);
            }
        }
        //走到这里  集合当中  只剩下一个元素了  问题：如何找到这个元素
        for(int i = 0;i < nums.length;i++) {
            int key = nums[i];
            if(set.contains(key)) {
                return nums[i];
            }
        }
        return -1;
    }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

138. ***复制带随机指针的链表

map存放的是两个属性

Map<Node,Node> map = new HashMap<>();
1

public Node copyRandomList(Node head) {
        Map<Node,Node> map = new HashMap<>();
        //1、第一次遍历链表
        Node cur = head;
        while(cur != null) {
            Node node = new Node(cur.val);
            map.put(cur,node);
            cur = cur.next;
        }
        //2、第二次遍历链表
        cur = head;
        while(cur != null) {
            map.get(cur).next = map.get(cur.next);
            map.get(cur).random = map.get(cur.random);
            cur = cur.next;
        }
        return map.get(head);
    }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

771. 宝石与石头

1. String用char遍历char ch : string.toCharArray()

public int numJewelsInStones(String jewels, String stones) {
    Set<Character> set = new HashSet<>();
    //把宝石存储在集合当中
    for(char ch : jewels.toCharArray()) {
        set.add(ch);
    }

    int count = 0;
    //遍历石头 查看石头当中 有多少个宝石
    for(char ch : stones.toCharArray()) {
        if(set.contains(ch)) {
            count++;
        }
    }
    return count;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

[编程题]旧键盘 (20)

import java.util.*;

/**
 * 旧键盘
 * 题目描述
 * 旧键盘上坏了几个键，于是在敲一段文字的时候，对应的字符就不会出现。现在给出应该输入的
 * 一段文字、以及实际被输入的文字，请你列出肯定坏掉的那些键。
 * 输入描述:
 * 输入在2行中分别给出应该输入的文字、以及实际被输入的文字。每段文字是不超过80个字符的串，
 * 由字母A-Z（包括大、小写）、数字0-9、以及下划线“_”（代表空格）组成。题目保证2个字符串均非空。
 * 输出描述:
 * 按照发现顺序，在一行中输出坏掉的键。其中英文字母只输出大写，每个坏键只输出一次。题目保证至少有1个坏键。
 * 输入例子:
 * 7_This_is_a_test
 * _hs_s_a_es
 * 输出例子:
 * 7TI
 *
 * @author shijiacheng
 * @date 2018/2/1
 */
public class B1019OldKeyboard {
    public static void main(String[] args) {
    //一个sc 可以输入 两次
        Scanner sc = new Scanner(System.in);
        String old = sc.next();
        String result = sc.next();

        Map<Character, Integer> mapResult = new HashMap<>();
        Map<Character, Integer> mapOld = new HashMap<>();

//把String转成大写，再转成数组
        char[] charsResult = result.toUpperCase().toCharArray();
        char[] charsOld = old.toUpperCase().toCharArray();

        for (int i = 0; i < charsResult.length; i++) {
            if (!mapResult.containsKey(charsResult[i])) {
                mapResult.put(charsResult[i], i);
            }
        }
        List<Character> list = new ArrayList<>();
        for (int i = 0; i < charsOld.length; i++) {
            if (!mapResult.containsKey(charsOld[i])) {
                if (!mapOld.containsKey(charsOld[i])) {
                    mapOld.put(charsOld[i], i);
                    list.add(charsOld[i]);
                }
            }
        }
        for (int i = 0; i < list.size(); i++) {
            System.out.print(list.get(i));
        }
    }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55

692. 前K个高频单词

一个比较器的两种比较方法

    //2.建立大小为k的    小根堆
//1. 传入比较器
//2. 比较器有两种比较方式针对于不同情况
    PriorityQueue<Map.Entry<String,Integer> > minQ =
            new PriorityQueue<>(k,new Comparator<Map.Entry<String, Integer>>() {
        @Override
        public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2) {
//频率相同的时候  把这个变成以key 为准的大根堆
            if(o1.getValue().compareTo(o2.getValue()) == 0) {
                return o2.getKey().compareTo(o1.getKey());
            }
            return o1.getValue().compareTo(o2.getValue());//根据频率
        }
    });
1
2
3
4
5
6
7
8
9
10
11
12
13
14

遍历map for(Map.Entry entry : map.entrySet())

public static List<String> topKFrequent(String[] words, int k) {
    //1. 统计单词出现的次数
    Map<String,Integer> map = new HashMap<>();
    for(String key : words) {
        if(map.get(key) == null) {
            map.put(key,1);
        }else {
            int val = map.get(key);
            map.put(key,val+1);
        }
    }

    //2.建立大小为k的    小根堆
//1. 传入比较器
//2. 比较器有两种比较方式针对于不同情况
    PriorityQueue<Map.Entry<String,Integer> > minQ =
            new PriorityQueue<>(k,new Comparator<Map.Entry<String, Integer>>() {
        @Override
        public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2) {
//频率相同的时候  把这个变成以key 为准的大根堆
            if(o1.getValue().compareTo(o2.getValue()) == 0) {
                return o2.getKey().compareTo(o1.getKey());
            }
            return o1.getValue().compareTo(o2.getValue());//根据频率
        }
    });

    //3. 遍历map，先放满K个，然后从第K+1个开始 和堆顶元素比较
    for(Map.Entry<String,Integer> entry : map.entrySet()) {
        //3.1 如果小根堆没有放满k个 那么直接放进去
        if(minQ.size() < k) {
            minQ.offer(entry);
        }else {
            //3.2 如果已经方面了小根堆，那么 我们需要让当前元素 和堆顶元素比较
            Map.Entry<String,Integer> top = minQ.peek();
            if(top != null) {
                if (entry.getValue() > top.getValue()) {
                    minQ.poll();
                    minQ.offer(entry);
                } else {
                    //<  ||  ==  3.3 此时考虑频率相同，key小的入队
                    if (top.getValue().compareTo(entry.getValue()) == 0) {
                        if (top.getKey().compareTo(entry.getKey()) > 0) {
                            minQ.poll();
                            minQ.offer(entry);
                        }
                    }
                }
            }
        }
    }
    List<String> ret = new ArrayList<>();
    for (int i = 0; i < k; i++) {
        Map.Entry<String,Integer> top = minQ.poll();
        if(top != null) {
            ret.add(top.getKey());
        }
    }
    Collections.reverse(ret);
    return ret;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61

6. 删除重复数据 | 找到第一个重复数据 | 统计出现的次数

在这里插入图片描述

//
public static void func1(int[] array) {
    Set<Integer> set = new HashSet<>();
    for (int i = 0; i < array.length; i++) {
        set.add(array[i]);
    }
    System.out.println(set);
}

public static int func2(int[] array) {
    Set<Integer> set = new HashSet<>();
    for (int i = 0; i < array.length; i++) {
        if(!set.contains(array[i])) {
            set.add(array[i]);
        }else {
            return array[i];
        }
    }
    return -1;
}

public static void func3(int[] array) {
    //key:关键字   val:次数
    Map<Integer,Integer> map = new HashMap<>();
    for (int i = 0; i < array.length; i++) {
        int key = array[i];
        if(map.get(key) == null) {
            map.put(key,1);
        }else {
            int val = map.get(key);//1
            map.put(key,val+1);
        }
    }
    System.out.println(map);
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

5. 搜索树

5.1 概念

二叉搜索树又称二叉排序树，它或者是一棵空树，或者是具有以下性质的二叉树:

若它的左子树不为空，则左子树上所有节点的值都小于根节点的值
若它的右子树不为空，则右子树上所有节点的值都大于根节点的值
它的左右子树也分别为二叉搜索树

5.2 操作-查找

在这里插入图片描述

5.3 操作-插入

如果树为空树，即根 == null，直接插入
如果树不是空树，按照查找逻辑确定插入位置，插入新结点

5.4 操作-删除（难点）

设待删除结点为 cur, 待删除结点的双亲结点为 parent

cur.left == null

cur 是 root，则 root = cur.right
cur 不是 root，cur 是 parent.left，则 parent.left = cur.right
cur 不是 root，cur 是 parent.right，则 parent.right = cur.right

cur.right == null

cur 是 root，则 root = cur.left
cur 不是 root，cur 是 parent.left，则 parent.left = cur.left
cur 不是 root，cur 是 parent.right，则 parent.right = cur.left

cur.left != null && cur.right != null

需要使用替换法进行删除，即在它的右子树中寻找中序下的第一个结点(关键码最小)，用它的值填补到被删除节点中，再来处理该结点的删除问题

5.5 实现

public class BinarySearchTree {

    static class TreeNode {
        public int val;
        public TreeNode left;
        public TreeNode right;

        public TreeNode(int val) {
            this.val =val;
        }
    }

    public TreeNode root;

    /**
     * 查找key是否存在于二叉搜索树当中
     * @param key
     * @return 当前节点找到返回地址  找不到返回null
     */
    public TreeNode search(int key) {
        TreeNode cur = root;
        while (cur != null) {
            if(cur.val < key) {
                cur = cur.right;
            }else if(cur.val > key) {
                cur = cur.left;
            }else {
                return cur;
            }
        }
        return null;
    }

    /**
     * 插入节点
     * @param key
     */
    public boolean insert(int key) {
        TreeNode node = new TreeNode(key);
        if(root == null) {
            root = node;
            return true;
        }
        TreeNode cur = root;
        TreeNode parent = null;
        while (cur != null) {
            if(cur.val < key) {
                parent = cur;
                cur = cur.right;
            }else if(cur.val > key) {
                parent = cur;
                cur = cur.left;
            }else {
                //存在相同的元素 不能插入成功
                return false;
            }
        }
        if(parent.val <  key) {
            parent.right = node;
        }else {
            parent.left = node;
        }
        return true;
    }

    /***
     * 删除关键字为key的节点
     * @param key
     */
    public void remove(int key) {
        TreeNode cur = root;
        TreeNode parent = null;
        while (cur != null) {
            if(cur.val < key) {
                parent = cur;
                cur = cur.right;
            }else if(cur.val > key) {
                parent = cur;
                cur = cur.left;
            }else {
                //找到了 就要开始删除了
                removeNode(cur,parent);
                return;
            }
        }
    }

    /**
     * 进行删除
     * @param cur 删除的节点
     * @param parent 删除节点的父节点
     */
    private void removeNode(TreeNode cur, TreeNode parent) {
        if(cur.left == null) {
            if(cur == root) {
                root = root.right;
            }else if(cur == parent.left) {
                parent.left = cur.right;
            }else {
                parent.right = cur.right;
            }
        }else if(cur.right == null) {
            if(cur == root) {
                root = root.left;
            }else if(cur == parent.left){
                parent.left = cur.left;
            }else {
                parent.right = cur.left;
            }
        }else {
            TreeNode targetParent = cur;
            TreeNode target = cur.right;
            while (target.left != null) {
                targetParent = target;
                target = target.left;
            }
            cur.val = target.val;
            //分两种情况
            if(targetParent.left == target) {
                targetParent.left = target.right;
            }else {
                targetParent.right = target.right;
            }
        }
    }

    public void inorder(TreeNode root) {
        if(root == null) {
            return;
        }
        inorder(root.left);
        System.out.print(root.val+" ");
        inorder(root.right);
    }

}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136

5.6 性能分析

插入和删除操作都必须先查找，查找效率代表了二叉搜索树中各个操作的性能。
对有n个结点的二叉搜索树，若每个元素查找的概率相等，则二叉搜索树平均查找长度是结点在二叉搜索树的深度的函数，即结点越深，则比较次数越多。
但对于同一个关键码集合，如果各关键码插入的次序不同，可能得到不同结构的二叉搜索树：
最优情况下，二叉搜索树为完全二叉树，其平均比较次数为
最差情况下，二叉搜索树退化为单支树，其平均比较次数为 N/2
问题：如果退化成单支树，二叉搜索树的性能就失去了。那能否进行改进，不论按照什么次序插入关键码，都可以是二叉搜索树的性能最佳？

5.7 和 java 类集的关系

TreeMap 和 TreeSet 即 java 中利用搜索树实现的 Map 和 Set；实际上用的是红黑树，而红黑树是一棵近似平衡的二叉搜索树，即在二叉搜索树的基础之上 + 颜色以及红黑树性质验证，关于红黑树的内容后序再进行讲解。

6. 哈希表

6.1 概念

顺序结构以及平衡树中，元素关键码与其存储位置之间没有对应的关系，因此在查找一个元素时，必须要经过关键码的多次比较。顺序查找时间复杂度为O(N)，平衡树中为树的高度，即O(log2N )，搜索的效率取决于搜索过程中元素的比较次数

理想的搜索方法：可以不经过任何比较，一次直接从表中得到要搜索的元素。如果构造一种存储结构，通过某种函数(hashFunc)使元素的存储位置与它的关键码之间能够建立一一映射的关系，那么在查找时通过该函数可以很快找到该元素。

当向该结构中：

插入元素

根据待插入元素的关键码，以此函数计算出该元素的存储位置并按此位置进行存放

搜索元素

对元素的关键码进行同样的计算，把求得的函数值当做元素的存储位置，在结构中按此位置取元素比较，若关键码相等，则搜索成功

该方式即为哈希(散列)方法，哈希方法中使用的转换函数称为哈希(散列)函数，构造出来的结构称为哈希表(Hash
Table)(或者称散列表)
用该方法进行搜索不必进行多次关键码的比较，因此搜索的速度比较快问题：按照上述哈希方式，向集合中插入元素44，会出现什么问题？

6.2 冲突-概念

对于两个数据元素的关键字Ki 和 Kj(i != j)，有Ki != Kj，但有：Hash(Ki ) == Hash(Kj )，即：不同关键字通过相同哈希哈数计算出相同的哈希地址，该种现象称为哈希冲突或哈希碰撞。
把具有不同关键码而具有相同哈希地址的数据元素称为“同义词”。

6.3 冲突-避免

首先，我们需要明确一点，由于我们哈希表底层数组的容量往往是小于实际要存储的关键字的数量的，这就导致一个问题，冲突的发生是必然的，但我们能做的应该是尽量的降低冲突率。

6.4 冲突-避免-哈希函数设计

引起哈希冲突的一个原因可能是：哈希函数设计不够合理。哈希函数设计原则：

哈希函数的定义域必须包括需要存储的全部关键码，而如果散列表允许有m个地址时，其值域必须在0到m-1之间
哈希函数计算出来的地址能均匀分布在整个空间中
哈希函数应该比较简单

常见哈希函数

直接定制法–(常用)
取关键字的某个线性函数为散列地址：Hash（Key）= A*Key + B 优点：简单、均匀缺点：需要事先知道关键字的分布情况使用场景：适合查找比较小且连续的情况面试题：字符串中第一个只出现一次字符
除留余数法–(常用)
设散列表中允许的地址数为m，取一个不大于m，但最接近或者等于m的质数p作为除数，按照哈希函数：Hash(key) = key% p(p<=m),将关键码转换成哈希地址
平方取中法–(了解)
假设关键字为1234，对它平方就是1522756，抽取中间的3位227作为哈希地址；再比如关键字为4321，对它平方就是18671041，抽取中间的3位671(或710)作为哈希地址平方取中法比较适合：不知道关键字的分布，而位数又不是很大的情况
折叠法–(了解)
折叠法是将关键字从左到右分割成位数相等的几部分(最后一部分位数可以短些)，然后将这几部分叠加求和，并按散列表表长，取后几位作为散列地址。
折叠法适合事先不需要知道关键字的分布，适合关键字位数比较多的情况
随机数法–(了解)
选择一个随机函数，取关键字的随机函数值为它的哈希地址，即H(key) = random(key),其中random为随机数函数。
通常应用于关键字长度不等时采用此法
数学分析法–(了解)
设有n个d位数，每一位可能有r种不同的符号，这r种不同的符号在各位上出现的频率不一定相同，可能在某些位上分布比较均匀，每种符号出现的机会均等，在某些位上分布不均匀只有某几种符号经常出现。可根据散列表的大小，选择其中各种符号分布均匀的若干位作为散列地址。例如
假设要存储某家公司员工登记表，如果用手机号作为关键字，那么极有可能前7位都是相同的，那么我们可以选择后面的四位作为散列地址，如果这样的抽取工作还容易出现冲突，还可以对抽取出来的数字进行反转(如1234改成4321)、右环位移(如1234改成4123)、左环移位、前两数与后两数叠加(如1234改成12+34=46)等方法。
数字分析法通常适合处理关键字位数比较大的情况，如果事先知道关键字的分布且关键字的若干位分布较均匀的情况
注意：哈希函数设计的越精妙，产生哈希冲突的可能性就越低，但是无法避免哈希冲突

6.5 冲突-避免-负载因子调节（重点掌握）

在这里插入图片描述
负载因子和冲突率的关系粗略演示

所以当冲突率达到一个无法忍受的程度时，我们需要通过降低负载因子来变相的降低冲突率
已知哈希表中已有的关键字个数是不可变的，那我们能调整的就只有哈希表中的数组的大小

6.6 冲突-解决

解决哈希冲突两种常见的方法是：闭散列和开散列

6.7 冲突-解决-闭散列

闭散列：也叫开放定址法，当发生哈希冲突时，如果哈希表未被装满，说明在哈希表中必然还有空位置，那么可以把key存放到冲突位置中的“下一个” 空位置中去。那如何寻找下一个空位置呢？

线性探测
比如上面的场景，现在需要插入元素44，先通过哈希函数计算哈希地址，下标为4，因此44理论上应该插在该位置，但是该位置已经放了值为4的元素，即发生哈希冲突。
线性探测：从发生冲突的位置开始，依次向后探测，直到寻找到下一个空位置为止。
>插入
二次探测
研究表明：当表的长度为质数且表装载因子a不超过0.5时，新的表项一定能够插入，而且任何一个位置都不会被探查两次。因此只要表中有一半的空位置，就不会存在表满的问题。在搜索时可以不考虑表装满的情况，但在插入时必须确保表的装载因子a不超过0.5，如果超出必须考虑增容。
因此：比散列最大的缺陷就是空间利用率比较低，这也是哈希的缺陷。

6.8 冲突-解决-开散列/哈希桶（重点掌握）

开散列法又叫链地址法(开链法)，首先对关键码集合用散列函数计算散列地址，具有相同地址的关键码归于同一子集合，每一个子集合称为一个桶，各个桶中的元素通过一个单链表链接起来，各链表的头结点存储在哈希表中。
从上图可以看出，开散列中每个桶中放的都是发生哈希冲突的元素。
开散列，可以认为是把一个在大集合中的搜索问题转化为在小集合中做搜索了

6.9 冲突严重时的解决办法

刚才我们提到了，哈希桶其实可以看作将大集合的搜索问题转化为小集合的搜索问题了，那如果冲突严重，就意味着小集合的搜索性能其实也时不佳的，这个时候我们就可以将这个所谓的小集合搜索问题继续进行转化，例如：

每个桶的背后是另一个哈希表
每个桶的背后是一棵搜索树

6.10 实现

public class HashBuck {

    static class Node {
        public int key;
        public int val;
        public Node next;

        public Node(int key, int val) {
            this.key = key;
            this.val = val;
        }
    }

    public Node[] array ;
    public int usedSize;//记录 当前哈希桶当中 有效数据的个数
    //默认的负载因子
    public static final float DEFAULT_LOAD_FACTOR = 0.75F;

    public HashBuck() {
        this.array = new Node[10];
        this.usedSize = 0;
    }

    /**
     * 存储key val
     * @param key
     * @param val
     */
    public void put(int key,int val) {
        Node node = new Node(key,val);
        int index = key % array.length;

        Node cur = array[index];
        while (cur != null) {
            if(cur.key == key) {
                cur.val = val;
                return;
            }
            cur = cur.next;
        }
        //头插法
        node.next = array[index];
        array[index] = node;
        usedSize++;

        //检查负载因子
        if(loadFactor() >= DEFAULT_LOAD_FACTOR) {
            //进行 扩容
            grow();
        }

    }

    private float loadFactor() {
        return usedSize*1.0f / array.length;
    }

    private void grow() {
        Node[] newArray = new Node[2* array.length];
        //重新的哈希
        /**
         * 1. 遍历数组的每个元素的链表
         * 2. 每遍历到一个节点，就重新哈希  key % len
         * 3. 进行头插法
         */
        for (int i = 0; i < array.length; i++) {
            Node cur = array[i];
            while (cur != null) {
                Node curNext = cur.next;
                //找到新的位置
                int index = cur.key % newArray.length;
                cur.next = newArray[index];
                newArray[index] = cur;
                //这样写对吗？
                //cur = cur.next;
                cur = curNext;
            }
        }
        this.array = newArray;
    }

    /**
     * 通过key值 获取val 值
     * @param key
     * @return
     */
    public int get(int key) {
        int index = key % array.length;
        Node cur = array[index];
        while (cur != null) {
            if(cur.key == key) {
                return cur.val;
            }
            cur = cur.next;
        }
        return -1;
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98

public class HashBuck2 <K,V>{
    static class Node<K,V> {
        public K key;
        public V val;
        public Node<K,V> next;

        public Node(K key,V val) {
            this.key = key;
            this.val = val;
        }
    }

    public Node<K,V> [] array = (Node<K,V> [])new Node[10];
    //public Node [] array = new Node[10];
    public int usedSize;

    public static final float DEFAULT_LOAD_FACTOR = 0.75F;

    public void put(K key,V val) {
        Node<K,V> node = new Node<>(key,val);
        int hash = key.hashCode();
        int index = hash % array.length;

        Node<K,V> cur = array[index];
        while (cur != null) {
            if(cur.key.equals(key)) {
                cur.val = val;
            }
            cur = cur.next;
        }

        node.next = array[index];
        array[index] = node;
        usedSize++;

        this.usedSize++;
        //扩容自己完善
    }
    public V get(K key) {
        int hash = key.hashCode();
        int index = hash % array.length;
        Node<K,V> cur = array[index];
        while (cur != null) {
            if(cur.key.equals(key)) {
                return cur.val;
            }
            cur = cur.next;
        }
        return null;
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

6.11 性能分析

虽然哈希表一直在和冲突做斗争，但在实际使用过程中，我们认为哈希表的冲突率是不高的，冲突个数是可控的，也就是每个桶中的链表的长度是一个常数，所以，通常意义下，我们认为哈希表的插入/删除/查找时间复杂度是O(1)

6.12 和 java 类集的关系

HashMap 和 HashSet 即 java 中利用哈希表实现的 Map 和 Set
java 中使用的是哈希桶方式解决冲突的
java 会在冲突链表长度大于一定阈值后，将链表转变为搜索树（红黑树）
java 中计算哈希值实际上是调用的类的 hashCode 方法，进行 key 的相等性比较是调用 key 的 equals 方法。所以如果要用自定义类作为 HashMap 的 key 或者 HashSet 的值，必须覆写 hashCode 和 equals 方法，而且要做到 equals 相等的对象，hashCode 一定是一致的。

6.13 面试题

在这里插入图片描述

相关阅读:
Linux shell编程学习笔记28：脚本调试 set命令
 分布式任务调度平台——XXL-JOB
分布式事务之BASE理论
 area.js 文件下载
 代码随想录算法训练营day56 | 300.最长递增子序列、674. 最长连续递增序列、718. 最长重复子数组
 项目：TCP在线云词典
 泰语同声翻译一天多少钱呢
 react 入门
 【C】字符串函数与字符函数
 网格窗帘布能符合NFP92-503法国M2级别要求吗？
原文地址：https://blog.csdn.net/m0_63571404/article/details/126194489