[C++数据结构](22)哈希表与unordered_set，unordered_map实现

[C++数据结构](22)哈希表与unordered_set，unordered_map实现
文章目录
unordered 系列容器

C++ 中，哈希表实现的容器为 unordered_map 和 unordered_set

和 set、map 的区别：
- 无序
- 只有单向迭代器
- 效率高，查找的时间复杂度为 $O (1)$
之所以 set 和 map 的查找速度没有这么快，是因为它们的底层是红黑树，查找元素时需要从根结点开始比较，每比较一次可以排除一半的元素，时间复杂度是 $O(\log n)$ ，当然这已经比遍历查找 $O (n)$ 快很多了。那么，有没有一种方法，可以一下子找到要查找的元素呢？

当然是有的，最简单的就是数组的下标查找，比如开辟一个大小 26 的数组，让 26 个英文字母对应 0~25 的下标，这样我们就可以对一段英文序列统计每个字母的出现次数。最后通过下标查找，我们就可以快速获得一个字母的出现次数。

像这样，字母(key)，存储位置(下标)，数组里存的数值(vlaue)，这三者之间就存在了一种一一对应（映射）关系。

最主要的是字母和存储位置之间建立的映射关系，在这里很合适，因为字母有 ASCII 码，并且是连续的

而在下列情景下可能就不行了：
1. 数据范围分布很广，不集中，那么你的数组就要开得很大，浪费的空间也很多。
2. key 的类型不是整数，而是字符串，自定义类型等等。
哈希表的概念

为了解决这些问题，我们需要某种函数（hashFunc）使 key 和存储位置建立映射关系，那么在查找时通过该函数可以很快找到该元素。

当向结构中：
- 插入元素
  
  根据待插入元素的 key ，以哈希函数计算该元素的存储位置并存放。
- 搜索元素
  
  对元素的 key 进行同样的计算，把求得的函数值当做元素的存储位置，在结构中按此位置取元素比较，若 key 相等，则搜索成功。
该方式即为哈希(散列)方法，哈希方法中使用的转换函数称为哈希(散列)函数，构造出来的结构称为哈希表(Hash Table)(或者称散列表)

哈希函数

哈希的关键就是 key 与地址的映射

哈希函数设计原则
- 哈希函数的定义域必须包括需要存储的全部关键码，而如果散列表允许有m个地址时，其值域必须在0到m-1之间
- 哈希函数计算出来的地址能均匀分布在整个空间中
- 哈希函数应该比较简单
常用哈希函数
1. 直接定址法
  
  取关键字的某个线性函数为散列地址：$\rm Hash(Key)= A\times key + B $
2. 除留余数法
  
  设散列表中允许的地址数为 m，取一个不大于 m，但最接近或者等于 m 的质数 p 作为除数，按照哈希函数： $\rm Hash(key) = key\ \%\ p(p<=m)$ ，将关键码转换成哈希地址。使用质数可以减少哈希冲突，库里面打了一个质数表来实现。
哈希冲突

 概念

不同 key 通过相同哈希函数计算出相同的哈希地址，该种现象称为哈希冲突或哈希碰撞

具有不同 key 而具有相同哈希地址的数据元素称为 同义词

例：

设置哈希函数为：hash(key) = key % capacity，capacity 即为结构的容量。

假设哈希表容量为 10 ，我们将 5 8 99999 20 100 这些数存进去：

$\begin{array}{} \begin{array}{} \ 0\ &1&2&3&4&5&6&7&8&\ \ \ \ 9\ \ \ \ \end{array}$
\\
$\begin{array}{cccccccccc} 20 & 5 & 8 & 99999 \end{array}$
\end{array} 0 12345678 9 20 5 899999
hash(20) = 0，hash(100) = 0，这两个元素要存的位置一样，而我们总不可能一个位置存放两个元素，那这个问题怎么解决呢？

解决方式

哈希冲突总归是要影响效率的，我们只能尽可能减小这个影响。

闭散列

闭散列又叫开放定址法，当发生哈希冲突时，如果哈希表未被装满，那么就把 key 存放到冲突位置的下一个空位置中去。

那么如何寻找下一个空位置呢？
1. 线性探测
还是上面那个数组，如果还要插入10 30 50，虽然它们的哈希地址都是 0，但是我们可以直接在后面找空位置存放

即新元素存在 $\rm{(hash(key)+i)\;\%\;N,(i=1,2,3,\cdots,N为结构容量})$ 的空位置

$\begin{array}{} \begin{array}{} \ 0\ &\ 1\ &\ 2\ &\ 3\ &4&5&6&7&8&\ \ \ \ 9\ \ \ \ \end{array}$
\\
$\begin{array}{cccccccccc} 20 & 10 & 30 & 50 & 5 & 8 & 99999 \end{array}$
\end{array} 0 1 2 3 45678 9 20103050 5 899999
缺点：拥堵
1. 二次探测
发生哈希冲突时，新元素存在 $\rm{(hash(key)+i^2)\;\%\;N,(i=1,2,3,\cdots,N为结构容量})$ 的空位置

10 就存在 $0+1^2=1$ 的位置，30 存在 $0+2^2=4$ 的位置，由于 $0+3^2=9$ 已经被占用了，50 只能存在 $0+4^2)\;\%\;10=6$ 的位置。

$\begin{array}{} \begin{array}{} \ 0\ &\ 1\ &2&3&\ 4\ &5&\ 6\ &7&8&\ \ \ \ 9\ \ \ \ \end{array}$
\\
$\begin{array}{cccccccccc} 20 & 10 & 30 & 5 & 50 & 8 & 99999 \end{array}$
\end{array} 0 1 23 4 5 6 78 9 2010 30550 899999
这是对线性探测的优化，相对来说没那么拥堵，但是没有改变占别人位置的本质。

开散列

开散列又叫链地址法/开链法/哈希桶

哈希表中存指针，指针指向一个链表，把哈希冲突的数据存在同一串链表中。

$\begin{array}{} \begin{array}{} 0\\1\\2\\3\\4\\5\\6\\7\\8\\9 \end{array}$
&
$\begin{matrix} N U L L \\ - - > \\ N U L L \\ N U L L \\ - - > \\ - - > \\ - - > \\ - - > \\ N U L L \\ - - > \end{matrix}$
&
$\begin{array}{l} 1 - - > N U L L \\ 44 - - > 4 - - > N U L L \\ 5 - - > N U L L \\ 6 - - > N U L L \\ 7 - - > N U L L \\ 9 - - > N U L L \end{array}$
\end{array} 0123456789NULL−−>NULLNULL−−>−−>−−>−−>NULL−−>1−−>NULL44−−>4−−>NULL5−−>NULL6−−>NULL7−−>NULL9−−>NULL

实现：除留余数法，闭散列

 框架

用顺序表实现一个简易的哈希表，这里我们直接用 vector 容器
```
enum State
{
	EMPTY,
	EXITS,
	DELETE
};

template<class K, class V>
struct HashData
{
	pair<K, V> _kv;
	State _state = EMPTY;
};

template<class K, class V>
class HashTable
{
    typedef HashData<K, V> Data;
public:
    
private:
	vector<Data> _tables;
    size_t _n = 0; // 有效数据的个数
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
```
$\begin{array}{} \begin{array}{} \ 0\ &\ 1\ &\ 2\ &\ 3\ &4&5&6&7&8&\ \ \ \ 9\ \ \ \ \end{array}$
\\
$\begin{array}{cccccccccc} 20 & 10 & 30 & 50 & 5 & 8 & 99999 \end{array}$
\end{array} 0 1 2 3 45678 9 20103050 5 899999

如何查找一个元素？
- 一个办法：先算出元素的哈希地址，然后依次往后找，直到找到或者遇到空位置为止
  - 如要查找 50，先算出它的哈希地址为0，然后往后找，直到找到下标 3 位置的 50
  - 如要查找 60，先算出它的哈希地址为0，然后往后找，直到空位置都没有找到，所以不存在60
这个方法可行吗？
- 如果先删除10，再找50，那么找到下标1位置就结束了，最后没找到，显然不行。
所以我们给了一个 HashData 类模板，用来表示哈希表里的数据类型，其中包含了一个 pair 和一个 State， State 是枚举类型，用来记录位置的状态。删除元素时把这个位置设为 DELETE 状态，查找时遇到 DELETE 状态就继续找，这个问题就解决了。

查找
```
Data* Find(const K& key)
{
    if (_tables.size() == 0)
    {
        return nullptr;
    }

    size_t starti = key;
    starti %= _tables.size();
    size_t hashi = starti;
    // 线性探测/二次探测
    size_t i = 1;
    while (_tables[hashi]._state != EMPTY)
    {
        if (_tables[hashi]._state != DELETE && _tables[hashi]._kv.first == key)
        {
            return &_tables[hashi];
        }
        hashi = (starti + i) % _tables.size();
        ++i;
    }
    return nullptr;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
```
细节：
- 表为空，应该直接 return nullptr 否则下面会出现除数为0的错误
- 如果只比较 key ，而不判断状态是否是 DELETE，那么用户很可能会把已经删除的元素查找出来。
插入与扩容

插入之前想一想，哈希表什么情况下进行扩容，如何扩容？
- 载荷因子： $\alpha=填入表中的元素个数/哈希表长度$
- $\alpha$ 表示哈希表的装满程度， $\alpha$ 越大，产生冲突的可能性越大。
- 对于闭散列，载荷因子是特别重要的因素，应严格限制在 0.7~0.8 以下，超过 0.8，查表时的 CPU 缓存不命中按照指数曲线上升。
我们设定负载因子到 0.7 以上就扩容
```
bool Insert(const pair<K, V>& kv)
{
    if (Find(kv.first))
    {
        return false;
    }
    // 载荷因子到0.7以上，扩容
    if (_tables.size() == 0 || _n * 10 / _tables.size() >= 7)
    {
        size_t newSize = _tables.size() == 0 ? 10 : 2 * _tables.size();
        // 映射关系变了，需要重新映射
        // 创建新表，开好新空间，将旧表的元素插入，最后swap
        HashTable<K, V> newHT;
        newHT._tables.resize(newSize);

        for (auto& e : _tables)
        {
            if (e._state == EXITS)
            {
                newHT.Insert(e._kv);
            }
        }
        newHT._tables.swap(_tables);
    }
    size_t starti = kv.first;
    starti %= _tables.size();
    size_t hashi = starti;
    // 线性探测/二次探测
    size_t i = 1;
    while (_tables[hashi]._state == EXITS)
    {
        hashi = (starti + i) % _tables.size();
        ++i;
    }
    _tables[hashi]._kv = kv;
    _tables[hashi]._state = EXITS;
    ++_n;

    return true;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
```
扩容后，映射关系变了，需要重新映射，这里采用了类似深拷贝现代写法的写法，创建新的哈希表，重新插入一遍，然后旧表与新表交换。

删除

找到要删除的元素，状态该成 DELETE
```
bool Erase(const K& key)
{
    Data* ret = Find(key);
    if (ret)
    {
        ret->_state = DELETE;
        --_n;
        return true;
    }
    else
    {
        return false;
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
```
仿函数

现在解决了数据范围过大，不集中的问题。还有一个问题，非整数类型，如何映射哈希地址？

为此，我们需要一个仿函数 HashFunc，添加一个模板参数 HashFunc，在需要将 key 转成整数的地方使用仿函数。

以下是仿函数的实现：
- 对于能直接转换为整型的类型，调用 DefaultHash 仿函数，
- 对于 string 类型进行模板特化
要求 string 类型的哈希值，如果只是简单的把每个字符的 ASCII 码相加，那么造成的哈希冲突较多，比如相同字符组合，不同排列的字符串算出的哈希值是相同的。

为了解决这个问题，这里使用一个字符串哈希算法——BKDR Hash：
- 在求和的基础上，每次加一个字符前先乘 131，也可以乘1313,13131,131313…
```
template<class K>
struct DefaultHash
{
	size_t operator()(const K& key)
	{
		return key;
	}
};

template<>
struct DefaultHash<string>
{
	size_t operator()(const string& key)
	{
		size_t hash = 0;
		for (auto ch : key)
		{
			hash = hash * 131 + ch;
		}
		return hash;
	}
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
```
如果是自定义类型，那就需要我们自己写额外的仿函数哈希算法了，比如日期类可以把年月日加起来作为哈希值。

完整代码
```
#pragma once
#include 
#include 
using namespace std;

enum State
{
	EMPTY,
	EXITS,
	DELETE
};

template<class K, class V>
struct HashData
{
	pair<K, V> _kv;
	State _state = EMPTY;
};

template<class K>
struct DefaultHash
{
	size_t operator()(const K& key)
	{
		return key;
	}
};

template<>
struct DefaultHash<string>
{
	size_t operator()(const string& key)
	{
		size_t hash = 0;
		for (auto ch : key)
		{
			hash = hash * 131 + ch;
		}
		return hash;
	}
};

template<class K, class V, class HashFunc = DefaultHash<K>>
class HashTable
{
	typedef HashData<K, V> Data;
public:
	bool Insert(const pair<K, V>& kv)
	{
		if (Find(kv.first))
		{
			return false;
		}
		// 载荷因子到0.7以上，扩容
		if (_tables.size() == 0 || _n * 10 / _tables.size() >= 7)
		{
			size_t newSize = _tables.size() == 0 ? 10 : 2 * _tables.size();
			// 映射关系变了，需要重新映射
			// 创建新表，开好新空间，将旧表的元素插入，最后swap
			HashTable<K, V, HashFunc> newHT;
			newHT._tables.resize(newSize);

			for (auto& e : _tables)
			{
				if (e._state == EXITS)
				{
					newHT.Insert(e._kv);
				}
			}
			newHT._tables.swap(_tables);
		}

		HashFunc hf;
		size_t starti = hf(kv.first);
		starti %= _tables.size();
		size_t hashi = starti;
		// 线性探测/二次探测
		size_t i = 1;
		while (_tables[hashi]._state == EXITS)
		{
			hashi = (starti + i) % _tables.size();
			++i;
		}
		_tables[hashi]._kv = kv;
		_tables[hashi]._state = EXITS;
		++_n;

		return true;
	}
	
	Data* Find(const K& key)
	{
		if (_tables.size() == 0)
		{
			return nullptr;
		}

		HashFunc hf;
		size_t starti = hf(key);
		starti %= _tables.size();
		size_t hashi = starti;
		// 线性探测/二次探测
		size_t i = 1;
		while (_tables[hashi]._state != EMPTY)
		{
			if (_tables[hashi]._state != DELETE && _tables[hashi]._kv.first == key)
			{
				return &_tables[hashi];
			}
			hashi = (starti + i) % _tables.size();
			++i;
		}
		return nullptr;
	}

	bool Erase(const K& key)
	{
		Data* ret = Find(key);
		if (ret)
		{
			ret->_state = DELETE;
			--_n;
			return true;
		}
		else
		{
			return false;
		}
	}

private:
	vector<Data> _tables;
	size_t _n = 0; // 有效数据的个数
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
```
实现：除留余数法、开散列

 框架
```
template<class K, class V>
struct HashNode
{
	pair<K, V> _kv;
	HashNode<K, V>* _next;

	HashNode(const pair<K, V>& kv)
		: _kv(kv)
		, _next(nullptr)
	{}
};

template<class K, class V, class HashFunc = DefaultHash<K>>
class HashTable
{
	typedef HashNode<K, V> Node;
public:

private:
	// 指针数组
	vector<Node*> _tables;
	size_t _n = 0;
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
```
插入与扩容

开散列对载荷因子的要求没有那么高，我们这里设为载荷因子达到 1 以上时扩容，扩容的方式和闭散列的差不多（将旧表所以元素遍历出来插入开好空间的新表，最后新表旧表互换）。
```
bool Insert(const pair<K, V>& kv)
{
    if (Find(kv.first))
    {
        return false;
    }
    HashFunc hf;
    // 载荷因子 == 1，扩容，重新映射
    if (_tables.size() == _n)
    {
        size_t newSize = _tables.size() == 0 ? 10 : 2 * _tables.size();
        HashTable<K, V> newHT;
        newHT._tables.resize(newSize);
        for (size_t i = 0; i < _tables.size(); ++i)
        {
            Node* cur = _tables[i];
            while (cur)
            {
                newHT.Insert(cur->_kv);
                cur = cur->_next;
            }
        }
        newHT._tables.swap(_tables);
    }

    size_t hashi = hf(kv.first);
    hashi %= _tables.size();
    //头插，然后把头结点指针塞进桶里面
    Node* newNode = new Node(kv);
    newNode->_next = _tables[hashi];
    _tables[hashi] = newNode;

    ++_n;
    return true;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
```
其实上面这种扩容方式不是很好，使用 insert 把数据插入到新表，就需要创建新的结点，完了还需要释放旧表，效率不高。那么，我们能不能直接把结点放到新表呢？

思路：创建新的 vector，而不是 HashTable，将结点头插进去。
```
    // 载荷因子 == 1，扩容，重新映射
    if (_tables.size() == _n)
    {
        size_t newSize = _tables.size() == 0 ? 10 : 2 * _tables.size();
        vector<Node*> newTable;
        newTable.resize(newSize);
        for (size_t i = 0; i < _tables.size(); ++i)
        {
            Node* cur = _tables[i];
            while (cur)
            {
                Node* next = cur->_next;

                size_t hashi = hf(cur->_kv.first) % newSize;
                cur->_next = newTable[hashi];
                newTable[hashi] = cur;

                cur = next;
            }
            _tables[i] = nullptr;
        }
        newTable.swap(_tables);
    }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
```
析构

链表需要我们手动写析构函数来释放
```
~HashTable()
{
    for (size_t i = 0; i < _tables.size(); ++i)
    {
        Node* cur = _tables[i];
        while (cur)
        {
            Node* next = cur->_next;
            delete cur;
            cur = next;
        }
        _tables[i] = nullptr;
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
```
查找

找到对应的桶，然后直接遍历里面的结点即可
```
Node* Find(const K& key)
{
    if (_tables.size() == 0)
    {
        return nullptr;
    }
    
	HashFunc hf;
    size_t hashi = hf(key);
    hashi %= _tables.size();
    Node* cur = _tables[hashi];
    while (cur)
    {
        if (cur->_kv.first == key)
        {
            return cur;
        }
        cur = cur->_next;
    }
    return nullptr;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
```
删除
```
bool Erase(const K& key)
{
    if (_tables.size() == 0)
    {
        return false;
    }
    
	HashFunc hf;
    size_t hashi = hf(key);
    hashi %= _tables.size();
    Node* prev = nullptr;
    Node* cur = _tables[hashi];
    while (cur)
    {
        if (cur->_kv.first == key)
        {
            if (prev == nullptr)
                _tables[hashi] = cur->_next;
            else
                prev->_next = cur->_next;
            delete cur;
            --_n;
            return true;
        }
        prev = cur;
        cur = cur->_next;
    }
    return false;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
```
另一种删除写法，不使用临时指针 prev，采用替换法

替换法：要删除的结点cur，将下一个结点的值覆盖自己的值，然后删除下一个结点。如果cur是最后一个结点，那么就用头结点来覆盖，然后删除头结点。
```
bool Erase(const K& key)
{
    if (_tables.size() == 0)
    {
        return false;
    }
    
	HashFunc hf;
    size_t hashi = hf(key);
    hashi %= _tables.size();
    Node* cur = _tables[hashi];
    while (cur)
    {
        if (cur->_kv.first == key)
        {
            if (cur->_next == nullptr)
            {
                cur->_kv = _tables[hashi]->_kv;
                Node* first = _tables[hashi];
                _tables[hashi] = first->_next;
                delete first;
            }
            else
            {
                cur->_kv = cur->_next->_kv;
                Node* next = cur->_next;
                cur->_next = next->_next;
                delete next;
            }
            --_n;
            return true;
        }
        cur = cur->_next;
    }
    return false;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
```
拷贝构造与赋值重载

默认构造则什么都不用做，但是要显式写出来。

拷贝构造的一种写法是调用 Insert：
```
HashTable()
{}

HashTable(const HashTable<K, V>& ht)
{
    _tables.resize(ht._tables.size());
    for (size_t i = 0; i < ht._tables.size(); ++i)
    {
        Node* cur = ht._tables[i];
        while (cur)
        {
            Insert(cur->_kv);
            cur = cur->_next;
        }
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
```
另一种写法则是原样复制照搬：
```
HashTable(const HashTable<K, V>& ht)
    :_n(ht._n)
{
    _tables.resize(ht._tables.size());
    for (size_t i = 0; i < ht._tables.size(); ++i)
    {
        Node* cur = ht._tables[i];
        Node* tail = nullptr;
        while (cur)
        {
            Node* newNode = new Node(cur->_kv);

            if (tail == nullptr)
                _tables[i] = newNode;
            else
                tail->_next = newNode;
            tail = newNode;

            cur = cur->_next;
        }
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
```
赋值重载
```
void swap(HashTable<K, V>& ht)
{
    _tables.swap(ht._tables);
    std::swap(_n, ht._n);
}

HashTable<K, V>& operator=(HashTable<K, V> ht)
{
    swap(ht);
    return *this;
}
1
2
3
4
5
6
7
8
9
10
11
```
完整代码
```
#pragma once
#include 
#include 
using namespace std;

template<class K, class V>
struct HashNode
{
	pair<K, V> _kv;
	HashNode<K, V>* _next;

	HashNode(const pair<K, V>& kv)
		: _kv(kv)
		, _next(nullptr)
	{}
};

template<class K>
struct DefaultHash
{
	size_t operator()(const K& key)
	{
		return key;
	}
};

template<>
struct DefaultHash<string>
{
	size_t operator()(const string& key)
	{
		size_t hash = 0;
		for (auto ch : key)
		{
			hash = hash * 131 + ch;
		}
		return hash;
	}
};

template<class K, class V, class HashFunc = DefaultHash<K>>
class HashTable
{
	typedef HashNode<K, V> Node;
public:

	HashTable()
	{}

	//HashTable(const HashTable& ht)
	//{
	//	_tables.resize(ht._tables.size());
	//	for (size_t i = 0; i < ht._tables.size(); ++i)
	//	{
	//		Node* cur = ht._tables[i];
	//		while (cur)
	//		{
	//			Insert(cur->_kv);
	//			cur = cur->_next;
	//		}
	//	}
	//}

	HashTable(const HashTable<K, V>& ht)
		:_n(ht._n)
	{
		_tables.resize(ht._tables.size());
		for (size_t i = 0; i < ht._tables.size(); ++i)
		{
			Node* cur = ht._tables[i];
			Node* tail = nullptr;
			while (cur)
			{
				Node* newNode = new Node(cur->_kv);

				if (tail == nullptr)
					_tables[i] = newNode;
				else
					tail->_next = newNode;
				tail = newNode;

				cur = cur->_next;
			}
		}
	}

	HashTable<K, V>& operator=(HashTable<K, V> ht)
	{
		swap(ht);
		return *this;
	}

	~HashTable()
	{
		for (size_t i = 0; i < _tables.size(); ++i)
		{
			Node* cur = _tables[i];
			while (cur)
			{
				Node* next = cur->_next;
				delete cur;
				cur = next;
			}
			_tables[i] = nullptr;
		}
	}

	void swap(HashTable<K, V>& ht)
	{
		_tables.swap(ht._tables);
		std::swap(_n, ht._n);
	}

	bool Insert(const pair<K, V>& kv)
	{
		if (Find(kv.first))
		{
			return false;
		}

		HashFunc hf;
		// 载荷因子 == 1，扩容，重新映射
		// 有缺陷的写法
		//if (_tables.size() == _n)
		//{
		//	size_t newSize = _tables.size() == 0 ? 10 : 2 * _tables.size();
		//	HashTable newHT;
		//	newHT._tables.resize(newSize);
		//	for (size_t i = 0; i < _tables.size(); ++i)
		//	{
		//		Node* cur = _tables[i];
		//		while (cur)
		//		{
		//			newHT.Insert(cur->_kv);
		//			cur = cur->_next;
		//		}
		//	}
		//	newHT._tables.swap(_tables);
		//}

		// 较高效的写法
		if (_tables.size() == _n)
		{
			size_t newSize = _tables.size() == 0 ? 10 : 2 * _tables.size();
			vector<Node*> newTable;
			newTable.resize(newSize);
			for (size_t i = 0; i < _tables.size(); ++i)
			{
				Node* cur = _tables[i];
				while (cur)
				{
					Node* next = cur->_next;

					size_t hashi = hf(cur->_kv.first) % newSize;
					cur->_next = newTable[hashi];
					newTable[hashi] = cur;

					cur = next;
				}
				_tables[i] = nullptr;
			}
			newTable.swap(_tables);
		}

		size_t hashi = hf(kv.first);
		hashi %= _tables.size();
		//头插，然后把头结点指针塞进桶里面
		Node* newNode = new Node(kv);
		newNode->_next = _tables[hashi];
		_tables[hashi] = newNode;
		
		++_n;
		return true;
	}

	Node* Find(const K& key)
	{
		if (_tables.size() == 0)
		{
			return nullptr;
		}

		HashFunc hf;
		size_t hashi = hf(key);
		hashi %= _tables.size();
		Node* cur = _tables[hashi];
		while (cur)
		{
			if (cur->_kv.first == key)
			{
				return cur;
			}
			cur = cur->_next;
		}
		return nullptr;
	}

	bool Erase(const K& key)
	{
		if (_tables.size() == 0)
		{
			return false;
		}

		HashFunc hf;
		size_t hashi = hf(key);
		hashi %= _tables.size();
		Node* prev = nullptr;
		Node* cur = _tables[hashi];
		while (cur)
		{
			if (cur->_kv.first == key)
			{
				if (prev == nullptr)
					_tables[hashi] = cur->_next;
				else
					prev->_next = cur->_next;
				delete cur;
				--_n;
				return true;
			}
			prev = cur;
			cur = cur->_next;
		}
		return false;
	}

	//bool Erase(const K& key)
	//{
	//	if (_tables.size() == 0)
	//	{
	//		return false;
	//	}

	//	HashFunc hf;
	//	size_t hashi = hf(key);
	//	hashi %= _tables.size();
	//	Node* cur = _tables[hashi];
	//	while (cur)
	//	{
	//		if (cur->_kv.first == key)
	//		{
	//			if (cur->_next == nullptr)
	//			{
	//				cur->_kv = _tables[hashi]->_kv;
	//				Node* first = _tables[hashi];
	//				_tables[hashi] = first->_next;
	//				delete first;
	//			}
	//			else
	//			{
	//				cur->_kv = cur->_next->_kv;
	//				Node* next = cur->_next;
	//				cur->_next = next->_next;
	//				delete next;
	//			}
	//			--_n;
	//			return true;
	//		}
	//		cur = cur->_next;
	//	}
	//	return false;
	//}

private:
	// 指针数组
	vector<Node*> _tables;
	size_t _n = 0;
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
```
封装成 unordered_map、unordered_set

改造哈希表

参考 set 和 map 的封装，这里的类型不能写死，要改成泛型 T，使它既可以是单值类型也可以是键值对类型。
```
template<class T>
struct HashNode
{
	T _data;
	HashNode<T>* _next;

	HashNode(const T& data)
		: _data(data)
		, _next(nullptr)
	{}
};
1
2
3
4
5
6
7
8
9
10
11
```
下面的 HashTable 也要做出相应的修改：

哈希函数 HashFunc 改为从上层传入，为了方便用户传入自己实现的哈希函数，这里原本的缺省模板参数可以去掉
```
template<class K, class T, class HashFunc>
class HashTable
{
	typedef HashNode<T> Node;
public:
    //略。。。
1
2
3
4
5
6
```
set 和 map 取 key 的方式不同，需要再传一个模板参数 KeyOfT，表示仿函数类型。

继续对 HashTable 进行改造，主要是 Insert 函数，因为它的参数是 T 类型的data。
```
template<class K, class T, class KeyOfT, class HashFunc>
class HashTable
{
    //...
    bool Insert(const T& data)
	{
		HashFunc hf;
		KeyOfT kot;	//取 key 的函数对象

		if (Find(kot(data)))	//改动
		{
			return false;
		}

		// 载荷因子 == 1，扩容，重新映射
		if (_tables.size() == _n)
		{
			size_t newSize = _tables.size() == 0 ? 10 : 2 * _tables.size();
			vector<Node*> newTable;
			newTable.resize(newSize);
			for (size_t i = 0; i < _tables.size(); ++i)
			{
				Node* cur = _tables[i];
				while (cur)
				{
					Node* next = cur->_next;

					size_t hashi = hf(kot(cur->_data)) % newSize;	//改动
					cur->_next = newTable[hashi];
					newTable[hashi] = cur;

					cur = next;
				}
				_tables[i] = nullptr;
			}
			newTable.swap(_tables);
		}

		size_t hashi = hf(kot(data));	//改动
		hashi %= _tables.size();
		//头插，然后把头结点指针塞进桶里面
		Node* newNode = new Node(kv);
		newNode->_next = _tables[hashi];
		_tables[hashi] = newNode;
		
		++_n;
		return true;
	}
    //...
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
```
另外 Find Erase 函数需要将传入的 key 与表中 data 的 key 进行比较，也需要用到仿函数，这里不多做演示。

迭代器

哈希表的迭代器不同于 list，它的 ++ 实现离不开对整个表的控制，所以一个迭代器包含了两个指针。

因为 __HTIterator 和 HashTable 两个类之间有交集，所以写在上面的类的前面必须声明另一个类。
```
//HashTable.h
template<class K, class T, class KeyOfT, class HashFunc>
class HashTable;

template<class K, class T, class KeyOfT, class HashFunc>
class __HTIterator
{
	typedef HashNode<T> Node;
	typedef __HTIterator<K, T, KeyOfT, HashFunc> Self;
public:
	Node* _node;
	HashTable<K, T, KeyOfT, HashFunc>* _pht;

    __HTIterator()
	{}
    
	__HTIterator(Node* node, HashTable<K, T, KeyOfT, HashFunc>* pht)
		: _node(node)
		, _pht(pht)
	{}

};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
```
++：
- 如果当前结点的下一个不为空，就将指针移到下一个结点
- 如果当前结点的下一个为空，则要找下一个非空桶，指针指向下一个非空桶的第一个结点。如果没有下一个非空桶，则将指针设为空。
```
Self& operator++()
{
    if (_node->_next)
    {
        _node = _node->_next;
    }
    else // 找下一个桶
    {
        KeyOfT kot;
        HashFunc hf;
        size_t hashi = hf(kot(_node->_data)) % _pht->_tables.size();
        for (++hashi; hashi < _pht->_tables.size(); ++hashi)
        {
            if (_pht->_tables[hashi])
            {
                _node = _pht->_tables[hashi];
                break;
            }
        }
        // 未找到
        if (hashi == _pht->_tables.size())
        {
            _node = nullptr;
        }
    }
    return *this;
}

Self operator++(int)
{
    Self tmp(_node, _pht);
    ++(*this);
    return tmp;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
```
其他迭代器操作符的重载：
```
T& operator*()
{
    return _node->_data;
}

T* operator->()
{
    return &_node->_data;
}

bool operator==(const Self& s) const
{
    return _node == s._node;
}

bool operator!=(const Self& s) const
{
    return _node != s._node;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
```
在 HashTable 里面实现 begin end
```
template<class K, class T, class KeyOfT, class HashFunc>
class HashTable
{
    template<class K, class T, class KeyOfT, class HashFunc>
	friend class __HTIterator;
    
	typedef HashNode<T> Node;
	typedef HashTable<K, T, KeyOfT, HashFunc> Self;
public:
	typedef __HTIterator<K, T, KeyOfT, HashFunc> iterator;

	iterator begin()
	{
		for (size_t i = 0; i < _tables.size(); ++i)
		{
			Node* cur = _tables[i];
			if (cur)
			{
				return iterator(cur, this);
			}
		}
		return end();
	}

	iterator end()
	{
		return iterator(nullptr, this);
	}
    
    //...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
```
另外需要将find的返回类型改成迭代器
```
iterator Find(const K& key)
{
    if (_tables.size() == 0)
    {
        return iterator(nullptr, this);
    }

    HashFunc hf;
    KeyOfT kot;

    size_t hashi = hf(key);
    hashi %= _tables.size();
    Node* cur = _tables[hashi];
    while (cur)
    {
        if (kot(cur->_data) == key)
        {
            return iterator(cur, this);
        }
        cur = cur->_next;
    }
    return iterator(nullptr, this);
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
```
封装

为了避免与标准库命名冲突，我们用一个命名空间包起来。
```
namespace my
{
	template<class K, class HashFunc = DefaultHash<K>>
	class unordered_set
	{
		struct SetKeyOfT
		{
			const K& operator()(const K& key)
			{
				return key;
			}
		};
	public:
		typedef typename HashTable<K, K, SetKeyOfT, HashFunc>::iterator iterator;

		iterator begin()
		{
			return _ht.begin();
		}

		iterator end()
		{
			return _ht.end();
		}

		pair<iterator, bool> insert(const K& key)
		{
			return _ht.Insert(key);
		}

		iterator find(const K& key)
		{
			return _ht.Find(key);
		}

		bool erase(const K& key)
		{
			return _ht.Erase(key);
		}

	private:
		HashTable<K, K, SetKeyOfT, HashFunc> _ht;
	};
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
```
```
namespace my
{
	template<class K, class V, class HashFunc = DefaultHash<K>>
	class unordered_map
	{
		struct MapKeyOfT
		{
			const K& operator()(const pair<K, V>& kv)
			{
				return kv.first;
			}
		};
	public:
		typedef typename HashTable<K, pair<K, V>, MapKeyOfT, HashFunc>::iterator iterator;

		iterator begin()
		{
			return _ht.begin();
		}

		iterator end()
		{
			return _ht.end();
		}

		pair<iterator, bool> insert(const pair<K, V>& kv)
		{
			return _ht.Insert(kv);
		}

		iterator find(const K& key)
		{
			return _ht.Find(key);
		}

		bool erase(const K& key)
		{
			return _ht.Erase(key);
		}

	private:
		HashTable<K, pair<K, V>, MapKeyOfT, HashFunc> _ht;
	};
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
```
map 还需要封装 []，这里就需要对Insert再次进行改造，类似的改造在 set和map的模拟实现_世真的博客-CSDN博客和 STL - set 和 map 的使用_世真的博客-CSDN博客中有详细讲解。
- 把返回类型改成 pair，然后把两个 return 部分改好。
```
pair<iterator, bool> Insert(const T& data)
{
    HashFunc hf;
    KeyOfT kot;

    iterator pos = Find(kot(data));
    if (pos != end())
    {
        return make_pair(pos, false);
    }

    // 载荷因子 == 1，扩容，重新映射
    if (_tables.size() == _n)
    {
        size_t newSize = _tables.size() == 0 ? 10 : 2 * _tables.size();
        vector<Node*> newTable;
        newTable.resize(newSize);
        for (size_t i = 0; i < _tables.size(); ++i)
        {
            Node* cur = _tables[i];
            while (cur)
            {
                Node* next = cur->_next;

                size_t hashi = hf(kot(cur->_data)) % newSize;
                cur->_next = newTable[hashi];
                newTable[hashi] = cur;

                cur = next;
            }
            _tables[i] = nullptr;
        }
        newTable.swap(_tables);
    }

    size_t hashi = hf(kot(data));
    hashi %= _tables.size();
    //头插，然后把头结点指针塞进桶里面
    Node* newNode = new Node(data);
    newNode->_next = _tables[hashi];
    _tables[hashi] = newNode;

    ++_n;
    return make_pair(iterator(newNode, this), true);
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
```
operator[]:
```
V& operator[](const K& key)
{
    pair<iterator, bool> ret = _ht.Insert(make_pair(key, V()));
    return ret.first->second;
}
1
2
3
4
5
```
关于除留余数法的补充

除留余数法中，除数应该是质数。我们在上面的实现中，除数就是容量，扩容是2倍扩，显然这种方式不能保证除数为质数。质数本身也没有具体的公式去直接计算得到

为了实现真正的除留余数法，我们可以打一个质数表：

这些质数相邻两个之间大约成2倍关系
```
size_t GetNextPrime(size_t prime)
{
    const int PRIMECOUNT = 28;
    static const size_t primeList[PRIMECOUNT] =
    {
        53ul, 97ul, 193ul, 389ul, 769ul,
        1543ul, 3079ul, 6151ul, 12289ul, 24593ul,
        49157ul, 98317ul, 196613ul, 393241ul, 786433ul,
        1572869ul, 3145739ul, 6291469ul, 12582917ul,25165843ul,
        50331653ul, 100663319ul, 201326611ul, 402653189ul,805306457ul,
        1610612741ul, 3221225473ul, 4294967291ul
    };
    size_t i = 0;
    for (; i < PRIMECOUNT; ++i)
    {
        if (primeList[i] > prime)
            return primeList[i];
    }
    return primeList[i];
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
```
使用该函数，传入一个值可以找到下一个大约是这个值的2倍的质数。

对插入函数的扩容部分进行修改：
```
//...
// 载荷因子 == 1，扩容，重新映射
if (_tables.size() == _n)
{
    //size_t newSize = _tables.size() == 0 ? 10 : 2 * _tables.size();
    size_t newSize = GetNextPrime(_tables.size());

//...
1
2
3
4
5
6
7
8
```
完整代码

 HashTable.h
```
#pragma once
#include 
#include 
using namespace std;

template<class T>
struct HashNode
{
	T _data;
	HashNode<T>* _next;

	HashNode(const T& data)
		: _data(data)
		, _next(nullptr)
	{}
};

template<class K, class T, class KeyOfT, class HashFunc>
class HashTable;

template<class K, class T, class KeyOfT, class HashFunc>
class __HTIterator
{
	typedef HashNode<T> Node;
	typedef __HTIterator<K, T, KeyOfT, HashFunc> Self;
public:
	Node* _node;
	HashTable<K, T, KeyOfT, HashFunc>* _pht;

	__HTIterator()
	{}
	
	__HTIterator(Node* node, HashTable<K, T, KeyOfT, HashFunc>* pht)
		: _node(node)
		, _pht(pht)
	{}

	Self& operator++()
	{
		if (_node->_next)
		{
			_node = _node->_next;
		}
		else // 找下一个桶
		{
			KeyOfT kot;
			HashFunc hf;
			size_t hashi = hf(kot(_node->_data)) % _pht->_tables.size();
			for (++hashi; hashi < _pht->_tables.size(); ++hashi)
			{
				if (_pht->_tables[hashi])
				{
					_node = _pht->_tables[hashi];
					break;
				}
			}
			// 未找到
			if (hashi == _pht->_tables.size())
			{
				_node = nullptr;
			}
		}
		return *this;
	}

	Self operator++(int)
	{
		Self tmp(_node, _pht);
		++(*this);
		return tmp;
	}

	T& operator*()
	{
		return _node->_data;
	}

	T* operator->()
	{
		return &_node->_data;
	}

	bool operator==(const Self& s) const
	{
		return _node == s._node;
	}

	bool operator!=(const Self& s) const
	{
		return _node != s._node;
	}

};

template<class K>
struct DefaultHash
{
	size_t operator()(const K& key)
	{
		return key;
	}
};

template<>
struct DefaultHash<string>
{
	size_t operator()(const string& key)
	{
		size_t hash = 0;
		for (auto ch : key)
		{
			hash = hash * 131 + ch;
		}
		return hash;
	}
};

template<class K, class T, class KeyOfT, class HashFunc>
class HashTable
{
	template<class K, class T, class KeyOfT, class HashFunc>
	friend class __HTIterator;

	typedef HashNode<T> Node;
	typedef HashTable<K, T, KeyOfT, HashFunc> Self;
public:
	typedef __HTIterator<K, T, KeyOfT, HashFunc> iterator;

	iterator begin()
	{
		for (size_t i = 0; i < _tables.size(); ++i)
		{
			Node* cur = _tables[i];
			if (cur)
			{
				return iterator(cur, this);
			}
		}
		return end();
	}

	iterator end()
	{
		return iterator(nullptr, this);
	}

	HashTable()
	{}

	HashTable(const Self& ht)
		:_n(ht._n)
	{
		_tables.resize(ht._tables.size());
		for (size_t i = 0; i < ht._tables.size(); ++i)
		{
			Node* cur = ht._tables[i];
			Node* tail = nullptr;
			while (cur)
			{
				Node* newNode = new Node(cur->_kv);

				if (tail == nullptr)
					_tables[i] = newNode;
				else
					tail->_next = newNode;
				tail = newNode;

				cur = cur->_next;
			}
		}
	}

	Self& operator=(Self ht)
	{
		swap(ht);
		return *this;
	}

	~HashTable()
	{
		for (size_t i = 0; i < _tables.size(); ++i)
		{
			Node* cur = _tables[i];
			while (cur)
			{
				Node* next = cur->_next;
				delete cur;
				cur = next;
			}
			_tables[i] = nullptr;
		}
	}

	void swap(Self& ht)
	{
		_tables.swap(ht._tables);
		std::swap(_n, ht._n);
	}

	size_t GetNextPrime(size_t prime)
	{
		const int PRIMECOUNT = 28;
		static const size_t primeList[PRIMECOUNT] =
		{
			53ul, 97ul, 193ul, 389ul, 769ul,
			1543ul, 3079ul, 6151ul, 12289ul, 24593ul,
			49157ul, 98317ul, 196613ul, 393241ul, 786433ul,
			1572869ul, 3145739ul, 6291469ul, 12582917ul,25165843ul,
			50331653ul, 100663319ul, 201326611ul, 402653189ul,805306457ul,
			1610612741ul, 3221225473ul, 4294967291ul
		};
		size_t i = 0;
		for (; i < PRIMECOUNT; ++i)
		{
			if (primeList[i] > prime)
				return primeList[i];
		}
		return primeList[i];
	}

	pair<iterator, bool> Insert(const T& data)
	{
		HashFunc hf;
		KeyOfT kot;

		iterator pos = Find(kot(data));
		if (pos != end())
		{
			return make_pair(pos, false);
		}

		// 载荷因子 == 1，扩容，重新映射
		if (_tables.size() == _n)
		{
			//size_t newSize = _tables.size() == 0 ? 10 : 2 * _tables.size();
			size_t newSize = GetNextPrime(_tables.size());
			vector<Node*> newTable;
			newTable.resize(newSize);
			for (size_t i = 0; i < _tables.size(); ++i)
			{
				Node* cur = _tables[i];
				while (cur)
				{
					Node* next = cur->_next;

					size_t hashi = hf(kot(cur->_data)) % newSize;
					cur->_next = newTable[hashi];
					newTable[hashi] = cur;

					cur = next;
				}
				_tables[i] = nullptr;
			}
			newTable.swap(_tables);
		}

		size_t hashi = hf(kot(data));
		hashi %= _tables.size();
		//头插，然后把头结点指针塞进桶里面
		Node* newNode = new Node(data);
		newNode->_next = _tables[hashi];
		_tables[hashi] = newNode;
		
		++_n;
		return make_pair(iterator(newNode, this), true);
	}

	iterator Find(const K& key)
	{
		if (_tables.size() == 0)
		{
			return iterator(nullptr, this);
		}

		HashFunc hf;
		KeyOfT kot;

		size_t hashi = hf(key);
		hashi %= _tables.size();
		Node* cur = _tables[hashi];
		while (cur)
		{
			if (kot(cur->_data) == key)
			{
				return iterator(cur, this);
			}
			cur = cur->_next;
		}
		return iterator(nullptr, this);
	}

	bool Erase(const K& key)
	{
		if (_tables.size() == 0)
		{
			return false;
		}

		HashFunc hf;
		KeyOfT kot;

		size_t hashi = hf(key);
		hashi %= _tables.size();
		Node* prev = nullptr;
		Node* cur = _tables[hashi];
		while (cur)
		{
			if (kot(cur->_data) == key)
			{
				if (prev == nullptr)
					_tables[hashi] = cur->_next;
				else
					prev->_next = cur->_next;
				delete cur;
				--_n;
				return true;
			}
			prev = cur;
			cur = cur->_next;
		}
		return false;
	}

private:
	// 指针数组
	vector<Node*> _tables;
	size_t _n = 0;
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
```
UnorderedSet.h
```
#pragma once
#include "HashTable.h"

namespace my
{
	template<class K, class HashFunc = DefaultHash<K>>
	class unordered_set
	{
		struct SetKeyOfT
		{
			const K& operator()(const K& key)
			{
				return key;
			}
		};
	public:
		typedef typename HashTable<K, K, SetKeyOfT, HashFunc>::iterator iterator;

		iterator begin()
		{
			return _ht.begin();
		}

		iterator end()
		{
			return _ht.end();
		}

		pair<iterator, bool> insert(const K& key)
		{
			return _ht.Insert(key);
		}

		iterator find(const K& key)
		{
			return _ht.Find(key);
		}

		bool erase(const K& key)
		{
			return _ht.Erase(key);
		}

	private:
		HashTable<K, K, SetKeyOfT, HashFunc> _ht;
	};
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
```
UnorderedMap.h
```
#pragma once
#include "HashTable.h"

namespace my
{
	template<class K, class V, class HashFunc = DefaultHash<K>>
	class unordered_map
	{
		struct MapKeyOfT
		{
			const K& operator()(const pair<K, V>& kv)
			{
				return kv.first;
			}
		};
	public:
		typedef typename HashTable<K, pair<K, V>, MapKeyOfT, HashFunc>::iterator iterator;

		iterator begin()
		{
			return _ht.begin();
		}

		iterator end()
		{
			return _ht.end();
		}

		pair<iterator, bool> insert(const pair<K, V>& kv)
		{
			return _ht.Insert(kv);
		}

		iterator find(const K& key)
		{
			return _ht.Find(key);
		}

		bool erase(const K& key)
		{
			return _ht.Erase(key);
		}

		V& operator[](const K& key)
		{
			pair<iterator, bool> ret = _ht.Insert(make_pair(key, V()));
			return ret.first->second;
		}

	private:
		HashTable<K, pair<K, V>, MapKeyOfT, HashFunc> _ht;
	};
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
```
unordered_map其他底层接口

Buckets
- bucket_count：返回桶的数量
- max_bucket_count：返回最大桶数
- bucket_size：返回桶的大小
- bucket：返回具有键k的元素所在的桶号。
Hash policy
- load_factor：返回载荷因子
- max_bucket_count：返回最大载荷因子
- rehash：设置桶数
这些用得不多，可以大致了解一下。
相关阅读:
新的centos7.9安装jenkins—（一）
vscode调试container（进行rocksdb调试）+vscode比较git项目不同分支和fork的哪个分支
 微服务面试题
 新作文杂志新作文杂志社新作文编辑部2022年第8期目录
 Altium Designer 相同模块的布局布线复用-AD
信息系统项目管理师-范围管理论文提纲
 51单片机学习：LCD1602液晶显示实验
 SSM 框架搭建整合（IDEA）保姆级
 基于Java+SpringBoot+Vue前后端分离家政服务管理系统的设计与实现【Java毕业设计·文档报告·代码讲解·安装调试】
【深入理解Kotlin协程】lifecycleScope源码追踪扒皮
原文地址：https://blog.csdn.net/CegghnnoR/article/details/126893677

文章目录

unordered 系列容器

哈希表的概念

哈希函数

哈希函数设计原则

常用哈希函数

哈希冲突

概念

解决方式

闭散列

开散列

实现：除留余数法，闭散列

框架

查找

插入与扩容

删除

仿函数

完整代码

实现：除留余数法、开散列

框架

插入与扩容

析构

查找

删除

拷贝构造与赋值重载

完整代码

封装成 unordered_map、unordered_set

改造哈希表

迭代器

封装

关于除留余数法的补充

完整代码

HashTable.h

UnorderedSet.h

UnorderedMap.h

unordered_map其他底层接口