• .NET深入了解哈希表和Dictionary


    引子

    问题:给定一串数字{1,2,5,7,15,24,33,52},如何在时间复杂度为O(1)下,对数据进行CURD?

    数组:我创建一个Length为53的数组,将元素插入相同下标处,是不是就可以实现查找复杂度O(1)了?但是添加修改元素时间复杂度为O(n)了。

    链表:添加删除复杂度为O(1),但是查找时间复杂度为O(n)了。

    身为.NETer肯定熟练使用Dictionary和HashSet,这两个容器的底层就是HashTable,所以带着对技术浓重的兴趣,就从头到尾梳理一下!

    理论

    链地址法(拉链法)

    回到问题本身,我们用数组可以实现查找复杂度为O(1),链表实现添加删除复杂度为O(1),如果我们将两个合起来,不就可以实现增删查都为O(1)了么?如何结合呢?

    我们先定义一个数组,长度为7(敲黑板,思考下为什么选7?),将所有元素对7取余,这样所有元素都可以放在数组上了,如下图所示:

    如上图,如果我们将数组中每个下标位置都放成一个链条,这样,复杂度不久降下去了么?

    有问题么?没问题。真没问题么?有问题......

    注意

    1. 插入元素是{0,7,14,21,28}怎么办?这样都落在下标为0的链条里,时间复杂度不又上去了?针对这种情况,隔壁Java将链表优化成了红黑书,我们.NET呢?往下看。

    2. 如果我的数组长度不是7,是2怎么办?所有数对2取余,不是1就是0,时间复杂度不又上去了?所以我们对数组长度应该取素数。

    3. 如果元素超级多或者特别少,我们的数组长度要固定么?就要动态长度

    上边这种方法学名就叫拉链法!

    开放地址法

    上边我们聊过拉链法(为什么老想着裤子拉链......),拉链法是向下开辟新的空间,如果我们横向开辟空间呢?还是刚才的例子,我们这样搞一下试试。

    线性探测法

    我们插完7以后,在插24时,发现下标为2的地方有元素了,于是向后移动一位,发现有空位,于是就插进去了。

    上边这种方法就是线性探测法!

    二次聚集(堆积)

    聪明的老鸟们,肯定疑惑啦,如果我们继续添加元素{x%11=4},{y%11=5},此时x,y元素都要往下标6插数据。这样就导致了原始哈希地址不同的元素要插入同一个地址。即添加同义词的冲突过程中又添加了非同义词的冲突。这就是二次聚集

    二次探测法

    如果在线性探测法中,我们不依次寻找下一个呢?我们针对"下一个"采取{1 ^ 2,-1 ^ 2,2 ^ 2,-2 ^ 2....}(垃圾编辑器,次方样式乱了)这样的步长呢?真聪明!你已经知道二次探测法了!

    这......这还能用么?不都乱了么?下标和元素对不上了呀!怎么去查找元素呢?

    别急呀,家人们呐,我们按照这个思路查询就好了:

    查找算法步骤

    1. 给定待查找的关键字key,获取原始应该插入的下标index
    2. 如果原始下标index处,元素为空,则所查找的元素不存在
    3. 如果index处的元素等于key,则查找成功
    4. 否则重复下述解决冲突的过程
      * 按照处理冲突的方法,计算下一个地址nextIndex
      * 若nextIndex为空,则查找元素不存在
      * 若nextIndex等于关键词key,则查找成功

    还有要注意的点么?必须有!

    注意(敲重点啦)

    1. 数组长度必须大于给定元素的长度!
    2. 当数组元素快装满时,时间复杂度也是O(n)!
    3. 如果都装满了,就会一直循环找空位,我们应该进行扩容!

    理论小结

    接口设计

    干活啦,干活啦,领导嫌查询效率太低,让设计一种CURD时间复杂度都为O(n)的数据结构。给了接口。接口如下:

    internal interface IDictionary : IEnumerable>
    {
        TV this[TK key] { get; set; }
    
        int Count { get; }
        /// <summary>
        /// 根据key判断元素是否存在
        /// summary>
        /// <param name="key">param>
        /// <returns>returns>
        bool ContainsKey(TK key);
        /// <summary>
        /// 添加元素
        /// summary>
        /// <param name="key">param>
        /// <param name="value">param>
        void Add(TK key, TV value);
        /// <summary>
        /// 根据key移除元素
        /// summary>
        /// <param name="key">param>
        void Remove(TK key);
        /// <summary>
        /// 清除
        /// summary>
        void Clear();
    }

    .NET实现线性探测法

    实现过程

    1. 先来个对象,存储key和value

    对象:KeyValuePair
    internal class DictionaryKeyValuePair<TK, TV>
    {
        internal TK Key;
        internal TV Value;
    
        internal DictionaryKeyValuePair(TK key, TV value)
        {
            Key = key;
            Value = value;
        }
    }

    2. 来个类OpenAddressDictionary,继承IDictionary接口,就是我们的实现类

    实现类:OpenAddressDictionary
    /// 
    /// 使用线性探测法实现哈希表
    /// 
    /// 
    /// 
    public class OpenAddressDictionary<TK, TV> : IDictionary<TK, TV>
    {
        //创建一个数组,用来存储元素
        private DictionaryKeyValuePair[] hashArray;
    
        //记录已插入元素的数量
        public int Count { get; private set; }
    
        public OpenAddressDictionary(int capacity)
        {
            if (capacity < 0)
                throw new ArgumentOutOfRangeException("初始值容量不能小于0");
            hashArray = new DictionaryKeyValuePair[capacity];
        }
        public TV this[TK key] {
            get => throw new System.NotImplementedException();
            set => throw new System.NotImplementedException();
        }
    
        public void Add(TK key, TV value)
        {
            throw new System.NotImplementedException();
        }
    
        public void Clear()
        {
            throw new System.NotImplementedException();
        }
    
        public System.Boolean ContainsKey(TK key)
        {
            throw new System.NotImplementedException();
        }
    
        public IEnumerator> GetEnumerator()
        {
            throw new System.NotImplementedException();
        }
    
        public void Remove(TK key)
        {
            throw new System.NotImplementedException();
        }
    
        IEnumerator IEnumerable.GetEnumerator()
        {
            throw new System.NotImplementedException();
        }
    }

    3.如何实现查找?跟着上文查找步骤就行

    线性探测:查找
    /// 
    /// 查找,按照上文线性探测查找步骤
    /// 
    /// 
    /// 
    public bool ContainsKey(TK key)
    {
        //1.给定待查找的关键字key,获取原始应该插入的下标index
        var hashCode = GetHash(key);
        var index = hashCode % hashArray.Length;
    
        //2.如果原始下标index处,元素为空,则所查找的元素不存在
        if (hashArray[index] == null) return false;
    
        var current = hashArray[index];//当前元素
    
        /*这个点用来判断是否走了一整圈*/
        var hitKey = current.Key;
    
        //4.否则重复下述解决冲突的过程           
        while (current != null)
        {
            //3.如果index处的元素等于key,则查找成功
            if (current.Key.Equals(key)) return true;
    
            /*这个地方来修改获取下一个元素位置*/
            index++;
    
            /*到尾了,但是没有走完一圈*/
            if (index == hashArray.Length)
                index = 0;
    
            current = hashArray[index];
    
            /*走完一圈了,没找到*/
            if (current != null && current.Key.Equals(hitKey)) break;
        }
    
        return false;
    }

    4. 添加

    线性探测:添加
    /// 
    /// 添加元素
    /// 
    /// 
    /// 
    /// 
    public void Add(TK key, TV value)
    {
        Grow();
        //1.获取原始插入位置
        var hashCode = GetHash(key);
        var index = hashCode % hashArray.Length;
    
        //2.此位置为空,直接插入
        if (hashArray[index] == null)
        {
            hashArray[index] = new DictionaryKeyValuePair(key, value);
        }
        //3.坑被占了,去看看下一个
        else
        {
            var current = hashArray[index];
            /*这个点用来判断是否走了一整圈*/
            var hitKey = current.Key;
            while (current != null)
            {
                if (current.Key.Equals(key)) throw new Exception("重复key");
    
                /*这个地方来修改获取下一个元素位置*/
                index++;
    
                /*到尾了,但是没有走完一圈*/
                if (index == hashArray.Length)
                    index = 0;
    
                current = hashArray[index];
    
                /*走完一圈了,没找到空位*/
                if (current != null && current.Key.Equals(hitKey)) throw new Exception("容器满了");
            }
            hashArray[index] = new DictionaryKeyValuePair(key, value);
        }
        Count++;
    }
    /// 
    /// 扩容
    /// 
    private void Grow()
    {
        /*这个地方判断使用多少扩容*/
        if (hashArray.Length * 0.7 <= Count)
        {
            var orghashArray = hashArray.Length;
            var currentArray = hashArray;
    
            /*这个地方改变扩容大小的规则*/
            hashArray = new DictionaryKeyValuePair[hashArray.Length * 2];
            for (var i = 0; i < orghashArray; i++)
            {
                var current = currentArray[i];
    
                /*旧数组中存在元素,添加到新数组中,Add方法会对Count++,所以加入后要Count--*/
                if (current != null)
                {
                    Add(current.Key, current.Value);
                    Count--;
                }
            }
            currentArray = null;
        }
    }

    5. 删除

    线性探测:删除
    /// 
    /// 删除元素key
    /// 
    /// 
    /// 
    public void Remove(TK key)
    {
        //1.获取原始插入位置
        var hashCode = GetHash(key);
        var curIndex = hashCode % hashArray.Length;
    
        //2.此位置为空,无法删除
        if (hashArray[curIndex] == null) throw new Exception("未找到元素key");
    
        var current = hashArray[curIndex];
    
        /*这个点用来判断是否走了一整圈*/
        var hitKey = current.Key;
    
        #region 找到待删除元素
        DictionaryKeyValuePair target = null;
    
        while (current != null)
        {
            if (current.Key.Equals(key))
            {
                target = current;
                break;
            }
    
            /*这个地方来修改获取下一个元素位置*/
            curIndex++;
    
            /*到尾了,但是没有走完一圈*/
            if (curIndex == hashArray.Length)
                curIndex = 0;
    
            current = hashArray[curIndex];
    
            /*走完一圈了,没找到空位*/
            if (current != null && current.Key.Equals(hitKey)) throw new Exception("No such item for given key");
        }
    
        if (target == null)
        {
            throw new Exception("未找到元素key");
        }
        #endregion  
    
    
        //删除,将当前位置置空
        hashArray[curIndex] = null;
    
        #region 之前讲过删除,造成元素丢失,所以在此处处理
    
        curIndex++;
    
        /*到尾了,但是没有走完一圈*/
        if (curIndex == hashArray.Length)
            curIndex = 0;
    
        current = hashArray[curIndex];
    
        //直到下一个为空的点,到空说明后边的还没有被线性探测插入污染
        while (current != null)
        {
            //先删除
            hashArray[curIndex] = null;
    
            //重新插入
            Add(current.Key, current.Value);
            Count--;
    
            curIndex++;
    
            /*到尾了,但是没有走完一圈*/
            if (curIndex == hashArray.Length)
                curIndex = 0;
    
            current = hashArray[curIndex];
        }
        #endregion
    
    
        Count--;
    
        Shrink();
    }
    
    /// 
    /// 减容
    /// 
    private void Shrink()
    {
        /*这个地方判断元素在什么程度算少*/
        if (Count <= hashArray.Length * 0.3 && hashArray.Length / 2 > 0)
        {
            var orghashArray = hashArray.Length;
            var currentArray = hashArray;
    
            /*这个地方改变扩容大小的规则*/
            hashArray = new DictionaryKeyValuePair[hashArray.Length / 2];
    
            for (var i = 0; i < orghashArray; i++)
            {
                var current = currentArray[i];
    
                /*旧数组中存在元素,添加到新数组中,Add方法会对Count++,所以加入后要Count--*/
                if (current != null)
                {
                    Add(current.Key, current.Value);
                    Count--;
                }
            }
    
            currentArray = null;
        }
    }

    最终代码

    线性探测:最终代码
    /// 
    /// 使用线性探测法实现哈希表
    /// 
    /// 
    /// 
    public class OpenAddressDictionary<TK, TV> : IDictionary<TK, TV>
    {
        //创建一个数组,用来存储元素
        private DictionaryKeyValuePair[] hashArray;
        //记录已插入元素的数量
        public int Count { get; private set; }
    
        public TV this[TK key]
        {
            get => GetValue(key);
            set => SetValue(key, value);
        }
    
        public OpenAddressDictionary(int capacity)
        {
            if (capacity < 0)
                throw new ArgumentOutOfRangeException("初始值容量不能小于0");
            hashArray = new DictionaryKeyValuePair[capacity];
        }
        /// 
        /// 清除最简单
        /// 
        public void Clear()
        {
            if (Count > 0)
                Array.Clear(hashArray, 0, hashArray.Length);
        }
    
        /// 
        /// 查找,按照上文线性探测查找步骤
        /// 
        /// 
        /// 
        public bool ContainsKey(TK key)
        {
            //1.给定待查找的关键字key,获取原始应该插入的下标index
            var hashCode = GetHash(key);
            var index = hashCode % hashArray.Length;
    
            //2.如果原始下标index处,元素为空,则所查找的元素不存在
            if (hashArray[index] == null) return false;
    
            var current = hashArray[index];//当前元素
    
            /*这个点用来判断是否走了一整圈*/
            var hitKey = current.Key;
    
            //4.否则重复下述解决冲突的过程           
            while (current != null)
            {
                //3.如果index处的元素等于key,则查找成功
                if (current.Key.Equals(key)) return true;
    
                /*这个地方来修改获取下一个元素位置*/
                index++;
    
                /*到尾了,但是没有走完一圈*/
                if (index == hashArray.Length)
                    index = 0;
    
                current = hashArray[index];
    
                /*走完一圈了,没找到*/
                if (current != null && current.Key.Equals(hitKey)) break;
            }
    
            return false;
        }
    
        /// 
        /// 添加元素
        /// 
        /// 
        /// 
        /// 
        public void Add(TK key, TV value)
        {
            Grow();
            //1.获取原始插入位置
            var hashCode = GetHash(key);
            var index = hashCode % hashArray.Length;
    
            //2.此位置为空,直接插入
            if (hashArray[index] == null)
            {
                hashArray[index] = new DictionaryKeyValuePair(key, value);
            }
            //3.坑被占了,去看看下一个
            else
            {
                var current = hashArray[index];
                /*这个点用来判断是否走了一整圈*/
                var hitKey = current.Key;
                while (current != null)
                {
                    if (current.Key.Equals(key)) throw new Exception("重复key");
    
                    /*这个地方来修改获取下一个元素位置*/
                    index++;
    
                    /*到尾了,但是没有走完一圈*/
                    if (index == hashArray.Length)
                        index = 0;
    
                    current = hashArray[index];
    
                    /*走完一圈了,没找到空位*/
                    if (current != null && current.Key.Equals(hitKey)) throw new Exception("容器满了");
                }
                hashArray[index] = new DictionaryKeyValuePair(key, value);
            }
            Count++;
        }
        /// 
        /// 删除元素key
        /// 
        /// 
        /// 
        public void Remove(TK key)
        {
            //1.获取原始插入位置
            var hashCode = GetHash(key);
            var curIndex = hashCode % hashArray.Length;
    
            //2.此位置为空,无法删除
            if (hashArray[curIndex] == null) throw new Exception("未找到元素key");
    
            var current = hashArray[curIndex];
    
            /*这个点用来判断是否走了一整圈*/
            var hitKey = current.Key;
    
            #region 找到待删除元素
            DictionaryKeyValuePair target = null;
    
            while (current != null)
            {
                if (current.Key.Equals(key))
                {
                    target = current;
                    break;
                }
    
                /*这个地方来修改获取下一个元素位置*/
                curIndex++;
    
                /*到尾了,但是没有走完一圈*/
                if (curIndex == hashArray.Length)
                    curIndex = 0;
    
                current = hashArray[curIndex];
    
                /*走完一圈了,没找到空位*/
                if (current != null && current.Key.Equals(hitKey)) throw new Exception("No such item for given key");
            }
    
            if (target == null)
            {
                throw new Exception("未找到元素key");
            }
            #endregion  
    
    
            //删除,将当前位置置空
            hashArray[curIndex] = null;
    
            #region 之前讲过删除,造成元素丢失,所以在此处处理
    
            curIndex++;
    
            /*到尾了,但是没有走完一圈*/
            if (curIndex == hashArray.Length)
                curIndex = 0;
    
            current = hashArray[curIndex];
    
            //直到下一个为空的点,到空说明后边的还没有被线性探测插入污染
            while (current != null)
            {
                //先删除
                hashArray[curIndex] = null;
    
                //重新插入
                Add(current.Key, current.Value);
                Count--;
    
                curIndex++;
    
                /*到尾了,但是没有走完一圈*/
                if (curIndex == hashArray.Length)
                    curIndex = 0;
    
                current = hashArray[curIndex];
            }
            #endregion
    
    
            Count--;
    
            Shrink();
        }
    
        /// 
        /// 扩容
        /// 
        private void Grow()
        {
            /*这个地方判断使用多少扩容*/
            if (hashArray.Length * 0.7 <= Count)
            {
                var orghashArray = hashArray.Length;
                var currentArray = hashArray;
    
                /*这个地方改变扩容大小的规则*/
                hashArray = new DictionaryKeyValuePair[hashArray.Length * 2];
                for (var i = 0; i < orghashArray; i++)
                {
                    var current = currentArray[i];
    
                    /*旧数组中存在元素,添加到新数组中,Add方法会对Count++,所以加入后要Count--*/
                    if (current != null)
                    {
                        Add(current.Key, current.Value);
                        Count--;
                    }
                }
                currentArray = null;
            }
        }
        /// 
        /// 减容
        /// 
        private void Shrink()
        {
            /*这个地方判断元素在什么程度算少*/
            if (Count <= hashArray.Length * 0.3 && hashArray.Length / 2 > 0)
            {
                var orghashArray = hashArray.Length;
                var currentArray = hashArray;
    
                /*这个地方改变扩容大小的规则*/
                hashArray = new DictionaryKeyValuePair[hashArray.Length / 2];
    
                for (var i = 0; i < orghashArray; i++)
                {
                    var current = currentArray[i];
    
                    /*旧数组中存在元素,添加到新数组中,Add方法会对Count++,所以加入后要Count--*/
                    if (current != null)
                    {
                        Add(current.Key, current.Value);
                        Count--;
                    }
                }
    
                currentArray = null;
            }
        }
    
        private void SetValue(TK key, TV value)
        {
            var index = GetHash(key) % hashArray.Length;
    
            if (hashArray[index] == null)
            {
                Add(key, value);
            }
            else
            {
                var current = hashArray[index];
                var hitKey = current.Key;
    
                while (current != null)
                {
                    if (current.Key.Equals(key))
                    {
                        Remove(key);
                        Add(key, value);
                        return;
                    }
    
                    index++;
    
                    //wrap around
                    if (index == hashArray.Length)
                        index = 0;
    
                    current = hashArray[index];
    
                    //reached original hit again
                    if (current != null && current.Key.Equals(hitKey)) throw new Exception("Item not found");
                }
            }
    
            throw new Exception("Item not found");
        }
    
        private TV GetValue(TK key)
        {
            var index = GetHash(key) % hashArray.Length;
    
            if (hashArray[index] == null) throw new Exception("Item not found");
    
            var current = hashArray[index];
            var hitKey = current.Key;
    
            while (current != null)
            {
                if (current.Key.Equals(key)) return current.Value;
    
                index++;
    
                //wrap around
                if (index == hashArray.Length)
                    index = 0;
    
                current = hashArray[index];
    
                //reached original hit again
                if (current != null && current.Key.Equals(hitKey)) throw new Exception("Item not found");
            }
    
            throw new Exception("Item not found");
        }
    
        private int GetHash(TK key)
        {
            return Math.Abs(key.GetHashCode());
        }
        IEnumerator IEnumerable.GetEnumerator()
        {
            return GetEnumerator();
        }
        //迭代器就不写了,想了解看我博客容器栏目
        public IEnumerator> GetEnumerator()
        {
            throw new System.NotImplementedException();
        }
    }
    internal class DictionaryKeyValuePair<TK, TV>
    {
        internal TK Key;
        internal TV Value;
    
        internal DictionaryKeyValuePair(TK key, TV value)
        {
            Key = key;
            Value = value;
        }
    }

    .NET实现拉链法

    实现过程

    回想一下,上边的拉链法,每个下标位置放置的是一个链条,所以我们先实现一个双向链表

    1. 实现一个双向链表

    拉链法:构建双向链表
    internal class DLinkedNode<T>
    {
        public T Data;
        public DLinkedNode Next;
        public DLinkedNode Previous;
    
        public DLinkedNode(T data)
        {
            Data = data;
        }
    }

    2. 创建一个拉链法实体类

    拉链法:实现类
    /// 
    /// 拉链法:实现类
    /// 
    /// 
    /// 
    internal class SeparateChainingDictionary<TK, TV>:IDictionary<TK, TV>
    {
        //构建一个数组,数组每个节点都是链表
        private DLinkedNode>[] hashArray;
        //已使用数组下标个数
        private int filledBuckets;
    
        public SeparateChainingDictionary(int capacity) {
            if (capacity < 0)
                throw new ArgumentOutOfRangeException("初始值容量不能小于0");
            hashArray = new DLinkedNode>[capacity];
        }
        public TV this[TK key] { 
            get => throw new NotImplementedException(); 
            set => throw new NotImplementedException(); 
        }
    
        public int Count => throw new NotImplementedException();
    
        public void Add(TK key, TV value)
        {
            throw new NotImplementedException();
        }
    
        public void Clear()
        {
            throw new NotImplementedException();
        }
    
        public bool ContainsKey(TK key)
        {
            throw new NotImplementedException();
        }
    
        public void Remove(TK key)
        {
            throw new NotImplementedException();
        }
    
        public IEnumerator> GetEnumerator()
        {
            throw new NotImplementedException();
        }
    
        IEnumerator IEnumerable.GetEnumerator()
        {
            throw new NotImplementedException();
        }
    }

    3. 拉链法:查找

    拉链法:查找
    /// 
    /// 查找
    /// 
    /// 
    /// 
    public bool ContainsKey(TK key)
    {
        /*1.获取原始下标*/
        var index = Math.Abs(key.GetHashCode()) % hashArray.Length;
    
        /*2.为空即无*/
        if (hashArray[index] == null) return false;
    
        var current = hashArray[index];
    
        /*3.遍历链表*/
        while (current != null)
        {               
            if (current.Data.Key.Equals(key)) return true;
    
            current = current.Next;
        }
    
        return false;
    }

    4. 拉链法:添加

    拉链法:添加
    /// 
    /// 添加
    /// 
    /// 
    /// 
    /// 
    public void Add(TK key, TV value)
    {
        Grow();
    
        var index = Math.Abs(key.GetHashCode()) % hashArray.Length;
    
        if (hashArray[index] == null)
        {
            hashArray[index] = new DLinkedNode>(new KeyValuePair(key, value));
            filledBuckets++;
        }
        else
        {
            var current = hashArray[index];
    
            while (current != null && current.Next != null)
            {
                /*此处可以判断是重复修改,还是抛异常*/
                if (current.Data.Key.Equals(key)) throw new Exception("重复key");
    
                current = current.Next;
            }
            if (current.Data.Key.Equals(key)) throw new Exception("重复key");
            current.Next = new DLinkedNode>(new KeyValuePair(key, value));
        }
    
        Count++;
    }
    
    /// 
    /// 扩容
    /// 
    private void Grow()
    {
        if (filledBuckets >= hashArray.Length * 0.7)
        {
            filledBuckets = 0;
    
            var newBucketSize = hashArray.Length * 2;
    
            var biggerArray = new DLinkedNode>[newBucketSize];
    
            for (var i = 0; i < hashArray.Length; i++)
            {
                var item = hashArray[i];
    
                if (item != null)
                {
                    var current = item;
    
                    while (current != null)
                    {
                        var next = current.Next;
    
                        var newIndex = Math.Abs(current.Data.Key.GetHashCode()) % newBucketSize;
    
                        if (biggerArray[newIndex] == null)
                        {
                            filledBuckets++;
                            biggerArray[newIndex] = current;
                        }
    
                        var bItem = biggerArray[newIndex];
                        while(bItem.Next != null)
                            bItem = bItem.Next;
                        bItem.Next = current;
    
                        current = next;
                    }
                }
            }
    
            hashArray = biggerArray;
        }
    }

    5. 拉链法:删除

    拉链法:删除
    public void Remove(TK key)
    {
        var index = Math.Abs(key.GetHashCode()) % hashArray.Length;
    
        if (hashArray[index] == null) throw new Exception("未找到key");
    
        var current = hashArray[index];
    
        /*查找待删除元素*/
        DLinkedNode> item = null;
        while (current != null)
        {
            if (current.Data.Key.Equals(key))
            {
                item = current;
                break;
            }
            current = current.Next;
        }
    
        if (item == null)
        {
            throw new Exception("未找到key");
        }
    
        /*删除*/
        if (item.Next == null)
            item = null;
        else
        {
            item.Previous = item.Next; 
            item.Next.Previous =item.Previous ;
            item = null;
        }
    
    
        if (hashArray[index] == null)
        {
            filledBuckets--;
        }
    
        Count--;
    
        Shrink();
    }
    private void Shrink()
    {
        /*是否减容*/
        if (Math.Abs(filledBuckets - hashArray.Length * 0.3) < 0.1 && hashArray.Length / 2 > 0)
        {
            filledBuckets = 0;
            var newBucketSize = hashArray.Length / 2;
    
            var smallerArray = new DLinkedNode>[newBucketSize];
    
            for (var i = 0; i < hashArray.Length; i++)
            {
                var item = hashArray[i];
    
                if (item != null)
                {
                    var current = item;
    
                    /*找到新的存储点*/
                    while (current != null)
                    {
                        var next = current.Next;
    
                        var newIndex = Math.Abs(current.Data.Key.GetHashCode()) % newBucketSize;
    
                        if (smallerArray[newIndex] == null)
                        {
                            filledBuckets++;
                            smallerArray[newIndex] = current;
                        }
    
                        var newItem = smallerArray[newIndex];
                        while(newItem.Next != null)
                            newItem = newItem.Next;
                        newItem.Next = current;
    
                        current = next;
                    }
                }
            }
    
            hashArray = smallerArray;
        }
    }

    最终代码

    拉链法:最终代码
    
    internal class DLinkedNode<T>
    {
      public T Data;
      public DLinkedNode Next;
      public DLinkedNode Previous;
    
      public DLinkedNode(T data)
      {
          Data = data;
      }
    }
    internal class SeparateChainingDictionary<TK, TV> : IDictionary<TK, TV>
    {
    //构建一个数组,数组每个节点都是链表
    private DLinkedNode>[] hashArray;
    //已使用数组下标个数
    private int filledBuckets;
    
    public SeparateChainingDictionary(int capacity)
    {
        if (capacity < 0)
            throw new ArgumentOutOfRangeException("初始值容量不能小于0");
        hashArray = new DLinkedNode>[capacity];
    }
    public TV this[TK key]
    {
        get => throw new NotImplementedException();
        set => throw new NotImplementedException();
    }
    
    public int Count { get; private set; }
    
    /// 
    /// 添加
    /// 
    /// 
    /// 
    /// 
    public void Add(TK key, TV value)
    {
        Grow();
    
        var index = Math.Abs(key.GetHashCode()) % hashArray.Length;
    
        if (hashArray[index] == null)
        {
            hashArray[index] = new DLinkedNode>(new KeyValuePair(key, value));
            filledBuckets++;
        }
        else
        {
            var current = hashArray[index];
    
            while (current != null && current.Next != null)
            {
                /*此处可以判断是重复修改,还是抛异常*/
                if (current.Data.Key.Equals(key)) throw new Exception("重复key");
    
                current = current.Next;
            }
            if (current.Data.Key.Equals(key)) throw new Exception("重复key");
            current.Next = new DLinkedNode>(new KeyValuePair(key, value));
        }
    
        Count++;
    }
    
    /// 
    /// 扩容
    /// 
    private void Grow()
    {
        if (filledBuckets >= hashArray.Length * 0.7)
        {
            filledBuckets = 0;
    
            var newBucketSize = hashArray.Length * 2;
    
            var biggerArray = new DLinkedNode>[newBucketSize];
    
            for (var i = 0; i < hashArray.Length; i++)
            {
                var item = hashArray[i];
    
                if (item != null)
                {
                    var current = item;
    
                    while (current != null)
                    {
                        var next = current.Next;
    
                        var newIndex = Math.Abs(current.Data.Key.GetHashCode()) % newBucketSize;
    
                        if (biggerArray[newIndex] == null)
                        {
                            filledBuckets++;
                            biggerArray[newIndex] = current;
                        }
    
                        var bItem = biggerArray[newIndex];
                        while(bItem.Next != null)
                            bItem = bItem.Next;
                        bItem.Next = current;
    
                        current = next;
                    }
                }
            }
    
            hashArray = biggerArray;
        }
    }
    
    
    public void Clear()
    {
        throw new NotImplementedException();
    }
    
    /// 
    /// 查找
    /// 
    /// 
    /// 
    public bool ContainsKey(TK key)
    {
        /*1.获取原始下标*/
        var index = Math.Abs(key.GetHashCode()) % hashArray.Length;
    
        /*2.为空即无*/
        if (hashArray[index] == null) return false;
    
        var current = hashArray[index];
    
        /*3.遍历链表*/
        while (current != null)
        {
            if (current.Data.Key.Equals(key)) return true;
    
            current = current.Next;
        }
    
        return false;
    }
    
    public void Remove(TK key)
    {
        var index = Math.Abs(key.GetHashCode()) % hashArray.Length;
    
        if (hashArray[index] == null) throw new Exception("未找到key");
    
        var current = hashArray[index];
    
        /*查找待删除元素*/
        DLinkedNode> item = null;
        while (current != null)
        {
            if (current.Data.Key.Equals(key))
            {
                item = current;
                break;
            }
            current = current.Next;
        }
    
        if (item == null)
        {
            throw new Exception("未找到key");
        }
    
        /*删除*/
        if (item.Next == null)
            item = null;
        else
        {
            item.Previous = item.Next; 
            item.Next.Previous =item.Previous ;
            item = null;
        }
    
    
        if (hashArray[index] == null)
        {
            filledBuckets--;
        }
    
        Count--;
    
        Shrink();
    }
    private void Shrink()
    {
        /*是否减容*/
        if (Math.Abs(filledBuckets - hashArray.Length * 0.3) < 0.1 && hashArray.Length / 2 > 0)
        {
            filledBuckets = 0;
            var newBucketSize = hashArray.Length / 2;
    
            var smallerArray = new DLinkedNode>[newBucketSize];
    
            for (var i = 0; i < hashArray.Length; i++)
            {
                var item = hashArray[i];
    
                if (item != null)
                {
                    var current = item;
    
                    /*找到新的存储点*/
                    while (current != null)
                    {
                        var next = current.Next;
    
                        var newIndex = Math.Abs(current.Data.Key.GetHashCode()) % newBucketSize;
    
                        if (smallerArray[newIndex] == null)
                        {
                            filledBuckets++;
                            smallerArray[newIndex] = current;
                        }
    
                        var newItem = smallerArray[newIndex];
                        while(newItem.Next != null)
                            newItem = newItem.Next;
                        newItem.Next = current;
    
                        current = next;
                    }
                }
            }
    
            hashArray = smallerArray;
        }
    }
    public IEnumerator> GetEnumerator()
    {
        throw new NotImplementedException();
    }
    
    IEnumerator IEnumerable.GetEnumerator()
    {
        throw new NotImplementedException();
    }
    }

    Dictionary源码分析

    模拟实现:一个Dictionary,存储数据{1,'a'},{'4','b'},{5,'c'}

    1. 创建一个单链表,用来存储K-V

    private struct Entry
    {
        public uint hashCode;
        //值为-1,表示是该链条最后一个节点
        //值小于-1,表示已经被删除的自由节点
        public int next;
        public TKey key;     // Key of entry
        public TValue value; // Value of entry
    }

    2. 创建一个数组当桶,还有一个链表数组(核心就这两个数组)

    private int[]? _buckets;
    private Entry[]? _entries;

    3. 模拟实现插入{1,'a'},{'4','b'},{5,'c'}

    初始化

    第一次插入{1,'a'}

    第二次插入{'4','b'}

    第三次插入{5,'c'}

    仔细看一下这三个数据的插入,及数据的变化,应该可以理解_buckets和_entries的关系

    4.删除

    上边再讲哈希表,包括我们自己实现的代码中,删除一个节点后,都要重新计算后边的位置。如何解决这个问题呢?我们可以使用Entry的next,来表示是否已经删除,小于0就表示是自由节点。

    关于删除就这样几个变量:

    private int _freeList;//最后一个删除的Entry下标
    private int _freeCount;//当前已删除,但是还未重新使用的节点数量
    private const int StartOfFreeList = -3;//帮助寻找自由节点的一个常量

    看一下StartOfFreeList和_freeList和Entry.next如何寻找自由节点

    • 删除时:Entry[i].next=上一层中的StartOfFreeList-_freeList
    • 添加&&_freeCount>0:_freeList=StartOfFreeList - entries[_freeList].next

    请看图理解:

    源码:简化版(debug理解)

    源码:简化版可直接运行
    public static void Main(string[] args)
    {
        Dictionary<int, char> dic = new Dictionary<int, char>();
        dic.TryInsert(1, 'a');
        dic.TryInsert(4, 'b');
        dic.TryInsert(5, 'c');
        dic.Remove(4);
        dic.Remove(5);
        dic.TryInsert(0, 'd');
        dic.TryInsert(1, 'e');
    }
    public class Dictionary<TKey, TValue>
    {
    private int[]? _buckets;
    private Entry[]? _entries;
    private int _count;
    private int _freeList;
    private int _freeCount;
    private int _version;
    private const int StartOfFreeList = -3;
    public Dictionary()
    {
        /*初始值为素数,这里就不动态了,获取素数可以使用埃及筛选法*/
        Initialize(7);
    }
    private int Initialize(int capacity)
    {
        int size = capacity;
        int[] buckets = new int[size];
        Entry[] entries = new Entry[size];
        _freeList = -1;
        _buckets = buckets;
        _entries = entries;
        return size;
    }
    
    public bool TryInsert(TKey key, TValue value)
    {
        Entry[]? entries = _entries;
        uint hashCode = (uint)key.GetHashCode();
    
        uint collisionCount = 0;
        ref int bucket = ref GetBucket(hashCode);
        int i = bucket - 1; // Value in _buckets is 1-based
        if (typeof(TKey).IsValueType)
        {
            while (true)
            {
                if ((uint)i >= (uint)entries.Length)
                {
                    break;
                }
    
                if (entries[i].hashCode == hashCode && EqualityComparer.Default.Equals(entries[i].key, key))
                {
                    entries[i].value = value;
                    return true;
                }
    
                i = entries[i].next;
    
                collisionCount++;
                if (collisionCount > (uint)entries.Length)
                {
                    throw new Exception("");
                }
            }
        }
        int index;
        if (_freeCount > 0)
        {
            index = _freeList;
            // Debug.Assert((StartOfFreeList - entries[_freeList].next) >= -1, "shouldn't overflow because `next` cannot underflow");
            _freeList = StartOfFreeList - entries[_freeList].next;
            _freeCount--;
        }
        else
        {
            int count = _count;
            if (count == entries.Length)
            {
                //Resize();
                bucket = ref GetBucket(hashCode);
            }
            index = count;
            _count = count + 1;
            entries = _entries;
        }
    
        ref Entry entry = ref entries![index];
        entry.hashCode = hashCode;
        entry.next = bucket - 1; // Value in _buckets is 1-based
        entry.key = key;
        entry.value = value; // Value in _buckets is 1-based
        bucket = index + 1;
        _version++;
        return true;
    }
    public bool Remove(TKey key)
    {
        if (key == null) return false;
    
        if (_buckets != null)
        {
            uint collisionCount = 0;
            uint hashCode = (uint)key.GetHashCode();
            ref int bucket = ref GetBucket(hashCode);
            Entry[]? entries = _entries;
            int last = -1;
            int i = bucket - 1; // Value in buckets is 1-based
            while (i >= 0)
            {
                ref Entry entry = ref entries[i];
    
                if (entry.hashCode == hashCode && EqualityComparer.Default.Equals(entry.key, key))
                {
                    if (last < 0)
                    {
                        bucket = entry.next + 1; 
                    }
                    else
                    {
                        entries[last].next = entry.next;
                    }
    
                    entry.next = StartOfFreeList - _freeList;
    
                    entry.key = default!;
    
                    entry.value = default!;
    
                    _freeList = i;
                    _freeCount++;
                    return true;
                }
                last = i;
                i = entry.next;
                collisionCount++;
                if (collisionCount > (uint)entries.Length)
                {
    
                }
            }
        }
        return false;
    }
    private ref int GetBucket(uint hashCode)
    {
        int[] buckets = _buckets!;
        return ref buckets[hashCode % (uint)buckets.Length];
    }
    private struct Entry
    {
        public uint hashCode;
        //值为-1,表示是该链条最后一个节点
        public int next;
        public TKey key;     // Key of entry
        public TValue value; // Value of entry
    }

    __EOF__

  • 本文作者: Broder
  • 本文链接: https://www.cnblogs.com/Z7TS/p/16894154.html
  • 关于博主: 评论和私信会在第一时间回复。或者直接私信我。
  • 版权声明: 本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
  • 声援博主: 如果您觉得文章对您有帮助,可以点击文章右下角推荐一下。
  • 相关阅读:
    【JAVA基础】String类概述
    javaScript 内存管理
    flutter生态一统甜夏 @Android @ios @windowse @macos @linux @Web
    Node.js基础---使用Express写接口
    2023新疆褐牛产业集群高质量发展论坛伊犁召开
    STM32cubemx对FreeRTOS的适配(工程模板配置)
    Java TreeMap如何按照key或value降序排列呢?
    图解来啦!机器学习工业部署最佳实践!10分钟上手机器学习部署与大规模扩展 ⛵
    单线程、多线程Reactor模型
    红黑树及其应用介绍(万字长文)
  • 原文地址:https://www.cnblogs.com/Z7TS/p/16894154.html