• NativeBuferring,一种零分配的数据类型[下篇]


    上文说到Unmanaged、BufferedBinary和BufferedString是NativeBuffering支持的三个基本数据类型,其实我们也可以说NativeBuffering只支持UnmanagedIReadOnlyBufferedObject两种类型,BufferedString、NativeBuffering和通过Source Generator生成的BufferedMessage类型,以及下面介绍的几种集合和字典类型,都实现了IReadOnlyBufferedObject接口。

    一、IReadOnlyBufferedObject
    二、集合
    三、字典
    四、为什么不直接返回接口?

    一、IReadOnlyBufferedObject

    顾名思义,IReadOnlyBufferedObject表示一个针对缓冲字节序列创建的只读数据类型。如下面的代码片段所示,该接口只定义了一个名为Parse静态方法,意味着对于任何一个实现了该接口的类型,对应的实例都可以利用一个代表缓冲字节序列的NativeBuffer的对象进行创建。

    复制代码
    public interface IReadOnlyBufferedObject where T: IReadOnlyBufferedObject
    {
        static abstract T Parse(NativeBuffer buffer);
    }
    
    public unsafe readonly struct NativeBuffer
    {
        public byte[] Bytes { get; }
        public void* Start { get; }
    
        public NativeBuffer(byte[] bytes, void* start)
        {
            Bytes = bytes ?? throw new ArgumentNullException(nameof(bytes));
            Start = start;
        }
    
        public NativeBuffer(byte[] bytes, int index = 0)
        {
            Bytes = bytes ?? throw new ArgumentNullException(nameof(bytes));
            Start = Unsafe.AsPointer(ref bytes[index]);
        }
    }
    复制代码

    由于IReadOnlyBufferedObject是NativeBuffering支持的基础类型,而生成的BufferedMessage类型也实现了这个接口。通过这种“无限嵌套”的形式,我们可以定义一个具有任意结构的数据类型。比如我们具有如下这个表示联系人的Contact类型,我们需要利用它作为“源类型”生成对应BufferedMessage类型。

    复制代码
    [BufferedMessageSource]
    public partial class Contact
    {
        public Contact(string id, string name, Address address)
        {
            Id = id;
            Name = name;
            ShipAddress = address;
        }
    
        public string Id { get; }
        public string Name { get; }
        public Address ShipAddress { get; }
    }
    
    [BufferedMessageSource]
    public partial class Address
    {
        public string Province { get; }
        public string City { get; }
        public string District { get; }
        public string Street { get; }
        public Address(string province, string city, string district, string street)
        {
            Province = province ?? throw new ArgumentNullException(nameof(province));
            City = city ?? throw new ArgumentNullException(nameof(city));
            District = district ?? throw new ArgumentNullException(nameof(district));
            Street = street ?? throw new ArgumentNullException(nameof(street));
        }
    }
    复制代码

    Contact具有Id、Name和ShipAddress 三个数据成员,ShipAddress 对应的Address又是一个复合类型,具有四个表示省、市、区和介绍的字符串类型成员。现在我们为Contact和Address这两个类型生成对应的ContactBufferedMessage和AddressBufferedMessage。

    复制代码
    public unsafe readonly struct ContactBufferedMessage : IReadOnlyBufferedObject
    {
        public NativeBuffer Buffer { get; }
        public ContactBufferedMessage(NativeBuffer buffer) => Buffer = buffer;
        public static ContactBufferedMessage Parse(NativeBuffer buffer) => new ContactBufferedMessage(buffer);
        public BufferedString Id => Buffer.ReadBufferedObjectField(0);
        public BufferedString Name => Buffer.ReadBufferedObjectField(1);
        public AddressBufferedMessage ShipAddress => Buffer.ReadBufferedObjectField(2);
    }
    
    public unsafe readonly struct AddressBufferedMessage : IReadOnlyBufferedObject
    {
        public NativeBuffer Buffer { get; }
        public AddressBufferedMessage(NativeBuffer buffer) => Buffer = buffer;
        public static AddressBufferedMessage Parse(NativeBuffer buffer) => new AddressBufferedMessage(buffer);
        public BufferedString Province => Buffer.ReadBufferedObjectField(0);
        public BufferedString City => Buffer.ReadBufferedObjectField(1);
        public BufferedString District => Buffer.ReadBufferedObjectField(2);
        public BufferedString Street => Buffer.ReadBufferedObjectField(3);
    }
    复制代码

    如下的程序演示了如何将一个Contact对象转换成字节数组,然后利用这这段字节序列生成一个ContactBufferedMessage对象。给出的调试断言验证了Contact和ContactBufferedMessage对象承载了一样的数据,fixed关键字是为了将字节数组“固定住”。(源代码从这里下载)

    复制代码
    using NativeBuffering;
    using System.Diagnostics;
    
    var address = new Address("Jiangsu", "Suzhou", "Industory Park", "#328, Xinghu St");
    var contact = new Contact("123456789", "John Doe", address);
    var size = contact.CalculateSize();
    var bytes = new byte[size];
    var context = new BufferedObjectWriteContext(bytes);
    contact.Write(context);
    
    unsafe
    {
        fixed (byte* _ = bytes)
        {
            var contactMessage = ContactBufferedMessage.Parse(new NativeBuffer(bytes));
            Debug.Assert(contactMessage.Id == "123456789");
            Debug.Assert(contactMessage.Name == "John Doe");
            Debug.Assert(contactMessage.ShipAddress.Province == "Jiangsu");
            Debug.Assert(contactMessage.ShipAddress.City == "Suzhou");
            Debug.Assert(contactMessage.ShipAddress.District == "Industory Park");
            Debug.Assert(contactMessage.ShipAddress.Street == "#328, Xinghu St");
        }
    }
    复制代码

    二、集合

    NativeBuffering同样支持集合。由于UnmanagedIReadOnlyBufferedObject是两种基本的数据类型,它们的根据区别在于:前者的长度有类型本身决定,是固定长度类型,后者则是可变长度类型。元素类型为UnmanagedIReadOnlyBufferedObject的集合分别通过ReadOnlyFixedLengthTypedList和ReadOnlyVaraibleLengthTypedList类型(结构体)表示,它们同样实现了IReadOnlyBufferedObject接口。ReadOnlyFixedLengthTypedList采用如下的字节布局:集合元素数量(4字节整数)+所有元素的字节内容(下图-上)。对于ReadOnlyVaraibleLengthTypedList类型,我们会在前面为每个元素添加一个索引(4字节的整数),该索引指向目标元素在整个缓冲区的偏移量(下图-下)。

    image

    以如下所示的Entity为例,它具有两个数组类型的属性成员Collection1和Collection2,数组元素类型分别为Foobar和double,它们分别代表了上述的两种集合类型。

    复制代码
    [BufferedMessageSource]
    public partial class Entity
    {
        public Foobar[] Collection1 { get; }
        public double[] Collection2 { get; }
        public Entity(Foobar[] collection1, double[] collection2)
        {
            Collection1 = collection1;
            Collection2 = collection2;
        }
    }
    
    [BufferedMessageSource]
    public partial class Foobar
    {
        public int Foo { get; }
        public string Bar { get; }
        public Foobar(int foo, string bar)
        {
            Foo = foo;
            Bar = bar;
        }
    }
    复制代码

    NativeBuffering.Generator会将作为“源类型”的Entity和Foobar类型的生成对应的BufferedMessage类型(EntityBufferredMessage和FoobarBufferedMessage)。从EntityBufferredMessage类型的定义可以看出,两个集合属性的分别是ReadOnlyVariableLengthTypeList和ReadOnlyFixedLengthTypedList

    复制代码
    public unsafe readonly struct EntityBufferedMessage : IReadOnlyBufferedObject
    {
        public NativeBuffer Buffer { get; }
        public EntityBufferedMessage(NativeBuffer buffer) => Buffer = buffer;
        public static EntityBufferedMessage Parse(NativeBuffer buffer) => new EntityBufferedMessage(buffer);
        public ReadOnlyVariableLengthTypeList Collection1 => Buffer.ReadBufferedObjectCollectionField(0);
        public ReadOnlyFixedLengthTypedList Collection2 => Buffer.ReadUnmanagedCollectionField(1);
    }
    
    public unsafe readonly struct FoobarBufferedMessage : IReadOnlyBufferedObject
    {
        public NativeBuffer Buffer { get; }
        public FoobarBufferedMessage(NativeBuffer buffer) => Buffer = buffer;
        public static FoobarBufferedMessage Parse(NativeBuffer buffer) => new FoobarBufferedMessage(buffer);
        public System.Int32 Foo => Buffer.ReadUnmanagedField(0);
        public BufferedString Bar => Buffer.ReadBufferedObjectField(1);
    }
    复制代码

    两个集合类型都实现了IEnumerable接口,还提供了索引。下面的代码演示了以索引的形式提取集合元素(源代码从这里下载)。

    复制代码
    using NativeBuffering;
    using System.Diagnostics;
    
    var entity = new Entity(
        collection1: new Foobar[] { new Foobar(1, "foo"), new Foobar(2, "bar") },
        collection2: new double[] { 1.1, 2.2 });
    var bytes = new byte[entity.CalculateSize()];
    var context = new BufferedObjectWriteContext(bytes);
    entity.Write(context);
    
    unsafe
    {
        fixed (byte* p = bytes)
        {
            var entityMessage = EntityBufferedMessage.Parse(new NativeBuffer(bytes));
            var foobar = entityMessage.Collection1[0];
            Debug.Assert(foobar.Foo == 1);
            Debug.Assert(foobar.Bar == "foo");
    
            foobar = entityMessage.Collection1[1];
            Debug.Assert(foobar.Foo == 2);
            Debug.Assert(foobar.Bar == "bar");
    
            Debug.Assert(entityMessage.Collection2[0] == 1.1);
            Debug.Assert(entityMessage.Collection2[1] == 2.2);
        }
    }
    复制代码

    三、字典

    从数据的存储来看,字典就是键值对的集合,所以我们采用与集合一致的存储形式。NativeBuffering对集合的Key作了限制,要求其类型只能是Unmanaged字符串(String/BufferredString)。按照Key和Value的类型组合,我们一共定义了四种类型的字典类型,它们分别是:

    • ReadOnlyUnmanagedUnmanagedDictionary:Key=Unmanaged; Value = Unmanaged
    • ReadOnlyUnmanagedBufferedObjectDictionary:Key=Unmanaged; Value = IReadOnlyBufferedObject
    • ReadOnlyStringUnmanagedDictionary:Key=String/BufferredString; Value = Unmanaged
    • ReadOnlyStringBufferedObjectDictionary:Key=String/BufferredString; Value = IReadOnlyBufferedObject

    如果Key和Value的类型都是Unmanaged,键值对就是定长类型,所以我们会采用类似于ReadOnlyFixedLengthTypedList的字节布局方式(下图-上),至于其他三种字典类型,则采用类似于ReadOnlyVaraibleLengthTypedList的字节布局形式(下图-下)。

    image

    但是这仅仅解决了字段数据存储的问题,字典基于哈希检索定位的功能是没有办法实现的。这里我们不得不作出妥协,四种字典的索引均不能提供时间复杂度O(1)的哈希检索方式。为了在现有的数据结构上使针对Key的查找尽可能高效,在生成字节内容之前,我们会按照Key对键值对进行排序,这样我们至少可以采用二分法的形式进行检索,所以四种类型的字典的索引在根据指定的Key查找对应Value,对应的时间复杂度为Log(N)。如果字典包含的元素比较多,这样的查找方式不能满足我们的需求,我们可以I将它们转换成普通的Dictionary类型,但是这就没法避免内存分配了。

    我们照例编写一个简答的程序来演示针对字典的使用。我们定义了如下这个Entity作为“源类型”,它的四个属性对应的字典类型刚好对应上述四种键值对的组合。从生成的EntityBufferedMessage类型可以看出,四个成员的类型正好对应上述的四种字典类型。

    复制代码
    [BufferedMessageSource]
    public partial class Entity
    {
        public Dictionary<int, long> Dictionary1 { get; set; }
        public Dictionary<int, string> Dictionary2 { get; set; }
        public Dictionary<string, long> Dictionary3 { get; set; }
        public Dictionary<string, string> Dictionary4 { get; set; }
    }
    
    public unsafe readonly struct EntityBufferedMessage : IReadOnlyBufferedObject
    {
        public NativeBuffer Buffer { get; }
        public EntityBufferedMessage(NativeBuffer buffer) => Buffer = buffer;
        public static EntityBufferedMessage Parse(NativeBuffer buffer) => new EntityBufferedMessage(buffer);
        public ReadOnlyUnmanagedUnmanagedDictionary Dictionary1 => Buffer.ReadUnmanagedUnmanagedDictionaryField(0);
        public ReadOnlyUnmanagedBufferedObjectDictionary Dictionary2 => Buffer.ReadUnmanagedBufferedObjectDictionaryField(1);
        public ReadOnlyStringUnmanagedDictionary Dictionary3 => Buffer.ReadStringUnmanagedDictionaryField(2);
        public ReadOnlyStringBufferedObjectDictionary Dictionary4 => Buffer.ReadStringBufferedObjectDictionaryField(3);
    }
    复制代码

    如下的代码演示了基于四种字典类型基于“索引”的检索方式(源代码从这里下载)。

    复制代码
    using NativeBuffering;
    using System.Diagnostics;
    
    var entity = new Entity
    {
        Dictionary1 = new Dictionary<int, long> { { 1, 1 }, { 2, 2 }, { 3, 3 } },
        Dictionary2 = new Dictionary<int, string> { { 1, "foo" }, { 2, "bar" }, { 3, "baz" } },
        Dictionary3 = new Dictionary<string, long> { { "foo", 1 }, { "bar", 2 }, { "baz", 3 } },
        Dictionary4 = new Dictionary<string, string> { { "a", "foo" }, { "b", "bar" }, { "c", "baz" } }
    };
    
    var bytes = new byte[entity.CalculateSize()];
    var context = new BufferedObjectWriteContext(bytes);
    entity.Write(context);
    unsafe
    {
        fixed (void* _ = bytes)
        {
            var bufferedMessage = EntityBufferedMessage.Parse(new NativeBuffer(bytes));
    
            ref var value1 = ref bufferedMessage.Dictionary1.AsRef(1);
            Debug.Assert(value1 == 1);
            ref var value2 = ref bufferedMessage.Dictionary3.AsRef("baz");
            Debug.Assert(value2 == 3);
    
            var dictionary1 = bufferedMessage.Dictionary1;
            Debug.Assert(dictionary1[1] == 1);
            Debug.Assert(dictionary1[2] == 2);
            Debug.Assert(dictionary1[3] == 3);
    
            var dictionary2 = bufferedMessage.Dictionary2;
            Debug.Assert(dictionary2[1] == "foo");
            Debug.Assert(dictionary2[2] == "bar");
            Debug.Assert(dictionary2[3] == "baz");
    
            var dictionary3 = bufferedMessage.Dictionary3;
            Debug.Assert(dictionary3["foo"] == 1);
            Debug.Assert(dictionary3["bar"] == 2);
            Debug.Assert(dictionary3["baz"] == 3);
    
            var dictionary4 = bufferedMessage.Dictionary4;
            Debug.Assert(dictionary4["a"] == "foo");
            Debug.Assert(dictionary4["b"] == "bar");
            Debug.Assert(dictionary4["c"] == "baz");
        }
    }
    复制代码

    四、为什么不直接返回接口

    针对集合,NativeBuffering提供了两种类型;针对字典,更是定义了四种类型,为什么不直接返回IList/IDictionary(或者IReadOnlyList/IReadOnlyDictionary)接口呢?这主要有两个原因,第一:为了尽可能地减少内存占用,我们将四种字典类型都定义成了结构体,如果使用接口的话会导致装箱;第二,四种字典类型的提供的API是有差异的,比如ReadOnlyFixedLengthTypedList 和ReadOnlyUnmanagedUnmanagedDictionary都提供了一个额外的AsRef方法,它直接返回值的引用(只读)。如果这个值被定义成一个成员较多的结构体,传引用的方式可以避免较多的拷贝。

    复制代码
    public readonly unsafe struct ReadOnlyFixedLengthTypedList : IReadOnlyList, IReadOnlyBufferedObject>
        where T: unmanaged
    {
        public readonly ref T AsRef(int index);
        ...
    }
    
    public unsafe readonly struct ReadOnlyUnmanagedUnmanagedDictionary : IReadOnlyDictionary, IReadOnlyBufferedObject>
        where TKey : unmanaged, IComparable
        where TValue : unmanaged
    {
        public readonly ref TValue AsRef(TKey index) ;
        ...
    }
    复制代码
  • 相关阅读:
    数据持久化技术——MP
    汽车UDS诊断之读取DTC(19h 02h)子功能深度剖析
    程序设计实验第二周
    Java 请求华为云认证接口API
    期末复习【微机原理】
    linux中busybox与文件系统的关系
    git基础命令(二)
    【力扣】2578. 最小和分割
    NX/UG二次开发—UF_MODL_ask_bounding_box_exact浅析
    YOLOv8-Seg改进:渐近特征金字塔网络(AFPN)
  • 原文地址:https://www.cnblogs.com/artech/p/17587660.html