• c# string字符串连接原理



    一、现象:string 没有重载operate +,但是能加

    在这里插入图片描述看上面的代码执行结果,发现string跟int相加得到了一个正常的结果,但是我们在string.cs源码中并没有看到string对于+operate的重载,那么到底发生了什么呢?马上使用ildasm打开,找到关键的IL代码:
    在这里插入图片描述

    马上可以发现,首先进行了一次int32的Tosting方法,然后,调用了string的concat方法。说明在编译期,编译器自动帮我们进行了代码转换。马上找到concat方法:

    //code from string.cs
    public static String Concat(Object arg0, Object arg1) {
        Contract.Ensures(Contract.Result<String>() != null);
        Contract.EndContractBlock();
    
        if (arg0 == null)
        {
            arg0 = String.Empty;
        }
    
        if (arg1==null) {
            arg1 = String.Empty;
        }
        return Concat(arg0.ToString(), arg1.ToString());
    }
    [System.Security.SecuritySafeCritical]  // auto-generated
    public static String Concat(String str0, String str1) {
        Contract.Ensures(Contract.Result<String>() != null);
        Contract.Ensures(Contract.Result<String>().Length ==
            (str0 == null ? 0 : str0.Length) +
            (str1 == null ? 0 : str1.Length));
        Contract.EndContractBlock();
    
        if (IsNullOrEmpty(str0)) {
            if (IsNullOrEmpty(str1)) {
                return String.Empty;
            }
            return str1;
        }
    
        if (IsNullOrEmpty(str1)) {
            return str0;
        }
    
        int str0Length = str0.Length;
        
        String result = FastAllocateString(str0Length + str1.Length);
        
        FillStringChecked(result, 0,        str0);
        FillStringChecked(result, str0Length, str1);
        
        return result;
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43

    可以看到,在调用Concat的时候,传入2个Object对象,然后都tosting,之后,执行下面的方法对2个string Concat,这个过程中,调用了FastAllocateString,创建了一个新的String,然后再把之前string的值填入。那如果是3个string连续相加呢?马上试一下:
    在这里插入图片描述
    在这里插入图片描述
    可以看到,都是3个字符串相加,但是后面确调用了3次2个参数的string.Concat,也就创建了3次string。连续相加只会调用1个3个参数的string.Concat,仅仅一次GC,回头想一想,字符串连接还有其他方法,那么研究一下吧。

    二、string.Format

    先上源码:

    public static String Format(String format, Object arg0) {
         if (format == null)
             throw new ArgumentNullException("format");
         Contract.Ensures(Contract.Result<String>() != null);
         Contract.EndContractBlock();
         return Format(null, format, new Object[] {arg0});
     }
    
     public static String Format(String format, Object arg0, Object arg1) {
         if (format == null)
             throw new ArgumentNullException("format");
         Contract.Ensures(Contract.Result<String>() != null);
         Contract.EndContractBlock();
         return Format(null, format, new Object[] {arg0, arg1});
     }
    
     public static String Format(String format, Object arg0, Object arg1, Object arg2) {
         if (format == null)
             throw new ArgumentNullException("format");
         Contract.Ensures(Contract.Result<String>() != null);
         Contract.EndContractBlock();
    
         return Format(null, format, new Object[] {arg0, arg1, arg2});
     }
    
    
     public static String Format(String format, params Object[] args) {
         if (format == null || args == null)
             throw new ArgumentNullException((format == null) ? "format" : "args");
         Contract.Ensures(Contract.Result<String>() != null);
         Contract.EndContractBlock();
    
         return Format(null, format, args);
     }
    
     public static String Format( IFormatProvider provider, String format, params Object[] args) {
         if (format == null || args == null)
             throw new ArgumentNullException((format == null) ? "format" : "args");
         Contract.Ensures(Contract.Result<String>() != null);
         Contract.EndContractBlock();
    
         StringBuilder sb = StringBuilderCache.Acquire(format.Length + args.Length * 8);
         sb.AppendFormat(provider,format,args);
         return StringBuilderCache.GetStringAndRelease(sb);
     }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45

    从上面代码可以看出来,String.Format不同参数的方法,最终会调用:
    String Format( IFormatProvider provider, String format, params Object[] args)
    这个方法里面其实是先调用的StringBuilderCache.Acquire,然后依次append进去,最后调用StringBuilderCache.GetStringAndRelease(sb)

    那么下面就看下StringBuilderCache的这两个方法:

    //stringbuildercache.cs
    namespace System.Text
    {
        internal static class StringBuilderCache
        {
            // The value 360 was chosen in discussion with performance experts as a compromise between using
            // as litle memory (per thread) as possible and still covering a large part of short-lived
            // StringBuilder creations on the startup path of VS designers.
            private const int MAX_BUILDER_SIZE = 360;
    
            [ThreadStatic]
            private static StringBuilder CachedInstance;
    
            public static StringBuilder Acquire(int capacity = StringBuilder.DefaultCapacity)
            {
                if(capacity <= MAX_BUILDER_SIZE)
                {
                    StringBuilder sb = StringBuilderCache.CachedInstance;
                    if (sb != null)
                    {
                        // Avoid stringbuilder block fragmentation by getting a new StringBuilder
                        // when the requested size is larger than the current capacity
                        if(capacity <= sb.Capacity)
                        {
                            StringBuilderCache.CachedInstance = null;
                            sb.Clear();
                            return sb;
                        }
                    }
                }
                return new StringBuilder(capacity);
            }
    
            public static void Release(StringBuilder sb)
            {
                if (sb.Capacity <= MAX_BUILDER_SIZE)
                {
                    StringBuilderCache.CachedInstance = sb;
                }
            }
    
            public static string GetStringAndRelease(StringBuilder sb)
            {
                string result = sb.ToString();
                Release(sb);
                return result;
            }
        }
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49

    看上面代码可以知道,StringBuilderCache类其实帮助capacity在360以下的StringBuilder缓存了一个StringBuilder实例,想不到吧,其实我们在单次使用360容量以下的StringBuilder对象,可以直接使用,不需要本地再次缓存了。具体StringBuilder怎么做的,下一章介绍。

    三、StringBuilder

    根据前面的描述,我们调用了StringBuilder的AppendFormat,源码有点长,这里直接简单描述一下:

    internal char[] m_ChunkChars;                // The characters in this block
    internal int m_ChunkLength;                  // The index in m_ChunkChars that represent the end of the block
    //这里就只写定义
    public StringBuilder AppendFormat(IFormatProvider provider, String format, params Object[] args)
    
    // Appends a character at the end of this string builder. The capacity is adjusted as needed.
    public StringBuilder Append(char value, int repeatCount) {
        if (repeatCount<0) {
            throw new ArgumentOutOfRangeException("repeatCount", Environment.GetResourceString("ArgumentOutOfRange_NegativeCount"));
        }
        Contract.Ensures(Contract.Result<StringBuilder>() != null);
        Contract.EndContractBlock();
    
        if (repeatCount==0) {
            return this;
        }
        int idx = m_ChunkLength;
        while (repeatCount > 0)
        {
            if (idx < m_ChunkChars.Length)
            {
                m_ChunkChars[idx++] = value;
                --repeatCount;
            }
            else
            {
                m_ChunkLength = idx;
                ExpandByABlock(repeatCount);
                Contract.Assert(m_ChunkLength == 0, "Expand should create a new block");
                idx = 0;
            }
        }
        m_ChunkLength = idx;
        VerifyClassInvariant();
        return this;
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36

    上面两个变量保存了StringBuilder里面保存的字符串,通过AppendFormat自动识别参数format中的{n}然后使用后面的args填充到m_ChunkChars中,当m_ChunkLength不足的时候,调用ExpandByABlock:

    /// 
    /// Assumes that 'this' is the last chunk in the list and that it is full.  Upon return the 'this'
    /// block is updated so that it is a new block that has at least 'minBlockCharCount' characters.
    /// that can be used to copy characters into it.   
    /// 
    private void ExpandByABlock(int minBlockCharCount)
    {
        Contract.Requires(Capacity == Length, "Expand expect to be called only when there is no space left");        // We are currently full
        Contract.Requires(minBlockCharCount > 0, "Expansion request must be positive");
    
        VerifyClassInvariant();
    
        if ((minBlockCharCount + Length) > m_MaxCapacity)
            throw new ArgumentOutOfRangeException("requiredLength", Environment.GetResourceString("ArgumentOutOfRange_SmallCapacity"));
    
        // Compute the length of the new block we need 
        // We make the new chunk at least big enough for the current need (minBlockCharCount)
        // But also as big as the current length (thus doubling capacity), up to a maximum
        // (so we stay in the small object heap, and never allocate really big chunks even if
        // the string gets really big. 
        int newBlockLength = Math.Max(minBlockCharCount, Math.Min(Length, MaxChunkSize));
    
        // Copy the current block to the new block, and initialize this to point at the new buffer. 
        m_ChunkPrevious = new StringBuilder(this);
        m_ChunkOffset += m_ChunkLength;
        m_ChunkLength = 0;
    
        // Check for integer overflow (logical buffer size > int.MaxInt)
        if (m_ChunkOffset + newBlockLength < newBlockLength)
        {
            m_ChunkChars = null;
            throw new OutOfMemoryException();
        }
        m_ChunkChars = new char[newBlockLength];
    
        VerifyClassInvariant();
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37

    通过计算newBlockLength ,重新创建了一个 m_ChunkChars = new char[newBlockLength];
    最后,一般都会调用ToString()方法:

    [System.Security.SecuritySafeCritical]  // auto-generated
      public override String ToString() {
          Contract.Ensures(Contract.Result<String>() != null);
    
          VerifyClassInvariant();
          
          if (Length == 0)
              return String.Empty;
    
          string ret = string.FastAllocateString(Length);
          StringBuilder chunk = this;
          unsafe {
              fixed (char* destinationPtr = ret)
              {
                  do
                  {
                      if (chunk.m_ChunkLength > 0)
                      {
                          // Copy these into local variables so that they are stable even in the presence of ----s (hackers might do this)
                          char[] sourceArray = chunk.m_ChunkChars;
                          int chunkOffset = chunk.m_ChunkOffset;
                          int chunkLength = chunk.m_ChunkLength;
    
                          // Check that we will not overrun our boundaries. 
                          if ((uint)(chunkLength + chunkOffset) <= ret.Length && (uint)chunkLength <= (uint)sourceArray.Length)
                          {
                              fixed (char* sourcePtr = sourceArray)
                                  string.wstrcpy(destinationPtr + chunkOffset, sourcePtr, chunkLength);
                          }
                          else
                          {
                              throw new ArgumentOutOfRangeException("chunkLength", Environment.GetResourceString("ArgumentOutOfRange_Index"));
                          }
                      }
                      chunk = chunk.m_ChunkPrevious;
                  } while (chunk != null);
              }
          }
          return ret;
      }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40

    这里也是创建了一个stirng对象,然后调用unsafe代码中的string.wstrcpy,将m_ChunkChars的数据拷贝到新的string对象中。

    可见,对于StringBuilder的使用,最好开始就计算好大小,不然经常性的扩容,也会导致GC增大,其次每次tostring的时候也有一次GC。

    四、String.Join

    先看一下使用场景:
    在这里插入图片描述
    可以看到,String.Join可以方便的组合一个集合通过分隔符组成一个字符串。
    那么再看一下源码的实现:

    // Joins an array of strings together as one string with a separator between each original string.
    public static String Join(String separator, params String[] value) {
        if (value==null)
            throw new ArgumentNullException("value");
        Contract.EndContractBlock();
        return Join(separator, value, 0, value.Length);
    }
    
    [System.Security.SecuritySafeCritical]  // auto-generated
    public unsafe static String Join(String separator, String[] value, int startIndex, int count) {
    	//部分代码,省略了jointLength计算
        string jointString = FastAllocateString( jointLength );
        fixed (char * pointerToJointString = &jointString.m_firstChar) {
            UnSafeCharBuffer charBuffer = new UnSafeCharBuffer( pointerToJointString, jointLength);                
            
            // Append the first string first and then append each following string prefixed by the separator.
            charBuffer.AppendString( value[startIndex] );
            for (int stringToJoinIndex = startIndex + 1; stringToJoinIndex <= endIndex; stringToJoinIndex++) {
                charBuffer.AppendString( separator );
                charBuffer.AppendString( value[stringToJoinIndex] );
            }
            Contract.Assert(*(pointerToJointString + charBuffer.Length) == '\0', "String must be null-terminated!");
        }
        return jointString;
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25

    可以看到,对于String数组的join,采用的是unsafe 代码,操作UnSafeCharBuffer,通过指针运算,将每一个stringappend进创建的jointString中。

    [ComVisible(false)]
    public static String Join(String separator, params Object[] values) {
        if (values==null)
            throw new ArgumentNullException("values");
        Contract.EndContractBlock();
    
        if (values.Length == 0 || values[0] == null)
            return String.Empty;
    
        if (separator == null)
            separator = String.Empty;
    
        StringBuilder result = StringBuilderCache.Acquire();
    
        String value = values[0].ToString();           
        if (value != null)
            result.Append(value);
    
        for (int i = 1; i < values.Length; i++) {
            result.Append(separator);
            if (values[i] != null) {
                // handle the case where their ToString() override is broken
                value = values[i].ToString();
                if (value != null)
                    result.Append(value);
            }
        }
        return StringBuilderCache.GetStringAndRelease(result);
    }
    
    [ComVisible(false)]
    public static String Join<T>(String separator, IEnumerable<T> values) {
        if (values == null)
            throw new ArgumentNullException("values");
        Contract.Ensures(Contract.Result<String>() != null);
        Contract.EndContractBlock();
    
        if (separator == null)
            separator = String.Empty;
    
        using(IEnumerator<T> en = values.GetEnumerator()) {
            if (!en.MoveNext())
                return String.Empty;
    
            StringBuilder result = StringBuilderCache.Acquire();
            if (en.Current != null) {
                // handle the case that the enumeration has null entries
                // and the case where their ToString() override is broken
                string value = en.Current.ToString();
                if (value != null)
                    result.Append(value);
            }
    
            while (en.MoveNext()) {
                result.Append(separator);
                if (en.Current != null) {
                    // handle the case that the enumeration has null entries
                    // and the case where their ToString() override is broken
                    string value = en.Current.ToString();
                    if (value != null)
                        result.Append(value);
                }
            }            
            return StringBuilderCache.GetStringAndRelease(result);
        }
    }
    
    [ComVisible(false)]
    public static String Join(String separator, IEnumerable<String> values) {
        if (values == null)
            throw new ArgumentNullException("values");
        Contract.Ensures(Contract.Result<String>() != null);
        Contract.EndContractBlock();
    
        if (separator == null)
            separator = String.Empty;
    
    
        using(IEnumerator<String> en = values.GetEnumerator()) {
            if (!en.MoveNext())
                return String.Empty;
    
            StringBuilder result = StringBuilderCache.Acquire();
            if (en.Current != null) {
                result.Append(en.Current);
            }
    
            while (en.MoveNext()) {
                result.Append(separator);
                if (en.Current != null) {
                    result.Append(en.Current);
                }
            }            
            return StringBuilderCache.GetStringAndRelease(result);
        }           
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96

    这里是的三个方法:
    public static String Join(String separator, params Object[] values)
    public static String Join(String separator, IEnumerable values)
    public static String Join(String separator, IEnumerable values)

    都是通过StringBuilder,将字符串合并的。

    五、内插

    string userName = "";
    string date = DateTime.Today.ToShortDateString();
    
    // Use string interpolation to concatenate strings.
    string str = $"Hello {userName}. Today is {date}.";
    System.Console.WriteLine(str);
    
    str = $"{str} How are you today?";
    System.Console.WriteLine(str);
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    从 C# 10 开始,当用于占位符的所有表达式也是常量字符串时,可以使用字符串内插来初始化常量字符串。在某些表达式中,使用字符串内插进行字符串串联更简单,那么内插的IL到底是调用的什么呢?
    在这里插入图片描述
    可以看到,字符串内插,其实调用的就是string.Concat。

    总结

    提示:这里对文章进行总结:
    本文总结了5中对字符串拼接的方式,以及原理,因此我们在不同的场景要根据选择去编写字符串拼接代码。建议如下:

    1. Concat 跟 + 操作符,以及内插方法,其实都是调用了string.contacct,调用一次创建一个新字符串并且拷贝,因此这些方法不适合进行循环以及大量的拼接操作。
    2. stringBuilder,string.Format本质上都是调用了stringBuilder,但是要注意扩容,已经360capacity的话是有cache对象的
    3. 对于需要分隔符,以及数组,list等集合,可以使用string.join
    4. 其实console.WriteLine(),最终也是调用了String.Format。

    参考

    源码下载:Download .NET Framework 4.5.1

  • 相关阅读:
    Mysql忘记密码后如何重置密码
    Appium+python+unittest搭建UI自动化框架
    xhEditor实现WORD粘贴图片自动上传
    【数据结构】单链表的增删查改(C语言实现)
    【Python学习】Day-025 爬虫、requests基本用法
    react知识点
    蓝桥杯(砝码称重,C++)
    隆云通PM100传感器
    java代理示例
    Nodejs 第十六章(ffmpeg)
  • 原文地址:https://blog.csdn.net/qq_17347313/article/details/126843672