• JMH-并发包分析


    内容来源:Java 高并发编程详解~深入理解并发核心库汪文君

    目的:复现书中使用 JMH 代码微基准测试工具基测试常见的并发包和集合类的过程,对并发性能问题做初步了解

    内容:

    ① 了解 JMH :ArrayList 和 LinkedList 的测试对比

    ② 常见并发包测试

    1. JMH

    JMH : Java Micro Benchmark Harness,专门用于代码微基准测试的工具集

    1.1 JMH 测试 ArrayList 和 LinkedList

    ① 创建 maven 工程

    ② 引入 JMH 依赖

    <dependency>
        <groupId>org.openjdk.jmhgroupId>
        <artifactId>jmh-coreartifactId>
        <version>1.19version>
    dependency>
    <dependency>
        <groupId>org.openjdk.jmhgroupId>
        <artifactId>jmh-generator-annprocessartifactId>
        <version>1.19version>
        <scope>providedscope>
    dependency>
    

    ③ 测试代码

    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.MICROSECONDS)
    @State(Scope.Thread)
    public class JMHExample01 {
    
        private final static String DATA = "DUMMY DATA";
    
        private List<String> arrayList;
    
        private List<String> linkedList;
    
        @Setup(Level.Iteration)
        public void setUp(){
            this.arrayList = new ArrayList<>();
            this.linkedList = new LinkedList<>();
        }
        @Benchmark
        public List<String> arrayListAdd(){
            this.arrayList.add(DATA);
            return arrayList;
        }
        @Benchmark
        public List<String> linkedListAdd(){
            this.linkedList.add(DATA);
            return linkedList;
        }
        public static void main(String[] args) throws RunnerException {
           final Options opts = new OptionsBuilder().include(JMHExample01.class.getSimpleName())
               .forks(1)
               .measurementIterations(10)
               .warmupIterations(10)
               .build();
           new Runner(opts).run();
        }
    }
    

    ④ 测试结果

    ...
    # Run complete. Total time: 00:01:06
    
    Benchmark                              Mode  Cnt  Score   Error  Units
    MemoryTest.JMHExample01.arrayListAdd   avgt   10  0.019 ± 0.012  us/op
    MemoryTest.JMHExample01.linkedListAdd  avgt   10  0.130 ± 0.092  us/op
    

    测试结论:

    ArrayList.add() 方法的调用平均响应时间为 0.019 us,误差在 0.012 us

    LinkedList.add() 方法的调用平均响应时间为 0.013 us,误差在 0.092 us

    1.2 参数说明

    1.2.1 @Benchmark

    JMH 使用 @Benchmark 对基准测试方法进行标记,区别于普通方法,被标记的方法会执行基准测试

    @Benchmark 之于 JMH 等价于 @Test 之于 Junit4.x

    返回到上述代码,@Benchmark 被用在两个测试方法 arrayListAdd() 和 linkedListAddd() 上

    1.2.2 Warmup 及 Measurement

    Warmup 及 Measurement 的作用:分批次执行基准测试方法

    Warmup :在基准测试代码正式度量前,对其进行预热,使得代码的执行经历过了类的早期优化、JVM 运行期编译、JIT 优化后的最终状态

    Measurement :度量操作,每轮的度量中,所有的度量数据会被纳入统计中

    本例:构造 Optins 时设置 Warmup 和 Measurement 的执行批次

    final Options opts = new OptionsBuilder().include(JMHExample01.class.getSimpleName())
               .forks(1)
               .measurementIterations(10) // 10 次调度批次对基准方法的执行和调用都会纳入统计
               .warmupIterations(10) // 进行真正的度量前,会对代码进行 3 个批次的热身
               .build();
    

    1.2.3 forks()

    基准测试方法间与其所在的测试类 JVM 进程共享,可能会存在干扰

    将 fork 设置为 1,每次运行基准测试时都会开辟一个全新的 JVM 进程对其进行测试,多个基准测试间将不会存在干扰

    1.2.4 @BenchcmarkMode(…)

    JMH 使用 @BenchmarkMode 注解声明使用那种发模式来运行

    四种模式:

    • AverageTime 平均响应时间:输出基准测试方法每次调用一次所耗费的时间
    • Throughput 方法吞吐量:输出单位时间内可对该方法调用多少次
    • SampleTime 时间采样:采用一种抽样的方式来统计基准测试方法的性能结果
    • SingleShotTime 冷测试:Warmup 和 Measurement 在每个批次中基准测试方法只会被执行一次

    1.2.5 OutputTimeUnit

    OutputTimeUnit 统计结果输出时的单位

    @OutputTimeUnit(TimeUnit.MICROSECONDS)

    可在 class 上设置,也可在基准方法上设置

    1.2.6 State(…)

    State 对应 Scope 的三个枚举值

    • Thread :Thread 独享的 State ,每个运行基准测试方法的线程都会持有一个独立的对象实例
    • Benchmark :Thread 共享的 State,多线程共享同一个实例
    • Group :线程组共享的 State,多线程共享同一个实例,且允许多个基准测试方法并发并行允许

    1.2.7 Setup(…)

    @Setup 在每个基准测试方法执行前被调用,通常用于资源的初始化

    @TearDown :在基准测试方法被执行之后被调用,通常用于资源的回收清理工作

    上述两个注解是JMH的套件测试,类似于 Junit 中的 @Before, @After, @BeforeClass, @AfterClass

    @Setup(Level.Iteration),其中的 Level 用于配置上述两个套件执行方式

    • Level.Trial :在每个基准测试方法的所有批次执行的前后被执行
    • Level.Iteration :在每个基准测试方法的每个批次执行的前后被执行
    • Level.Invocation :在每个基准测试方法的每次度量执行的前后被执行

    2. 并发包测试

    2.1 AtomicInteger 性能测试

    @Measurement(iterations = 10)
    @Warmup(iterations = 10)
    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.MICROSECONDS)
    public class Synchronized_Lock_AtomicInteger {
    
        @State(Scope.Group)
        public static class IntMonitor{
            private int x;
            private final Lock lock = new ReentrantLock();
            // 使用显式锁 Lock 进行共享资源同步
            public void lockInc(){
                lock.lock();
                try{
                    x++;
                }finally {
                    lock.unlock();
                }
            }
            // 使用 synchronized 关键字进行共享资源同步
            public void synInc(){
                synchronized (this){
                    x++;
                }
            }
        }
        @State(Scope.Group)
        public static class AtomicIntegerMonitor{
            private AtomicInteger x = new AtomicInteger();
            public void inc(){
                x.incrementAndGet();
            }
        }
    
        @GroupThreads(10)
        @Group("sync")
        @Benchmark
        public void syncInc(IntMonitor monitor){
            monitor.synInc();
        }
    
        @GroupThreads(10)
        @Group("lock")
        @Benchmark
        public void lockInc(IntMonitor monitor){
            monitor.lockInc();
        }
    
        @GroupThreads(10)
        @Group("atomic")
        @Benchmark
        public void atomicIntegerInc(AtomicIntegerMonitor monitor){
            monitor.inc();
        }
    
        public static void main(String[] args) throws RunnerException {
            Options opts = new OptionsBuilder()
                .include(Synchronized_Lock_AtomicInteger.class.getSimpleName())
                .forks(1)
                .timeout(TimeValue.seconds(10))
                .addProfiler(StackProfiler.class)
                .build();
            new Runner(opts).run();
        }
    }
    

    输出结果:

    Benchmark                                                 Mode  Cnt  Score   Error  Units
    MemoryTest.Synchronized_Lock_AtomicInteger.atomic         avgt   10  0.190 ± 0.002  us/op
    MemoryTest.Synchronized_Lock_AtomicInteger.lock           avgt   10  0.218 ± 0.003  us/op
    MemoryTest.Synchronized_Lock_AtomicInteger.sync           avgt   10  0.281 ± 0.034  us/op
    

    即:表现情况
    AtomicInteger > ReentrantLock > Synchronized

    上面的示例中加入了 .addProfiler(StackProfiler.class)

    可以通过 StackProfiler 输出线程堆栈信息,统计程序在执行过程中线程状态的数据,比如 RUNNING 状态、WAIT 状态所占的百分比

    ## AtomicInteger
    ....[Thread state distributions].............
     98.2%         RUNNABLE
      1.8%         WAITING
    ## ReentrantLock
    ....[Thread state distributions].............
     78.0%         WAITING
     22.0%         RUNNABLE
    ## Synchronized
    ....[Thread state distributions].............
     69.7%         BLOCKED
     29.5%         RUNNABLE
      0.8%         WAITING
    

    即 AtomicInteger 的 RUNNABLE 状态高达 98.2%,并且没有 BLOCKED 状态,

    而 synchronized 关键字则相反,BLOCK 状态高达 68.5%

    本质上 AtomicInteger 是基于 Unsafe 类的 compareAndSwap 来实现的

    Unsafe 由 C++ 实现,内部存在大量的汇编 CPU 指令等代码

    此处需要补充一篇关于 Unsafe 类的解读

    2.2 ReentrantLock - Synchronized

    2.2.1 使用 Blackdhole

    pg 30 基准测试方法中,需要将两个计算结果作为返回值

    JMH 提供了一个称为 Blackhole 类,可以在不作任何返回的情况下避免 Dead Code 的发生

    2.2.2 单线程读操作性能对比

    @Measurement(iterations = 10)
    @Warmup(iterations = 10)
    @BenchmarkMode(Mode.AverageTime)
    @Threads(1)
    @OutputTimeUnit(TimeUnit.MICROSECONDS)
    @State(Scope.Thread)
    public class ReentrantLockExample4 {
        public static class Test{
            private int x = 10;
            private final Lock lock = new ReentrantLock();
            public int baseMethod(){
                return x;
            }
    
            public int lockMethod(){
                lock.lock();
                try{
                    return x;
                }finally {
                    lock.unlock();
                }
            }
    
            public int syncMethod(){
                synchronized (this){
                    return x;
                }
            }
        }
        private Test test;
        @Setup(Level.Iteration)
        public void setUp(){
            this.test = new Test();
        }
        @Benchmark
        public void base(Blackhole hole){
            hole.consume(test.baseMethod());
        }
        @Benchmark
        public void testLockMethod(Blackhole hole){
            hole.consume(test.lockMethod());
        }
        @Benchmark
        public void testSyncMethod(Blackhole hole){
            hole.consume(test.syncMethod());
        }
    
        public static void main(String[] args) throws RunnerException {
            Options opts = new OptionsBuilder()
                .include(ReentrantLockExample4.class.getSimpleName())
                .forks(1)
                .build();
            new Runner(opts).run();
        }
    }
    

    测试结果

    Benchmark                                        Mode  Cnt  Score    Error  Units
    MemoryTest.ReentrantLockExample4.base            avgt   10  0.002 ±  0.001  us/op
    MemoryTest.ReentrantLockExample4.testLockMethod  avgt   10  0.019 ±  0.001  us/op
    MemoryTest.ReentrantLockExample4.testSyncMethod  avgt   10  0.004 ±  0.001  us/op
    

    结论:单线程访问情况下,synchronized 关键字的性能远高于 lock 锁

    2.2.3 多线程读操作性能对比

    @Measurement(iterations = 10)
    @Warmup(iterations = 10)
    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.MICROSECONDS)
    
    public class ReentrantLockExample5 {
        @State(Scope.Group)
        public static class Test{
            private int x = 10;
            private final Lock lock = new ReentrantLock();
            public int baseMethod(){
                return x;
            }
    
            public int lockMethod(){
                lock.lock();
                try{
                    return x;
                }finally {
                    lock.unlock();
                }
            }
    
            public int syncMethod(){
                synchronized (this){
                    return x;
                }
            }
        }
        @GroupThreads(10)
        @Group("base")
        @Benchmark
        public void base(Test test, Blackhole hole){
            hole.consume(test.baseMethod());
        }
        @GroupThreads(10)
        @Group("lock")
        @Benchmark
        public void testLockMethod(Test test, Blackhole hole){
            hole.consume(test.lockMethod());
        }
        @GroupThreads(10)
        @Group("sync")
        @Benchmark
        public void testSyncMethod(Test test, Blackhole hole){
            hole.consume(test.syncMethod());
        }
    
        public static void main(String[] args) throws RunnerException {
            Options opts = new OptionsBuilder()
                .include(ReentrantLockExample5.class.getSimpleName())
                .forks(1)
                .build();
            new Runner(opts).run();
        }
    }
    

    测试结果:

    Benchmark                              Mode  Cnt  Score    Error  Units
    MemoryTest.ReentrantLockExample5.base  avgt   10  0.005 ±  0.001  us/op
    MemoryTest.ReentrantLockExample5.lock  avgt   10  0.241 ±  0.003  us/op
    MemoryTest.ReentrantLockExample5.sync  avgt   10  0.277 ±  0.004  us/op
    

    测试结论:多线程读操作情况下,显式锁 Lock 的性能优于 synchronized 关键字

    2.2.4 多线程下读写操作性能对比

    设置 10 个线程,其中 5 个线程并发修改共享资源,5 个线程并发读取共享资源

    @Measurement(iterations = 10)
    @Warmup(iterations = 10)
    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.MICROSECONDS)
    public class ReentrantLockExample6 {
        @State(Scope.Group)
        public static class Test{
            private int x = 10;
            private final Lock lock = new ReentrantLock();
            public void lockInc(){
                lock.lock();
                try{
                    x++;
                }finally {
                    lock.unlock();
                }
            }
            public int lockGet(){
                lock.lock();
                try{
                    return x;
                }finally {
                    lock.unlock();
                }
            }
            public void syncInc(){
                synchronized (this){
                    x++;
                }
            }
            public int syncGet(){
                synchronized (this){
                    return x;
                }
            }
        }
        @GroupThreads(5)
        @Group("lock")
        @Benchmark
        public void lockInc(Test test){
            test.lockInc();
        }
        @GroupThreads(5)
        @Group("lock")
        @Benchmark
        public void lockGet(Test test, Blackhole blackhole){
            blackhole.consume(test.lockGet());
        }
    
        @GroupThreads(5)
        @Group("sync")
        @Benchmark
        public void syncInc(Test test){
            test.syncInc();
        }
        @GroupThreads(5)
        @Group("sync")
        @Benchmark
        public void syncGet(Test test, Blackhole blackhole){
            blackhole.consume(test.syncGet());
        }
    
        public static void main(String[] args) throws RunnerException {
            Options opts = new OptionsBuilder()
                .include(ReentrantLockExample6.class.getSimpleName())
                .forks(1)
                .build();
            new Runner(opts).run();
        }
    }
    

    测试结果

    Benchmark                                      Mode  Cnt  Score   Error  Units
    MemoryTest.ReentrantLockExample6.lock          avgt   10  0.224 ± 0.003  us/op
    MemoryTest.ReentrantLockExample6.lock:lockGet  avgt   10  0.242 ± 0.020  us/op
    MemoryTest.ReentrantLockExample6.lock:lockInc  avgt   10  0.207 ± 0.016  us/op
    MemoryTest.ReentrantLockExample6.sync          avgt   10  0.268 ± 0.001  us/op
    MemoryTest.ReentrantLockExample6.sync:syncGet  avgt   10  0.278 ± 0.005  us/op
    MemoryTest.ReentrantLockExample6.sync:syncInc  avgt   10  0.257 ± 0.003  us/op
    

    现在存在的问题是:里面的那个 lock 和 sync 是哪里的?

    应该是 lock 对应的 10 个线程组【5个读线程,5 个写线程】的平均响应时间

    2.3 ReentrantReadWriteLock

    2.3.1 多线程只读性能比较

    @Measurement(iterations = 10)
    @Warmup(iterations = 10)
    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.MICROSECONDS)
    public class ReentrantReadWriteLockExample2 {
    
        @State(Scope.Group)
        public static class Test{
            private int x = 10;
            private final Lock lock = new ReentrantLock();
            public int baseMethod(){
                return x;
            }
            public int lockMethod(){
                lock.lock();
                try{
                    return x;
                }finally {
                    lock.unlock();
                }
            }
            public int syncMethod(){
                synchronized (this){
                    return x;
                }
            }
    
            private final ReadWriteLock readWriteLock = new ReentrantReadWriteLock();
            private final Lock readLock = readWriteLock.readLock();
            public int readLockMethod(){
                readLock.lock();
                try{
                    return x;
                }finally {
                    readLock.unlock();
                }
            }
        }
        @GroupThreads(10)
        @Group("base")
        @Benchmark
        public void base(Test test, Blackhole hole){
            hole.consume(test.baseMethod());
        }
        @GroupThreads(10)
        @Group("lock")
        @Benchmark
        public void testLockMethod(Test test, Blackhole hole){
            hole.consume(test.lockMethod());
        }
        @GroupThreads(10)
        @Group("sync")
        @Benchmark
        public void testSyncMethod(Test test, Blackhole hole){
            hole.consume(test.syncMethod());
        }
    
        @GroupThreads(10)
        @Group("readLocks")
        @Benchmark
        public void testReadLockMethod(Test test, Blackhole hole){
            hole.consume(test.readLockMethod());
        }
    
        public static void main(String[] args) throws RunnerException {
            Options opts = new OptionsBuilder()
                .include(ReentrantReadWriteLockExample2.class.getSimpleName())
                .forks(1)
                .build();
            new Runner(opts).run();
        }
    }
    

    测试结果:

    Benchmark                                            Mode  Cnt  Score   Error  Units
    MemoryTest.ReentrantReadWriteLockExample2.base       avgt   10  0.007 ± 0.001  us/op
    MemoryTest.ReentrantReadWriteLockExample2.lock       avgt   10  0.241 ± 0.006  us/op
    MemoryTest.ReentrantReadWriteLockExample2.readLocks  avgt   10  1.809 ± 0.019  us/op
    MemoryTest.ReentrantReadWriteLockExample2.sync       avgt   10  0.316 ± 0.016  us/op
    

    结论:

    10 个线程并发只读情况下,性能表现好坏程度:

    ReentrantLock > synchronized > ReentrantReadWriteLock

    即在没有任何写操作情况下,读锁效率反而是最差的,因此在 JDK 1.8 版本中引入 StampedLock

    2.3.2 多线程读写性能比较

    设置 5 个线程读 5 个线程写

    @Measurement(iterations = 10)
    @Warmup(iterations = 10)
    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.MICROSECONDS)
    public class ReentrantReadWriteLockExample3 {
    
        @State(Scope.Group)
        public static class Test{
            private int x = 10;
            private final Lock lock = new ReentrantLock();
            private final ReadWriteLock readWriteLock = new ReentrantReadWriteLock();
            private final Lock readLock = readWriteLock.readLock();
            private final Lock writeLock = readWriteLock.writeLock();
            public void lockInc(){
                lock.lock();
                try{
                    x++;
                }finally {
                    lock.unlock();
                }
            }
            public int lockGet(){
                lock.lock();
                try{
                    return x;
                }finally {
                    lock.unlock();
                }
            }
            public void syncInc(){
                synchronized (this){
                    x++;
                }
            }
            public int syncGet(){
                synchronized (this){
                    return x;
                }
            }
            public void writeLockInc(){
                writeLock.lock();
                try{
                    x++;
                }finally {
                    writeLock.unlock();
                }
            }
            public int readLockGet(){
                readLock.lock();
                try{
                    return x;
                }finally {
                    readLock.unlock();
                }
            }
        }
        @GroupThreads(5)
        @Group("lock")
        @Benchmark
        public void lockInc(Test test){
            test.lockInc();
        }
        @GroupThreads(5)
        @Group("lock")
        @Benchmark
        public void lockGet(Test test, Blackhole blackhole){
            blackhole.consume(test.lockGet());
        }
    
        @GroupThreads(5)
        @Group("sync")
        @Benchmark
        public void syncInc(Test test){
            test.syncInc();
        }
        @GroupThreads(5)
        @Group("sync")
        @Benchmark
        public void syncGet(Test test, Blackhole blackhole){
            blackhole.consume(test.syncGet());
        }
    
        @GroupThreads(5)
        @Group("rwlock")
        @Benchmark
        public void writeLockInc(Test test){
            test.writeLockInc();
        }
        @GroupThreads(5)
        @Group("rwlock")
        @Benchmark
        public void readLockGet(Test test, Blackhole blackhole){
            blackhole.consume(test.readLockGet());
        }
    
        public static void main(String[] args) throws RunnerException {
            Options opts = new OptionsBuilder()
                .include(ReentrantReadWriteLockExample3.class.getSimpleName())
                .forks(1)
                .build();
            new Runner(opts).run();
        }
    }
    

    测试结果:

    Benchmark                                                      Mode  Cnt  Score   Error  Units
    MemoryTest.ReentrantReadWriteLockExample3.lock                 avgt   10  0.229 ± 0.005  us/op
    MemoryTest.ReentrantReadWriteLockExample3.lock:lockGet         avgt   10  0.252 ± 0.014  us/op
    MemoryTest.ReentrantReadWriteLockExample3.lock:lockInc         avgt   10  0.207 ± 0.007  us/op
    MemoryTest.ReentrantReadWriteLockExample3.rwlock               avgt   10  0.542 ± 0.033  us/op
    MemoryTest.ReentrantReadWriteLockExample3.rwlock:readLockGet   avgt   10  0.842 ± 0.062  us/op
    MemoryTest.ReentrantReadWriteLockExample3.rwlock:writeLockInc  avgt   10  0.242 ± 0.006  us/op
    MemoryTest.ReentrantReadWriteLockExample3.sync                 avgt   10  0.277 ± 0.011  us/op
    MemoryTest.ReentrantReadWriteLockExample3.sync:syncGet         avgt   10  0.297 ± 0.012  us/op
    MemoryTest.ReentrantReadWriteLockExample3.sync:syncInc         avgt   10  0.256 ± 0.010  us/op
    

    可以看到 ReentrantReadWriteLock 在并发情况下读锁的性能并不高,反而是写锁的性能稍好?

    什么原因呢?

    2.4 StampedLock

    StampedLock 简要了解:

    1. 可使用 lock.writeLock() 替代 ReentrantLock
    2. 提供读锁,写锁两种模式可替代 ReentrantReadWriteLockc
    3. 乐观读模式 tryOptimisticRead() 方法获取一个非排他锁并且不会进入阻塞状态

    2.4.1 可重入锁,读写锁,StampedLock 性能对比

    比较 ReentrantLock, ReentrantReadWriteLock, StampedLock 的性能表现
    测试吞吐量:每秒的方法吞吐量 / 调用次数,数值越大代表吞吐量越高

    1. 读写分离场景
    2. 乐观锁场景【不同读写线程】
    @Measurement(iterations = 20)
    @Warmup(iterations = 20)
    @BenchmarkMode(Mode.Throughput)
    @OutputTimeUnit(TimeUnit.SECONDS)
    public class StampedLockExampled4 {
        @State(Scope.Group)
        public static class Test{
            private int x = 10;
            private final Lock lock = new ReentrantLock();
            private final ReadWriteLock readWriteLock = new ReentrantReadWriteLock();
            private final Lock readLock = readWriteLock.readLock();
            private final Lock writeLock = readWriteLock.writeLock();
            private final StampedLock stampedLock = new StampedLock();
            public void stampedLockInc(){
                long stamped = stampedLock.writeLock();
                try{
                    x++;
                }finally {
                    stampedLock.unlockWrite(stamped);
                }
            }
    
            public int stampedReadLockGet(){
                long stamped = stampedLock.readLock();
                try{
                    return x;
                }finally {
                    stampedLock.unlockRead(stamped);
                }
            }
    
            public int stampedOptimisticReadLockGet(){
                long stamped = stampedLock.tryOptimisticRead();
                if(!stampedLock.validate(stamped)){
                    stamped = stampedLock.readLock();
                    try{
                        return x;
                    }finally {
                        stampedLock.unlockRead(stamped);
                    }
                }
                return x;
            }
    
            public void lockInc(){
                lock.lock();
                try{
                    x++;
                }finally {
                    lock.unlock();
                }
            }
    
            public int lockGet(){
                lock.lock();
                try{
                    return x;
                }finally {
                    lock.unlock();
                }
            }
    
            public void writeLockInc(){
                writeLock.lock();
                try{
                    x++;
                }finally {
                    writeLock.unlock();
                }
            }
    
            public int readLockGet(){
                readLock.lock();
                try{
                    return x;
                }finally {
                    readLock.unlock();
                }
            }
        }
    
        @GroupThreads(5)
        @Group("lock")
        @Benchmark
        public void lockInc(Test test){
            test.lockInc();
        }
    
        @GroupThreads(5)
        @Group("lock")
        @Benchmark
        public void lockGet(Test test, Blackhole blackhole){
            blackhole.consume(test.lockGet());
        }
    
        @GroupThreads(5)
        @Group("rwlock")
        @Benchmark
        public void writeLockInc(Test test){
            test.writeLockInc();
        }
    
        @GroupThreads(5)
        @Group("rwlock")
        @Benchmark
        public void readLockGet(Test test, Blackhole blackhole){
            blackhole.consume(test.readLockGet());
        }
    
        @GroupThreads(5)
        @Group("stampedLock")
        @Benchmark
        public void writeStampedLockInc(Test test){
            test.stampedLockInc();
        }
    
        @GroupThreads(5)
        @Group("stampedLock")
        @Benchmark
        public void readStampedLockGet(Test test, Blackhole blackhole){
            blackhole.consume(test.stampedReadLockGet());
        }
    
        @GroupThreads(5)
        @Group("stampedLockOptimistic")
        @Benchmark
        public void writeStampedLockInc2(Test test){
            test.stampedLockInc();
        }
    
        @GroupThreads(5)
        @Group("stampedLockOptimistic")
        @Benchmark
        public void readStampedLockGet2(Test test, Blackhole blackhole){
            blackhole.consume(test.stampedOptimisticReadLockGet());
        }
    
        public static void main(String[] args) throws RunnerException {
            Options opts = new OptionsBuilder()
                .include(StampedLockExampled4.class.getSimpleName())
                .forks(1)
                .build();
            new Runner(opts).run();
        }
    }
    

    测试结果:
    ① 5 个读线程 5 个写线程总体性能对比【吞吐量】

    ReentrantLockReentrantRWLockStampedLockOptimistic
    34410694.19722397554.51123228158.44384757158.553

    性能表现:Optimistic > ReentrantLock > StampedLock > ReentrantRWLock

    ② 5 个读线程 5 个写线程性能对比【吞吐量】

    ReentrantLockReentrantRWLockStampedLockOptimistic
    16541110.9584797570.36864686.32559849633.846

    性能表现:Optimistic > ReentrantLock > ReentrantRWLock > StampedLock

    ③ 5 个读线程 5 个写线程性能对比【吞吐量】

    ReentrantLockReentrantRWLockStampedLockOptimistic
    17869583.23917599984.14323163472.11824907524.707

    性能表现:Optimistic > StampedLock > ReentrantLock > ReentrantRWLock

    这里 Optimistic 的写测试方法,实际上是和 StampedLock 一样的,测试结果与书上不同

    差别仅体现在 StampedLock 分别调用 readLock() 和 trytryOptimisticRead() 转换到调用 writeLock() 的差别

    扩展测试:

    1. 10 个读线程 10 个写线程:@GroupThreads(10)
    2. 16 个读线程 4 个写线程:读测试方法设置 @GroupThreads(16)写测试方法设置 @GroupThreads(4)

    StampedLock 总结:

    1. 提供 乐观读 的方式
    2. 解决了读写锁中的 “饥饿写” 问题

    2.5 ConcurrentQueue

    2.5.1 七种阻塞队列

    ArrayBlockingQueue:

    阻塞写非阻塞写阻塞读非阻塞读
    void put(E e)boolean add(E e)E take()E poll()
    boolean offer(E e, Long timeout, TimeUnit unit)boolean offer(E e)E poll(Long timeout, TimeUnit unit)E peek()

    PriorityBlockingQueue: 无边界阻塞队列,根据某种规则对插入队列尾部的元素进行排序

    主要特点:头插和头删,每次插入删除都会进行调整

    阻塞写非阻塞写阻塞读非阻塞读
    offer(E e)E take()E poll()
    E poll(Long timeout, TimeUnit unit)E peek()

    LinkedBlockingQueue: 可选边界,基于链表实现 FIFO 队列

    可选边界:有边界在构造函数中指明 capacity,无边界默认 Integer.MAX_VALUE

    DelayQueue: 无边界阻塞队列,存入其中的数据元素会被延迟单位时间后才消费

    存入其中的元素类型必须是 Delayed 接口的子类

    SynchronousQueue: 无容量概念,每次对其写入操作必须等待(阻塞)其他线程进行对应的移除操作

    LinkedBlockingDeque: 基于链表实现的双向阻塞队列,支持队尾写入数据,读取移除数据;队头写入数据,读取移除数据

    LinkedTransferQueue: 是 TransferQueue 接口的实现类,无界队列,具有 FIFO 特性

    2.5.2 并发队列的性能

    以下程序:设置 10 个线程同时读写( 5 个线程向队列尾部插入数据, 5 个线程从队列头部读取数据)

    对比:ConcurrentLinkedQueue 和自定义的使用 synchronized 封装 LinkedList 实现的并发队列 SynchronizedLinkedList

    @Warmup(iterations = 10)
    @Measurement(iterations = 10)
    @Fork(1)
    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.MICROSECONDS)
    @State(Scope.Group)
    public class ConcurrentLinkedQueueVsSynchronizedList {
    
        private SynchronizedLinkedList synchronizedList;
        private ConcurrentLinkedQueue<String> concurrentLinkedQueue;
        private final static String DATA = "TEST";
        private final static Object LOCK = new Object();
    
        private static class SynchronizedLinkedList{
            private LinkedList<String> list = new LinkedList<>();
    
            void addLast(String element){
                synchronized (LOCK){
                    list.addLast(element);
                }
            }
            String removeFirst(){
                synchronized (LOCK){
                    if(list.isEmpty()){
                        return null;
                    }
                    return list.removeFirst();
                }
            }
        }
    
        @Setup(Level.Iteration)
        public void setUp(){
            synchronizedList = new SynchronizedLinkedList();
            concurrentLinkedQueue = new ConcurrentLinkedQueue<>();
        }
    
        @Group("sync")
        @Benchmark
        @GroupThreads(5)
        public void synchronizedListAdd(){
            synchronizedList.addLast(DATA);
        }
    
        @Group("sync")
        @Benchmark
        @GroupThreads(5)
        public String synchronizedListGet(){
            return synchronizedList.removeFirst();
        }
    
        @Group("concurrent")
        @Benchmark
        @GroupThreads(5)
        public void concurrentLinkedQueueAdd(){
            concurrentLinkedQueue.offer(DATA);
        }
    
        @Group("concurrent")
        @Benchmark
        @GroupThreads(5)
        public String concurrentLinkedQueueGet(){
            return concurrentLinkedQueue.poll();
        }
    
        public static void main(String[] args) throws RunnerException {
            final Options opt = new OptionsBuilder()
                .include(ConcurrentLinkedQueueVsSynchronizedList.class.getSimpleName())
                .build();
            new Runner(opt).run();
        }
    }
    

    测试结果:

    Benchmark                                                                               Mode  Cnt  Score   Error  Units
    MemoryTest.ConcurrentLinkedQueueVsSynchronizedList.concurrent                           avgt   10  0.534 ± 0.051  us/op
    MemoryTest.ConcurrentLinkedQueueVsSynchronizedList.concurrent:concurrentLinkedQueueAdd  avgt   10  0.643 ± 0.055  us/op
    MemoryTest.ConcurrentLinkedQueueVsSynchronizedList.concurrent:concurrentLinkedQueueGet  avgt   10  0.425 ± 0.060  us/op
    MemoryTest.ConcurrentLinkedQueueVsSynchronizedList.sync                                 avgt   10  0.773 ± 0.029  us/op
    MemoryTest.ConcurrentLinkedQueueVsSynchronizedList.sync:synchronizedListAdd             avgt   10  0.508 ± 0.018  us/op
    MemoryTest.ConcurrentLinkedQueueVsSynchronizedList.sync:synchronizedListGet             avgt   10  1.038 ± 0.043  us/op
    

    测试结论:并发队列采用 无锁算法实现,因此在并发情况下其读写性能优于使用synchronized 实现同步的 LinkedList

    无锁算法:

    例如其中的 offer() 方法

    if (p.casNext(null, newNode)) {
        // Successful CAS is the linearization point
        // for e to become an element of this queue,
        // and for newNode to become "live".
        if (p != t) // hop two nodes at a time
            casTail(t, newNode);  // Failure is OK.
        return true;
    }
    
  • 相关阅读:
    OceanBase持续践行“一体化”产品战略,发布首个一体化数据库长期支持版本
    Stable Diffusion WebUI内存不够爆CUDA Out of memory怎么办?
    【vue3源码】一、认识副作用函数与响应式数据
    如何启动一个Vue项目
    小程序添加悬浮在线客服源码
    ReqAndRespAndZuul的一些自己的见解,和超时异常的解方案
    Java中实体与Map的相互转换
    (附源码)springboot体检预约APP 计算机毕设16370
    文献阅读笔记(2022.11.14)
    【SpringBoot学习】44、SpringBoot 集成 Elasticsearch-7.6 实战
  • 原文地址:https://blog.csdn.net/qq_43156556/article/details/126922882