内容来源:Java 高并发编程详解~深入理解并发核心库
汪文君
目的:复现书中使用 JMH 代码微基准测试工具基测试常见的并发包和集合类的过程,对并发性能问题做初步了解
内容:
① 了解 JMH :ArrayList 和 LinkedList 的测试对比
② 常见并发包测试
JMH : Java Micro Benchmark Harness,专门用于代码微基准测试的工具集
① 创建 maven 工程
② 引入 JMH 依赖
<dependency>
<groupId>org.openjdk.jmhgroupId>
<artifactId>jmh-coreartifactId>
<version>1.19version>
dependency>
<dependency>
<groupId>org.openjdk.jmhgroupId>
<artifactId>jmh-generator-annprocessartifactId>
<version>1.19version>
<scope>providedscope>
dependency>
③ 测试代码
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Thread)
public class JMHExample01 {
private final static String DATA = "DUMMY DATA";
private List<String> arrayList;
private List<String> linkedList;
@Setup(Level.Iteration)
public void setUp(){
this.arrayList = new ArrayList<>();
this.linkedList = new LinkedList<>();
}
@Benchmark
public List<String> arrayListAdd(){
this.arrayList.add(DATA);
return arrayList;
}
@Benchmark
public List<String> linkedListAdd(){
this.linkedList.add(DATA);
return linkedList;
}
public static void main(String[] args) throws RunnerException {
final Options opts = new OptionsBuilder().include(JMHExample01.class.getSimpleName())
.forks(1)
.measurementIterations(10)
.warmupIterations(10)
.build();
new Runner(opts).run();
}
}
④ 测试结果
...
# Run complete. Total time: 00:01:06
Benchmark Mode Cnt Score Error Units
MemoryTest.JMHExample01.arrayListAdd avgt 10 0.019 ± 0.012 us/op
MemoryTest.JMHExample01.linkedListAdd avgt 10 0.130 ± 0.092 us/op
测试结论:
ArrayList.add() 方法的调用平均响应时间为 0.019 us,误差在 0.012 us
LinkedList.add() 方法的调用平均响应时间为 0.013 us,误差在 0.092 us
JMH 使用 @Benchmark 对基准测试方法进行标记,区别于普通方法,被标记的方法会执行基准测试
@Benchmark 之于 JMH 等价于 @Test 之于 Junit4.x
返回到上述代码,@Benchmark 被用在两个测试方法 arrayListAdd() 和 linkedListAddd() 上
Warmup 及 Measurement 的作用:分批次执行基准测试方法
Warmup :在基准测试代码正式度量前,对其进行预热,使得代码的执行经历过了类的早期优化、JVM 运行期编译、JIT 优化后的最终状态
Measurement :度量操作,每轮的度量中,所有的度量数据会被纳入统计中
本例:构造 Optins 时设置 Warmup 和 Measurement 的执行批次
final Options opts = new OptionsBuilder().include(JMHExample01.class.getSimpleName())
.forks(1)
.measurementIterations(10) // 10 次调度批次对基准方法的执行和调用都会纳入统计
.warmupIterations(10) // 进行真正的度量前,会对代码进行 3 个批次的热身
.build();
基准测试方法间与其所在的测试类 JVM 进程共享,可能会存在干扰
将 fork 设置为 1,每次运行基准测试时都会开辟一个全新的 JVM 进程对其进行测试,多个基准测试间将不会存在干扰
JMH 使用 @BenchmarkMode 注解声明使用那种发模式来运行
四种模式:
OutputTimeUnit 统计结果输出时的单位
@OutputTimeUnit(TimeUnit.MICROSECONDS)
可在 class 上设置,也可在基准方法上设置
State 对应 Scope 的三个枚举值
@Setup 在每个基准测试方法执行前被调用,通常用于资源的初始化
@TearDown :在基准测试方法被执行之后被调用,通常用于资源的回收清理工作
上述两个注解是JMH的套件测试,类似于 Junit 中的 @Before, @After, @BeforeClass, @AfterClass
@Setup(Level.Iteration)
,其中的 Level 用于配置上述两个套件执行方式
@Measurement(iterations = 10)
@Warmup(iterations = 10)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
public class Synchronized_Lock_AtomicInteger {
@State(Scope.Group)
public static class IntMonitor{
private int x;
private final Lock lock = new ReentrantLock();
// 使用显式锁 Lock 进行共享资源同步
public void lockInc(){
lock.lock();
try{
x++;
}finally {
lock.unlock();
}
}
// 使用 synchronized 关键字进行共享资源同步
public void synInc(){
synchronized (this){
x++;
}
}
}
@State(Scope.Group)
public static class AtomicIntegerMonitor{
private AtomicInteger x = new AtomicInteger();
public void inc(){
x.incrementAndGet();
}
}
@GroupThreads(10)
@Group("sync")
@Benchmark
public void syncInc(IntMonitor monitor){
monitor.synInc();
}
@GroupThreads(10)
@Group("lock")
@Benchmark
public void lockInc(IntMonitor monitor){
monitor.lockInc();
}
@GroupThreads(10)
@Group("atomic")
@Benchmark
public void atomicIntegerInc(AtomicIntegerMonitor monitor){
monitor.inc();
}
public static void main(String[] args) throws RunnerException {
Options opts = new OptionsBuilder()
.include(Synchronized_Lock_AtomicInteger.class.getSimpleName())
.forks(1)
.timeout(TimeValue.seconds(10))
.addProfiler(StackProfiler.class)
.build();
new Runner(opts).run();
}
}
输出结果:
Benchmark Mode Cnt Score Error Units
MemoryTest.Synchronized_Lock_AtomicInteger.atomic avgt 10 0.190 ± 0.002 us/op
MemoryTest.Synchronized_Lock_AtomicInteger.lock avgt 10 0.218 ± 0.003 us/op
MemoryTest.Synchronized_Lock_AtomicInteger.sync avgt 10 0.281 ± 0.034 us/op
即:表现情况
AtomicInteger > ReentrantLock > Synchronized
上面的示例中加入了 .addProfiler(StackProfiler.class)
可以通过 StackProfiler 输出线程堆栈信息,统计程序在执行过程中线程状态的数据,比如 RUNNING 状态、WAIT 状态所占的百分比
## AtomicInteger
....[Thread state distributions].............
98.2% RUNNABLE
1.8% WAITING
## ReentrantLock
....[Thread state distributions].............
78.0% WAITING
22.0% RUNNABLE
## Synchronized
....[Thread state distributions].............
69.7% BLOCKED
29.5% RUNNABLE
0.8% WAITING
即 AtomicInteger 的 RUNNABLE 状态高达 98.2%,并且没有 BLOCKED 状态,
而 synchronized 关键字则相反,BLOCK 状态高达 68.5%
本质上 AtomicInteger 是基于 Unsafe 类的 compareAndSwap 来实现的
Unsafe 由 C++ 实现,内部存在大量的汇编 CPU 指令等代码
此处需要补充一篇关于 Unsafe 类的解读
pg 30 基准测试方法中,需要将两个计算结果作为返回值
JMH 提供了一个称为 Blackhole 类,可以在不作任何返回的情况下避免 Dead Code 的发生
@Measurement(iterations = 10)
@Warmup(iterations = 10)
@BenchmarkMode(Mode.AverageTime)
@Threads(1)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Thread)
public class ReentrantLockExample4 {
public static class Test{
private int x = 10;
private final Lock lock = new ReentrantLock();
public int baseMethod(){
return x;
}
public int lockMethod(){
lock.lock();
try{
return x;
}finally {
lock.unlock();
}
}
public int syncMethod(){
synchronized (this){
return x;
}
}
}
private Test test;
@Setup(Level.Iteration)
public void setUp(){
this.test = new Test();
}
@Benchmark
public void base(Blackhole hole){
hole.consume(test.baseMethod());
}
@Benchmark
public void testLockMethod(Blackhole hole){
hole.consume(test.lockMethod());
}
@Benchmark
public void testSyncMethod(Blackhole hole){
hole.consume(test.syncMethod());
}
public static void main(String[] args) throws RunnerException {
Options opts = new OptionsBuilder()
.include(ReentrantLockExample4.class.getSimpleName())
.forks(1)
.build();
new Runner(opts).run();
}
}
测试结果
Benchmark Mode Cnt Score Error Units
MemoryTest.ReentrantLockExample4.base avgt 10 0.002 ± 0.001 us/op
MemoryTest.ReentrantLockExample4.testLockMethod avgt 10 0.019 ± 0.001 us/op
MemoryTest.ReentrantLockExample4.testSyncMethod avgt 10 0.004 ± 0.001 us/op
结论:单线程访问情况下,synchronized 关键字的性能远高于 lock 锁
@Measurement(iterations = 10)
@Warmup(iterations = 10)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
public class ReentrantLockExample5 {
@State(Scope.Group)
public static class Test{
private int x = 10;
private final Lock lock = new ReentrantLock();
public int baseMethod(){
return x;
}
public int lockMethod(){
lock.lock();
try{
return x;
}finally {
lock.unlock();
}
}
public int syncMethod(){
synchronized (this){
return x;
}
}
}
@GroupThreads(10)
@Group("base")
@Benchmark
public void base(Test test, Blackhole hole){
hole.consume(test.baseMethod());
}
@GroupThreads(10)
@Group("lock")
@Benchmark
public void testLockMethod(Test test, Blackhole hole){
hole.consume(test.lockMethod());
}
@GroupThreads(10)
@Group("sync")
@Benchmark
public void testSyncMethod(Test test, Blackhole hole){
hole.consume(test.syncMethod());
}
public static void main(String[] args) throws RunnerException {
Options opts = new OptionsBuilder()
.include(ReentrantLockExample5.class.getSimpleName())
.forks(1)
.build();
new Runner(opts).run();
}
}
测试结果:
Benchmark Mode Cnt Score Error Units
MemoryTest.ReentrantLockExample5.base avgt 10 0.005 ± 0.001 us/op
MemoryTest.ReentrantLockExample5.lock avgt 10 0.241 ± 0.003 us/op
MemoryTest.ReentrantLockExample5.sync avgt 10 0.277 ± 0.004 us/op
测试结论:多线程读操作情况下,显式锁 Lock 的性能优于 synchronized 关键字
设置 10 个线程,其中 5 个线程并发修改共享资源,5 个线程并发读取共享资源
@Measurement(iterations = 10)
@Warmup(iterations = 10)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
public class ReentrantLockExample6 {
@State(Scope.Group)
public static class Test{
private int x = 10;
private final Lock lock = new ReentrantLock();
public void lockInc(){
lock.lock();
try{
x++;
}finally {
lock.unlock();
}
}
public int lockGet(){
lock.lock();
try{
return x;
}finally {
lock.unlock();
}
}
public void syncInc(){
synchronized (this){
x++;
}
}
public int syncGet(){
synchronized (this){
return x;
}
}
}
@GroupThreads(5)
@Group("lock")
@Benchmark
public void lockInc(Test test){
test.lockInc();
}
@GroupThreads(5)
@Group("lock")
@Benchmark
public void lockGet(Test test, Blackhole blackhole){
blackhole.consume(test.lockGet());
}
@GroupThreads(5)
@Group("sync")
@Benchmark
public void syncInc(Test test){
test.syncInc();
}
@GroupThreads(5)
@Group("sync")
@Benchmark
public void syncGet(Test test, Blackhole blackhole){
blackhole.consume(test.syncGet());
}
public static void main(String[] args) throws RunnerException {
Options opts = new OptionsBuilder()
.include(ReentrantLockExample6.class.getSimpleName())
.forks(1)
.build();
new Runner(opts).run();
}
}
测试结果
Benchmark Mode Cnt Score Error Units
MemoryTest.ReentrantLockExample6.lock avgt 10 0.224 ± 0.003 us/op
MemoryTest.ReentrantLockExample6.lock:lockGet avgt 10 0.242 ± 0.020 us/op
MemoryTest.ReentrantLockExample6.lock:lockInc avgt 10 0.207 ± 0.016 us/op
MemoryTest.ReentrantLockExample6.sync avgt 10 0.268 ± 0.001 us/op
MemoryTest.ReentrantLockExample6.sync:syncGet avgt 10 0.278 ± 0.005 us/op
MemoryTest.ReentrantLockExample6.sync:syncInc avgt 10 0.257 ± 0.003 us/op
现在存在的问题是:里面的那个 lock 和 sync 是哪里的?
应该是 lock 对应的 10 个线程组【5个读线程,5 个写线程】的平均响应时间
@Measurement(iterations = 10)
@Warmup(iterations = 10)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
public class ReentrantReadWriteLockExample2 {
@State(Scope.Group)
public static class Test{
private int x = 10;
private final Lock lock = new ReentrantLock();
public int baseMethod(){
return x;
}
public int lockMethod(){
lock.lock();
try{
return x;
}finally {
lock.unlock();
}
}
public int syncMethod(){
synchronized (this){
return x;
}
}
private final ReadWriteLock readWriteLock = new ReentrantReadWriteLock();
private final Lock readLock = readWriteLock.readLock();
public int readLockMethod(){
readLock.lock();
try{
return x;
}finally {
readLock.unlock();
}
}
}
@GroupThreads(10)
@Group("base")
@Benchmark
public void base(Test test, Blackhole hole){
hole.consume(test.baseMethod());
}
@GroupThreads(10)
@Group("lock")
@Benchmark
public void testLockMethod(Test test, Blackhole hole){
hole.consume(test.lockMethod());
}
@GroupThreads(10)
@Group("sync")
@Benchmark
public void testSyncMethod(Test test, Blackhole hole){
hole.consume(test.syncMethod());
}
@GroupThreads(10)
@Group("readLocks")
@Benchmark
public void testReadLockMethod(Test test, Blackhole hole){
hole.consume(test.readLockMethod());
}
public static void main(String[] args) throws RunnerException {
Options opts = new OptionsBuilder()
.include(ReentrantReadWriteLockExample2.class.getSimpleName())
.forks(1)
.build();
new Runner(opts).run();
}
}
测试结果:
Benchmark Mode Cnt Score Error Units
MemoryTest.ReentrantReadWriteLockExample2.base avgt 10 0.007 ± 0.001 us/op
MemoryTest.ReentrantReadWriteLockExample2.lock avgt 10 0.241 ± 0.006 us/op
MemoryTest.ReentrantReadWriteLockExample2.readLocks avgt 10 1.809 ± 0.019 us/op
MemoryTest.ReentrantReadWriteLockExample2.sync avgt 10 0.316 ± 0.016 us/op
结论:
10 个线程并发只读情况下,性能表现好坏程度:
ReentrantLock > synchronized > ReentrantReadWriteLock
即在没有任何写操作情况下,读锁效率反而是最差的,因此在 JDK 1.8 版本中引入 StampedLock
设置 5 个线程读 5 个线程写
@Measurement(iterations = 10)
@Warmup(iterations = 10)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
public class ReentrantReadWriteLockExample3 {
@State(Scope.Group)
public static class Test{
private int x = 10;
private final Lock lock = new ReentrantLock();
private final ReadWriteLock readWriteLock = new ReentrantReadWriteLock();
private final Lock readLock = readWriteLock.readLock();
private final Lock writeLock = readWriteLock.writeLock();
public void lockInc(){
lock.lock();
try{
x++;
}finally {
lock.unlock();
}
}
public int lockGet(){
lock.lock();
try{
return x;
}finally {
lock.unlock();
}
}
public void syncInc(){
synchronized (this){
x++;
}
}
public int syncGet(){
synchronized (this){
return x;
}
}
public void writeLockInc(){
writeLock.lock();
try{
x++;
}finally {
writeLock.unlock();
}
}
public int readLockGet(){
readLock.lock();
try{
return x;
}finally {
readLock.unlock();
}
}
}
@GroupThreads(5)
@Group("lock")
@Benchmark
public void lockInc(Test test){
test.lockInc();
}
@GroupThreads(5)
@Group("lock")
@Benchmark
public void lockGet(Test test, Blackhole blackhole){
blackhole.consume(test.lockGet());
}
@GroupThreads(5)
@Group("sync")
@Benchmark
public void syncInc(Test test){
test.syncInc();
}
@GroupThreads(5)
@Group("sync")
@Benchmark
public void syncGet(Test test, Blackhole blackhole){
blackhole.consume(test.syncGet());
}
@GroupThreads(5)
@Group("rwlock")
@Benchmark
public void writeLockInc(Test test){
test.writeLockInc();
}
@GroupThreads(5)
@Group("rwlock")
@Benchmark
public void readLockGet(Test test, Blackhole blackhole){
blackhole.consume(test.readLockGet());
}
public static void main(String[] args) throws RunnerException {
Options opts = new OptionsBuilder()
.include(ReentrantReadWriteLockExample3.class.getSimpleName())
.forks(1)
.build();
new Runner(opts).run();
}
}
测试结果:
Benchmark Mode Cnt Score Error Units
MemoryTest.ReentrantReadWriteLockExample3.lock avgt 10 0.229 ± 0.005 us/op
MemoryTest.ReentrantReadWriteLockExample3.lock:lockGet avgt 10 0.252 ± 0.014 us/op
MemoryTest.ReentrantReadWriteLockExample3.lock:lockInc avgt 10 0.207 ± 0.007 us/op
MemoryTest.ReentrantReadWriteLockExample3.rwlock avgt 10 0.542 ± 0.033 us/op
MemoryTest.ReentrantReadWriteLockExample3.rwlock:readLockGet avgt 10 0.842 ± 0.062 us/op
MemoryTest.ReentrantReadWriteLockExample3.rwlock:writeLockInc avgt 10 0.242 ± 0.006 us/op
MemoryTest.ReentrantReadWriteLockExample3.sync avgt 10 0.277 ± 0.011 us/op
MemoryTest.ReentrantReadWriteLockExample3.sync:syncGet avgt 10 0.297 ± 0.012 us/op
MemoryTest.ReentrantReadWriteLockExample3.sync:syncInc avgt 10 0.256 ± 0.010 us/op
可以看到 ReentrantReadWriteLock 在并发情况下读锁的性能并不高,反而是写锁的性能稍好?
什么原因呢?
StampedLock 简要了解:
比较 ReentrantLock, ReentrantReadWriteLock, StampedLock 的性能表现
测试吞吐量:每秒的方法吞吐量 / 调用次数,数值越大代表吞吐量越高
@Measurement(iterations = 20)
@Warmup(iterations = 20)
@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.SECONDS)
public class StampedLockExampled4 {
@State(Scope.Group)
public static class Test{
private int x = 10;
private final Lock lock = new ReentrantLock();
private final ReadWriteLock readWriteLock = new ReentrantReadWriteLock();
private final Lock readLock = readWriteLock.readLock();
private final Lock writeLock = readWriteLock.writeLock();
private final StampedLock stampedLock = new StampedLock();
public void stampedLockInc(){
long stamped = stampedLock.writeLock();
try{
x++;
}finally {
stampedLock.unlockWrite(stamped);
}
}
public int stampedReadLockGet(){
long stamped = stampedLock.readLock();
try{
return x;
}finally {
stampedLock.unlockRead(stamped);
}
}
public int stampedOptimisticReadLockGet(){
long stamped = stampedLock.tryOptimisticRead();
if(!stampedLock.validate(stamped)){
stamped = stampedLock.readLock();
try{
return x;
}finally {
stampedLock.unlockRead(stamped);
}
}
return x;
}
public void lockInc(){
lock.lock();
try{
x++;
}finally {
lock.unlock();
}
}
public int lockGet(){
lock.lock();
try{
return x;
}finally {
lock.unlock();
}
}
public void writeLockInc(){
writeLock.lock();
try{
x++;
}finally {
writeLock.unlock();
}
}
public int readLockGet(){
readLock.lock();
try{
return x;
}finally {
readLock.unlock();
}
}
}
@GroupThreads(5)
@Group("lock")
@Benchmark
public void lockInc(Test test){
test.lockInc();
}
@GroupThreads(5)
@Group("lock")
@Benchmark
public void lockGet(Test test, Blackhole blackhole){
blackhole.consume(test.lockGet());
}
@GroupThreads(5)
@Group("rwlock")
@Benchmark
public void writeLockInc(Test test){
test.writeLockInc();
}
@GroupThreads(5)
@Group("rwlock")
@Benchmark
public void readLockGet(Test test, Blackhole blackhole){
blackhole.consume(test.readLockGet());
}
@GroupThreads(5)
@Group("stampedLock")
@Benchmark
public void writeStampedLockInc(Test test){
test.stampedLockInc();
}
@GroupThreads(5)
@Group("stampedLock")
@Benchmark
public void readStampedLockGet(Test test, Blackhole blackhole){
blackhole.consume(test.stampedReadLockGet());
}
@GroupThreads(5)
@Group("stampedLockOptimistic")
@Benchmark
public void writeStampedLockInc2(Test test){
test.stampedLockInc();
}
@GroupThreads(5)
@Group("stampedLockOptimistic")
@Benchmark
public void readStampedLockGet2(Test test, Blackhole blackhole){
blackhole.consume(test.stampedOptimisticReadLockGet());
}
public static void main(String[] args) throws RunnerException {
Options opts = new OptionsBuilder()
.include(StampedLockExampled4.class.getSimpleName())
.forks(1)
.build();
new Runner(opts).run();
}
}
测试结果:
① 5 个读线程 5 个写线程总体性能对比【吞吐量】
ReentrantLock | ReentrantRWLock | StampedLock | Optimistic |
---|---|---|---|
34410694.197 | 22397554.511 | 23228158.443 | 84757158.553 |
性能表现:Optimistic > ReentrantLock > StampedLock > ReentrantRWLock
② 5 个读线程 5 个写线程读性能对比【吞吐量】
ReentrantLock | ReentrantRWLock | StampedLock | Optimistic |
---|---|---|---|
16541110.958 | 4797570.368 | 64686.325 | 59849633.846 |
性能表现:Optimistic > ReentrantLock > ReentrantRWLock > StampedLock
③ 5 个读线程 5 个写线程写性能对比【吞吐量】
ReentrantLock | ReentrantRWLock | StampedLock | Optimistic |
---|---|---|---|
17869583.239 | 17599984.143 | 23163472.118 | 24907524.707 |
性能表现:Optimistic > StampedLock > ReentrantLock > ReentrantRWLock
这里 Optimistic 的写测试方法,实际上是和 StampedLock 一样的,测试结果与书上不同
差别仅体现在 StampedLock 分别调用 readLock() 和 trytryOptimisticRead() 转换到调用 writeLock() 的差别
扩展测试:
@GroupThreads(10)
@GroupThreads(16)
写测试方法设置 @GroupThreads(4)
StampedLock 总结:
ArrayBlockingQueue:
阻塞写 | 非阻塞写 | 阻塞读 | 非阻塞读 |
---|---|---|---|
void put(E e) | boolean add(E e) | E take() | E poll() |
boolean offer(E e, Long timeout, TimeUnit unit) | boolean offer(E e) | E poll(Long timeout, TimeUnit unit) | E peek() |
PriorityBlockingQueue: 无边界阻塞队列,根据某种规则对插入队列尾部的元素进行排序
主要特点:头插和头删,每次插入删除都会进行调整
阻塞写 | 非阻塞写 | 阻塞读 | 非阻塞读 |
---|---|---|---|
offer(E e) | E take() | E poll() | |
E poll(Long timeout, TimeUnit unit) | E peek() |
LinkedBlockingQueue: 可选边界,基于链表实现 FIFO 队列
可选边界:有边界在构造函数中指明 capacity
,无边界默认 Integer.MAX_VALUE
DelayQueue: 无边界阻塞队列,存入其中的数据元素会被延迟单位时间后才消费
存入其中的元素类型必须是 Delayed 接口的子类
SynchronousQueue: 无容量概念,每次对其写入操作必须等待(阻塞)其他线程进行对应的移除操作
LinkedBlockingDeque: 基于链表实现的双向阻塞队列,支持队尾写入数据,读取移除数据;队头写入数据,读取移除数据
LinkedTransferQueue: 是 TransferQueue 接口的实现类,无界队列,具有 FIFO 特性
以下程序:设置 10 个线程同时读写( 5 个线程向队列尾部插入数据, 5 个线程从队列头部读取数据)
对比:ConcurrentLinkedQueue 和自定义的使用 synchronized 封装 LinkedList 实现的并发队列 SynchronizedLinkedList
@Warmup(iterations = 10)
@Measurement(iterations = 10)
@Fork(1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Group)
public class ConcurrentLinkedQueueVsSynchronizedList {
private SynchronizedLinkedList synchronizedList;
private ConcurrentLinkedQueue<String> concurrentLinkedQueue;
private final static String DATA = "TEST";
private final static Object LOCK = new Object();
private static class SynchronizedLinkedList{
private LinkedList<String> list = new LinkedList<>();
void addLast(String element){
synchronized (LOCK){
list.addLast(element);
}
}
String removeFirst(){
synchronized (LOCK){
if(list.isEmpty()){
return null;
}
return list.removeFirst();
}
}
}
@Setup(Level.Iteration)
public void setUp(){
synchronizedList = new SynchronizedLinkedList();
concurrentLinkedQueue = new ConcurrentLinkedQueue<>();
}
@Group("sync")
@Benchmark
@GroupThreads(5)
public void synchronizedListAdd(){
synchronizedList.addLast(DATA);
}
@Group("sync")
@Benchmark
@GroupThreads(5)
public String synchronizedListGet(){
return synchronizedList.removeFirst();
}
@Group("concurrent")
@Benchmark
@GroupThreads(5)
public void concurrentLinkedQueueAdd(){
concurrentLinkedQueue.offer(DATA);
}
@Group("concurrent")
@Benchmark
@GroupThreads(5)
public String concurrentLinkedQueueGet(){
return concurrentLinkedQueue.poll();
}
public static void main(String[] args) throws RunnerException {
final Options opt = new OptionsBuilder()
.include(ConcurrentLinkedQueueVsSynchronizedList.class.getSimpleName())
.build();
new Runner(opt).run();
}
}
测试结果:
Benchmark Mode Cnt Score Error Units
MemoryTest.ConcurrentLinkedQueueVsSynchronizedList.concurrent avgt 10 0.534 ± 0.051 us/op
MemoryTest.ConcurrentLinkedQueueVsSynchronizedList.concurrent:concurrentLinkedQueueAdd avgt 10 0.643 ± 0.055 us/op
MemoryTest.ConcurrentLinkedQueueVsSynchronizedList.concurrent:concurrentLinkedQueueGet avgt 10 0.425 ± 0.060 us/op
MemoryTest.ConcurrentLinkedQueueVsSynchronizedList.sync avgt 10 0.773 ± 0.029 us/op
MemoryTest.ConcurrentLinkedQueueVsSynchronizedList.sync:synchronizedListAdd avgt 10 0.508 ± 0.018 us/op
MemoryTest.ConcurrentLinkedQueueVsSynchronizedList.sync:synchronizedListGet avgt 10 1.038 ± 0.043 us/op
测试结论:并发队列采用 无锁算法实现,因此在并发情况下其读写性能优于使用synchronized 实现同步的 LinkedList
无锁算法:
例如其中的 offer() 方法
if (p.casNext(null, newNode)) {
// Successful CAS is the linearization point
// for e to become an element of this queue,
// and for newNode to become "live".
if (p != t) // hop two nodes at a time
casTail(t, newNode); // Failure is OK.
return true;
}