• HBase集群复制之验证


    0. prerequisite
    Suppose 2 HBase pseudo distributed clusters have both started as folowing

    relevant parameters in hbase-site.xml sourcedestnation
    hbase.zookeeper.quorummacosubuntu
    hbase.zookeeper.property.clientPort21812181
    zookeeper.znode.parent/hbase  /hbase

    1. Create table for replication
    1) start hbase shell on source cluster and create a table

    1. $ cd $HOME_HBASE
    2. $ bin/hbase shell
    3. > create 'peTable', {NAME => 'info0', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536', METADATA => {'IN_MEMORY_COMPACTION' => 'NONE'}}

    2) create excatly same table on destination cluster 

    2. Add the destination cluster as a peer in source cluster hbase shell

    > add_peer 'ubt_pe', CLUSTER_KEY => "ubuntu:2181:/hbase", TABLE_CFS => { "peTable" => []}

    3. Enable the table for replication in source cluster hbase shell

    > enable_table_replication 'peTable'

    4. Put data by using HBase PerformanceEvaluation tool

    1. $ cd $HOME_HBASE
    2. $ bin/hbase pe --table=peTable --nomapred --valueSize=100 randomWrite 1
    3. 2023-09-08 19:57:55,256 INFO [main] hbase.PerformanceEvaluation: RandomWriteTest test run options={"cmdName":"randomWrite","nomapred":true,"filterAll":false,"startRow":0,"size":0.0,"perClientRunRows":1048576,"numClientThreads":1,"totalRows":1048576,"measureAfter":0,"sampleRate":1.0,"traceRate":0.0,"tableName":"peTable","flushCommits":true,"writeToWAL":true,"autoFlush":false,"oneCon":false,"connCount":-1,"useTags":false,"noOfTags":1,"reportLatency":false,"multiGet":0,"multiPut":0,"randomSleep":0,"inMemoryCF":false,"presplitRegions":0,"replicas":1,"compression":"NONE","bloomType":"ROW","blockSize":65536,"blockEncoding":"NONE","valueRandom":false,"valueZipf":false,"valueSize":100,"period":104857,"cycles":1,"columns":1,"families":1,"caching":30,"latencyThreshold":0,"addColumns":true,"inMemoryCompaction":"NONE","asyncPrefetch":false,"cacheBlocks":true,"scanReadType":"DEFAULT","bufferSize":"2097152"}
    4. ...
    5. 2023-09-08 19:57:58,476 INFO [TestClient-0] hbase.PerformanceEvaluation: row [start=0, current=104857, last=1048576], latency [mean=19.87, min=0.00, max=328487.00, stdDev=1355.87, 95th=1.00, 99th=8.00]
    6. 2023-09-08 19:57:59,679 INFO [TestClient-0] hbase.PerformanceEvaluation: row [start=0, current=209714, last=1048576], latency [mean=15.34, min=0.00, max=328487.00, stdDev=1026.36, 95th=1.00, 99th=4.00]
    7. ...
    8. 2023-09-08 19:58:10,520 INFO [TestClient-0] hbase.PerformanceEvaluation: row [start=0, current=1048570, last=1048576], latency [mean=13.17, min=0.00, max=328487.00, stdDev=780.16, 95th=0.00, 99th=1.00]
    9. 2023-09-08 19:58:10,569 INFO [TestClient-0] hbase.PerformanceEvaluation: Test : RandomWriteTest, Thread : TestClient-0
    10. 2023-09-08 19:58:10,577 INFO [TestClient-0] hbase.PerformanceEvaluation: Latency (us) : mean=13.17, min=0.00, max=328487.00, stdDev=780.16, 50th=0.00, 75th=0.00, 95th=0.00, 99th=1.00, 99.9th=19.00, 99.99th=28853.39, 99.999th=58579.15
    11. 2023-09-08 19:58:10,577 INFO [TestClient-0] hbase.PerformanceEvaluation: Num measures (latency) : 1048575
    12. 2023-09-08 19:58:10,584 INFO [TestClient-0] hbase.PerformanceEvaluation: Mean = 13.17
    13. Min = 0.00
    14. Max = 328487.00
    15. StdDev = 780.16
    16. 50th = 0.00
    17. 75th = 0.00
    18. 95th = 0.00
    19. 99th = 1.00
    20. 99.9th = 19.00
    21. 99.99th = 28853.39
    22. 99.999th = 58579.15
    23. 2023-09-08 19:58:10,584 INFO [TestClient-0] hbase.PerformanceEvaluation: No valueSize statistics available
    24. 2023-09-08 19:58:10,586 INFO [TestClient-0] hbase.PerformanceEvaluation: Finished class org.apache.hadoop.hbase.PerformanceEvaluation$RandomWriteTest in 14286ms at offset 0 for 1048576 rows (9.24 MB/s)
    25. 2023-09-08 19:58:10,586 INFO [TestClient-0] hbase.PerformanceEvaluation: Finished TestClient-0 in 14286ms over 1048576 rows
    26. 2023-09-08 19:58:10,586 INFO [main] hbase.PerformanceEvaluation: [RandomWriteTest] Summary of timings (ms): [14286]
    27. 2023-09-08 19:58:10,595 INFO [main] hbase.PerformanceEvaluation: [RandomWriteTest duration ] Min: 14286ms Max: 14286ms Avg: 14286ms
    28. 2023-09-08 19:58:10,595 INFO [main] hbase.PerformanceEvaluation: [ Avg latency (us)] 13
    29. 2023-09-08 19:58:10,596 INFO [main] hbase.PerformanceEvaluation: [ Avg TPS/QPS] 73399 row per second
    30. 2023-09-08 19:58:10,596 INFO [main] client.AsyncConnectionImpl: Connection has been closed by main.

    Note, help of PerformanceEvaluation can be shown as:

    $ bin/hbase pe

    5. Count rows on source and peer
    1) in source cluster hbase shell

    1. > count 'peTable'
    2. Current count: 1000, row: 00000000000000000000001563
    3. Current count: 2000, row: 00000000000000000000003160
    4. ...
    5. Current count: 663000, row: 00000000000000000001048457
    6. 663073 row(s)
    7. Took 12.9970 seconds

    2) in peer cluster hbase shell

    1. > count 'peTable'
    2. Current count: 1000, row: 00000000000000000000001563
    3. Current count: 2000, row: 00000000000000000000003160
    4. ...
    5. Current count: 663000, row: 00000000000000000001048457
    6. 663073 row(s)
    7. Took 7.1883 seconds

    6. Verify replication by using VerifyReplication class from source cluster hbase shell

    1. $ cd $HOME_HBASE
    2. $ bin/hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication 'ubt_pe' 'peTable'
    3. 2023-09-08 20:14:37,199 INFO [main] zookeeper.RecoverableZooKeeper: Process identifier=VerifyReplication connecting to ZooKeeper ensemble=localhost:2181
    4. ...
    5. 2023-09-08 20:14:44,393 INFO [main] mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1694172104063_0001/
    6. 2023-09-08 20:14:44,394 INFO [main] mapreduce.Job: Running job: job_1694172104063_0001
    7. 2023-09-08 20:14:54,521 INFO [main] mapreduce.Job: Job job_1694172104063_0001 running in uber mode : false
    8. 2023-09-08 20:14:54,524 INFO [main] mapreduce.Job: map 0% reduce 0%
    9. 2023-09-08 20:20:18,907 INFO [main] mapreduce.Job: map 100% reduce 0%
    10. 2023-09-08 20:20:19,924 INFO [main] mapreduce.Job: Job job_1694172104063_0001 completed successfully
    11. 2023-09-08 20:20:20,040 INFO [main] mapreduce.Job: uces in occupied slots (ms)=0
    12. Total time spent by all map tasks (ms)=321487
    13. Total vcore-milliseconds taken by all map tasks=321487
    14. Total megabyte-milliseconds taken by all map tasks=329202688
    15. Map-Reduce Framework
    16. Map input records=663073
    17. Map output records=0
    18. Input split bytes=105
    19. Spilled Records=0
    20. Failed Shuffles=0
    21. Merged Map outputs=0
    22. GC time elapsed (ms)=707
    23. CPU time spent (ms)=0
    24. Physical memory (bytes) snapshot=0
    25. Virtual memory (bytes) snapshot=0
    26. Total committed heap usage (bytes)=114819072
    27. HBaseCounters
    28. BYTES_IN_REMOTE_RESULTS=103439388
    29. BYTES_IN_RESULTS=103439388
    30. MILLIS_BETWEEN_NEXTS=313921
    31. NOT_SERVING_REGION_EXCEPTION=0
    32. REGIONS_SCANNED=1
    33. REMOTE_RPC_CALLS=60
    34. REMOTE_RPC_RETRIES=0
    35. ROWS_FILTERED=17
    36. ROWS_SCANNED=663073
    37. RPC_CALLS=60
    38. RPC_RETRIES=0
    39. org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier$Counters
    40. GOODROWS=663073
    41. File Input Format Counters
    42. Bytes Read=0
    43. File Output Format Counters
    44. Bytes Written=0

    Note, help of VerifyReplication can be shown as:

    $ bin/hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication –help

  • 相关阅读:
    SQL基本语句练习
    「PAT乙级真题解析」Basic Level 1093 字符串A+B (问题分析+完整步骤+伪代码描述+提交通过代码)
    Java项目:SSM图书馆图书管理系统
    可调用对象、std..function、std..bind
    Flink部署 完整使用 (第三章)
    七、克隆虚拟机、常见错误及解决方案
    2022乐鑫数字芯片提前批笔试
    实训笔记——Spark计算框架
    软件工程毕业设计课题(55)微信小程序毕业设计JAVA企业公司小程序系统设计与实现
    读后:水浒的水有多深
  • 原文地址:https://blog.csdn.net/sun_xo/article/details/132766392