ClickHouse是一个用于联机分析(OLAP)的列式数据库管理系统(DBMS)。它的表分为两种:一种是分布式表,一种是本地表:
当然,ClickHouse还有其他形式的表,如
在ck中,副本机制其实也称为『复制表』,这是因为它工作在表级别上,而不是集群级别(如HDFS),所以在同一个服务节点中,每张表的引擎选择是相互独立的,他们的分片与副本也是互相独立的。目前支持复制表的引擎是ReplicatedMergeTree引擎族。

下面谈谈数据的写入。
数据是直接写入本地表还是分布式表呢? 看了相关的文章,都是建议直接写本地表,如
Sharding key in Distributed table is used only at INSERT.
For SELECTs, sharding key does not make sense and Distributed tables always query all shards.Insertion to local tables is more efficient and more flexible than insertion to Distributed table.
It is more efficient because it avoids excessive copying of temporary data.
It is more flexible because you can use any sophisticated sharding schemas, not only simple sharding by modulo of division.Insertion to local tables require more logic on your client application and can be more difficult to use. But also it is conceptually more simple.
If your queries rely on some assumptions on data distribution, like queries that use IN or JOIN (joining co-located data) instead of GLOBAL IN, GLOBAL JOIN, then you have to maintain correctness by yourself.
正如上面的建议:
当然插入数据量不大时,本地表与分布式表都可以。