背景:
ClickHouse是一个用于在线分析处理(OLAP)的列式数据库管理系统,它以其高速查询性能和灵活的数据分析能力而闻名。然而,作为一款高性能数据库系统,ClickHouse对数据插入的管理同样严格。最近,我们在日常监控中遇到了一个关于max_partitions_per_insert_block的错误,这可能会对数据插入操作产生影响,影响整体数据库性能。
问题描述:
近期clickhouse日志监控告警,查看错误信息如下:
2024.04.26 09:34:13.309417 [ 18490 ] {} <Error> xxx.xxxxx.xxxx: Code: 252. DB::Exception:
Received from xxx.xxx.xxx.xxx:9000. DB::Exception: Too many partitions for single INSERT
block (more than 100).The limit is controlled by 'max_partitions_per_insert_block' setting.
Large number of partitions is a common misconception. It will lead to severe negative
performance impact, including slow server startup, slow INSERT queries and slow SELECT queries.
Recommended total number of partitions for a table is under 1000..10000. Please note, that
partitioning is not intended to speed up SELECT queries (ORDER BY key is sufficient to make
range queries fast). Partitions are intended for data manipulation (DROP PARTITION, etc)..
这表明插入操作涉及的分区数量超过了系统允许的最大值。ClickHouse默认每次插入操作可以涉及的最大分区数为100,这是为了避免由于分区数量过多带来的性能负担。
解决步骤:
确认当前设置:
首先,通过ClickHouse的SQL接口确认当前max_partitions_per_insert_block的设置值:
show settings like 'max_partitions_per_insert_block';
┌─name────────────────────────────┬─type───┬─value─┐
│ max_partitions_per_insert_block │ UInt64 │ 100 │
└─────────────────────────────────┴────────┴───────┘
修改配置文件:
如果默认的分区限制值过低,不满足业务需求,可以通过修改ClickHouse的配置文件users.xml来调整这一设置。增加max_partitions_per_insert_block的值可以容许每次插入操作涉及更多的分区:
在users.xml中找到profiles->default->max_partitions_per_insert_block
<profiles>
<!-- Default settings. -->
<default>
<!-- How to choose between replicas during distributed query processing.
random - choose random replica from set of replicas with minimum number of errors
nearest_hostname - from set of replicas with minimum number of errors, choose replica
with minimum number of different symbols between replica's hostname and local hostname
(Hamming distance).
in_order - first live replica is chosen in specified order.
first_or_random - if first replica one has higher number of errors, pick a random one from replicas with minimum number of errors.
-->
<load_balancing>random</load_balancing>
<!-- 添加如下配置 -->
<max_partitions_per_insert_block>500</max_partitions_per_insert_block>
<!-- 添加结束 -->
</default>
<!-- Profile that allows only read queries. -->
<readonly>
<readonly>1</readonly>
</readonly>
</profiles>
修改后,保存并关闭文件。此设置更改无需重启ClickHouse服务即可生效。
验证修改:
更改设置后,重新检查以确认新的设置已经生效:
show settings like 'max_partitions_per_insert_block';
┌─name────────────────────────────┬─type───┬─value─┐
│ max_partitions_per_insert_block │ UInt64 │ 500 │
└─────────────────────────────────┴────────┴───────┘
总结:
通过适当调整max_partitions_per_insert_block的设置,我们可以灵活管理数据插入过程中涉及的分区数,以优化性能和避免可能的错误。此类配置的调整应根据实际的业务需求和系统表现进行,过多的分区虽然可以提高数据组织的灵活性,但也可能对系统性能产生负面影响。因此,管理者应该在调整此类参数时保持审慎,确保改动能够真正符合业务的长远发展和系统的稳定运行。
「欢迎关注我们的公众号,获取更多技术分享与经验交流。」




