ESM 更新版本 v0.5.0,经测 3 节点集群,在线 Elasticsearch 数据导入导出可以得到每分钟一千万条。
特性:
新增 buffer_count 来控制内存占用,避免大量并发造成的 OOM
添加压缩参数,支持 GZIP 压缩 HTTP 请求流量
改进:
优化性能,重用 Buffer,使用 FasthttpClient,吞吐能力提升
下载地址:
https://github.com/medcl/esm/releases/tag/v0.5.0
性能数据:
在一台 3 个节点的集群上面(3 * c5d.4xlarge, 16C,32GB,10Gbps),对一千万的 NGINX 访问日志使用 ESM 进行数据的导入导出,只需要耗时 55 秒,如果是导出到别的独立集群,可能更高。
./esm -s https://localhost:8000 -d https://localhost:8000 -x kibana_sample_data_logs -y logs-test -m elastic:medcl123 -n elastic:medcl123 --regenerate_id --repeat_times=5./esm -s https://localhost:8000 -d https://localhost:8000 -x logs-test -y logs-test1 -m elastic:medcl123 -n elastic:medcl123 --regenerate_id -w 20 --sliced_scroll_size=20 -b 20 --repeat_times=500./esm -s https://localhost:8000 -d https://localhost:8000 -x logs-test1 -y logs-test -m elastic:medcl123 -n elastic:medcl123 --regenerate_id -w 20 --sliced_scroll_size=20 -b 20 --repeat_times=500[12-19 06:29:40] [INF] [main.go:537,main] data migration finished.root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 40 --sliced_scroll_size=60 -b 5 --buffer_count=2000000 --regenerate_id[12-19 06:31:20] [INF] [main.go:506,main] start data migration..Scroll 10064570 / 10064570 [============================================] 100.00% 55sBulk 10062602 / 10064570 [=============================================] 99.98% 55s[12-19 06:32:15] [INF] [main.go:537,main] data migration finished.
上周末对 ESM 做了使用的介绍:
活动视频如下:
完整的演示脚本如下:
#生成测试数据./esm -s https://localhost:8000 -d https://localhost:8000 -x kibana_sample_data_logs -y logs-test -m elastic:medcl123 -n elastic:medcl123 --regenerate_id --repeat_times=5./esm -s https://localhost:8000 -d https://localhost:8000 -x logs-test -y logs-test1 -m elastic:medcl123 -n elastic:medcl123 --regenerate_id -w 20 --sliced_scroll_size=20 -b 20 --repeat_times=500./esm -s https://localhost:8000 -d https://localhost:8000 -x logs-test1 -y logs-test -m elastic:medcl123 -n elastic:medcl123 --regenerate_id -w 20 --sliced_scroll_size=20 -b 20 --repeat_times=500# 导数据./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123# 设置 worker./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 -b 5# 调整 Buffer./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 -b 5 --buffer_count=1000000# 调大 Slice./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000# 优化索引DELETE _template/logsGET _template/logsPUT _template/logs{"order": 0,"index_patterns": ["logs*"],"settings": {"codec": "default","index": {"number_of_shards": "12","number_of_replicas": "0","refresh_interval": "-1","translog.sync_interval": "30s","translog.durability": "async","translog.flush_threshold_size": "10g"}},"mappings": {"dynamic_templates": [{"strings": {"mapping": {"ignore_above": 256,"type": "keyword"},"match_mapping_type": "string"}}]},"aliases": {}}DELETE logs122PUT logs122GET logs122POST logs122/_refreshGET logs122/_searchGET _cat/indices#开启压缩./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000#自动生成 ID./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 --compress true --regenerate_id#走网关./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 --regenerate_idroot@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 --regenerate_id[12-19 06:10:53] [INF] [main.go:506,main] start data migration..Scroll 10064570 / 10064570 [===============================================================================================================] 100.00% 1m5sBulk 10062580 / 10064570 [=================================================================================================================] 99.98% 1m5s[12-19 06:11:58] [INF] [main.go:537,main] data migration finished.root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 --regenerate_id[12-19 06:14:29] [INF] [main.go:506,main] start data migration..Scroll 10064570 / 10064570 [===============================================================================================================] 100.00% 1m4sBulk 10062586 / 10064570 [=================================================================================================================] 99.98% 1m4s[12-19 06:15:33] [INF] [main.go:537,main] data migration finished.PUT _template/logs{"order": 0,"index_patterns": ["logs*"],"settings": {"codec": "default","index": {"number_of_shards": "12","number_of_replicas": "0","refresh_interval": "-1","translog.sync_interval": "90s","translog.durability": "async","translog.flush_threshold_size": "10g"}},"mappings": {"dynamic_templates": [{"strings": {"mapping": {"ignore_above": 256,"type": "keyword"},"match_mapping_type": "string"}}]},"aliases": {}}root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 40 --sliced_scroll_size=60 -b 5 --buffer_count=2000000 --regenerate_idScroll 1367104 / 1 [=======================================================================================================] 136710400.00% 2562047h47m16s[12-19 06:17:46] [INF] [main.go:506,main] start data migration..Scroll 10064570 / 10064570 [================================================================================================================] 100.00% 56sBulk 10062603 / 10064570 [==================================================================================================================] 99.98% 56s[12-19 06:18:42] [INF] [main.go:537,main] data migration finished.PUT _template/logs{"order": 0,"index_patterns": ["logs*"],"settings": {"codec": "default","index": {"number_of_shards": "24","number_of_replicas": "0","refresh_interval": "-1","translog.sync_interval": "90s","translog.durability": "async","translog.flush_threshold_size": "10g"}},"mappings": {"dynamic_templates": [{"strings": {"mapping": {"ignore_above": 256,"type": "keyword"},"match_mapping_type": "string"}}]},"aliases": {}}[12-19 06:29:40] [INF] [main.go:537,main] data migration finished.root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 40 --sliced_scroll_size=60 -b 5 --buffer_count=2000000 --regenerate_idScroll 1367104 / 1 [=======================================================================================================] 136710400.00% 2562047h47m16slogs1kw[12-19 06:31:20] [INF] [main.go:506,main] start data migration..Scroll 10064570 / 10064570 [================================================================================================================] 100.00% 55sBulk 10062602 / 10064570 [==================================================================================================================] 99.98% 55s[12-19 06:32:15] [INF] [main.go:537,main] data migration finished.
文章转载自弹性搜索,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




