暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

CM启动过程中DCC和DCF组件日志介绍

openGauss 2025-01-07
408

DCC和DCF概述

简介

DCC(Distributed Configuration Center):分布式配置中心,使用键值对的形式进行存储,用于实现集群中配置信息管理;openGauss CM依赖DCC组件对配置数据分布式存取,实现集群配置管理高可用能力。

DCF(Distributed Consensus Framework):分布式一致性框架,基于Paxos一致性协议实现日志多副本复制,可以实现DN自选主自冲裁或者CM仲裁。,DCF各节点角色介绍如下:

副本类型DCF节点角色说明访问
主副本Leader主节点,提供读写能力写、读
全能副本Follower1.参与投票;2.可转为Leader;3.多数派节点相当于同步备只读
只读副本Passive1.不参与投票;2.可转为Follower副本;3.相当于异步备只读
日志副本Logger1.参与投票,提供日志复制服务,可参与其他副本的恢复;2.不能转换类型;3.仅包含日志的副本,没有数据表不允许
级联副本Cascade Follower1.不参与投票;2.可转为Follower副本,日志仅从备机获取;3.包含的日志均在集群内达成多数派只读


在openGauss传统主备中由DCC组件拉起,负责CM仲裁,日志也集成在DCC日志目录下,所以遇到CM选主问题时可以看下DCC的日志去定位。

命令

使用CM操作dcc时需要使用从cm_ctl ddb

参数参数说明
–put [key] [value]往DCC或者share disk中插入键值对,如果键值对已存在则会修改键key所对应的值value。
–get [key]查询DCC或者share disk中key对应的value。
–delete [key]删除DCC或者share disk中指定的键值对。
–prefixget或者delete后添加prefix参数,可以实现模糊匹配查询和删除。
–cluster_infoDCC部署模式下获取数据库实例信息,share disk部署模式下不支持该命令。
–leader_infoDCC部署模式下获取主节点信息,share disk部署模式下不支持该命令。
–help,-h显示DDB命令帮助信息。
–version,-vDCC部署模式下显示DCC版本信息,share disk部署模式下不支持该命令。


说明:openGauss传统主备均为使用DCC模式部署

例如:

  • cm_ctl ddb --cluster_info
    查询集群信息

  • cm_ctl ddb --leader_info
    查询到主节点的信息

  • cm_ctl ddb --put
     存放键值对

  • cm_ctl ddb --get
     根据key获取value

CM启动DCC日志分析

在CM启动过程中,cms的日志中可以看到如下片段,即为启动DCC组件并使用DCC存储配置信息,日志路径$GAUSSLOG/cm/cm_server

    2024-12-02 00:04:20.444 tid=25703 MAIN LOG: successfully to get GAUSSLOG(/data/dcc/openGauss/install/log/dcc) from env.
    2024-12-02 00:04:20.444 tid=25703 MAIN LOG: cfg is [{"stream_id":1,"node_id":1,"ip":"192.168.0.229","port":16301,"role":"LEADER", "weight":1},{"stream_id":1,"node_id":2,"ip":"192.168.0.219","port":16301,"role":"FOLLOWER", "weight":1},{"stream_id":1,"node_id":3,"ip":"192.168.0.114","port":16301,"role":"FOLLOWER", "weight":1}], curIdx is 1, datapath is data/dcc/openGauss/install/cm, logPath is data/dcc/openGauss/install/log/dcc/cm/dcc.
    2024-12-02 00:04:29.679 tid=26137 LOG: [DccNotifyStatus] g_dbRole is 1, roleType is 1.
    2024-12-02 00:04:30.662 tid=25703 MAIN LOG: success to start dcc.
    2024-12-02 00:04:30.662 tid=25703 MAIN LOG: dccStr is 192.168.0.229:16301:zhangyaozhong:1:1:az1; 192.168.0.219:16301:ecs-6ac8:2:2:az1; 192.168.0.114:16301:wangchao:3:3:az1; .
    2024-12-02 00:04:30.662 tid=25703 MAIN LOG: get curidx(0) from server.
    2024-12-02 00:04:30.662 tid=26254 DCC_MONITOR LOG: Starting DCC monitor thread.
    2024-12-02 00:04:30.662 tid=26255 DCC_SET LOG: Starting DCC SET priority thread.
    2024-12-02 00:04:30.662 tid=26255 DCC_SET LOG: will set ELECTION_PRIORITY, and value is 100, ret is 0.

    DCC正常启动日志如下,包含DCF的启动过程,日志路径$GAUSSLOG/cm/dcc/run

    DCC分为如下几个模块:

    模块说明
    API对外提供的接口
    LOG日志文件管理
    EXC执行命令
    PARAM配置设置


    DCF分为以下几个模块:

    模块说明
    META元数据:负责管理集群配置日志信息
    MEC通信:提供节点间的数据通信能力(TCP/SSL)
    STG存储:负责日志数据和配置数据的持久化
    ELC选举:负责leader的选举、心跳维持、角色状态通知
    REP复制:复制日志的复制、提交、应用
      |DCC|25703|INFO>[API] dcc init logger succeed. 
      |DCC|25703|INFO>[LOG] file '/data/dcc/openGauss/install/log/dcc/cm/dcc/run/dcc.rlog' is added
      |DCC|25735|INFO>[LOG] file '/data/dcc/openGauss/install/log/dcc/cm/dcc/profile/dcc.plog' is added
      |DCC|25703|INFO>[API] dcc db_startup succeed.
      |DCC|25703|INFO>[EXC]init watch group start
      |DCC|25703|INFO>[EXC]init watch group end
      |DCC|26139|INFO>[EXC LEASE] lease_expire thread started, tid:139935284332288, close:0
      |DCC|25703|INFO>[EXC] Set the local applied index:0 for starting DCC.
      |DCF|25703|INFO>[DCF]Logger init succeed
      |DCC|25703|INFO>[LOG] file '/data/dcc/openGauss/install/log/dcc/cm/dcc/oper/oper.log' is added
      |DCC|25703|INFO>[LOG] file '/data/dcc/openGauss/install/log/dcc/cm/dcc/debug/dcc.dlog' is added
      |DCF|25703|INFO>[META]Md init succeed, checksum:716019734
      |DCF|26141|INFO>[MEC]reactor thread started
      |DCF|26142|INFO>[MEC]reactor thread started
      |DCF|25703|INFO>[MEC]mec_init_ssl: ssl is enabled.
      |DCF|25703|INFO>[MEC]high msg_pool_extent=8, low msg_pool_extent=64
      |DCF|25703|INFO>[MEC]high msg_pool_extent=8, low msg_pool_extent=64
      |DCF|26144|INFO>[MEC]agent thread started, tid:139935278020352, close:0
      |DCF|25703|INFO>[MEC]high msg_pool_extent=8, low msg_pool_extent=64
      |DCF|26144|INFO>[MEC]agent thread started, tid:139935278020352, close:0
      |DCF|25703|INFO>[MEC]connect to instance 2 channel id 0.
      |DCF|26145|INFO>[MEC]agent thread started, tid:139935276967680, close:0
      |DCF|26146|INFO>[MEC]agent thread started, tid:139935275915008, close:0
      |DCF|25703|INFO>[MEC]connect to instance 3 channel id 0.
      |DCF|26147|INFO>[MEC]agent thread started, tid:139935274862336, close:0
      |DCF|25703|INFO>[MEC]Mec init succeed
      |DCF|26144|INFO>[MEC]after cs_tcp_connect to host 192.168.0.219 port 16301.
      |DCF|26145|INFO>[MEC]after cs_tcp_connect to host 192.168.0.219 port 16301.
      |DCF|26144|INFO>[MEC]cs_open_tcp_link success.
      |DCF|26143|INFO>[MEC]mec_accept: received req, start accept...
      |DCF|26143|INFO>[MEC]mec_accept: start cs_ssl_accept...
      |DCF|26144|INFO>[MEC]after cs_connect to instance 2 channel id 0, priv 0.
      |DCF|26144|INFO>[MEC]connect to instance 2 channel id 0, priv 0 success.
      DCF|26144|INFO>[MEC]after cs_connect to instance 2 channel id 0, priv 0.
      |DCF|26144|INFO>[MEC]connect to instance 2 channel id 0, priv 0 success.
      |DCF|26145|INFO>[MEC]cs_open_tcp_link success.
      |DCF|26143|INFO>[MEC]mec_accept: channel id 512 priv 0 receive ok.
      |DCF|26143|INFO>[MEC]mec_accept: received req, start accept...
      |DCF|26143|INFO>[MEC]mec_accept: start cs_ssl_accept...
      |DCF|26145|INFO>[MEC]after cs_connect to instance 2 channel id 0, priv 1.
      |DCF|26145|INFO>[MEC]connect to instance 2 channel id 0, priv 1 success.
      |DCF|26143|INFO>[MEC]mec_accept: channel id 512 priv 1 receive ok.
      |DCF|25703|INFO>[STG]Stg init succeed
      |DCF|25703|INFO>[ELC]stream 1 init, cur_node_id 1, vote_for 0, last_hb_time 0
      |DCF|25703|INFO>[ELC]elc_status_check_init ok.
      |DCF|25703|INFO>[ELC]Elc init succeed
      |DCF|26172|WARN>[ELC]heartbeat timeout, begin voting, stream_id=1, node_id=1
      |DCF|25703|INFO>rep_leader_init: flow_ctrl_type=1.
      |DCF|26172|WARN>[ELC]heartbeat timeout, begin voting, stream_id=1, node_id=1
      |DCF|25703|INFO>rep_leader_init: flow_ctrl_type=1.
      |DCF|25703|INFO>[monitor]monitor init start.
      |DCF|25703|INFO>[monitor]dcf log path: /data/dcc/openGauss/install/cm/dcf_data
      |DCF|25703|INFO>[monitor]monitor init end.
      |DCF|26177|INFO>leader monitor thread start.
      |DCF|26172|INFO>[ELC]get vote from stream_id=1, node_id=1, term=1, vote_count=1
      |DCF|26180|INFO>[MEC]work thread started, tid:139935010559744, close:0
      |DCF|26172|INFO>[ELC]elc vote req broadcast, stream_id=1 candidate_id=1 candidate_term=1 last_log.term=0 last_log.index=0 vote_flag=0x1 work_mode=0
      |DCF|25703|INFO>rep_leader_init finished
      |DCF|25703|INFO>[REP]rep_init succeed
      |DCF|25703|INFO>init exception report
      |DCF|25703|INFO>[DCF]Tool init succeed
      |DCF|25703|INFO>dcf start succeed.
      |DCF|26184|INFO>[MEC]work thread started, tid:139935006349056, close:0
      |DCF|26184|INFO>[ELC]receive ack from node_id=2, stream_id=1, current_node=1, current_term=1, role=5, work_mode=0 peer_term=1, vote_granted=1 work_mode=0
      |DCF|26184|INFO>[MEC]work thread started, tid:139935006349056, close:0
      |DCF|26184|INFO>[ELC]receive ack from node_id=2, stream_id=1, current_node=1, current_term=1, role=5, work_mode=0 peer_term=1, vote_granted=1 work_mode=0
      |DCF|26184|INFO>[ELC]get vote from stream_id=1, node_id=2, term=1, vote_count=2
      |DCF|26184|INFO>[ELC]pre-voting succeeded, stream_id=1, node_id=1, current_term=1
      |DCF|26184|INFO>[ELC]get vote from stream_id=1, node_id=1, term=2, vote_count=1
      |DCF|26184|INFO>[ELC]elc vote req broadcast, stream_id=1 candidate_id=1 candidate_term=2 last_log.term=0 last_log.index=0 vote_flag=0x0 work_mode=0
      |DCF|26184|INFO>[ELC]receive ack from node_id=2, stream_id=1, current_node=1, current_term=2, role=6, work_mode=0 peer_term=2, vote_granted=1 work_mode=0
      |DCF|26184|INFO>[ELC]get vote from stream_id=1, node_id=2, term=2, vote_count=2
      |DCF|26184|INFO>[ELC]election is successful, stream_id=1, node_id=1, current_term=2
      |DCF|26173|INFO>[ELC]best_prio:cur_node=1,my_group=0,my_prio=0,leader_group=0,best_group=0,best_prio=0,rcv_best_priority_node=0
      |DCF|26173|INFO>[ELC]max_prio_leader=0 force_vote=0 role=1
      |DCF|26173|INFO>rep_leader_reset finished
      |DCF|26143|INFO>[MEC]mec_accept: received req, start accept...
      |DCF|26143|INFO>[MEC]mec_accept: start cs_ssl_accept...
      |DCF|26143|INFO>[MEC]mec_accept: channel id 768 priv 0 receive ok.
      |DCF|26143|INFO>[MEC]mec_accept: start cs_ssl_accept...
      |DCF|26143|INFO>[MEC]mec_accept: channel id 768 priv 0 receive ok.
      |DCF|26143|INFO>[MEC]mec_accept: received req, start accept...
      |DCF|26143|INFO>[MEC]mec_accept: start cs_ssl_accept...
      |DCF|26143|INFO>[MEC]mec_accept: channel id 768 priv 1 receive ok.
      |DCF|26147|INFO>[MEC]after cs_tcp_connect to host 192.168.0.114 port 16301.
      |DCF|26146|INFO>[MEC]after cs_tcp_connect to host 192.168.0.114 port 16301.
      |DCF|26147|INFO>[MEC]cs_open_tcp_link success.
      |DCF|26147|INFO>[MEC]after cs_connect to instance 3 channel id 0, priv 1.
      |DCF|26147|INFO>[MEC]connect to instance 3 channel id 0, priv 1 success.
      |DCF|26146|INFO>[MEC]cs_open_tcp_link success.
      |DCF|26146|INFO>[MEC]after cs_connect to instance 3 channel id 0, priv 0.
      |DCF|26146|INFO>[MEC]connect to instance 3 channel id 0, priv 0 success.
      |DCF|26199|INFO>[MEC]work thread started, tid:139934262359808, close:0
      |DCF|26200|INFO>[MEC]work thread started, tid:139934261307136, close:0
      |DCF|26199|INFO>[MEC]work thread started, tid:139934262359808, close:0
      |DCF|26200|INFO>[MEC]work thread started, tid:139934261307136, close:0
      |DCF|26212|INFO>[MEC]work thread started, tid:139934260254464, close:0
      |DCF|26175|INFO>[META]md_consensus_notify. node=0, src=1, key=0x0.
      |DCC|26137|INFO>[EXC LEASE] exc lease promote begin
      |DCC|26137|INFO>[EXC LEASE] exc lease promote end
      |DCC|25703|INFO>[EXC LEASE] exc lease promote begin
      |DCC|25703|INFO>[EXC LEASE] exc lease promote end
      |DCC|25703|INFO>[API] dcc init executor succeed.
      |DCC|25703|INFO>[API] dcc srv init session pool succeed.
      |DCC|25703|INFO>srv init sess apply mgr succeed.
      |DCC|25703|INFO>[API] dcc create srv instance and init sess apply mgr succeed.
      |DCC|25703|INFO>[API] dcc srv instance init succeed.
      |DCC|25703|INFO>[API] dcc srv start succeed.
      |DCC|26253|INFO>srv sess apply thread started, tid:139934254868224, close:0
      |DCC|25703|INFO>[PARAM] set dcf param LOG_LEVEL value RUN_ERR|RUN_WAR|DEBUG_ERR|OPER|RUN_INF|PROFILE success
      |DCC|25703|INFO>[PARAM] set dcf param LOG_BACKUP_FILE_COUNT value 10 success
      |DCC|25703|INFO>[PARAM] set dcf param MAX_LOG_FILE_SIZE value 10 success
      |DCC|25703|INFO>[PARAM] set dcc param LOG_SUPPRESS_ENABLE value 1 success
      |DCC|25703|INFO>[PARAM] set dcf param ELECTION_TIMEOUT value 3 success 


      点击阅读原文跳转作者文章

      文章转载自openGauss,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

      评论