暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

golang源码分析:etcd(1)

        https://github.com/etcd-io/etcd以及迭代到v3版本,是很多中间件的核心组件,比如k8s,下面我们将通过一系列文章分析下它的源码和设计。其中部分内容翻译自官方文档https://etcd.io/docs/v3.5/install/。

        首先尝试下源码安装:进入源码目录编译

    % cd etcd
    % ./scripts/build.sh
    (cd etcdctl && env GO_BUILD_FLAGS= CGO_ENABLED=0 GO_BUILD_FLAGS= GOOS=darwin GOARCH=amd64 go build -trimpath -installsuffix=cgo -ldflags=-X=go.etcd.io/etcd/api/v3/version.GitSHA=8da2a5b -o=../bin/etcdctl .)
    SUCCESS: etcd_build (GOARCH=amd64)

    编译完成后查看下版本:

      % ./bin/etcd --version
      etcd Version: 3.6.0-alpha.0
      Git SHA: 8da2a5b
      Go Version: go1.19
      Go OS/Arch: darwin/amd64

      把它添加到path

        % export PATH="$PATH:`pwd`/bin"

        然后启动server端

          % etcd
          {"level":"warn","ts":"2023-06-07T09:17:23.681691+0800","caller":"embed/config.go:708","msg":"Running http and grpc server on single port. This is not recommended for production."}

          etcd最小可操作单元是k/v,我们可以通过etcdctl来操作

            % etcdctl put greeting "Hello, etcd"
            OK
              % etcdctl get greeting
              greeting
              Hello, etcd

                      etcd的service主要分为两类:

              • 处理k/v相关的,Services important for dealing with etcd’s key space include

              1. KV - Creates, updates, fetches, and deletes key-value pairs.

              2. Watch - Monitors changes to keys.

              3. Lease - Primitives for consuming client keep-alive messages.

              • 处理集群相关的,Services which manage the cluster itself include:

              1. Auth - Role based authentication mechanism for authenticating users.

              2. Cluster - Provides membership information and configuration facilities.

              3. Maintenance - Takes recovery snapshots, defragments the store, and returns per-member status information.

              比如KV的查询,核心接口如下

                service KV {
                Range(RangeRequest) returns (RangeResponse)
                ...
                }

                etcd所有的api返回结果里都增加了Response header,包括集群的元信息:All Responses from etcd API have an attached response header which includes cluster metadata for the response。具体内容如下

                  message ResponseHeader {
                  uint64 cluster_id = 1;
                  uint64 member_id = 2;
                  int64 revision = 3;
                  uint64 raft_term = 4;
                  }

                  k/v对是api可操作的最小单元,它的定义如下:

                    message KeyValue {
                    bytes key = 1;
                    int64 create_revision = 2;
                    int64 mod_revision = 3;
                    int64 version = 4;
                    bytes value = 5;
                    int64 lease = 6;
                    }

                            用etcd实现的分布式锁是通过创建版本号来获取锁的所有权。修改版本号用户mvcc场景下检测版本是否冲突,实现cas逻辑的。etcd内部维护了一个64位的集群粒度的计数器,存储的版本号会随着key修改的次数增加,版本号可以作为逻辑上的一个全局锁。给存储的所有更新排序。etcd maintains a 64-bit cluster-wide counter, the store revision, that is incremented each time the key space is modified. The revision serves as a global logical clock, sequentially ordering all updates to the store. The change represented by a new revision is incremental; the data associated with a revision is the data that changed the store. Internally, a new revision means writing the changes to the backend’s B+tree, keyed by the incremented revision.

                            etcd的数据模型会给所有的二进制key建设一个打平的索引。查询的请求和返回定义如下:

                      message RangeRequest {
                      enum SortOrder {
                      NONE = 0; default, no sorting
                      ASCEND = 1; lowest target value first
                      DESCEND = 2; highest target value first
                      }
                      enum SortTarget {
                      KEY = 0;
                      VERSION = 1;
                      CREATE = 2;
                      MOD = 3;
                      VALUE = 4;
                      }




                      bytes key = 1;
                      bytes range_end = 2;
                      int64 limit = 3;
                      int64 revision = 4;
                      SortOrder sort_order = 5;
                      SortTarget sort_target = 6;
                      bool serializable = 7;
                      bool keys_only = 8;
                      bool count_only = 9;
                      int64 min_mod_revision = 10;
                      int64 max_mod_revision = 11;
                      int64 min_create_revision = 12;
                      int64 max_create_revision = 13;
                      }
                        message RangeResponse {
                        ResponseHeader header = 1;
                        repeated mvccpb.KeyValue kvs = 2;
                        bool more = 3;
                        int64 count = 4;
                        }

                                修改的请求定义类似,同样还有删除的:

                          message PutRequest {
                          bytes key = 1;
                          bytes value = 2;
                          int64 lease = 3;
                          bool prev_kv = 4;
                          bool ignore_value = 5;
                          bool ignore_lease = 6;
                          }
                            message PutResponse {
                            ResponseHeader header = 1;
                            mvccpb.KeyValue prev_kv = 2;
                            }

                                    etcd把一个事务操作,抽象为一个原子的If/Then/Else模型:A transaction is an atomic If/Then/Else construct over the key-value store.Transactions can be used for protecting keys from unintended concurrent updates, building compare-and-swap operations, and developing higher-level concurrency control.All comparisons are applied atomically; if all comparisons are true, the transaction is said to succeed and etcd applies the transaction’s then success request block, otherwise it is said to fail and applies the else / failure request block.

                                    上述模型会对应三个操作:

                              message Compare {
                              enum CompareResult {
                              EQUAL = 0;
                              GREATER = 1;
                              LESS = 2;
                              NOT_EQUAL = 3;
                              }
                              enum CompareTarget {
                              VERSION = 0;
                              CREATE = 1;
                              MOD = 2;
                              VALUE= 3;
                              }
                              CompareResult result = 1;
                              // target is the key-value field to inspect for the comparison.
                              CompareTarget target = 2;
                              // key is the subject key for the comparison operation.
                              bytes key = 3;
                              oneof target_union {
                              int64 version = 4;
                              int64 create_revision = 5;
                              int64 mod_revision = 6;
                              bytes value = 7;
                              }
                              }
                                message RequestOp {
                                // request is a union of request types accepted by a transaction.
                                oneof request {
                                RangeRequest request_range = 1;
                                PutRequest request_put = 2;
                                DeleteRangeRequest request_delete_range = 3;
                                }
                                }

                                All together, a transaction is issued with a Txn API call, which takes a TxnRequest:

                                  message TxnRequest {
                                  repeated Compare compare = 1;
                                  repeated RequestOp success = 2;
                                  repeated RequestOp failure = 3;
                                  }

                                          事务的结果如下:

                                    message TxnResponse {
                                    ResponseHeader header = 1;
                                    bool succeeded = 2;
                                    repeated ResponseOp responses = 3;
                                    }
                                      message ResponseOp {
                                      oneof response {
                                      RangeResponse response_range = 1;
                                      PutResponse response_put = 2;
                                      DeleteRangeResponse response_delete_range = 3;
                                      }
                                      }






                                      message Event {
                                      enum EventType {
                                      PUT = 0;
                                      DELETE = 1;
                                      }
                                      EventType type = 1;
                                      KeyValue kv = 2;
                                      KeyValue prev_kv = 3;
                                      }

                                      Watches are long-running requests and use gRPC streams to stream event data.A single watch stream can multiplex many distinct watches by tagging events with per-watch identifiers.

                                              watch的语意实现了三个要素,有序、可靠、原子性。Watches make three guarantees about events:

                                      1. Ordered - events are ordered by revision; an event will never appear on a watch if it precedes an event in time that has already been posted.

                                      2. Reliable - a sequence of events will never drop any subsequence of events; if there are events ordered in time as a < b < c, then if the watch receives events a and c, it is guaranteed to receive b.

                                      3. Atomic - a list of events is guaranteed to encompass complete revisions; updates in the same revision over multiple keys will not be split over several lists of events.

                                        message WatchCreateRequest {
                                        bytes key = 1;
                                        bytes range_end = 2;
                                        int64 start_revision = 3;
                                        bool progress_notify = 4;




                                        enum FilterType {
                                        NOPUT = 0;
                                        NODELETE = 1;
                                        }
                                        repeated FilterType filters = 5;
                                        bool prev_kv = 6;
                                        }

                                                租约是一种客户端的保活机制,当收不到心跳的时候,就认为客户端挂掉了。Leases are a mechanism for detecting client liveness. The cluster grants leases with a time-to-live. A lease expires if the etcd cluster does not receive a keepAlive within a given TTL period.

                                          message LeaseGrantRequest {
                                          int64 TTL = 1;
                                          int64 ID = 2;
                                          }
                                            message LeaseRevokeRequest {
                                            int64 ID = 1;
                                            }

                                            Leases are refreshed using a bi-directional stream created with the LeaseKeepAlive API call.

                                                    

                                            文章转载自golang算法架构leetcode技术php,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                                            评论