暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Etcd集群节点的一些操作

乔克的好奇心 2021-09-27
1294

from pixabay


在部署etcd集群时,建议使用奇数个etcd实例,这样至少可以保证集群有(N-1)/2
个实例是可以正常提供服务的。但是如果超过了(N-1)/2
个实例故障。就需要使用备份的etcd数据对集群进行容灾恢复。


备份

备份策略:

  • 每两个小时用命令对etcd进行备份

  • 备份数据保留2天

  1. ETCDCTL_API=3 /opt/etcd/bin/etcdctl snapshot save /root/backup/etcd_$(date "+%Y%m%d%H%M%S").db


恢复

(1)、停止kube-apiserver,确保不会再写入数据

  1. systemctl stop kube-apiserver

(2)、停止所有节点etcd

  1. systemctl stop etcd

(3)、将etcd节点上原有的数据目录备份(具体的目录可以在etcd的配置文件中查看)

  1. mv /var/lib/etcd/default.etcd{,.bak}

(4)、将备份的数据拷贝到所有etcd节点

  1. scp etcd_20200106152240.db 10.1.10.129:/root/backup/

  2. scp etcd_20200106152240.db 10.1.10.130:/root/backup/

(5)、使用命令进行恢复

  1. # 在etcd-1上

  2. ETCDCTL_API=3 /opt/etcd/bin/etcdctl snapshot --endpoints="https://10.1.10.128:2379,https://10.1.10.129:2379,https://10.1.10.130:2379" --cacert=/opt/etcd/ssl/etcd-ca.pem --cert=/opt/etcd/ssl/etcd-server.pem --key=/opt/etcd/ssl/etcd-server-key.pem restore ~/backup/etcd_20200106152240.db --name=etcd-1 --data-dir=/var/lib/etcd/default.etcd --initial-cluster="etcd-1=https://10.1.10.128:2380,etcd-2=https://10.1.10.129:2380,etcd-3=https://10.1.10.130:2380" --initial-cluster-token="etcd-cluster" --initial-advertise-peer-urls=https://10.1.10.128:2380

  3. # 在etcd-2上

  4. ETCDCTL_API=3 /opt/etcd/bin/etcdctl snapshot --endpoints="https://10.1.10.128:2379,https://10.1.10.129:2379,https://10.1.10.130:2379" --cacert=/opt/etcd/ssl/etcd-ca.pem --cert=/opt/etcd/ssl/etcd-server.pem --key=/opt/etcd/ssl/etcd-server-key.pem restore ~/backup/etcd_20200106152240.db --name=etcd-2 --data-dir=/var/lib/etcd/default.etcd --initial-cluster="etcd-1=https://10.1.10.128:2380,etcd-2=https://10.1.10.129:2380,etcd-3=https://10.1.10.130:2380" --initial-cluster-token="etcd-cluster" --initial-advertise-peer-urls=https://10.1.10.129:2380

  5. # 在etcd-3上

  6. ETCDCTL_API=3 /opt/etcd/bin/etcdctl snapshot --endpoints="https://10.1.10.128:2379,https://10.1.10.129:2379,https://10.1.10.130:2379" --cacert=/opt/etcd/ssl/etcd-ca.pem --cert=/opt/etcd/ssl/etcd-server.pem --key=/opt/etcd/ssl/etcd-server-key.pem restore ~/backup/etcd_20200106152240.db --name=etcd-3 --data-dir=/var/lib/etcd/default.etcd --initial-cluster="etcd-1=https://10.1.10.128:2380,etcd-2=https://10.1.10.129:2380,etcd-3=https://10.1.10.130:2380" --initial-cluster-token="etcd-cluster" --initial-advertise-peer-urls=https://10.1.10.130:2380

(6)、启动etcd服务

  1. systemctl start etcd

(7)、启动kube-apiserver

  1. systemctl start kube-apiserver

(8)、查看集群状态

  1. # opt/etcd/bin/etcdctl \

  2. > --ca-file=/opt/etcd/ssl/etcd-ca.pem --cert-file=/opt/etcd/ssl/etcd-server.pem --key-file=/opt/etcd/ssl/etcd-server-key.pem \

  3. > --endpoints="https://10.1.10.128:2379,https://10.1.10.129:2379,https://10.1.10.130:2379" \

  4. > cluster-health

  5. member a2dba8836695bcf6 is healthy: got healthy result from https://10.1.10.129:2379

  6. member d1272b0b3cb41282 is healthy: got healthy result from https://10.1.10.128:2379

  7. member e4a3a9c93ef84f2d is healthy: got healthy result from https://10.1.10.130:2379

  8. cluster is healthy


  9. # kubectl get node

  10. NAME STATUS ROLES AGE VERSION

  11. master-k8s Ready <none> 4h9m v1.16.4

  12. node01-k8s Ready <none> 40h v1.16.4

  13. node02-k8s Ready <none> 39h v1.16.4

  14. # kubectl get pod -n kube-system

  15. NAME READY STATUS RESTARTS AGE

  16. coredns-9d5b6bdb6-mpwht 1/1 Running 0 38h

  17. kube-flannel-ds-amd64-2qkcb 1/1 Running 0 38h

  18. kube-flannel-ds-amd64-7nzj5 1/1 Running 0 38h

  19. kube-flannel-ds-amd64-hlfdf 1/1 Running 0 4h9m

  20. metrics-server-v0.3.6-6c57d48cb4-tzjc7 2/2 Running 0 3h53m

  21. traefik-ingress-controller-7758594f89-lwf2t 1/1 Running 0 16h


剔除

如果我们要剔除某个节点,我们可以通过下面步骤进行。
(1)、查看现有节点的member信息

  1. /opt/etcd/bin/etcdctl --ca-file=/opt/etcd/ssl/etcd-ca.pem --cert-file=/opt/etcd/ssl/etcd-server.pem --key-file=/opt/etcd/ssl/etcd-server-key.pem member list

  2. a2dba8836695bcf6: name=etcd-2 peerURLs=https://10.1.10.129:2380 clientURLs=https://10.1.10.129:2379 isLeader=false

  3. d1272b0b3cb41282: name=etcd-1 peerURLs=https://10.1.10.128:2380 clientURLs=https://10.1.10.128:2379 isLeader=true

  4. e4a3a9c93ef84f2d: name=etcd-3 peerURLs=https://10.1.10.130:2380 clientURLs=https://10.1.10.130:2379 isLeader=false

(2)、根据member信息移除相应的实例

  1. # opt/etcd/bin/etcdctl --ca-file=/opt/etcd/ssl/etcd-ca.pem --cert-file=/opt/etcd/ssl/etcd-server.pem --key-file=/opt/etcd/ssl/etcd-server-key.pem member remove e4a3a9c93ef84f2d

  2. Removed member e4a3a9c93ef84f2d from cluster

(3)、停止被移除节点的etcd

  1. systemctl stop etcd

(4)、修改现有etcd集群的配置文件,移除被踢掉的etcd集群

  1. ...

  2. ETCD_INITIAL_CLUSTER="etcd-1=https://10.1.10.128:2380,etcd-2=https://10.1.10.129:2380"

  3. ...

(5)、重启现有集群的etcd

  1. systemctl restart etcd

(6)、查看集群状态

  1. # opt/etcd/bin/etcdctl --ca-file=/opt/etcd/ssl/etcd-ca.pem --cert-file=/opt/etcd/ssl/etcd-server.pem --key-file=/opt/etcd/ssl/etcd-server-key.pem member list

  2. a2dba8836695bcf6: name=etcd-2 peerURLs=https://10.1.10.129:2380 clientURLs=https://10.1.10.129:2379 isLeader=false

  3. d1272b0b3cb41282: name=etcd-1 peerURLs=https://10.1.10.128:2380 clientURLs=https://10.1.10.128:2379 isLeader=true


  4. # opt/etcd/bin/etcdctl --ca-file=/opt/etcd/ssl/etcd-ca.pem --cert-file=/opt/etcd/ssl/etcd-server.pem --key-file=/opt/etcd/ssl/etcd-server-key.pem --endpoints="https://10.1.10.128:2379,https://10.1.10.129:2379" cluster-health

  5. member a2dba8836695bcf6 is healthy: got healthy result from https://10.1.10.129:2379

  6. member d1272b0b3cb41282 is healthy: got healthy result from https://10.1.10.128:2379

  7. cluster is healthy


新增

如果我们需要新增一个etcd节点,则可以按照以下步骤进行。
(1)、通过以下命令新增节点

  1. # opt/etcd/bin/etcdctl --ca-file=/opt/etcd/ssl/etcd-ca.pem --cert-file=/opt/etcd/ssl/etcd-server.pem --key-file=/opt/etcd/ssl/etcd-server-key.pem member add etcd-3 https://10.1.10.130:2380

  2. Added member named etcd-3 with ID 16ebaad9c9a3a3e3 to cluster


  3. ETCD_NAME="etcd-3"

  4. ETCD_INITIAL_CLUSTER="etcd-3=https://10.1.10.130:2380,etcd-2=https://10.1.10.129:2380,etcd-1=https://10.1.10.128:2380"

  5. ETCD_INITIAL_CLUSTER_STATE="existing"

注意:

  • etcdname: etcd.conf配置文件中ETCDNAME内容

  • etdcnodeaddress: etcd.conf配置文件中的ETCDLISTENPEER_URLS内容

(2)、删除新增成员旧数据目录,并且启动新增成员etcd服务,加入集群时要改下配置文件,把初始化集群状态由new改成existing,如下

  1. #[Member]

  2. ETCD_NAME="etcd-3"

  3. ETCD_DATA_DIR="/var/lib/etcd/default.etcd"

  4. ETCD_LISTEN_PEER_URLS="https://10.1.10.130:2380"

  5. ETCD_LISTEN_CLIENT_URLS="https://10.1.10.130:2379"


  6. #[Clustering]

  7. ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.1.10.130:2380"

  8. ETCD_ADVERTISE_CLIENT_URLS="https://10.1.10.130:2379"

  9. ETCD_INITIAL_CLUSTER="etcd-1=https://10.1.10.128:2380,etcd-2=https://10.1.10.129:2380,etcd-3=https://10.1.10.130:2380"

  10. ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"

  11. ETCD_INITIAL_CLUSTER_STATE="existing"

(3)、还需要修改systemd unit文件中的参数,如下:

  1. ......

  2. --initial-cluster-state=existing \

  3. ......

(4)、在现存etcd的配置文件中修改

  1. ETCD_INITIAL_CLUSTER="etcd-1=https://10.1.10.128:2380,etcd-2=https://10.1.10.129:2380,etcd-3=https://10.1.10.130:2380"

(5)、重启etcd

  1. systemctl restart etcd

(6)、查看状态

  1. # opt/etcd/bin/etcdctl --ca-file=/opt/etcd/ssl/etcd-ca.pem --cert-file=/opt/etcd/ssl/etcd-server.pem --key-file=/opt/etcd/ssl/etcd-server-key.pem member list

  2. 7d38a6cc82b63e33: name=etcd-3 peerURLs=https://10.1.10.130:2380 clientURLs=https://10.1.10.130:2379 isLeader=false

  3. a2dba8836695bcf6: name=etcd-2 peerURLs=https://10.1.10.129:2380 clientURLs=https://10.1.10.129:2379 isLeader=true

  4. d1272b0b3cb41282: name=etcd-1 peerURLs=https://10.1.10.128:2380 clientURLs=https://10.1.10.128:2379 isLeader=false


  5. # /opt/etcd/bin/etcdctl \

  6. > --ca-file=/opt/etcd/ssl/etcd-ca.pem --cert-file=/opt/etcd/ssl/etcd-server.pem --key-file=/opt/etcd/ssl/etcd-server-key.pem \

  7. > --endpoints="https://10.1.10.128:2379,https://10.1.10.129:2379,https://10.1.10.130:2379" \

  8. > cluster-health

  9. member 7d38a6cc82b63e33 is healthy: got healthy result from https://10.1.10.130:2379

  10. member a2dba8836695bcf6 is healthy: got healthy result from https://10.1.10.129:2379

  11. member d1272b0b3cb41282 is healthy: got healthy result from https://10.1.10.128:2379



乔边故事

只要脸皮够厚,整个世界都


将被你踩在脚下。


听说转发文章

会给你带来好运


文章转载自乔克的好奇心,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论