pg 架构之Patroni
元数据记录etcd,真正的DCS系统,避免脑裂
使用 etcd+ Patroni+haproxy+keepalived 搭建PG集群
环境准备
192.168.0.211 node01
192.168.0.212 node02
192.168.0.213 node03
PG主从
略
etcd安装配置
在node01,node02,node03上执行
yum install -y gcc python-devel epel-release
yum clean all
yum makecache
yum -y install etcd
etcd配置
修改对应的ETCD_NAME, 对应的IP
cp /etc/etcd/etcd.conf /etc/etcd/etcd.conf.bak202404
cat > /etc/etcd/etcd.conf<<here
ETCD_NAME="node01"
ETCD_DATA_DIR="/var/lib/etcd/node01.etcd"
ETCD_LISTEN_CLIENT_URLS="http://127.0.0.1:2379,http://192.168.0.211:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://127.0.0.1:2379,http://192.168.0.211:2379"
ETCD_LISTEN_PEER_URLS="http://192.168.0.211:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.0.211:2380"
ETCD_INITIAL_CLUSTER="node01=http://192.168.0.211:2380,node02=http://192.168.0.212:2380,node03=http://192.168.0.213:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
here
etcd自动启动
systemctl enable etcd
systemctl start etcd
查看etcd状态
etcdctl member list
Partoni安装配置
python安装
#安装python3.8
yum clean all
yum makecache
yum install -y libffi-devel libuuid libuuid-devel
wget https://www.python.org/ftp/python/3.8.19/Python-3.8.19.tgz
tar -xzvf Python-3.8.19.tgz
cd Python-3.8.19
./configure
make -j 7
make -j 7 install
rm -rf /usr/bin/python
ln -s /usr/local/bin/python3.8 /usr/bin/python
rm -rf /usr/bin/python
ln -s /usr/local/bin/python3 /usr/bin/pyhon
#修正yum指向python2
vi /usr/bin/yum
修改#!/usr/bin/python 为#!/usr/bin/python2
vi /usr/libexec/urlgrabber-ext-down
修改#!/usr/bin/python 为#!/usr/bin/python2
Partoni安装
yum install -y libffi-devel
yum install -y gcc epel-release
yum install -y python-pip python-psycopg2 python-devel
pip3 install --upgrade pip
pip3 install --upgrade setuptools
pip3 install psycopg2-binary
pip3 install patroni[etcd]
pip3 install urllib3==1.26.15
pip3 list
Partoni配置
在node01,node02,node03上执行,替换为对应的ip,name
mkdir /etc/patroni
#用户密码延用repmgr
cat > /etc/patroni/patroni.yml <<here
scope: etcd_patroni_pg
namespace: /pgcluster1/
name: node01
restapi:
listen: 192.168.0.211:8008
connect_address: 192.168.0.211:8008
etcd:
host: 192.168.0.211:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
master_start_timeout: 300
synchronous_mode: false
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
listen_addresses: "0.0.0.0"
port: 5432
wal_level: logical
hot_standby: "on"
wal_keep_segments: 1000
max_wal_senders: 10
max_replication_slots: 10
wal_log_hints: "on"
# archive_mode: "on"
# archive_timeout: 1800s
# archive_command: gzip < %p > /data/backup/pgwalarchive/%f.gz
# recovery_conf:
# restore_command: gunzip < /data/backup/pgwalarchive/%f.gz > %p
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.0.211:5432
data_dir: /data/pgdata
bin_dir: /usr/local/pgsql-15.6/bin
authentication:
replication:
username: repmgr
password: repmgr
superuser:
username: postgres
password: postgres
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
here
chown -R postgres.postgres -R /etc/patroni/
patroni.service服务设置
vi /etc/systemd/system/patroni.service
[Unit]
Description=patroni - a high-availability PostgreSQL
Documentation=https://patroni.readthedocs.io/en/latest/index.html
After=syslog.target network.target etcd.target
Wants=network-online.target
[Service]
Type=simple
User=postgres
Group=postgres
PermissionsStartOnly=true
ExecStart=/usr/local/bin/patroni /etc/patroni/patroni.yml
ExecReload=/bin/kill -HUP $MAINPID
LimitNOFILE=65536
KillMode=process
KillSignal=SIGINT
Restart=on-abnormal
RestartSec=30s
TimeoutSec=0
[Install]
WantedBy=multi-user.target
开机自动启动
systemctl enable patroni.service
systemctl start patroni.service
检查集群状态
查看集群
patronictl -c /etc/patroni/patroni.yml list

查看主从
select client_addr,sync_state from pg_stat_replication;
client_addr | sync_state
---------------+------------
192.168.0.213 | async
192.168.0.212 | async
SELECT * FROM pg_stat_replication;
集群切换
手工在线切换
patronictl -c /etc/patroni/patroni.yml switchover




自动启动原主库,跟随时间线启动为从库


手动停库
pg_ctl stop -D /data/pgdata
patroni自动拉起PG,并保持leader角色(crashed -> running)
Apr 22 21:21:08 node01 patroni: 2024-04-22 21:21:08,699 INFO: no action. I am (node01), the leader with the lock
Apr 22 21:21:12 node01 patroni: 2024-04-22 21:21:12,653 WARNING: Postgresql is not running.
……
Apr 22 21:21:12 node01 patroni: 2024-04-22 21:21:12,791 INFO: starting primary after failure
Apr 22 21:21:14 node01 patroni: 2024-04-22 21:21:14,049 INFO: postmaster pid=1752
Apr 22 21:21:14 node01 patroni: 2024-04-22 21:21:14.164 CST [1752] LOG: redirecting log output to logging collector process
Apr 22 21:21:14 node01 patroni: 2024-04-22 21:21:14.164 CST [1752] HINT: Future log output will appear in directory "log".
Apr 22 21:21:15 node01 patroni: localhost:5432 - accepting connections
Apr 22 21:21:17 node01 patroni: localhost:5432 - accepting connections
Apr 22 21:21:17 node01 patroni: 2024-04-22 21:21:17,673 INFO: Lock owner: node01; I am node01
Apr 22 21:21:17 node01 patroni: 2024-04-22 21:21:17,673 INFO: establishing a new patroni heartbeat connection to postgres
Apr 22 21:21:19 node01 patroni: 2024-04-22 21:21:19,760 INFO: no action. I am (node01), the leader with the lock
crashed -> running




reboot操作
reboot 重启机器,自动切主
#reboot master


主机重启后,集群发现replica stopped-> 切换时间线TL 选新主


tail -f /var/log/mesasges
Apr 22 21:26:49 node02 patroni: conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
Apr 22 21:26:49 node02 patroni: psycopg2.OperationalError: connection to server at "192.168.0.211", port 5432 failed: FATAL: the database system is shutting down
Apr 22 21:26:49 node02 patroni: 2024-04-22 21:26:49,760 INFO: no action. I am (node02), a secondary, and following a leader (node01)
Apr 22 21:27:28 node02 patroni: 2024-04-22 21:27:26,924 INFO: Lock owner: node03; I am node02
Apr 22 21:27:28 node02 patroni: 2024-04-22 21:27:28,644 INFO: Local timeline=6 lsn=0/180001F0
Apr 22 21:27:28 node02 patroni: 2024-04-22 21:27:28,696 ERROR: Exception when working with leader
Apr 22 21:27:28 node02 patroni: Traceback (most recent call last):
Apr 22 21:27:28 node02 patroni: psycopg2.OperationalError: connection to server at "192.168.0.213", port 5432 failed: FATAL: no pg_hba.conf entry for host "192.168.0.212", user "postgres", database "postgres", no encryption
Apr 22 21:27:28 node02 patroni: 2024-04-22 21:27:28,751 INFO: no action. I am (node02), a secondary, and following a leader (node03)
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




