数据库高可用--MySQL--高可用之keepalive+MySQL
1. 高可用方案说明
MySQL高可用方案主要目标是保证MySQL主库异常宕机后从库能够快速承接业务,从而保证业务的连续性。MySQL的高可用都是基于MySQL主从实现的,然后组合其他的中间件形成适合公司业务的高可用架构方案。高可用方案要考虑架构的稳定性,时效性,一致性,这三者做很好的平衡。
目前主流的MySQL高可用架构有keepalived+一主一从,vip/dns+一主多从+mha,vip/dns+一主多从+ Orchestrator。还可以加入代理中间做负载均衡,分库分表等,类似的中间件有haproxy,proxysql
,mycat,dble,sharingsphere等。
组件分类
Keepalive,dns,consul,zookeeper等这些主要是实现应用程序的统一的访问入口。
Mha,Orchestrator主要是实现主从关系的管理和自动切换
haproxy,proxysql,mycat,dble, sharingsphere等主要是实现负载均衡,访问流量管理,分库分表等。
目前苍穹数据库都是一主一从架构,这样搭配keepalived,采用keepalived+一主一从的高可用方案是最合适的。这种方案最大的好处就是简单可控。本文档不会详细讲解高可用架构的安装部署,主要是测试MySQL异步和半同步的性能对比,高可用各种故障情况的测试。
测试服务器信息
IP地址 | 主从 | 版本 |
192.168.63.65 | 主 | 8.0.26 |
192.168.63.63 | 从 | 8.0.26 |
2. MySQL异步和半同步复制
异步复制:
应用发起数据更新(含insert、update、delete操作)请求,master在执行完更新操作后立即向应用程序返回响应,然后master在向slave同步数据。
数据更新过程中master不需要等待slave的响应,因此异步复制的数据库实例通常具有较高的性能,且slave不可以用并不影响master对外提供服务。但因数据并不是实时同步到slave,而master在slave有延迟的情况下发生故障则有较小的概率会引起数据不一致。
半同步复制:
应用发起数据更新(含insert、update、delete操作)请求,master在执行完更新操作后立即向slave复制数据,slave接收到数据并写到relaylog中(不需要执行)后才向master返回成功的信息,master必须在接受到slave的成功信息后再向应用程序返回响应。
仅在数据复制发生异常(slave节点不可以用或者数据复制所用的网络发生异常)的情况下,master会暂停(MySQL默认10s左右)对应用的响应,将复制方式降为异步复制。当数据复制恢复正常,将恢复为半同步复制。
2.1MySQL安装和主从搭建
1.创建用户
/usr/sbin/groupadd mysql
/usr/sbin/useradd -g mysql mysql
2.解压文件
tar -xvJf mysql-8.0.26-linux-glibc2.12-x86_64.tar.xz -C /usr/local
cd /usr/local
ln -s mysql-8.0.26-linux-glibc2.12-x86_64 mysql80
3.创建目录结构
mkdir -pv /data/dbdata/mysqldata
mkdir -pv /data/dbdata/mysqllog
mkdir -pv /data/dbdata/mysqllog/binlog
mkdir -pv /data/dbdata/mysqllog/relay-log
4.设置配置文件
拷贝模板my.cnf 到/etc/my.cnf
修改目录权限
chown -R mysql.mysql /data/dbdata
chown -R mysql.mysql /usr/local/mysql80
5.初始化数据库
/usr/local/mysql80/bin/mysqld --defaults-file=/home/mysql3308/etc/my.cnf --initialize --user=mysql
建立主从关系,在从库执行
CHANGE MASTER TO MASTER_HOST='192.168.63.65',MASTER_PORT=3307,MASTER_USER='repl',MASTER_PASSWORD='replPAS',MASTER_LOG_FILE='binlog.000105',MASTER_LOG_POS=156;
2.2半同步安装部署
主库查看是否安装半同步复制插件:rpl_semi_sync_master
show plugins;
使用二进制或者源码编译安装,用find查找模块是否存在
find / -name "semisync*"
主和所有从节点均需安装并开启插件
主库安装
install plugin rpl_semi_sync_master soname 'semisync_master.so';
从库安装
install plugin rpl_semi_sync_slave soname 'semisync_slave.so';
show variables like '%semi%';
+-------------------------------------------+------------+
| Variable_name | Value |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled | OFF |
| rpl_semi_sync_master_timeout | 10000 |
| rpl_semi_sync_master_trace_level | 32 |
| rpl_semi_sync_master_wait_for_slave_count | 1 |
| rpl_semi_sync_master_wait_no_slave | ON |
| rpl_semi_sync_master_wait_point | AFTER_SYNC |
+-------------------------------------------+------------+
主相关【系统变量】
rpl_semi_sync_master_enabled:主库是否启用了半同步复制,默认为OFF。
rpl_semi_sync_master_timeout:等待从库的ACK回复的超时时间,默认为10秒。
rpl_semi_sync_master_trace_level:半同步复制时主库的调试级别。
rpl_semi_sync_master_wait_for_slave_count:主库在超时时间内需要收到多少个ACK回复才认为此次提交成功,否则就降级为异步复制。该变量在MySQL 5.7.3才提供,在此之前的版本都默认为收到1个ACK则确认成功,且不可更改。MySQL 5.7.3之后该变量的默认值也是1。
rpl_semi_sync_master_wait_no_slave:默认值为ON,当状态变量Rpl_semi_sync_master_clients中的值小于rpl_semi_sync_master_wait_for_slave_count时,Rpl_semi_sync_master_status依旧为ON,只有当事务提交后等待rpl_semi_sync_master_timeout超时后,Rpl_semi_sync_master_status才会变为OFF,即降级为异步复制;为OFF时,当状态变量Rpl_semi_sync_master_clients中的值小于rpl_semi_sync_master_wait_for_slave_count时,Rpl_semi_sync_master_status立即显示为OFF,即立即降级为异步复制。
rpl_semi_sync_master_wait_point:控制主库上commit、接收ack、返回消息给客户端的时间点。值为 AFTER_SYNC 或 AFTER_COMMIT。该选项是MySQL5.7.2后引入的,默认值为 AFTER_SYNC。此版本之前,等价于使用了 AFTER_COMMIT 模式。
从库
show variables like '%semi%';
+---------------------------------+-------+
| Variable_name | Value |
+---------------------------------+-------+
| rpl_semi_sync_slave_enabled | OFF |
| rpl_semi_sync_slave_trace_level | 32 |
+---------------------------------+-------+
从相关系统变量:
rpl_semi_sync_slave_enabled:从库是否开启半同步复制。
rpl_semi_sync_slave_trace_level:从库的调试级别。
启用半同步复制
-- 主库
set global rpl_semi_sync_master_enabled = 1;
-- 从库
set global rpl_semi_sync_slave_enabled = 1;
长期生效修改my.conf配置文件
#主库
plugin-load="rpl_semi_sync_master=semisync_master.so"
rpl_semi_sync_master_enabled=1
#主库
plugin-load="rpl_semi_sync_slave=semisync_slave.so"
rpl_semi_sync_slave_enabled=1
重启从库上的I/O线程
stop slave io_thread;
start slave io_thread;
查看半同步是否在运行
#主库
mysql> show status like 'Rpl_semi_sync_master_status';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_status | ON |
+-----------------------------+-------+
1 row in set (0.01 sec)
#从库
mysql> show status like 'Rpl_semi_sync_slave_status';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Rpl_semi_sync_slave_status | ON |
+----------------------------+-------+
1 row in set (0.00 sec)
这两个变量常用来监控主从是否运行在半同步复制模式下。至此,MySQL半同步复制搭建完毕。
从库上只有一个半同步复制相关的状态变量Rpl_semi_sync_slave_status,为ON时表示从库使用半同步复制,OFF表示从库使用异步复制。
主库有14个半同步复制相关的【状态变量】:
Rpl_semi_sync_master_clients:当前连接了多少个半同步从库。
Rpl_semi_sync_master_net_avg_wait_time:主库等待从库回复的平均时间,以微秒为单位。此变量始终为0,不推荐使用,并且将在以后的版本中删除。
Rpl_semi_sync_master_net_wait_time:主库等待从库回复的总时间,以微秒为单位。此变量始终为0,不推荐使用,并且将在以后的版本中删除。
Rpl_semi_sync_master_net_waits:主库等待从库回复的总次数。
Rpl_semi_sync_master_no_times:主库关闭半同步复制的次数。
Rpl_semi_sync_master_no_tx:从库未成功确认的事务数。
Rpl_semi_sync_master_status:为ON时表示主库使用半同步复制,OFF表示主库使用异步复制。
Rpl_semi_sync_master_timefunc_failures:调用gettimeofday等时间函数时主库失败的次数。
Rpl_semi_sync_master_tx_avg_wait_time:主库等待一个事务的平均时间,以微秒为单位。
Rpl_semi_sync_master_tx_wait_time:主等待事务的总时间,以微秒为单位。
Rpl_semi_sync_master_tx_waits:主库等待事务的总次数。
Rpl_semi_sync_master_wait_pos_backtraverse:主库等待事件的二进制坐标低于之前等待事件的总次数。当事务开始等待回复的顺序与其二进制日志事件的写入顺序不同时,就会发生这种情况。
Rpl_semi_sync_master_wait_sessions:当前等待从库回复的会话数。
Rpl_semi_sync_master_yes_tx:从库成功确认的事务数。
2.3异步和半同步性能测试对比
1.导入47G dump数据文件到数据库
Mysqldump -uadmin –h192.168.63.65 -P3307 –pxxxx < pt_bizbl_big_wfs.sql
异步:耗时(s): 811s
半同步:耗时(s): 1021s
2.使用sysbench测试
sysbench /usr/local/share/sysbench/tests/include/oltp_legacy/oltp.lua --mysql-host=192.168.63.65 --mysql-port=3307 --mysql-user=admin --mysql-password='kd$2020' --mysql-db=testdb --db-driver=mysql --oltp-test-mode=complex --oltp-tables-count=32 --oltp-table-size=1000000 --threads=64 --time=120 --report-interval=10 prepare
sysbench /usr/local/share/sysbench/tests/include/oltp_legacy/oltp.lua --mysql-host=192.168.63.65 --mysql-port=3307 --mysql-user=admin --mysql-password='kd$2020' --mysql-db=testdb --db-driver=mysql --oltp-test-mode=complex --oltp-tables-count=32 --oltp-table-size=1000000 --threads=64 --time=120 --report-interval=10 run
异步QPS:3271.60 per sec
半同步QPS:2934.64 per sec
测试方式 | 异步复制 | 半同步复制 | 性能损耗 |
导入数据测试[单位s] | 811s | 1021s | 25.89% |
Sysben测试[单位qps] | 3271.60 per sec | 2934.64 per sec | 11.48% |
测试结论:半同步复制比异步复制性能要下降10-25%之间
3. keepalived高可用测试
目前苍穹采用这种高可用方案,优先保证数据一致性,只有在严格的情况下才会触发VIP的漂移,并且切换到从库后需要手动关闭只读属性。在mysql发生oom,carsh 不会切换,mysql异常关闭后如果能够重新启动也不会切换,数据库压力高,超过最大连接数等不会切换。
相对于目前私有云和公有云keepalived+一主一从方案,本方案有以下几点改进
• keepalived无主备概念,无优先级等级,不抢占模式,谁先启动谁就是master
• 整个运行过程中不会关闭keepalived,除非需要长时间停掉mysql要手动停止keepalive,减少运维复杂度
• Mysql挂掉后,会判断主从延迟情况,如果同步不一致不会做切换,如果同步一致后会做切换
• mysql发生oom,carsh 不会切换,mysql挂掉自动重启一次
• 如果高可用要降级,可以停掉从库的keepalived,全程手动切换,不会导致数据一致性问题
• Keepalived日志独立,不和系统日志打印在一起,方便查看问题
• 主从同步检测不依赖主库,当主库宕机无法ssh登录时不会检测失败
MySQL高可用严重依赖主从同步,如果主从中断,延迟了,主机挂掉的时候从库就不能及时接管业务,这样就等于没有高可用。所以MySQL主从同步一定要做好监控,这个告警等级要和MySQL服务down掉一样高。
3.1. Keepalived部署常见问题
keepalived配置文件
主备采用相同的配置
Cat /etc/keepalived.conf
global_defs {
router_id mysql_65_63
script_user root
enable_script_security
}
vrrp_script chk_mysql {
script "/etc/keepalived/check_mysql.sh"
interval 30
fall 3
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 69
mcast_src_ip 192.168.63.65
priority 100
nopreempt
advert_int 3
authentication {
auth_type PASS
auth_pass kingdee2020
}
track_script {
chk_mysql
}
virtual_ipaddress {
192.168.63.69
}
#notify_master /etc/keepalived/readonly_off.sh
notify_backup /etc/keepalived/readonly_on.sh
notify_stop /etc/keepalived/readonly_on.sh
}
常见报错信息
报错Configuration file '/etc/keepalived/keepalived.conf' is not a regular non-executable file
修改配置文件权限
chmod 664 /etc/keepalived/keepalived.conf
日志中输出如下
WARNING - default user 'keepalived_script' for script execution does not exist - please create
添加脚本运行任务
script_user root
enable_script_security
如果报这种错误,说明脚本执行时间超过interval,没有执行完成就要重新开始执行
Track script chk_mysql is being timed out, expect idle - skipping run
修改参数interval 调大间隔时间
MySQL检测脚本
script "/etc/keepalived/check_mysql.sh"脚本
脚本中的变量要根据实际情况替换当前环境的值
#变量需要根据实际情况修改
my_cnf="/etc/my.cnf"
mysql_bin="/usr/local/mysql80/bin"
mysql_slave_ip="192.168.63.63"
mysql_slave_port="3307"
mysql_slave_user="admin"
mysql_slave_passwd='kd$2020'
mysql_service="mysqld_8.0_3307" ##初始安装需要修改
日志输出配置
日志默认是输出到/var/log/message,如果需要改变keepalive日志输出的位置
配置的是KEEPALIVED_OPTIONS="-D -d -S 0";而-S是syslog的facility,0表示放在local0,在/etc/rsyslog.conf 中配置local0.* /var/log/keepalived.log
systemctl restart rsyslog
直接修改/usr/lib/systemd/system/keepalived.service
ExecStart=/usr/local/keepalived/sbin/keepalived -D -d -S 0
systemctl daemon-reload
systemctl restart keepalived
主库keepalive成功启动日志如下
Jun 23 10:30:03 centos7-65 Keepalived[47088]: Starting Keepalived v1.4.0 (12/29,2017)
Jun 23 10:30:03 centos7-65 Keepalived[47088]: Running on Linux 3.10.0-123.el7.x86_64 #1 SMP Mon Jun 30 12:09:22 UTC 2014 (built for Linux 3.10.0)
Jun 23 10:30:03 centos7-65 Keepalived[47088]: Opening file '/etc/keepalived/keepalived.conf'.
Jun 23 10:30:03 centos7-65 systemd: PID file /var/run/keepalived.pid not readable (yet?) after start.
Jun 23 10:30:03 centos7-65 Keepalived[47089]: Starting Healthcheck child process, pid=47090
Jun 23 10:30:03 centos7-65 systemd: Started Keepalived.
Jun 23 10:30:03 centos7-65 Keepalived[47089]: Starting VRRP child process, pid=47091
Jun 23 10:30:03 centos7-65 Keepalived_healthcheckers[47090]: Opening file '/etc/keepalived/keepalived.conf'.
Jun 23 10:30:03 centos7-65 Keepalived_vrrp[47091]: Registering Kernel netlink reflector
Jun 23 10:30:03 centos7-65 Keepalived_vrrp[47091]: Registering Kernel netlink command channel
Jun 23 10:30:03 centos7-65 Keepalived_vrrp[47091]: Registering gratuitous ARP shared channel
Jun 23 10:30:03 centos7-65 Keepalived_vrrp[47091]: Opening file '/etc/keepalived/keepalived.conf'.
Jun 23 10:30:03 centos7-65 Keepalived_vrrp[47091]: Truncating auth_pass to 8 characters
Jun 23 10:30:03 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) removing protocol VIPs.
Jun 23 10:30:03 centos7-65 Keepalived_vrrp[47091]: Using LinkWatch kernel netlink reflector...
Jun 23 10:30:03 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) Entering BACKUP STATE
Jun 23 10:30:03 centos7-65 Keepalived_vrrp[47091]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Jun 23 10:30:03 centos7-65 Keepalived_vrrp[47091]: VRRP_Script(chk_mysql) succeeded ###执行脚本
Jun 23 10:30:12 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) Transition to MASTER STATE
Jun 23 10:30:15 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) Entering MASTER STATE ###进入MASTER状态
Jun 23 10:30:15 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) setting protocol VIPs.
Jun 23 10:30:15 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 10:30:15 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.63.69
Jun 23 10:30:15 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 10:30:15 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 10:30:15 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 10:30:15 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 10:30:15 centos7-65 avahi-daemon[2226]: Registering new address record for 192.168.63.69 on eth0.IPv4. ###注册虚拟ip地址
Jun 23 10:30:20 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 10:30:20 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.63.69
Jun 23 10:30:20 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 10:30:20 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 10:30:20 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 10:30:20 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
备库keepalive启动日志
Jun 23 11:17:42 tikv3 systemd: Starting Keepalived...
Jun 23 11:17:42 tikv3 Keepalived[88839]: Starting Keepalived v1.4.0 (12/29,2017)
Jun 23 11:17:42 tikv3 Keepalived[88839]: Running on Linux 3.10.0-123.el7.x86_64 #1 SMP Mon Jun 30 12:09:22 UTC 2014 (built for Linux 3.10.0)
Jun 23 11:17:42 tikv3 Keepalived[88839]: Opening file '/etc/keepalived/keepalived.conf'.
Jun 23 11:17:42 tikv3 Keepalived[88840]: Starting Healthcheck child process, pid=88841
Jun 23 11:17:42 tikv3 Keepalived[88840]: Starting VRRP child process, pid=88842
Jun 23 11:17:42 tikv3 Keepalived_healthcheckers[88841]: Opening file '/etc/keepalived/keepalived.conf'.
Jun 23 11:17:42 tikv3 Keepalived_vrrp[88842]: Registering Kernel netlink reflector
Jun 23 11:17:42 tikv3 Keepalived_vrrp[88842]: Registering Kernel netlink command channel
Jun 23 11:17:42 tikv3 Keepalived_vrrp[88842]: Registering gratuitous ARP shared channel
Jun 23 11:17:42 tikv3 Keepalived_vrrp[88842]: Opening file '/etc/keepalived/keepalived.conf'.
Jun 23 11:17:42 tikv3 Keepalived_vrrp[88842]: Truncating auth_pass to 8 characters
Jun 23 11:17:42 tikv3 Keepalived_vrrp[88842]: VRRP_Instance(VI_1) removing protocol VIPs.
Jun 23 11:17:42 tikv3 Keepalived_vrrp[88842]: Using LinkWatch kernel netlink reflector...
Jun 23 11:17:42 tikv3 Keepalived_vrrp[88842]: VRRP_Instance(VI_1) Entering BACKUP STATE ###进入BACKUP状态
Jun 23 11:17:42 tikv3 Keepalived_vrrp[88842]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
3.2. 高可用模拟故障测试
机器+keepalived+脚本检测,三种状态等同,一般情况keepalive不会挂掉,一般只是人为关闭或者和机器同时挂掉。
故障编号 | 故障说明 | 解释 |
RT_MySQL_01 | 只启动单个keepalived | 作用就是启用一个VIP,出现故障手动启停keepaliveed来做vip转移,目前有部分客户采用次方案 |
RT_MySQL_02 | 主库MySQL挂掉,主从同步正常 | Oom,crash recovery不会切换,sleep 10 等待自己启动,mysql进程不在还会自动重启一下,这个可以看情况也可以去掉。如果确实需要停掉mysql主库,又不想vip漂移,可以先停掉从的keepalived然后停掉主的keepalived |
RT_MySQL_03 | 主库MySQL挂掉,主从同步不一致 | 当主从同步不一致不能切换,会导致数据一致,如果同步跟上一致了会发生切换 |
RT_MySQL_04 | 主库keepalive挂掉 | 一般情况不会出现,除非是人为停掉或者机器挂掉,这种不会做mysql的任何检测包括主从,直接把vip漂移过去,从库只读关闭要人工确认 |
RT_MySQL_05 | 主库机器挂掉 | 同上 |
RT_MySQL_06 | 主库和从库网络之间不通 | 这种情况会存在多个vip的情况出现脑裂情况,需要手工停掉一个keepalived |
RT_MySQL_07 | 主库负载高 | 检测mysql进程是否存在,出现负载高,数据库hang住,超过连接数等都要手工处理,不会做切换 |
RT_MySQL_08 | 从库挂掉 | 无影响,从库检测也会执行,也会执行check_mysql.sh逻辑,会把从库重启一次 |
RT_MySQL_09 | 从库keepalived挂掉 | 无影响 |
RT_MySQL_10 | 主库挂掉,从库挂掉,keepalived运行正常 | VIP消失 |
RT_MySQL_01
只启动单个keepalive,提供VIP的服务,这种情况下vip不会漂移,有故障发生手动启动从库的keepalive。这样方案的好处就是不会有数据不一致的问题。
关闭mysql服务,VIP会down掉,keepalived输出信息如下
Jun 23 10:59:03 centos7-65 Keepalived_vrrp[47091]: VRRP_Script(chk_mysql) failed (exited with status 1)
Jun 23 10:59:05 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) Entering FAULT STATE
Jun 23 10:59:05 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) removing protocol VIPs.
Jun 23 10:59:05 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) Now in FAULT state
Jun 23 10:59:05 centos7-65 avahi-daemon[2226]: Withdrawing address record for 192.168.63.69 on eth0.
重新把mysql拉起来,keepalive日志,vip重新注册成功
Jun 23 11:04:06 centos7-65 Keepalived_vrrp[47091]: VRRP_Script(chk_mysql) succeeded
Jun 23 11:04:12 centos7-65 Keepalived_vrrp[47091]: Kernel is reporting: interface eth0 UP
Jun 23 11:04:12 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1): Entering BACKUP STATE
Jun 23 11:04:22 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) Transition to MASTER STATE
Jun 23 11:04:25 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) Entering MASTER STATE
Jun 23 11:04:25 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) setting protocol VIPs.
Jun 23 11:04:25 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 11:04:25 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.63.69
Jun 23 11:04:25 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 11:04:25 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 11:04:25 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 11:04:25 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 11:04:25 centos7-65 avahi-daemon[2226]: Registering new address record for 192.168.63.69 on eth0.IPv4.
Jun 23 11:04:30 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 11:04:30 centos7-65 Keepalived_vrrp[47091]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.63.69
Jun 23 11:04:30 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 11:04:30 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 11:04:30 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jun 23 11:04:30 centos7-65 Keepalived_vrrp[47091]: Sending gratuitous ARP on eth0 for 192.168.63.69
测试结果【mysql挂掉VIP会down掉】
RT_MySQL_02
主库MySQL挂掉,主从同步正常,其中会排查掉oom,crash挂掉的情况。
第一种测试kill -9 `pidof mysqld` 模拟carsh或者oom情况
mysql日志输出
2022-07-13T13:43:55.176671+08:00 0 [Warning] [MY-010097] [Server] Insecure configuration for --secure-file-priv: Current value does not restrict location of generated files. Consider setting it to a valid, non-empty path.
2022-07-13T13:43:55.176787+08:00 0 [System] [MY-010116] [Server] /usr/local/mysql80/bin/mysqld (mysqld 8.0.26) starting as process 97886
2022-07-13T13:43:55.188148+08:00 0 [Warning] [MY-000081] [Server] option 'innodb-ft-total-cache-size': unsigned value 6400000000 adjusted to 1600000000.
2022-07-13T13:43:55.188371+08:00 0 [Warning] [MY-000081] [Server] option 'innodb-ft-result-cache-limit': unsigned value 40000000000 adjusted to 4294967295.
2022-07-13T13:43:55.192936+08:00 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2022-07-13T13:44:03.815637+08:00 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2022-07-13T13:44:04.551760+08:00 0 [Warning] [MY-010736] [Server] Couldn't load plugin named 'rpl_semi_sync_master' with soname 'semisync_master.so'.
2022-07-13T13:44:04.558550+08:00 0 [Warning] [MY-010918] [Repl] 'rpl_semi_sync_master' is deprecated and will be removed in a future release. Please use rpl_semi_sync_source instead.
2022-07-13T13:44:04.597628+08:00 0 [System] [MY-010229] [Server] Starting XA crash recovery...
2022-07-13T13:44:04.610862+08:00 0 [System] [MY-010232] [Server] XA crash recovery finished.
keepalive无日志
测试结果【不会切换】
第二种情况关闭mysql,能够重启成功,这种下也不会切换
mysql日志输出
2022-07-13T14:12:33.068072+08:00 0 [System] [MY-013172] [Server] Received SHUTDOWN from user <via user signal>. Shutting down mysqld (Version: 8.0.26).
2022-07-13T14:12:43.160249+08:00 0 [System] [MY-010910] [Server] /usr/local/mysql80/bin/mysqld: Shutdown complete (mysqld 8.0.26) MySQL Community Server - GPL.
2022-07-13T14:12:55.838827+08:00 0 [System] [MY-010116] [Server] /usr/local/mysql80/bin/mysqld (mysqld 8.0.26) starting as process 113285
2022-07-13T14:12:55.849266+08:00 0 [Warning] [MY-000081] [Server] option 'innodb-ft-total-cache-size': unsigned value 6400000000 adjusted to 1600000000.
2022-07-13T14:12:55.849551+08:00 0 [Warning] [MY-000081] [Server] option 'innodb-ft-result-cache-limit': unsigned value 40000000000 adjusted to 4294967295.
2022-07-13T14:12:55.853794+08:00 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2022-07-13T14:13:04.331102+08:00 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
keepalive无日志
测试结果【不会切换】
第三种情况关闭mysql,并且重启不成功,这种情况需要发生切换
模拟重启不成功,修改mysql启动文件
把启动mysql的命令注释掉
#$basedir/bin/mysqld_safe --defaults-file=$my_cnf >/dev/null 2>&1 &
#wait_for_pid created "$!" "$mysqld_pid_file_path"; return_value=$?
mysql日志输出
2022-07-13T14:50:38.970201+08:00 0 [System] [MY-013172] [Server] Received SHUTDOWN from user <via user signal>. Shutting down mysqld (Version: 8.0.26).
2022-07-13T14:50:48.682089+08:00 0 [System] [MY-010910] [Server] /usr/local/mysql80/bin/mysqld: Shutdown complete (mysqld 8.0.26) MySQL Community Server - GPL.
keepalive主日志
Jul 13 14:52:24 centos7-65 Keepalived_vrrp[111579]: VRRP_Script(chk_mysql) failed (exited with status 1)
Jul 13 14:52:26 centos7-65 Keepalived_vrrp[111579]: VRRP_Instance(VI_1) Entering FAULT STATE
Jul 13 14:52:26 centos7-65 Keepalived_vrrp[111579]: VRRP_Instance(VI_1) removing protocol VIPs.
Jul 13 14:52:26 centos7-65 Keepalived_vrrp[111579]: VRRP_Instance(VI_1) Now in FAULT state
keepalive从日志
Jul 13 14:52:27 tikv3 Keepalived_vrrp[61230]: VRRP_Instance(VI_1) Transition to MASTER STATE
Jul 13 14:52:30 tikv3 Keepalived_vrrp[61230]: VRRP_Instance(VI_1) Entering MASTER STATE
Jul 13 14:52:30 tikv3 Keepalived_vrrp[61230]: VRRP_Instance(VI_1) setting protocol VIPs.
Jul 13 14:52:30 tikv3 Keepalived_vrrp[61230]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jul 13 14:52:30 tikv3 Keepalived_vrrp[61230]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.63.69
Jul 13 14:52:30 tikv3 Keepalived_vrrp[61230]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jul 13 14:52:30 tikv3 Keepalived_vrrp[61230]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jul 13 14:52:30 tikv3 Keepalived_vrrp[61230]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jul 13 14:52:30 tikv3 Keepalived_vrrp[61230]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jul 13 14:52:35 tikv3 Keepalived_vrrp[61230]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jul 13 14:52:35 tikv3 Keepalived_vrrp[61230]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.63.69
Jul 13 14:52:35 tikv3 Keepalived_vrrp[61230]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jul 13 14:52:35 tikv3 Keepalived_vrrp[61230]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jul 13 14:52:35 tikv3 Keepalived_vrrp[61230]: Sending gratuitous ARP on eth0 for 192.168.63.69
Jul 13 14:52:35 tikv3 Keepalived_vrrp[61230]: Sending gratuitous ARP on eth0 for 192.168.63.69
测试结果【vip会切换到从库】
RT_MySQL_03
主库MySQL挂掉,主从同步不一致(可能是延迟,也能是中断了),这种情况不能切换
判断的条件如下,包含了延迟和中断的情况,主从跟上了会切换
[ -z "$master_log_file" ] || [ -z "$relay_master_log_file" ] || [ -z "$read_master_log_pos" ] || [ -z "$exec_master_log_pos" ]
比较主从同步一致情况
[ ${master_log_file} == ${relay_master_log_file} ] && [ ${read_master_log_pos} -eq ${exec_master_log_pos} ]
用sysbench 造数据,从库先stop slave掉,主库造数据完成后start slave,这样模拟主从延迟现象
主从中断情况不用模拟,文件名称都不相等,不存在主从同步一致的情况
mysql关闭
2022-07-13T16:33:22.616794+08:00 0 [System] [MY-013172] [Server] Received SHUTDOWN from user <via user signal>. Shutting down mysqld (Version: 8.0.26).
2022-07-13T16:33:24.465640+08:00 0 [System] [MY-010910] [Server] /usr/local/mysql80/bin/mysqld: Shutdown complete (mysqld 8.0.26) MySQL Community Server - GPL.
主从检测不同步,keepalive无日志输出
主库关掉,从库不能连接主库,同步不一致
如果io线程已经把binlog日志全部同步过来,但是sql线程没有同步完成,等sql线程同步完成会触发切换
主从检测同步,keepalive输出
2022-07-13T17:18:47.134046+08:00 0 [System] [MY-013172] [Server] Received SHUTDOWN from user <via user signal>. Shutting down mysqld (Version: 8.0.26).
2022-07-13T17:18:59.018091+08:00 0 [System] [MY-010910] [Server] /usr/local/mysql80/bin/mysqld: Shutdown complete (mysqld 8.0.26) MySQL Community Server - GPL.
从库状态
************************* 1. row ***************************
Slave_IO_State: Reconnecting after a failed source event read
Master_Host: 192.168.63.65
Master_User: repl
Master_Port: 3307
Connect_Retry: 60
Master_Log_File: binlog.000123
Read_Master_Log_Pos: 876060889
Relay_Log_File: mysql_relay_bin6507.000039
Relay_Log_Pos: 876061098
Relay_Master_Log_File: binlog.000123
Slave_IO_Running: Connecting
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table: mysql.%,performance_schema.%
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 876060889
Relay_Log_Space: 876061355
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 2003
Last_IO_Error: error reconnecting to master 'repl@192.168.63.65:3307' - retry-time: 60 retries: 3 message: Can't connect to MySQL server on '192.168.63.65:3307' (111)
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 6507
Master_UUID: c29e1834-c155-11ec-b3c0-861550fd19a2
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp: 220713 17:20:55
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
Network_Namespace:
1 row in set, 1 warning (0.00 sec)
当主从同步完成时会发生切换
主keepalived切换日志
Jul 13 17:22:24 centos7-65 Keepalived_vrrp[111579]: VRRP_Script(chk_mysql) failed (exited with status 1)
Jul 13 17:22:25 centos7-65 Keepalived_vrrp[111579]: VRRP_Instance(VI_1) Entering FAULT STATE
Jul 13 17:22:25 centos7-65 Keepalived_vrrp[111579]: VRRP_Instance(VI_1) removing protocol VIPs.
Jul 13 17:22:25 centos7-65 Keepalived_vrrp[111579]: VRRP_Instance(VI_1) Now in FAULT state
从库keepalived 切换日志
Jul 13 17:22:26 tikv3 Keepalived_vrrp[73661]: VRRP_Instance(VI_1) Transition to MASTER STATE
Jul 13 17:22:29 tikv3 Keepal
数据库高可用--MySQL--高可用之keepalive+MySQL
本文2024-09-23 01:12:11发表“云苍穹知识”栏目。
本文链接:https://wenku.my7c.com/article/kingdee-cangqiong-144428.html
相关文章
- 鼎捷EAI整合規範文件V3.1.07 (集團).pdf
- 鼎捷OpenAPI應用場景說明_基礎資料.pdf
- 鼎捷OpenAPI應用場景說明_財務管理.pdf
- 鼎捷T100 API設計器使用手冊T100 APIDesigner(V1.0).docx
- 鼎新e-GoB2雲端ERP B2 線上課程E6-2應付票據整批郵寄 領取.pdf
- 鼎新e-GoB2雲端ERP B2 線上課程A4使用者建立權限設定.pdf
- 鼎新e-GoB2雲端ERP B2 線上課程C3會計開帳與會計傳票.pdf
- 鼎新e-GoB2雲端ERP B2 線上課程E6-1應付票據.pdf
- 鼎新e-GoB2雲端ERP B2 線上課程A5-1進銷存參數設定(初階篇).pdf
- 鼎新e-GoB2雲端ERP B2 線上課程D2帳款開帳與票據開帳.pdf