DBA专题
DBA授课
DBA公开课
DBA训练营三天
01.Mysql基础入门-数据库简介
02.Mysql基础入门-部署与管理体系
03.MySQL主流版本版本特性与部署安装
04.Mysql-基础入门-用户与权限
05 MySQL-SQL基础2
06 SQL高级开发-函数
07 MySQL-SQL高级处理
08 SQL练习 作业
09 数据库高级开发2
10 Mysql基础入门-索引
11 Mysql之InnoDB引擎架构与体系结构
12 Mysql之InnoDB存储引擎
13 Mysql之日志管理
14 Mysql备份,恢复与迁移
15 主从复制的作用及重要性
16 Mysql Binlog Event详解
17 Mysql 主从复制
18 MySQL主从复制延时优化及监控故障处理
19 MySQL主从复制企业级场景解析
20 MySql主从复制搭建
21 MySQL高可用-技术方案选型
22 MySQL高可用-MHA(原理篇)
23 MySQL MHA实验
24 MySQL MGR
25 部署MySQL InnoDB Cluster
26 MySQL Cluster(MGR)
27 MySQL ProxySQL中间件
相信可能就有无限可能
-
+
首页
23 MySQL MHA实验
# MHA 实验 MHA(Master High Availability)是⼀套相对成熟的MySQL⾼可⽤⽅案,能做到在0~30s内⾃动完成数据库的故 障切换操作,在master服务器不宕机的情况下,基本能保证数据的⼀致性。 它由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。其中,MHA Manager可以单独部署 在⼀台独⽴的机器上管理多个master-slave集群,也可以部署在⼀台slave上。MHA Node则运⾏在每个mysql节点 上,MHA Manager会定时探测集群中的master节点,当master出现故障时,它⾃动将最新数据的slave提升为 master,然后将其它所有的slave指向新的master。 在MHA⾃动故障切换过程中,MHA试图保存master的⼆进制⽇志,从⽽最⼤程度地保证数据不丢失,当这并不总 是可⾏的,譬如,主服务器硬件故障或⽆法通过ssh访问,MHA就没法保存⼆进制⽇志,这样就只进⾏了故障转移 但丢失了最新数据。可结合MySQL 5.5中推出的半同步复制来降低数据丢失的⻛险。 ------ MHA for MySQL:掌握 MySQL 高可用性管理器和工具 MHA 的主要目标是在短时间(通常 10-30 秒)停机时间内自动执行主故障转移和从升级,而不会遇到复制一致性问题,无需花钱购买大量新服务器,不会造成性能损失,也不会产生复杂性(易于操作)安装),并且无需更改现有部署。 MHA还提供了一种预定的在线master切换的方法:在停机的几秒(0.5-2秒)内安全地将当前正在运行的master更改为新的master(仅阻止写入)。 MHA 提供以下功能,并且在需要高可用性、数据完整性、几乎不间断的主维护等要求的许多部署中非常有用。 * 自动主控监控和故障转移 MHA 具有在现有复制环境中监控 MySQL 主控、检测主控故障并自动执行主控故障转移的功能。即使某些从站尚未收到最新的中继日志事件,MHA 也会自动识别来自最新从站的差异中继日志事件,并将差异事件应用于其他从站。所以所有的slave都可以保持一致。 MHA 通常可以在几秒钟内完成故障转移(9-12 秒检测主站故障,可选地 7-10 秒关闭主站电源以避免脑裂,几秒钟将差异中继日志应用到新主站,因此总停机时间为通常为 10-30 秒)。此外,您可以在配置文件中将特定从站定义为候选主站(设置优先级)。由于 MHA 修复了从站之间的一致性,因此您可以将任何从站提升为新的主站,并且不会发生一致性问题(这可能会导致突然的复制失败)。 * 交互式(手动)主站故障转移 您还可以使用 MHA 仅进行故障转移,而不是用于监控主站。您可以交互地使用 MHA 进行主故障转移。 * 非交互式主机故障转移 还支持非交互式主机故障转移(不监视主机,但自动执行故障转移)。这个功能非常有用,特别是当你已经使用过监控MySQL master的软件时。例如,您可以使用[Pacemaker(Heartbeat)](http://www.linux-ha.org/wiki/Pacemaker)来检测主站故障和虚拟IP地址接管,并使用MHA来进行主站故障转移和从站升级。 * 在线切换master到不同的主机 在许多情况下,有必要将现有的master迁移到不同的机器(即当前的master在RAID控制器或RAM上有H/W问题,您想更换为更快的机器等) 。这不是主站崩溃,但需要定期进行主站维护才能做到这一点。计划的主维护会导致停机(至少你不能写主),所以应该尽快完成。另一方面,您应该非常小心地阻止/终止当前正在运行的会话,因为不同 master 之间可能会发生一致性问题(即“更新 master1、更新 master 2、提交 master1、提交 master 2 时出错”将导致数据不一致)。快速主切换和优雅的阻塞写入都是必需的。 MHA 提供了一种方法来做到这一点。您可以在 writer block 的 0.5-2 秒内优雅地切换 master。在许多情况下,0.5-2 秒的写入器停机时间是可以接受的,即使不分配计划的维护窗口,您也可以切换主服务器。这意味着您可以更轻松地采取行动,例如升级到更高版本、更快的机器等。 ## 1 集群拓扑 | 角色 | IP | | ---------------- | ------------ | | Master | 172.31.7.110 | | Candicate master | 172.31.7.111 | | Slave | 172.31.7.112 | | Monitor节点 | 172.31.7.113 | | VIP | 172.31.7.166 | ## 2 准备工作 ```bash # systemctl stop firewalld # systemctl disable firewalld # iptables -F # setenforce 0 # vim /etc/selinux/config SELINUX=disabled ``` ## 3 安装MHA 官网地址:https://github.com/yoshinorim **安装相关依赖** - centos 7 或 Rocky Linux 9 ```bash [root@db01 ~]# sudo dnf install epel-release [root@db01 ~]# yum -y install perl-DBD-MySQL ncftp perl-CPAN ``` - Ubuntu 20.04 ```sql apt-get update apt-get install libdbd-mysql-perl apt-get install libconfig-tiny-perl apt-get install liblog-dispatch-perl apt-get install libparallel-forkmanager-perl apt-get install libdbd-mysql-perl libconfig-tiny-perl liblog-dispatch-perl libparallel-forkmanager-perl -y ``` ### 3.1 在所有节点上部署 MHA Node - centos 7 ```bash [root@db01 ~]# curl -LO https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm [root@db01 ~]# rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm Verifying... ################################# [100%] Preparing... ################################# [100%] Updating / installing... 1:mha4mysql-node-0.58-0.el7.centos ################################# [100%] # 快速命令 curl -LO https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm yum install epel-release yum -y install perl-DBD-MySQL ncftp perl-CPAN rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm ``` - Ubuntu 20.04 ```bash # wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node_0.58-0_all.deb # dpkg -i mha4mysql-node_0.58-0_all.deb (Reading database ... 111270 files and directories currently installed.) Preparing to unpack mha4mysql-node_0.58-0_all.deb ... Unpacking mha4mysql-node (0.58-0) over (0.58-0) ... Setting up mha4mysql-node (0.58-0) ... Processing triggers for man-db (2.10.2-1) ... ``` - 编译安装centos ```sql # wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58.tar.gz # tar xvf mha4mysql-node-0.58.tar.gz # cd mha4mysql-node-0.58 # yum install cpan # yum install perl-Mail-Sender # perl -MCPAN -e "install inc::Module::Install" # perl -MCPAN -e "install DBI" # perl -MCPAN -e "install DBD::mysql" # perl Makefile.PL # make # make install ... Installing /usr/local/bin/apply_diff_relay_logs Installing /usr/local/bin/filter_mysqlbinlog Installing /usr/local/bin/purge_relay_logs Installing /usr/local/bin/save_binary_logs ... ``` 各个⼯具的作⽤: - save_binary_logs:保存或复制 master 的⼆进制⽇志 - apply_diff_relay_logs:识别差异的 relay log 并将差异的⽇志应⽤到其它 slave 中 - filter_mysqlbinlog:去除不必要的ROLLBACK事件(MHA已不再使⽤这个⼯具) - purge_relay_logs:删除 relay log。 ### 3.2 在 Monitor 节点上部署 MHA Manager - centos 7 ```bash [root@monitor ~]# yum install epel-release [root@monitor ~]# yum makecache [root@monitor ~]# yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y [root@monitor ~]# curl -LO https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm [root@monitor ~]# rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm ``` - Ubuntu 20.04 ```bash # wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager_0.58-0_all.deb # dpkg -i mha4mysql-manager_0.58-0_all.deb Selecting previously unselected package mha4mysql-manager. (Reading database ... 111233 files and directories currently installed.) Preparing to unpack mha4mysql-manager_0.58-0_all.deb ... Unpacking mha4mysql-manager (0.58-0) ... Setting up mha4mysql-manager (0.58-0) ... Processing triggers for man-db (2.10.2-1) ... ``` - 编译安装centos ```sql # cd /usr/src/ # wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58.tar.gz # tar xvf mha4mysql-manager-0.58.tar.gz # cd mha4mysql-manager-0.58 # yum install cpan # perl -MCPAN -e "install Config::Tiny" # perl -MCPAN -e "install Log::Dispatch" # perl -MCPAN -e "install Parallel::ForkManager" # perl Makefile.PL # make # make install Installing /usr/local/bin/masterha_check_repl Installing /usr/local/bin/masterha_check_ssh Installing /usr/local/bin/masterha_check_status Installing /usr/local/bin/masterha_conf_host Installing /usr/local/bin/masterha_manager Installing /usr/local/bin/masterha_master_monitor Installing /usr/local/bin/masterha_master_switch Installing /usr/local/bin/masterha_secondary_check Installing /usr/local/bin/masterha_stop ``` 各个⼯具的作⽤: - masterha_check_ssh:检查 MHA 的 SSH 配置状况 - masterha_check_repl:检查 MySQL 的复制状况 - masterha_manager:启动 MHA - masterha_check_status:检测当前 MHA 运⾏状态 - masterha_master_monitor:检测 master 是否宕机 - masterha_master_switch:控制故障转移(⾃动或⼿动) - masterha_conf_host:添加或删除配置的server信息 - masterha_stop:关闭MHA ## 4 配置 SSH 免密码登陆 1.在 manager(172.31.7.113)上配置到所有 Node 节点的⽆密码验证 ```sql # ssh-keygen ⼀路按“Enter” # ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.31.7.110 # ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.31.7.111 # ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.31.7.112 ``` 2.在 Master(172.31.7.110)上配置 ```sql # ssh-keygen # ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.31.7.111 # ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.31.7.112 ``` 3.在 Candicate master(172.31.7.111)上配置 ```sql # ssh-keygen # ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.31.7.110 # ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.31.7.112 ``` 4.在 Slave(172.31.7.112)上配置 ```sql # ssh-keygen # ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.31.7.110 # ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.31.7.111 ``` ## 5 搭建主从复制环境 - MYSQL配置文件 由于之后slave节点会提升为master节点,所以主从配置要做一些修改 Master配置(参考) ```bash log_bin = master-bin # 启动二进制日志 log_bin_index = master-bin.index relay-log = slave-relay-bin relay-log-purge = 0 # 禁用或启用不再需要中继日志时是否自动清空它们 ``` Slave配置(参考) ```bash log_bin = master-bin relay-log = slave-relay-bin relay-log-index = slave-relay-bin.index relay-log-purge = 0 read_only = 1 ``` - 配置文件 ```bash root@db01:~# cat /etc/my.cnf [client] port = 3306 socket = /data/mysql3306/tmp/mysql.sock #default_character_set = utf8mb4 [mysqld] user= mysql port = 3306 server_id = 1103306 #character-set-server = utf8mb4 datadir = /data/mysql3306/data socket = /data/mysql3306/tmp/mysql.sock log_error = /data/mysql3306/log/mysql.err pid_file = /data/mysql3306/data/mysql.pid slow-query-log slow_query_log_file = /data/mysql3306/log/slowquery.log tmpdir = /data/mysql3306/tmp/ log_bin = /data/mysql3306/log/mysql-bin #replication binlog_format = row binlog_row_image = full expire_logs_days = 10 relay-log = /data/mysql3306/log/relay-log slave_net_timeout = 30 #skip-slave-start slave-parallel-workers = 0 relay_log_info_repository = TABLE master_info_repository = TABLE sync_binlog = 1 gtid-mode=on enforce-gtid-consistency=true log-slave-updates=1 root@db02:~# cat /etc/my.cnf [client] port = 3306 socket = /data/mysql3306/tmp/mysql.sock [mysqld] user= mysql port = 3306 server_id = 1113306 datadir = /data/mysql3306/data socket = /data/mysql3306/tmp/mysql.sock log_error = /data/mysql3306/log/mysql.err pid_file = /data/mysql3306/data/mysql.pid slow-query-log slow_query_log_file = /data/mysql3306/log/slowquery.log tmpdir = /data/mysql3306/tmp/ log_bin = /data/mysql3306/log/mysql-bin #replication binlog_format = row binlog_row_image = full expire_logs_days = 10 relay-log = /data/mysql3306/log/relay-log slave_net_timeout = 30 #skip-slave-start slave-parallel-workers = 0 relay_log_info_repository = TABLE master_info_repository = TABLE sync_binlog = 1 #gtid gtid-mode=on enforce-gtid-consistency=true log-slave-updates=1 root@db03:~# cat /etc/my.cnf [client] port = 3306 socket = /data/mysql3306/tmp/mysql.sock [mysqld] user= mysql port = 3306 server_id = 1123306 datadir = /data/mysql3306/data socket = /data/mysql3306/tmp/mysql.sock log_error = /data/mysql3306/log/mysql.err pid_file = /data/mysql3306/data/mysql.pid slow-query-log slow_query_log_file = /data/mysql3306/log/slowquery.log tmpdir = /data/mysql3306/tmp/ log_bin = /data/mysql3306/log/mysql-bin #replication binlog_format = row binlog_row_image = full expire_logs_days = 10 relay-log = /data/mysql3306/log/relay-log slave_net_timeout = 30 #skip-slave-start slave-parallel-workers = 0 relay_log_info_repository = TABLE master_info_repository = TABLE sync_binlog = 1 #gtid gtid-mode=on enforce-gtid-consistency=true log-slave-updates=1 ``` ### 5.1 在Master上执⾏备份 ```sql # mysqldump --master-data=2 --single-transaction -R --triggers -A > all.sql ``` 其中,-R是备份存储过程,--triggers是备份触发器 -A代表全库 ### 5.2 在Master上创建复制⽤户 ```sql create user 'repl'@'%' identified with mysql_native_password by 'repl123456'; grant replication slave on *.* to 'repl'@'%'; ``` ### 5.3 查看备份⽂件 all.sql 中的 CHANGE MASTER 语句(位置点参考) ```sql # head -n 30 all.sql | grep "CHANGE MASTER TO" # -- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=157; root@db01:/data# head -n 30 all.sql | grep "CHANGE MASTER TO" -- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=157; ``` ### 5.4 将备份⽂件复制到 Candicate master 和 Slave 上 ```sql # scp all.sql 172.31.7.111:/root/ # scp all.sql 172.31.7.112:/root/ ``` ### 5.5 在 Candicate master 上搭建从库 ```bash CHANGE MASTER TO MASTER_HOST='172.31.7.110',master_port=3306, MASTER_USER='repl', MASTER_PASSWORD='repl123456', master_auto_position=1; start slave; show slave status\G ``` (位置点参考)位置点与binlog文件大小的关系 ```bash root@db01:/data/mysql3306/log# ll mysql-bin.000016 -rw-r----- 1 mysql mysql 197 Apr 26 09:16 mysql-bin.000016 root@db01:/data/mysql3306/log# mysql -e 'show master status;' +------------------+----------+--------------+------------------+--------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+--------------------------------------------+ | mysql-bin.000016 | 197 | | | 32049408-02a5-11ef-8b62-000c29f23176:1-361 | +------------------+----------+--------------+------------------+--------------------------------------------+ ``` ```sql # mysql < all.sql CHANGE MASTER TO MASTER_HOST='172.31.7.110', MASTER_USER='repl', MASTER_PASSWORD='repl123456', MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=157; start slave; show slave status\G # 报错 root@db03:~# mysql < all.sql ERROR 1418 (HY000) at line 2623: This function has none of DETERMINISTIC, NO SQL, or READS SQL DATA in its declaration and binary logging is enabled (you *might* want to use the less safe log_bin_trust_function_creators variable) 临时解决: SET GLOBAL log_bin_trust_function_creators = 1; ``` ### 5.6 在 Slave 上搭建从库 ```sql CHANGE MASTER TO MASTER_HOST='172.31.7.110',master_port=3306, MASTER_USER='repl', MASTER_PASSWORD='repl123456', master_auto_position=1; start slave; show slave status\G ``` ### 5.7 从库服务器设置为 super_read_only ```sql set global super_read_only=1; ``` ### 5.8 在 Master 中创建监控⽤户 ```sql create user 'mha_monitor'@'%' identified with mysql_native_password by 'monitor123'; grant reload, process, super, replication client on *.* to 'mha_monitor'@'%'; grant all privileges on mysql.apply_diff_relay_logs_test to 'mha_monitor'@'%'; grant select on mysql.slave_relay_log_info to 'mha_monitor'@'%'; grant select on mysql.user to 'mha_monitor'@'%'; ``` ## 6 配置 MHA ### 6.1 创建配置⽂件 在 Monitor 主机(172.31.7.113)上创建MHA⼯作⽬录,并且创建相关配置⽂件 ```bash # mkdir -p /etc/masterha # vim /etc/masterha/app1.cnf [server default] manager_log=/masterha/app1/manager.log manager_workdir=/masterha/app1 master_binlog_dir=/data/mysql3306/log master_ip_failover_script= /usr/local/bin/master_ip_failover master_ip_online_change_script= /usr/local/bin/master_ip_online_change user=mha_monitor password=monitor123 ping_interval=1 remote_workdir=/tmp repl_user=repl repl_password=repl123456 report_script=/usr/local/bin/send_report secondary_check_script= /usr/local/bin/masterha_secondary_check -s 172.31.7.111 -s 172.31.7.112 --user=root --master_host=172.31.7.110 --master_ip=172.31.7.110 --master_port=3306 shutdown_script="" ssh_user=root #client_bindir=/usr/local/mysql/bin client_bindir=/opt/mysql-8.0.36/bin [server1] hostname=172.31.7.110 port=3306 [server2] hostname=172.31.7.111 port=3306 candidate_master=1 check_repl_delay=0 [server3] hostname=172.31.7.112 port=3306 check_repl_delay=0 # 用gtid的模式需要添加下面参数,如果传统复制可以不加,建议开启半同步 [binlog1] no_master=1 hostname=172.31.7.110 master_binlog_dir=/data/mysql3306/log ``` 各参数的具体含义可参考:https://www.cnblogs.com/ivictor/p/5686275.html ```bash [server default] manager_workdir=/var/log/masterha/app1 ##工作目录 目录需要自己创建 manager_log=/var/log/masterha/app1/manager.log ##日志文件 目录需要自己创建 master_ip_failover_script=/etc/masterha/master_ip_failover ##vip切换脚本 user=mha_monitor ##操作mysql的账户 三台主机都必须一样 password=mha_monitor ##操作mysql的密码 三台主机都必须一样 ssh_user=root ##ssh面密登录账户 repl_user=repl ##mysql的复制账户 repl_password=repl ##mysql的复制密码 ping_interval=1 ##每隔ping_interval秒检测主库心跳,最多四次机会,如果都没有心跳,主库宕机 [server1] hostname=172.16.100.101 ##mysql master主机ip; port=3306 ##端口; master_binlog_dir=/var/lib/mysql/data/mha-server ##二进制日志目录; candidate_master=1 #设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中事件最新的slave。 [server2] hostname=172.16.100.102 ##从机ip; port=3306 ##从机端口; master_binlog_dir=/var/lib/mysql/data/mha-server ##二进制日志目录; candidate_master=1 ##是否候选备用master主机; check_repl_delay=0 ### 默认情况下,如果一个slave落后master 100M的relay logs的话,MHA将不会选择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间。如果设置 check_repl_delay=0,MHA在触发切换选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,可以保证候选主在切换后一定是新master。 [server3] hostname=192.168.88.131 ##从机ip; port=3306; master_binlog_dir=/var/lib/mysql/data/mha-server; no_master=1 ##不适合做主库的从机。 ``` ### 6.2 设置 relay log 清除⽅式 ```sql mysql> set global relay_log_purge=0; ``` ### 6.3 设置 relay log 定期清理脚本 ```bash # vim /etc/cron.d/purge_relay_logs 0 1 * * * /usr/local/bin/purge_relay_logs --host=127.0.0.1 --port=3306 --user=mha_monitor --password=monitor123 -disable_relay_log_purge --workdir=/tmp/ >>/tmp/purge_relay_logs.log 2>&1 ``` ## 7 创建脚本 参考:https://github.com/yoshinorim/mha4mysql-manager/tree/master/samples/scripts - /usr/local/bin/master_ip_failover ```perl #!/usr/bin/env perl # Copyright (C) 2011 DeNA Co.,Ltd. # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ## Note: This is a sample script and is not complete. Modify the script based on your environment. use strict; use warnings FATAL => 'all'; use Getopt::Long; use MHA::DBHelper; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password ); my $vip = '172.31.7.166'; my $nic_name = 'ens33'; my $key = "2"; my $ssh_start_vip = "/sbin/ifconfig $nic_name:$key $vip/24"; my $ssh_stop_vip = "/sbin/ifconfig $nic_name:$key down"; my $ssh_send_garp = "/sbin/arping -U $vip -I $nic_name -c 1"; GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, 'new_master_user=s' => \$new_master_user, 'new_master_password=s' => \$new_master_password, ); exit &main(); sub main { if ( $command eq "stop" || $command eq "stopssh" ) { # $orig_master_host, $orig_master_ip, $orig_master_port are passed. # If you manage master ip address at global catalog database, # invalidate orig_master_ip here. my $exit_code = 1; eval { print "Disabling the VIP an old master: $orig_master_host \n"; &stop_vip(); $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { # all arguments are passed. # If you manage master ip address at global catalog database, # activate new_master_ip here. # You can also grant write access (create user, set read_only=0, etc) here. my $exit_code = 10; eval { my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); ## Set read_only=0 on the new master $new_master_handler->disable_log_bin_local(); print "Set read_only=0 on the new master.\n"; $new_master_handler->disable_read_only(); ## Creating an app user on the new master # print "Creating app user on the new master..\n"; # FIXME_xxx_create_user( $new_master_handler->{dbh} ); $new_master_handler->enable_log_bin_local(); $new_master_handler->disconnect(); print "Enabling the VIP $vip on the new master: $new_master_host \n"; &start_vip(); $exit_code = 0; }; if ($@) { warn $@; # If you want to continue failover, exit 10. exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { # do nothing exit 0; } else { &usage(); exit 1; } } sub start_vip(){ `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`; `ssh $ssh_user\@$new_master_host \" $ssh_send_garp \"`; } sub stop_vip(){ return 0 unless ($ssh_user); `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`; } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n"; } ``` - /usr/local/bin/master_ip_online_change ```perl #!/usr/bin/env perl # Copyright (C) 2011 DeNA Co.,Ltd. # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ## Note: This is a sample script and is not complete. Modify the script based on your environment. use strict; use warnings FATAL => 'all'; use Getopt::Long; use MHA::DBHelper; use MHA::NodeUtil; use Time::HiRes qw( sleep gettimeofday tv_interval ); use Data::Dumper; my $_tstart; my $_running_interval = 0.1; my $vip = '172.31.7.166'; my $nic_name = 'ens33'; my $key = "2"; my $ssh_start_vip = "/sbin/ifconfig $nic_name:$key $vip/24"; my $ssh_stop_vip = "/sbin/ifconfig $nic_name:$key down"; my $ssh_send_garp = "/sbin/arping -U $vip -I $nic_name -c 1"; my ( $command, $orig_master_is_new_slave, $orig_master_host, $orig_master_ip, $orig_master_port, $orig_master_user, $orig_master_password, $orig_master_ssh_user, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password, $new_master_ssh_user, ); GetOptions( 'command=s' => \$command, 'orig_master_is_new_slave' => \$orig_master_is_new_slave, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'orig_master_user=s' => \$orig_master_user, 'orig_master_password=s' => \$orig_master_password, 'orig_master_ssh_user=s' => \$orig_master_ssh_user, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, 'new_master_user=s' => \$new_master_user, 'new_master_password=s' => \$new_master_password, 'new_master_ssh_user=s' => \$new_master_ssh_user, ); exit &main(); sub start_vip(){ `ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`; `ssh $new_master_ssh_user\@$new_master_host \" $ssh_send_garp \"`; } sub stop_vip(){ `ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`; } sub current_time_us { my ( $sec, $microsec ) = gettimeofday(); my $curdate = localtime($sec); return $curdate . " " . sprintf( "%06d", $microsec ); } sub sleep_until { my $elapsed = tv_interval($_tstart); if ( $_running_interval > $elapsed ) { sleep( $_running_interval - $elapsed ); } } sub get_threads_util { my $dbh = shift; my $my_connection_id = shift; my $running_time_threshold = shift; my $type = shift; $running_time_threshold = 0 unless ($running_time_threshold); $type = 0 unless ($type); my @threads; my $sth = $dbh->prepare("SHOW PROCESSLIST"); $sth->execute(); while ( my $ref = $sth->fetchrow_hashref() ) { my $id = $ref->{Id}; my $user = $ref->{User}; my $host = $ref->{Host}; my $command = $ref->{Command}; my $state = $ref->{State}; my $query_time = $ref->{Time}; my $info = $ref->{Info}; $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info); next if ( $my_connection_id == $id ); next if ( defined($query_time) && $query_time < $running_time_threshold ); next if ( defined($command) && $command eq "Binlog Dump" ); next if ( defined($user) && $user eq "system user" ); next if ( defined($command) && $command eq "Sleep" && defined($query_time) && $query_time >= 1 ); if ( $type >= 1 ) { next if ( defined($command) && $command eq "Sleep" ); next if ( defined($command) && $command eq "Connect" ); } if ( $type >= 2 ) { next if ( defined($info) && $info =~ m/^select/i ); next if ( defined($info) && $info =~ m/^show/i ); } push @threads, $ref; } return @threads; } sub main { if ( $command eq "stop" ) { ## Gracefully killing connections on the current master # 1. Set read_only= 1 on the new master # 2. DROP USER so that no app user can establish new connections # 3. Set read_only= 1 on the current master # 4. Kill current queries # * Any database access failure will result in script die. my $exit_code = 1; eval { ## Setting read_only=1 on the new master (to avoid accident) my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error(die_on_error)_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); print current_time_us() . " Set read_only on the new master.. "; $new_master_handler->enable_read_only(); if ( $new_master_handler->is_read_only() ) { print "ok.\n"; } else { die "Failed!\n"; } $new_master_handler->disconnect(); # Connecting to the orig master, die if any database error happens my $orig_master_handler = new MHA::DBHelper(); $orig_master_handler->connect( $orig_master_ip, $orig_master_port, $orig_master_user, $orig_master_password, 1 ); ## Drop application user so that nobody can connect. Disabling per-session binlog beforehand $orig_master_handler->disable_log_bin_local(); # print current_time_us() . " Drpping app user on the orig master..\n"; #drop_app_user($orig_master_handler); ## Waiting for N * 100 milliseconds so that current connections can exit my $time_until_read_only = 15; $_tstart = [gettimeofday]; my @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); while ( $time_until_read_only > 0 && $#threads >= 0 ) { if ( $time_until_read_only % 5 == 0 ) { printf "%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n", current_time_us(), $#threads + 1, $time_until_read_only * 100; if ( $#threads < 5 ) { print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n" foreach (@threads); } } sleep_until(); $_tstart = [gettimeofday]; $time_until_read_only--; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); } ## Setting read_only=1 on the current master so that nobody(except SUPER) can write print current_time_us() . " Set read_only=1 on the orig master.. "; $orig_master_handler->enable_read_only(); if ( $orig_master_handler->is_read_only() ) { print "ok.\n"; } else { die "Failed!\n"; } ## Waiting for M * 100 milliseconds so that current update queries can complete my $time_until_kill_threads = 5; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); while ( $time_until_kill_threads > 0 && $#threads >= 0 ) { if ( $time_until_kill_threads % 5 == 0 ) { printf "%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n", current_time_us(), $#threads + 1, $time_until_kill_threads * 100; if ( $#threads < 5 ) { print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n" foreach (@threads); } } sleep_until(); $_tstart = [gettimeofday]; $time_until_kill_threads--; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); } ## Terminating all threads print current_time_us() . " Killing all application threads..\n"; $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 ); print current_time_us() . " done.\n"; $orig_master_handler->enable_log_bin_local(); $orig_master_handler->disconnect(); ## Droping the VIP print "Disabling the VIP an old master: $orig_master_host \n"; &stop_vip(); ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { ## Activating master ip on the new master # 1. Create app user with write privileges # 2. Moving backup script if needed # 3. Register new master's ip to the catalog database # We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery. # If exit code is 0 or 10, MHA does not abort my $exit_code = 10; eval { my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); ## Set read_only=0 on the new master $new_master_handler->disable_log_bin_local(); print current_time_us() . " Set read_only=0 on the new master.\n"; $new_master_handler->disable_read_only(); ## Creating an app user on the new master #print current_time_us() . " Creating app user on the new master..\n"; # create_app_user($new_master_handler); print "Enabling the VIP $vip on the new master: $new_master_host \n"; &start_vip(); $new_master_handler->enable_log_bin_local(); $new_master_handler->disconnect(); ## Update master ip on the catalog database, etc $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { # do nothing exit 0; } else { &usage(); exit 1; } } sub usage { print "Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n"; die; } ``` - /usr/local/bin/send_report ```perl #!/usr/bin/perl # Copyright (C) 2011 DeNA Co.,Ltd. # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ## Note: This is a sample script and is not complete. Modify the script based on your environment. use strict; use warnings FATAL => 'all'; use Mail::Sender; use Getopt::Long; #new_master_host and new_slave_hosts are set only when recovering master succeeded my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body ); my $smtp='smtp.126.com'; my $mail_from='slowtech@126.com'; my $mail_user='slowtech@126.com'; my $mail_pass='xxxxx'; my $mail_to=['slowtech@126.com']; GetOptions( 'orig_master_host=s' => \$dead_master_host, 'new_master_host=s' => \$new_master_host, 'new_slave_hosts=s' => \$new_slave_hosts, 'subject=s' => \$subject, 'body=s' => \$body, ); mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body); sub mailToContacts { my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_; open my $DEBUG, "> /tmp/monitormail.log" or die "Can't open the debug file:$!\n"; my $sender = new Mail::Sender { ctype => 'text/plain; charset=utf-8', encoding => 'utf-8', smtp => $smtp, from => $mail_from, auth => 'LOGIN', TLS_allowed => '0', authid => $user, authpwd => $passwd, to => $mail_to, subject => $subject, debug => $DEBUG }; $sender->MailMsg( { msg => $msg, debug => $DEBUG } ) or print $Mail::Sender::Error; return 1; } # Do whatever you want here exit 0; ``` ```python #!/usr/bin/env python3 # -*- coding: utf-8 -*- import os import yagmail import argparse # 创建 ArgumentParser 对象并定义命令⾏参数 parser = argparse.ArgumentParser(description='send_report for MHA') parser.add_argument('--orig_master_host', type=str, help='original master host') parser.add_argument('--new_master_host', type=str, help='new master host') parser.add_argument('--new_slave_hosts', type=str, help='new slave hosts') parser.add_argument('--conf', type=str, help='configuration file') parser.add_argument('--subject', type=str, help='email subject') parser.add_argument('--body', type=str, help='email body') # 解析命令⾏参数 args = parser.parse_args() # 从环境变量中读取⽤户名和密码 email_user = os.environ.get('EMAIL_USER') email_password = os.environ.get('EMAIL_PASSWORD') # 创建 yagmail.SMTP 对象 yag = yagmail.SMTP(user=email_user, password=email_password, host='smtp.126.com') # 发送邮件 yag.send( to=email_user, subject=args.subject, contents=args.body ) ``` - 采用python发送邮件 ```python root@monitor:~# cat /usr/local/bin/send_report #!/usr/bin/env python3 # -*- coding: utf-8 -*- import os import yagmail import argparse # 创建 ArgumentParser 对象并定义命令⾏参数 parser = argparse.ArgumentParser(description='send_report for MHA') parser.add_argument('--orig_master_host', type=str, help='original master host') parser.add_argument('--new_master_host', type=str, help='new master host') parser.add_argument('--new_slave_hosts', type=str, help='new slave hosts') parser.add_argument('--conf', type=str, help='configuration file') parser.add_argument('--subject', type=str, help='email subject') parser.add_argument('--body', type=str, help='email body') # 解析命令⾏参数 args = parser.parse_args() # 从环境变量中读取⽤户名和密码 email_user = os.environ.get('EMAIL_USER') email_password = os.environ.get('EMAIL_PASSWORD') # 创建 yagmail.SMTP 对象 yag = yagmail.SMTP(user='1210353303@qq.com', password='wyfzqfjkyhfjigai', host='smtp.qq.com') # 发送邮件 yag.send( to='yanzhao.li@qq.com', subject=args.subject, contents=args.body ) ``` 给脚本添加可执⾏权限 ```bash # chmod +x /usr/local/bin/master_ip_failover # chmod +x /usr/local/bin/master_ip_online_change # chmod +x /usr/local/bin/send_report ``` ## 8 检查 SSH 的配置 在 Monitor 主机上执⾏ ```bash # masterha_check_ssh --conf=/etc/masterha/app1.cnf root@monitor:~# masterha_check_ssh --conf=/etc/masterha/app1.cnf Thu Apr 25 13:07:53 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Apr 25 13:07:53 2024 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Thu Apr 25 13:07:53 2024 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu Apr 25 13:07:53 2024 - [info] Starting SSH connection tests.. Thu Apr 25 13:07:54 2024 - [debug] Thu Apr 25 13:07:53 2024 - [debug] Connecting via SSH from root@172.31.7.110(172.31.7.110:22) to root@172.31.7.111(172.31.7.111:22).. Thu Apr 25 13:07:53 2024 - [debug] ok. Thu Apr 25 13:07:53 2024 - [debug] Connecting via SSH from root@172.31.7.110(172.31.7.110:22) to root@172.31.7.112(172.31.7.112:22).. Thu Apr 25 13:07:54 2024 - [debug] ok. Thu Apr 25 13:07:54 2024 - [debug] Thu Apr 25 13:07:53 2024 - [debug] Connecting via SSH from root@172.31.7.111(172.31.7.111:22) to root@172.31.7.110(172.31.7.110:22).. Thu Apr 25 13:07:54 2024 - [debug] ok. Thu Apr 25 13:07:54 2024 - [debug] Connecting via SSH from root@172.31.7.111(172.31.7.111:22) to root@172.31.7.112(172.31.7.112:22).. Thu Apr 25 13:07:54 2024 - [debug] ok. Thu Apr 25 13:07:55 2024 - [debug] Thu Apr 25 13:07:54 2024 - [debug] Connecting via SSH from root@172.31.7.112(172.31.7.112:22) to root@172.31.7.110(172.31.7.110:22).. Thu Apr 25 13:07:54 2024 - [debug] ok. Thu Apr 25 13:07:54 2024 - [debug] Connecting via SSH from root@172.31.7.112(172.31.7.112:22) to root@172.31.7.111(172.31.7.111:22).. Thu Apr 25 13:07:55 2024 - [debug] ok. Thu Apr 25 13:07:55 2024 - [info] All SSH connection tests passed successfully. Use of uninitialized value in exit at /usr/bin/masterha_check_ssh line 44. ``` ## 9 查看整个集群的状态 在 Monitor 主机上执⾏ ```bash # masterha_check_repl --conf=/etc/masterha/app1.cnf root@monitor:~# masterha_check_repl --conf=/etc/masterha/app1.cnf Thu Apr 25 13:19:21 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Apr 25 13:19:21 2024 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Thu Apr 25 13:19:21 2024 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu Apr 25 13:19:21 2024 - [info] MHA::MasterMonitor version 0.58. Thu Apr 25 13:19:22 2024 - [info] GTID failover mode = 1 Thu Apr 25 13:19:22 2024 - [info] Dead Servers: Thu Apr 25 13:19:22 2024 - [info] Alive Servers: Thu Apr 25 13:19:22 2024 - [info] 172.31.7.110(172.31.7.110:3306) Thu Apr 25 13:19:22 2024 - [info] 172.31.7.111(172.31.7.111:3306) Thu Apr 25 13:19:22 2024 - [info] 172.31.7.112(172.31.7.112:3306) Thu Apr 25 13:19:22 2024 - [info] Alive Slaves: Thu Apr 25 13:19:22 2024 - [info] 172.31.7.111(172.31.7.111:3306) Version=8.0.36 (oldest major version between slaves) log-bin:enabled Thu Apr 25 13:19:22 2024 - [info] GTID ON Thu Apr 25 13:19:22 2024 - [info] Replicating from 172.31.7.110(172.31.7.110:3306) Thu Apr 25 13:19:22 2024 - [info] Primary candidate for the new Master (candidate_master is set) Thu Apr 25 13:19:22 2024 - [info] 172.31.7.112(172.31.7.112:3306) Version=8.0.36 (oldest major version between slaves) log-bin:enabled Thu Apr 25 13:19:22 2024 - [info] GTID ON Thu Apr 25 13:19:22 2024 - [info] Replicating from 172.31.7.110(172.31.7.110:3306) Thu Apr 25 13:19:22 2024 - [info] Current Alive Master: 172.31.7.110(172.31.7.110:3306) Thu Apr 25 13:19:22 2024 - [info] Checking slave configurations.. Thu Apr 25 13:19:22 2024 - [info] Checking replication filtering settings.. Thu Apr 25 13:19:22 2024 - [info] binlog_do_db= , binlog_ignore_db= Thu Apr 25 13:19:22 2024 - [info] Replication filtering check ok. Thu Apr 25 13:19:22 2024 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking. Thu Apr 25 13:19:22 2024 - [info] Checking SSH publickey authentication settings on the current master.. Thu Apr 25 13:19:23 2024 - [info] HealthCheck: SSH to 172.31.7.110 is reachable. Thu Apr 25 13:19:23 2024 - [info] 172.31.7.110(172.31.7.110:3306) (current master) +--172.31.7.111(172.31.7.111:3306) +--172.31.7.112(172.31.7.112:3306) Thu Apr 25 13:19:23 2024 - [info] Checking replication health on 172.31.7.111.. Thu Apr 25 13:19:23 2024 - [info] ok. Thu Apr 25 13:19:23 2024 - [info] Checking replication health on 172.31.7.112.. Thu Apr 25 13:19:23 2024 - [info] ok. Thu Apr 25 13:19:23 2024 - [info] Checking master_ip_failover_script status: Thu Apr 25 13:19:23 2024 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.31.7.110 --orig_master_ip=172.31.7.110 --orig_master_port=3306 Thu Apr 25 13:19:23 2024 - [info] OK. Thu Apr 25 13:19:23 2024 - [warning] shutdown_script is not defined. Thu Apr 25 13:19:23 2024 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK. ``` ### 9.1 常⻅报错 1.Redundant argument in sprintf at /usr/local/share/perl5/5.32/MHA/NodeUtil.pm line 202 ```bash [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. Redundant argument in sprintf at /usr/local/share/perl5/5.32/MHA/NodeUtil.pm line 202. ``` 解决⽅法: ```bash $str =~ s/\.[^.]+$//; my $result = sprintf( '%03d%03d', $str =~ m/(\d+)/g ); ``` 2.Can't exec "mysqlbinlog": No such file or directory at /usr/local/share/perl5/5.32/MHA/BinlogManager.pm line 106. ```bash Can't exec "mysqlbinlog": No such file or directory at /usr/local/share/perl5/5.32/MHA/BinlogManager.pm line 106. mysqlbinlog version command failed with rc 1:0, please verify PATH, LD_LIBRARY_PATH, and client options at /usr/local/bin/apply_diff_relay_logs line 532. ``` 解决⽅法:配置⽂件中指定 client_bindir。 ```bash client_bindir=/usr/local/mysql/bin ``` 3.Ubuntu 20.04 检查集群状态报错 ```bash root@monitor:~# masterha_check_repl --conf=/etc/masterha/app1.cnf Thu Apr 25 13:17:47 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Apr 25 13:17:47 2024 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Thu Apr 25 13:17:47 2024 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu Apr 25 13:17:47 2024 - [info] MHA::MasterMonitor version 0.58. Thu Apr 25 13:17:48 2024 - [error][/usr/share/perl5/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. Redundant argument in sprintf at /usr/share/perl5/MHA/NodeUtil.pm line 201. Thu Apr 25 13:17:48 2024 - [error][/usr/share/perl5/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers. Thu Apr 25 13:17:48 2024 - [info] Got exit code 1 (Not master dead). MySQL Replication Health is NOT OK! #解决方案: vim /usr/share/perl5/MHA/NodeUtil.pm +195 #sub parse_mysql_version($) { # my $str = shift; # my $result = sprintf( '%03d%03d%03d', $str =~ m/(\d+)/g ); # return $result; #} sub parse_mysql_version($) { my $str = shift; ($str) = $str =~ m/^[^-]*/g; my $result = sprintf( '%03d%03d%03d', $str =~ m/(\d+)/g ); return $result; } #sub parse_mysql_major_version($) { # my $str = shift; # my $result = sprintf( '%03d%03d', $str =~ m/(\d+)/g ); # return $result; #} sub parse_mysql_major_version($) { my $str = shift; $str =~ /(\d+)\.(\d+)/; my $strmajor = "$1.$2"; my $result = sprintf( '%03d%03d', $strmajor =~ m/(\d+)/g ); return $result; } # 参考: # https://www.cnblogs.com/windman0/p/16058777.html # https://github.com/yoshinorim/mha4mysql-manager/issues/116 ``` ## 10 检查 MHA Manager 的状态 ```bash # masterha_check_status --conf=/etc/masterha/app1.cnf app1 is stopped(2:NOT_RUNNING). ``` 如果正常,会显示 PING_OK ,否则会显示 NOT_RUNNING ,代表 MHA 监控还没有开启。 ## 11 开启 MHA Manager 监控 ```bash root@monitor:~# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /masterha/app1/manager.log 2>&1 & # 测试去除--remove_dead_master_conf 参数 方便测试 nohup masterha_manager --conf=/etc/masterha/app1.cnf --ignore_last_failover < /dev/null > /masterha/app1/manager.log 2>&1 & root@monitor:~# nohup masterha_manager --conf=/etc/masterha/app1.cnf --ignore_last_failover < /dev/null > /masterha/app1/manager.log 2>&1 & [1] 22278 # tailf /masterha/app1/manager.log # masterha_check_status --conf=/etc/masterha/app1.cnf root@monitor:~# masterha_check_status --conf=/etc/masterha/app1.cnf app1 (pid:5058) is running(0:PING_OK), master:172.31.7.110 ``` 关闭 ```bash root@monitor:~# masterha_stop --conf=/etc/masterha/app1.cnf ``` ## 12 Master 上挂载 VIP ```bash # /sbin/ifconfig ens160:2 172.31.7.166/24 root@db01:~# apt install arping -y root@db01:~# apt install net-tools -y root@db01:~# ifconfig ens33:2 172.31.7.166/24 # 验证 root@db03:~# mysql -h 172.31.7.166 -u repl -prepl123456 mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 48 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> select @@server_id; +-------------+ | @@server_id | +-------------+ | 1103306 | +-------------+ 1 row in set (0.00 sec) ``` ## 13 模拟主库故障 - 关闭主库 ```bash root@db01:~# systemctl stop mysql ``` - 查看日志 ```bash root@monitor:~# tail -f /masterha/app1/manager.log +--172.31.7.112(172.31.7.112:3306) Thu Apr 25 17:43:58 2024 - [info] Checking master_ip_failover_script status: Thu Apr 25 17:43:58 2024 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.31.7.110 --orig_master_ip=172.31.7.110 --orig_master_port=3306 Thu Apr 25 17:43:59 2024 - [info] OK. Thu Apr 25 17:43:59 2024 - [warning] shutdown_script is not defined. Thu Apr 25 17:43:59 2024 - [info] Set master ping interval 1 seconds. Thu Apr 25 17:43:59 2024 - [info] Set secondary check script: /usr/bin/masterha_secondary_check -s 172.31.7.111 -s 172.31.7.112 --user=root --master_host=172.31.7.110 --master_ip=172.31.7.110 --master_port=3306 Thu Apr 25 17:43:59 2024 - [info] Starting ping health check on 172.31.7.110(172.31.7.110:3306).. Thu Apr 25 17:43:59 2024 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond.. Thu Apr 25 17:44:17 2024 - [warning] Got error on MySQL select ping: 1053 (Server shutdown in progress) Thu Apr 25 17:44:17 2024 - [info] Executing secondary network check script: /usr/bin/masterha_secondary_check -s 172.31.7.111 -s 172.31.7.112 --user=root --master_host=172.31.7.110 --master_ip=172.31.7.110 --master_port=3306 --user=root --master_host=172.31.7.110 --master_ip=172.31.7.110 --master_port=3306 --master_user=mha_monitor --master_password=monitor123 --ping_type=SELECT Thu Apr 25 17:44:17 2024 - [info] Executing SSH check script: exit 0 Thu Apr 25 17:44:17 2024 - [info] HealthCheck: SSH to 172.31.7.110 is reachable. perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_TIME = "en_DK.UTF-8", LANG = "en_US.UTF-8" are supported and installed on your system. perl: warning: Falling back to a fallback locale ("en_US.UTF-8"). Monitoring server 172.31.7.111 is reachable, Master is not reachable from 172.31.7.111. OK. Thu Apr 25 17:44:18 2024 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.31.7.110:3306' (111)) Thu Apr 25 17:44:18 2024 - [warning] Connection failed 2 time(s).. perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_TIME = "en_DK.UTF-8", LANG = "en_US.UTF-8" are supported and installed on your system. perl: warning: Falling back to a fallback locale ("en_US.UTF-8"). Monitoring server 172.31.7.112 is reachable, Master is not reachable from 172.31.7.112. OK. Thu Apr 25 17:44:18 2024 - [info] Master is not reachable from all other monitoring servers. Failover should start. Thu Apr 25 17:44:19 2024 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.31.7.110:3306' (111)) Thu Apr 25 17:44:19 2024 - [warning] Connection failed 3 time(s).. Thu Apr 25 17:44:20 2024 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.31.7.110:3306' (111)) Thu Apr 25 17:44:20 2024 - [warning] Connection failed 4 time(s).. Thu Apr 25 17:44:20 2024 - [warning] Master is not reachable from health checker! Thu Apr 25 17:44:20 2024 - [warning] Master 172.31.7.110(172.31.7.110:3306) is not reachable! Thu Apr 25 17:44:20 2024 - [warning] SSH is reachable. Thu Apr 25 17:44:20 2024 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status.. Thu Apr 25 17:44:20 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Apr 25 17:44:20 2024 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Thu Apr 25 17:44:20 2024 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu Apr 25 17:44:21 2024 - [info] GTID failover mode = 1 Thu Apr 25 17:44:21 2024 - [info] Dead Servers: Thu Apr 25 17:44:21 2024 - [info] 172.31.7.110(172.31.7.110:3306) Thu Apr 25 17:44:21 2024 - [info] Alive Servers: Thu Apr 25 17:44:21 2024 - [info] 172.31.7.111(172.31.7.111:3306) Thu Apr 25 17:44:21 2024 - [info] 172.31.7.112(172.31.7.112:3306) Thu Apr 25 17:44:21 2024 - [info] Alive Slaves: Thu Apr 25 17:44:21 2024 - [info] 172.31.7.111(172.31.7.111:3306) Version=8.0.36 (oldest major version between slaves) log-bin:enabled Thu Apr 25 17:44:21 2024 - [info] GTID ON Thu Apr 25 17:44:21 2024 - [info] Replicating from 172.31.7.110(172.31.7.110:3306) Thu Apr 25 17:44:21 2024 - [info] Primary candidate for the new Master (candidate_master is set) Thu Apr 25 17:44:21 2024 - [info] 172.31.7.112(172.31.7.112:3306) Version=8.0.36 (oldest major version between slaves) log-bin:enabled Thu Apr 25 17:44:21 2024 - [info] GTID ON Thu Apr 25 17:44:21 2024 - [info] Replicating from 172.31.7.110(172.31.7.110:3306) Thu Apr 25 17:44:21 2024 - [info] Checking slave configurations.. Thu Apr 25 17:44:21 2024 - [info] Checking replication filtering settings.. Thu Apr 25 17:44:21 2024 - [info] Replication filtering check ok. Thu Apr 25 17:44:21 2024 - [info] Master is down! Thu Apr 25 17:44:21 2024 - [info] Terminating monitoring script. Thu Apr 25 17:44:21 2024 - [info] Got exit code 20 (Master dead). Thu Apr 25 17:44:21 2024 - [info] MHA::MasterFailover version 0.58. Thu Apr 25 17:44:21 2024 - [info] Starting master failover. Thu Apr 25 17:44:21 2024 - [info] Thu Apr 25 17:44:21 2024 - [info] * Phase 1: Configuration Check Phase.. Thu Apr 25 17:44:21 2024 - [info] Thu Apr 25 17:44:22 2024 - [info] GTID failover mode = 1 Thu Apr 25 17:44:22 2024 - [info] Dead Servers: Thu Apr 25 17:44:22 2024 - [info] 172.31.7.110(172.31.7.110:3306) Thu Apr 25 17:44:22 2024 - [info] Checking master reachability via MySQL(double check)... Thu Apr 25 17:44:22 2024 - [info] ok. Thu Apr 25 17:44:22 2024 - [info] Alive Servers: Thu Apr 25 17:44:22 2024 - [info] 172.31.7.111(172.31.7.111:3306) Thu Apr 25 17:44:22 2024 - [info] 172.31.7.112(172.31.7.112:3306) Thu Apr 25 17:44:22 2024 - [info] Alive Slaves: Thu Apr 25 17:44:22 2024 - [info] 172.31.7.111(172.31.7.111:3306) Version=8.0.36 (oldest major version between slaves) log-bin:enabled Thu Apr 25 17:44:22 2024 - [info] GTID ON Thu Apr 25 17:44:22 2024 - [info] Replicating from 172.31.7.110(172.31.7.110:3306) Thu Apr 25 17:44:22 2024 - [info] Primary candidate for the new Master (candidate_master is set) Thu Apr 25 17:44:22 2024 - [info] 172.31.7.112(172.31.7.112:3306) Version=8.0.36 (oldest major version between slaves) log-bin:enabled Thu Apr 25 17:44:22 2024 - [info] GTID ON Thu Apr 25 17:44:22 2024 - [info] Replicating from 172.31.7.110(172.31.7.110:3306) Thu Apr 25 17:44:22 2024 - [info] Starting GTID based failover. Thu Apr 25 17:44:22 2024 - [info] Thu Apr 25 17:44:22 2024 - [info] ** Phase 1: Configuration Check Phase completed. Thu Apr 25 17:44:22 2024 - [info] Thu Apr 25 17:44:22 2024 - [info] * Phase 2: Dead Master Shutdown Phase.. Thu Apr 25 17:44:22 2024 - [info] Thu Apr 25 17:44:22 2024 - [info] Forcing shutdown so that applications never connect to the current master.. Thu Apr 25 17:44:22 2024 - [info] Executing master IP deactivation script: Thu Apr 25 17:44:22 2024 - [info] /usr/local/bin/master_ip_failover --orig_master_host=172.31.7.110 --orig_master_ip=172.31.7.110 --orig_master_port=3306 --command=stopssh --ssh_user=root Disabling the VIP an old master: 172.31.7.110 Thu Apr 25 17:44:22 2024 - [info] done. Thu Apr 25 17:44:22 2024 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master. Thu Apr 25 17:44:22 2024 - [info] * Phase 2: Dead Master Shutdown Phase completed. Thu Apr 25 17:44:22 2024 - [info] Thu Apr 25 17:44:22 2024 - [info] * Phase 3: Master Recovery Phase.. Thu Apr 25 17:44:22 2024 - [info] Thu Apr 25 17:44:22 2024 - [info] * Phase 3.1: Getting Latest Slaves Phase.. Thu Apr 25 17:44:22 2024 - [info] Thu Apr 25 17:44:22 2024 - [info] The latest binary log file/position on all slaves is mysql-bin.000013:197 Thu Apr 25 17:44:22 2024 - [info] Retrieved Gtid Set: 32049408-02a5-11ef-8b62-000c29f23176:1-126 Thu Apr 25 17:44:22 2024 - [info] Latest slaves (Slaves that received relay log files to the latest): Thu Apr 25 17:44:22 2024 - [info] 172.31.7.111(172.31.7.111:3306) Version=8.0.36 (oldest major version between slaves) log-bin:enabled Thu Apr 25 17:44:22 2024 - [info] GTID ON Thu Apr 25 17:44:22 2024 - [info] Replicating from 172.31.7.110(172.31.7.110:3306) Thu Apr 25 17:44:22 2024 - [info] Primary candidate for the new Master (candidate_master is set) Thu Apr 25 17:44:22 2024 - [info] 172.31.7.112(172.31.7.112:3306) Version=8.0.36 (oldest major version between slaves) log-bin:enabled Thu Apr 25 17:44:22 2024 - [info] GTID ON Thu Apr 25 17:44:22 2024 - [info] Replicating from 172.31.7.110(172.31.7.110:3306) Thu Apr 25 17:44:22 2024 - [info] The oldest binary log file/position on all slaves is mysql-bin.000013:197 Thu Apr 25 17:44:22 2024 - [info] Retrieved Gtid Set: 32049408-02a5-11ef-8b62-000c29f23176:1-126 Thu Apr 25 17:44:22 2024 - [info] Oldest slaves: Thu Apr 25 17:44:22 2024 - [info] 172.31.7.111(172.31.7.111:3306) Version=8.0.36 (oldest major version between slaves) log-bin:enabled Thu Apr 25 17:44:22 2024 - [info] GTID ON Thu Apr 25 17:44:22 2024 - [info] Replicating from 172.31.7.110(172.31.7.110:3306) Thu Apr 25 17:44:22 2024 - [info] Primary candidate for the new Master (candidate_master is set) Thu Apr 25 17:44:22 2024 - [info] 172.31.7.112(172.31.7.112:3306) Version=8.0.36 (oldest major version between slaves) log-bin:enabled Thu Apr 25 17:44:22 2024 - [info] GTID ON Thu Apr 25 17:44:22 2024 - [info] Replicating from 172.31.7.110(172.31.7.110:3306) Thu Apr 25 17:44:22 2024 - [info] Thu Apr 25 17:44:22 2024 - [info] * Phase 3.3: Determining New Master Phase.. Thu Apr 25 17:44:22 2024 - [info] Thu Apr 25 17:44:22 2024 - [info] Searching new master from slaves.. Thu Apr 25 17:44:22 2024 - [info] Candidate masters from the configuration file: Thu Apr 25 17:44:22 2024 - [info] 172.31.7.111(172.31.7.111:3306) Version=8.0.36 (oldest major version between slaves) log-bin:enabled Thu Apr 25 17:44:22 2024 - [info] GTID ON Thu Apr 25 17:44:22 2024 - [info] Replicating from 172.31.7.110(172.31.7.110:3306) Thu Apr 25 17:44:22 2024 - [info] Primary candidate for the new Master (candidate_master is set) Thu Apr 25 17:44:22 2024 - [info] Non-candidate masters: Thu Apr 25 17:44:22 2024 - [info] Searching from candidate_master slaves which have received the latest relay log events.. Thu Apr 25 17:44:22 2024 - [info] New master is 172.31.7.111(172.31.7.111:3306) Thu Apr 25 17:44:22 2024 - [info] Starting master failover.. Thu Apr 25 17:44:22 2024 - [info] From: 172.31.7.110(172.31.7.110:3306) (current master) +--172.31.7.111(172.31.7.111:3306) +--172.31.7.112(172.31.7.112:3306) To: 172.31.7.111(172.31.7.111:3306) (new master) +--172.31.7.112(172.31.7.112:3306) Thu Apr 25 17:44:22 2024 - [info] Thu Apr 25 17:44:22 2024 - [info] * Phase 3.3: New Master Recovery Phase.. Thu Apr 25 17:44:22 2024 - [info] Thu Apr 25 17:44:22 2024 - [info] Waiting all logs to be applied.. Thu Apr 25 17:44:22 2024 - [info] done. Thu Apr 25 17:44:22 2024 - [info] Getting new master's binlog name and position.. Thu Apr 25 17:44:22 2024 - [info] mysql-bin.000001:2520343 Thu Apr 25 17:44:22 2024 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.31.7.111', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx'; Thu Apr 25 17:44:22 2024 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000001, 2520343, 32049408-02a5-11ef-8b62-000c29f23176:1-126 Thu Apr 25 17:44:22 2024 - [info] Executing master IP activate script: Thu Apr 25 17:44:22 2024 - [info] /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=172.31.7.110 --orig_master_ip=172.31.7.110 --orig_master_port=3306 --new_master_host=172.31.7.111 --new_master_ip=172.31.7.111 --new_master_port=3306 --new_master_user='mha_monitor' --new_master_password=xxx Set read_only=0 on the new master. Enabling the VIP 172.31.7.166 on the new master: 172.31.7.111 Thu Apr 25 17:44:24 2024 - [info] OK. Thu Apr 25 17:44:24 2024 - [info] ** Finished master recovery successfully. Thu Apr 25 17:44:24 2024 - [info] * Phase 3: Master Recovery Phase completed. Thu Apr 25 17:44:24 2024 - [info] Thu Apr 25 17:44:24 2024 - [info] * Phase 4: Slaves Recovery Phase.. Thu Apr 25 17:44:24 2024 - [info] Thu Apr 25 17:44:24 2024 - [info] Thu Apr 25 17:44:24 2024 - [info] * Phase 4.1: Starting Slaves in parallel.. Thu Apr 25 17:44:24 2024 - [info] Thu Apr 25 17:44:24 2024 - [info] -- Slave recovery on host 172.31.7.112(172.31.7.112:3306) started, pid: 22336. Check tmp log /masterha/app1/172.31.7.112_3306_20240425174421.log if it takes time.. Thu Apr 25 17:44:26 2024 - [info] Thu Apr 25 17:44:26 2024 - [info] Log messages from 172.31.7.112 ... Thu Apr 25 17:44:26 2024 - [info] Thu Apr 25 17:44:24 2024 - [info] Resetting slave 172.31.7.112(172.31.7.112:3306) and starting replication from the new master 172.31.7.111(172.31.7.111:3306).. Thu Apr 25 17:44:24 2024 - [info] Executed CHANGE MASTER. Thu Apr 25 17:44:25 2024 - [info] Slave started. Thu Apr 25 17:44:25 2024 - [info] gtid_wait(32049408-02a5-11ef-8b62-000c29f23176:1-126) completed on 172.31.7.112(172.31.7.112:3306). Executed 0 events. Thu Apr 25 17:44:26 2024 - [info] End of log messages from 172.31.7.112. Thu Apr 25 17:44:26 2024 - [info] -- Slave on host 172.31.7.112(172.31.7.112:3306) started. Thu Apr 25 17:44:26 2024 - [info] All new slave servers recovered successfully. Thu Apr 25 17:44:26 2024 - [info] Thu Apr 25 17:44:26 2024 - [info] * Phase 5: New master cleanup phase.. Thu Apr 25 17:44:26 2024 - [info] Thu Apr 25 17:44:26 2024 - [info] Resetting slave info on the new master.. Thu Apr 25 17:44:26 2024 - [info] 172.31.7.111: Resetting slave info succeeded. Thu Apr 25 17:44:26 2024 - [info] Master failover to 172.31.7.111(172.31.7.111:3306) completed successfully. Thu Apr 25 17:44:26 2024 - [info] ----- Failover Report ----- app1: MySQL Master failover 172.31.7.110(172.31.7.110:3306) to 172.31.7.111(172.31.7.111:3306) succeeded Master 172.31.7.110(172.31.7.110:3306) is down! Check MHA Manager logs at monitor.hosts.com:/masterha/app1/manager.log for details. Started automated(non-interactive) failover. Invalidated master IP address on 172.31.7.110(172.31.7.110:3306) Selected 172.31.7.111(172.31.7.111:3306) as a new master. 172.31.7.111(172.31.7.111:3306): OK: Applying all logs succeeded. 172.31.7.111(172.31.7.111:3306): OK: Activated master IP address. 172.31.7.112(172.31.7.112:3306): OK: Slave started, replicating from 172.31.7.111(172.31.7.111:3306) 172.31.7.111(172.31.7.111:3306): Resetting slave info succeeded. Master failover to 172.31.7.111(172.31.7.111:3306) completed successfully. Thu Apr 25 17:44:26 2024 - [info] Sending mail.. ``` - 查看vip 切换 ```bash root@db02:~# ifconfig ens33:2 ens33:2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.31.7.166 netmask 255.255.255.0 broadcast 172.31.7.255 ether 00:0c:29:73:08:a4 txqueuelen 1000 (Ethernet) ``` - 查看主从切换情况 ```sql # DB03查看: mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for source to send event Master_Host: 172.31.7.111 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 2520343 Relay_Log_File: relay-log.000002 Relay_Log_Pos: 420 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes ``` - 查看邮件 ![image-20240425180132265](https://img.sunrisenan.com/img/2024/04/25/180136052.png) ## 15 角色还原 前提条件: - 原主库要以slave身份加入集群 ```sql # 在db01数据库上 mysql> CHANGE MASTER TO MASTER_HOST='172.31.7.111',master_port=3306, MASTER_USER='repl', MASTER_PASSWORD='repl123456', master_auto_position=1; Query OK, 0 rows affected, 8 warnings (0.03 sec) mysql> start slave; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for source to send event Master_Host: 172.31.7.111 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 2520343 Relay_Log_File: relay-log.000002 Relay_Log_Pos: 420 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Executed_Gtid_Set: 32049408-02a5-11ef-8b62-000c29f23176:1-126 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: Master_public_key_path: Get_master_public_key: 0 Network_Namespace: 1 row in set, 1 warning (0.00 sec) ``` **下面将新备库(原master节点)还原为主库,新主库(节点db02)还原为备库**。 在MHA manager节点执行如下命令,停止MHA监控: ```bash masterha_stop --conf=/etc/masterha/app1.cnf ``` 然后执行如下命令,进行角色还原: ```bash masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=172.31.7.110 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=10000 ``` 执行完成后,可以看到恢复到初始的1个master(节点db01)和两个slave(节点db02和节点db03)状态。 ```bash root@monitor:~# masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=172.31.7.110 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=10000 Thu Apr 25 19:02:45 2024 - [info] MHA::MasterRotate version 0.58. Thu Apr 25 19:02:45 2024 - [info] Starting online master switch.. Thu Apr 25 19:02:45 2024 - [info] Thu Apr 25 19:02:45 2024 - [info] * Phase 1: Configuration Check Phase.. Thu Apr 25 19:02:45 2024 - [info] Thu Apr 25 19:02:45 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Apr 25 19:02:45 2024 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Thu Apr 25 19:02:45 2024 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu Apr 25 19:02:46 2024 - [info] GTID failover mode = 1 Thu Apr 25 19:02:46 2024 - [info] Current Alive Master: 172.31.7.111(172.31.7.111:3306) Thu Apr 25 19:02:46 2024 - [info] Alive Slaves: Thu Apr 25 19:02:46 2024 - [info] 172.31.7.110(172.31.7.110:3306) Version=8.0.36 (oldest major version between slaves) log-bin:enabled Thu Apr 25 19:02:46 2024 - [info] GTID ON Thu Apr 25 19:02:46 2024 - [info] Replicating from 172.31.7.111(172.31.7.111:3306) Thu Apr 25 19:02:46 2024 - [info] 172.31.7.112(172.31.7.112:3306) Version=8.0.36 (oldest major version between slaves) log-bin:enabled Thu Apr 25 19:02:46 2024 - [info] GTID ON Thu Apr 25 19:02:46 2024 - [info] Replicating from 172.31.7.111(172.31.7.111:3306) It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 172.31.7.111(172.31.7.111:3306)? (YES/no): yes Thu Apr 25 19:02:52 2024 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time.. Thu Apr 25 19:02:52 2024 - [info] ok. Thu Apr 25 19:02:52 2024 - [info] Checking MHA is not monitoring or doing failover.. Thu Apr 25 19:02:52 2024 - [info] Checking replication health on 172.31.7.110.. Thu Apr 25 19:02:52 2024 - [info] ok. Thu Apr 25 19:02:52 2024 - [info] Checking replication health on 172.31.7.112.. Thu Apr 25 19:02:52 2024 - [info] ok. Thu Apr 25 19:02:52 2024 - [info] 172.31.7.110 can be new master. Thu Apr 25 19:02:52 2024 - [info] From: 172.31.7.111(172.31.7.111:3306) (current master) +--172.31.7.110(172.31.7.110:3306) +--172.31.7.112(172.31.7.112:3306) To: 172.31.7.110(172.31.7.110:3306) (new master) +--172.31.7.112(172.31.7.112:3306) +--172.31.7.111(172.31.7.111:3306) Starting master switch from 172.31.7.111(172.31.7.111:3306) to 172.31.7.110(172.31.7.110:3306)? (yes/NO): yes Thu Apr 25 19:03:11 2024 - [info] Checking whether 172.31.7.110(172.31.7.110:3306) is ok for the new master.. Thu Apr 25 19:03:11 2024 - [info] ok. Thu Apr 25 19:03:11 2024 - [info] 172.31.7.111(172.31.7.111:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host. Thu Apr 25 19:03:11 2024 - [info] 172.31.7.111(172.31.7.111:3306): Resetting slave pointing to the dummy host. Thu Apr 25 19:03:11 2024 - [info] ** Phase 1: Configuration Check Phase completed. Thu Apr 25 19:03:11 2024 - [info] Thu Apr 25 19:03:11 2024 - [info] * Phase 2: Rejecting updates Phase.. Thu Apr 25 19:03:11 2024 - [info] Thu Apr 25 19:03:11 2024 - [info] Executing master ip online change script to disable write on the current master: Thu Apr 25 19:03:11 2024 - [info] /usr/local/bin/master_ip_online_change --command=stop --orig_master_host=172.31.7.111 --orig_master_ip=172.31.7.111 --orig_master_port=3306 --orig_master_user='mha_monitor' --new_master_host=172.31.7.110 --new_master_ip=172.31.7.110 --new_master_port=3306 --new_master_user='mha_monitor' --orig_master_ssh_user=root --new_master_ssh_user=root --orig_master_is_new_slave --orig_master_password=xxx --new_master_password=xxx Thu Apr 25 19:03:12 2024 210750 Set read_only on the new master.. ok. Thu Apr 25 19:03:12 2024 219255 Waiting all running 3 threads are disconnected.. (max 1500 milliseconds) {'db' => undef,'Host' => 'localhost','Time' => 32494,'State' => 'Waiting on empty queue','Id' => 5,'Command' => 'Daemon','User' => 'event_scheduler','Info' => undef} {'State' => 'Source has sent all binlog to replica; waiting for more updates','db' => undef,'Host' => '172.31.7.112:35742','Time' => 4728,'User' => 'repl','Info' => undef,'Id' => 97,'Command' => 'Binlog Dump GTID'} {'Id' => 108,'Command' => 'Binlog Dump GTID','Info' => undef,'User' => 'repl','Host' => '172.31.7.110:38192','Time' => 58,'db' => undef,'State' => 'Source has sent all binlog to replica; waiting for more updates'} Thu Apr 25 19:03:12 2024 721301 Waiting all running 3 threads are disconnected.. (max 1000 milliseconds) {'User' => 'event_scheduler','Info' => undef,'Id' => 5,'Command' => 'Daemon','State' => 'Waiting on empty queue','db' => undef,'Host' => 'localhost','Time' => 32494} {'User' => 'repl','Info' => undef,'Command' => 'Binlog Dump GTID','Id' => 97,'State' => 'Source has sent all binlog to replica; waiting for more updates','db' => undef,'Time' => 4728,'Host' => '172.31.7.112:35742'} {'Host' => '172.31.7.110:38192','Time' => 58,'db' => undef,'State' => 'Source has sent all binlog to replica; waiting for more updates','Id' => 108,'Command' => 'Binlog Dump GTID','Info' => undef,'User' => 'repl'} Thu Apr 25 19:03:13 2024 223617 Waiting all running 3 threads are disconnected.. (max 500 milliseconds) {'Info' => undef,'User' => 'event_scheduler','Command' => 'Daemon','Id' => 5,'State' => 'Waiting on empty queue','Time' => 32495,'Host' => 'localhost','db' => undef} {'db' => undef,'Time' => 4729,'Host' => '172.31.7.112:35742','State' => 'Source has sent all binlog to replica; waiting for more updates','Command' => 'Binlog Dump GTID','Id' => 97,'User' => 'repl','Info' => undef} {'User' => 'repl','Info' => undef,'Command' => 'Binlog Dump GTID','Id' => 108,'State' => 'Source has sent all binlog to replica; waiting for more updates','db' => undef,'Time' => 59,'Host' => '172.31.7.110:38192'} Thu Apr 25 19:03:13 2024 725993 Set read_only=1 on the orig master.. ok. Thu Apr 25 19:03:13 2024 730496 Waiting all running 3 queries are disconnected.. (max 500 milliseconds) {'Command' => 'Daemon','Id' => 5,'User' => 'event_scheduler','Info' => undef,'db' => undef,'Time' => 32495,'Host' => 'localhost','State' => 'Waiting on empty queue'} {'Info' => undef,'User' => 'repl','Command' => 'Binlog Dump GTID','Id' => 97,'State' => 'Source has sent all binlog to replica; waiting for more updates','Time' => 4729,'Host' => '172.31.7.112:35742','db' => undef} {'Command' => 'Binlog Dump GTID','Id' => 108,'Info' => undef,'User' => 'repl','Time' => 59,'Host' => '172.31.7.110:38192','db' => undef,'State' => 'Source has sent all binlog to replica; waiting for more updates'} Thu Apr 25 19:03:14 2024 226671 Killing all application threads.. Thu Apr 25 19:03:14 2024 230799 done. Disabling the VIP an old master: 172.31.7.111 Thu Apr 25 19:03:15 2024 - [info] ok. Thu Apr 25 19:03:15 2024 - [info] Locking all tables on the orig master to reject updates from everybody (including root): Thu Apr 25 19:03:15 2024 - [info] Executing FLUSH TABLES WITH READ LOCK.. Thu Apr 25 19:03:15 2024 - [info] ok. Thu Apr 25 19:03:15 2024 - [info] Orig master binlog:pos is mysql-bin.000001:2520343. Thu Apr 25 19:03:15 2024 - [info] Waiting to execute all relay logs on 172.31.7.110(172.31.7.110:3306).. Thu Apr 25 19:03:15 2024 - [info] master_pos_wait(mysql-bin.000001:2520343) completed on 172.31.7.110(172.31.7.110:3306). Executed 0 events. Thu Apr 25 19:03:15 2024 - [info] done. Thu Apr 25 19:03:15 2024 - [info] Getting new master's binlog name and position.. Thu Apr 25 19:03:15 2024 - [info] mysql-bin.000014:197 Thu Apr 25 19:03:15 2024 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.31.7.110', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx'; Thu Apr 25 19:03:15 2024 - [info] Executing master ip online change script to allow write on the new master: Thu Apr 25 19:03:15 2024 - [info] /usr/local/bin/master_ip_online_change --command=start --orig_master_host=172.31.7.111 --orig_master_ip=172.31.7.111 --orig_master_port=3306 --orig_master_user='mha_monitor' --new_master_host=172.31.7.110 --new_master_ip=172.31.7.110 --new_master_port=3306 --new_master_user='mha_monitor' --orig_master_ssh_user=root --new_master_ssh_user=root --orig_master_is_new_slave --orig_master_password=xxx --new_master_password=xxx Thu Apr 25 19:03:15 2024 392468 Set read_only=0 on the new master. Enabling the VIP 172.31.7.166 on the new master: 172.31.7.110 Thu Apr 25 19:03:17 2024 - [info] ok. Thu Apr 25 19:03:17 2024 - [info] Thu Apr 25 19:03:17 2024 - [info] * Switching slaves in parallel.. Thu Apr 25 19:03:17 2024 - [info] Thu Apr 25 19:03:17 2024 - [info] -- Slave switch on host 172.31.7.112(172.31.7.112:3306) started, pid: 22573 Thu Apr 25 19:03:17 2024 - [info] Thu Apr 25 19:03:19 2024 - [info] Log messages from 172.31.7.112 ... Thu Apr 25 19:03:19 2024 - [info] Thu Apr 25 19:03:17 2024 - [info] Waiting to execute all relay logs on 172.31.7.112(172.31.7.112:3306).. Thu Apr 25 19:03:17 2024 - [info] master_pos_wait(mysql-bin.000001:2520343) completed on 172.31.7.112(172.31.7.112:3306). Executed 0 events. Thu Apr 25 19:03:17 2024 - [info] done. Thu Apr 25 19:03:17 2024 - [info] Resetting slave 172.31.7.112(172.31.7.112:3306) and starting replication from the new master 172.31.7.110(172.31.7.110:3306).. Thu Apr 25 19:03:17 2024 - [info] Executed CHANGE MASTER. Thu Apr 25 19:03:18 2024 - [info] Slave started. Thu Apr 25 19:03:19 2024 - [info] End of log messages from 172.31.7.112 ... Thu Apr 25 19:03:19 2024 - [info] Thu Apr 25 19:03:19 2024 - [info] -- Slave switch on host 172.31.7.112(172.31.7.112:3306) succeeded. Thu Apr 25 19:03:19 2024 - [info] Unlocking all tables on the orig master: Thu Apr 25 19:03:19 2024 - [info] Executing UNLOCK TABLES.. Thu Apr 25 19:03:19 2024 - [info] ok. Thu Apr 25 19:03:19 2024 - [info] Starting orig master as a new slave.. Thu Apr 25 19:03:19 2024 - [info] Resetting slave 172.31.7.111(172.31.7.111:3306) and starting replication from the new master 172.31.7.110(172.31.7.110:3306).. Thu Apr 25 19:03:19 2024 - [info] Executed CHANGE MASTER. Thu Apr 25 19:03:20 2024 - [info] Slave started. Thu Apr 25 19:03:20 2024 - [info] All new slave servers switched successfully. Thu Apr 25 19:03:20 2024 - [info] Thu Apr 25 19:03:20 2024 - [info] * Phase 5: New master cleanup phase.. Thu Apr 25 19:03:20 2024 - [info] Thu Apr 25 19:03:20 2024 - [info] 172.31.7.110: Resetting slave info succeeded. Thu Apr 25 19:03:20 2024 - [info] Switching master to 172.31.7.110(172.31.7.110:3306) completed successfully. ``` ## 16 结论性总结 MHA切换一次就自动退出了。 MHA位置点和GTID的流程不一样。 ### 16.1 MHA 的切换流程(位置点) ⾸先,MHA Manager检测到master不可⽤,则会通过另外两个slave检查master的可⽤性。⼀共检测4次。 同时判断MHA Manager到master的ssh可⽤性。 ssh可⽤性的判断结果影响后后续切换中的“Phase 3.2: Saving Dead Master's Binlog Phase” Phase 1: Configuration Check Phase.. 检查了MHA的配置信息,并再次判断了master的可⽤性。 Phase 2: Dead Master Shutdown Phase.. 宕机的master处理阶段,包括摘除VIP,执⾏shutdown_script中定义的脚本。 Phase 3: Master Recovery Phase.. Phase 3.1: Getting Latest Slaves Phase.. 判断哪个slave拥有最新的relay log(通过⽐较show slave status中的Master_Log_File, Read_Master_Log_Pos位 置), 哪个slave拥有最旧的relay log Phase 3.2: Saving Dead Master's Binlog Phase.. 根据上⾯得到的slave的最新位置信息,将差异的⼆进制⽇志保存到MHA Manager的指定⽬录下。 如果在第⼀步骤中,判断了MHA Manager到master的ssh不可⽤,则会跳过这个阶段。 Phase 3.3: Determining New Master Phase.. 选择新的master Phase 3.4: New Master Diff Log Generation Phase.. 将差异的relay log和master差异⽇志scp到新的master上。 Phase 3.4: Master Log Apply Phase. 1. 通过apply_diff_relay_logs应⽤差异的relay log,及差异的binlog。 2. 2. 应⽤完毕后,得到新的master binlog的⽂件和位置,其它slave可根据该⽂件和位置来建⽴主从复制关系。 3. 执⾏master_ip_failover脚本,执⾏如下操作 1. 将新主的read_only设置为0 2. 启动vip Phase 4: Slaves Recovery Phase Phase Phase 4.1: Starting Parallel Slave Diff Log Generation Phase 为slave获取差异的relay log 因为172.31.7.112拥有最新的relay log,所以也没必要获取差异的relay log Phase 4.2: Starting Parallel Slave Log Apply Phase 开始slave的恢复阶段 1. 应⽤差异⽇志。 2. 2. 清除原来的复制关系,并再次执⾏change master命令建⽴新的主从同步。 如果有多个slave,则该恢复过程是并⾏的。 Phase 5: New master cleanup phase 1. 在新主上执⾏ reset slave all 2. 从MHA的配置⽂件中剔除server1的配置信息 在默认情况下,如果 MHA 检测到两次切换的时间⼩于8⼩时,则不允许进⾏Failover,这个时候,需指定 -- ignore_last_failover 参数 MHA在线切换的原理 ### 16.2 MHA在线切换的原理 1、检查当前的配置信息及主从服务器的信息 包括读取MHA的配置⽂件/etc/masterha/app1.cnf及检查当前slave的健康状态 2、阻⽌对当前master的更新 主要通过如下步骤: ```bash 1.等待1.5s($time_until_kill_threads*100ms),等待当前连接断开。 2.执⾏ read_only=1,阻⽌新的DML操作 3.等待0.5s,等待当前DML操作完成。 4.kill掉所有连接。 5.FLUSH NO_WRITE_TO_BINLOG TABLES 6.FLUSH TABLES WITH READ LOCK ``` 3、等待新master执⾏完所有的relay log 4、将新master的read_only设置为off,并添加VIP 5、slave切换到新master上。 ```bash 1. 等待slave应⽤完原主从复制产⽣的relay log,然后执⾏change master操作切换到新master上。 2. 释放原master上加的锁。 3. 因masterha_master_switch命令⾏中带有--orig_master_is_new_slave参数,故原master也切换为新 master的从。 ```` 6、清理新master的相关信息。 主要是执⾏了reset slave all操作,清除之前的复制信息。 ### 16.3 MHA在线切换需满⾜的条件 ```bash # /usr/local/bin/masterha_master_switch --conf=/etc/masterha/app1.cnf -- master_state=alive --new_master_host=172.31.7.110 --new_master_port=3306 -- orig_master_is_new_slave --running_updates_limit=10000 --seconds_behind_master=100 ``` MHA在执⾏在线切换之前,会判断当前的主从复制信息,只有满⾜了以下条件,才能执⾏切换动作: 1. 所有SLAVE的IO线程和SQL线程都在运⾏。 2. 所有slave的Seconds_Behind_Master⼩于seconds_behind_master的值,该参数如果没有显示指定的话, 则默认为30s 3. 在master上,通过show processlist输出,没有⼀个DML操作的时间⼤于running_updates_limit的值。 ## 17 MHA常见问题 ### 17.1MHA 如何适配 MySQL 8.0 - MySQL 5.6 ```bash mysql> show full processlist; +----------+-----------------+------------------+-------------+---------+---------+-----------------------------------------------------------------------------+-----------------------+ | Id | User | Host | db | Command | Time | State | Info | +----------+-----------------+------------------+-------------+---------+---------+-----------------------------------------------------------------------------+-----------------------+ | 3 | event_scheduler | localhost | NULL | Daemon | 2061231 | Waiting for next activation | NULL | | 26771886 | system user | | NULL | Connect | 1415901 | Waiting for master to send event | NULL | | 26771887 | system user | | NULL | Connect | 1 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL | | 28198869 | reportuser | 10.0.0.111:34054 | ebsig_mcake | Sleep | 563 | | NULL | | 28202859 | reportuser | 10.0.0.111:36826 | ebsig_mcake | Sleep | 532 | | NULL | | 28202860 | reportuser | 10.0.0.111:36828 | ebsig_mcake | Sleep | 563 | | NULL | | 28202861 | reportuser | 10.0.0.111:36830 | ebsig_mcake | Sleep | 563 | | NULL | | 28202862 | reportuser | 10.0.0.111:36832 | ebsig_mcake | Sleep | 560 | | NULL | | 28202863 | reportuser | 10.0.0.111:36834 | ebsig_mcake | Sleep | 563 | | NULL | | 28203415 | root | localhost | NULL | Query | 0 | init | show full processlist | +----------+-----------------+------------------+-------------+---------+---------+-----------------------------------------------------------------------------+-----------------------+ 10 rows in set (0.00 sec) ``` - MySQL 8.0.36 ```bash mysql> show full processlist; +----+-----------------+-----------------+------+---------+-------+----------------------------------------------------------+-----------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+-----------------+-----------------+------+---------+-------+----------------------------------------------------------+-----------------------+ | 5 | system user | | NULL | Query | 38522 | Replica has read all relay log; waiting for more updates | NULL | | 6 | system user | connecting host | NULL | Connect | 38523 | Waiting for source to send event | NULL | | 7 | event_scheduler | localhost | NULL | Daemon | 38523 | Waiting on empty queue | NULL | | 11 | root | localhost | NULL | Query | 0 | init | show full processlist | +----+-----------------+-----------------+------+---------+-------+----------------------------------------------------------+-----------------------+ 4 rows in set, 1 warning (0.00 sec) ``` - 出于上面不同的原因 ```bash root@monitor:~# vim /usr/share/perl5/MHA/DBHelper.pm +707 707 if ( $state =~ m/^Has read all relay log/ 708 || $state =~ m/^Slave has read all relay log/ ) 替换为 707 if ( $state =~ m/^Has read all relay log/ || $state =~ m/^Replica has readall relay log/ 708 || $state =~ m/^Slave has read all relay log/ ) ``` ### 17.2 GTID 的注意事项 ```bash [server default] client_bindir=/usr/local/mysql/bin manager_log=/masterha/app2/manager.log manager_workdir=/masterha/app2 master_binlog_dir=/data/mysql/3307/data master_ip_failover_script=/usr/local/bin/master_ip_failover master_ip_online_change_script=/usr/local/bin/master_ip_online_change password=monitor123 ping_interval=1 remote_workdir=/tmp repl_password=repl123456 repl_user=repl report_script=/usr/local/bin/send_report secondary_check_script=/usr/bin/masterha_secondary_check -s 172.31.7.111 --user=root --master_host=172.31.7.110 --master_ip=172.31.7.110 --master_port=3306 shutdown_script="" ssh_user=root user=mha_monitor [server1] hostname=172.31.7.110 port=3307 [server2] candidate_master=1 check_repl_delay=0 hostname=172.31.7.111 port=3307 [binlog1] no_master=1 hostname=172.31.7.110 master_binlog_dir=/data/mysql/3307/data ``` ### 17.3 脑裂 - 半同步 - shutdown_script ### 17.4 定期检测集群状态的健康 ```bash # masterha_check_repl --conf=/etc/masterha/app1.cnf ``` ### 17.5 定期检测主从数据⼀致性 pt-table-checksum [技术分享 | 数据校验工具 pt-table-checksum - 知乎 (zhihu.com)](https://zhuanlan.zhihu.com/p/340411904) ### 17.6 MHA 的单点问题 使⽤ daemontools、systemd 之类的进程管理⼯具 ### 17.7 跳过主从延迟检测 ```bash # 配置⽂件中设置 check_repl_delay =0 ``` ### 17.8 跳过过滤规则检测 ```bash # 配置⽂件中设置 check_repl_filter = 0 ``` ------------ ------------ ------------ 测试: ```bash mysql> create user sunrise@'%' identified by '123456'; Query OK, 0 rows affected (0.05 sec) mysql> grant all privileges on *.* to 'sunrise'@'%' WITH GRANT OPTION; Query OK, 0 rows affected (0.00 sec) mysql> create database school; Query OK, 1 row affected (0.00 sec) ``` 造数据 ```python import random import string import pymysql import time # MySQL数据库配置 # db_config = { # "host": "172.31.7.110", # "port": 3306, # "user": "sunrise", # "password": "3edfud", # "database": "school", # } db_config = { "host": "172.31.7.166", "port": 3306, "user": "sunrise", "password": "123456", "database": "school", } # 生成随机字符串 def generate_random_string(length): letters = string.ascii_letters return ''.join(random.choice(letters) for i in range(length)) # 生成随机数据 def generate_random_data(record_count): students = [ { "学号": f"NSD{random.randint(100000, 999999)}", "姓名": generate_random_string(5), "性别": random.choice(["男", "女"]), "手机号": "1" + ''.join(random.choice(string.digits) for _ in range(10)), # 生成11位数字字符串 "通信地址": generate_random_string(10), } for _ in range(record_count) ] return students # 将随机数据插入MySQL数据库中的新表 def create_and_insert_random_data(data): try: # 连接到MySQL数据库 conn = pymysql.connect(**db_config) cursor = conn.cursor() # 为每次迭代创建一个新的表 table_name = f"student_{int(time.time())}" create_table_query = f""" CREATE TABLE {table_name} ( 学号 VARCHAR(20), 姓名 VARCHAR(20), 性别 VARCHAR(10), 手机号 VARCHAR(20), 通信地址 VARCHAR(50) ) """ cursor.execute(create_table_query) # 插入数据到新表中 insert_query = f"INSERT INTO {table_name} (学号, 姓名, 性别, 手机号, 通信地址) VALUES (%s, %s, %s, %s, %s)" for student in data: values = ( student["学号"], student["姓名"], student["性别"], student["手机号"], student["通信地址"], ) cursor.execute(insert_query, values) # 提交更改并关闭连接 conn.commit() cursor.close() conn.close() print(f"成功插入 {len(data)} 条记录到表 {table_name}。") except pymysql.Error as error: print(f"错误:{error}") if conn and conn.open: cursor.close() conn.close() if __name__ == "__main__": try: while True: # 生成随机数据(这里插入1000条记录,可以根据需要调整) random_data = generate_random_data(10) # 将随机数据插入新表中 create_and_insert_random_data(random_data) # 等待一段时间,例如5秒 time.sleep(5) except KeyboardInterrupt: print("手动停止数据插入。") ``` ### 参考 - https://www.jianshu.com/p/41f0774226df - https://www.cnblogs.com/ivictor/p/5686275.html - https://mp.weixin.qq.com/s/WkTk7-nDUNuYmkf8w5gW4Q
李延召
2024年4月28日 09:06
转发文档
收藏文档
上一篇
下一篇
手机扫码
复制链接
手机扫一扫转发分享
复制链接
Markdown文件
分享
链接
类型
密码
更新密码