《基于Linux高可用负载均衡集群构建(Ansible+LVS+Keepalived)》
Ansible安装比较简单,首先安装CENTOS6 epel-release源,然后直接通过yum来安装。
sudo yum install epel-release sudo yum install ansible
我们使用ssh-key方式登录和管理LVS负载均衡集群,首先创建ssh-key。
$ ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Passphrases do not match. Try again. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: 3b:91:ee:8d:2c:fb:f8:d7:a2:1e:e4:d5:fa:68:bf:8a root@centos6.8-AutoOPS The key's randomart image is: +--[ RSA 2048]----+ | | | | | | | . . | | S . . | | + + . | | * .. | | .+ *+o. | | o*E++++. | +-----------------+
在使用ssh证书登录目标服务器后,有时候会面临在server1上想登陆server2,但他们之间并没有配置authorized_keys或为了安全没有配置,这时你就可以通过跳板机的ssh-agent来实现在多个服务器之间免密码登录。
# 开启 ssh-agent $ eval ssh-agent SSH_AUTH_SOCK=/tmp/ssh-wYwbJL1472/agent.1472; export SSH_AUTH_SOCK; SSH_AGENT_PID=1473; export SSH_AGENT_PID; echo Agent pid 1473; $ ssh-agent bash # 添加私钥,如果使用的默认的ssh-key路径可以不填 $ ssh-add ~/.ssh/id_rsa # 允许ssh-agent 转发 # 修改每台服务器的 ssh 配置文件,让它们都对 ssh-agent 进行转发 #修改全局: $ echo "ForwardAgent yes" >> /etc/ssh/ssh_config #修改个人 $ touch ~/.ssh/config $ vim ~/.ssh/config Host * ForwardAgent yes
注意:如果你的SSH-KEY设置了密码,这需要将key加入到ssh-agent中,否则在批量分发时要求输入KEY密码而导致任务失败。
将集群内的所有服务器IP加入到Ansible host配置文件中。
$ cat >> /etc/ansible/hosts <
在Linux操作系统中,如果需要将服务器登录的ssh公钥copy的目标服务器,可以通过ssh-copy-id命令或者直接将authorized_keys文件用scp上传到目标服务器。
在Ansible中,如果要分发公约文件可以通过文件传输类模块(copy,file)讲authorized_keys文件传输到目标服务器。
在分发ssh公钥之前,首先需要配置ssh登录凭证,如果你的集群服务器密码都是相同的,可以在执行命令时使用-k你指明输入密码。
$ ansible lvs-rs -m authorized_key -a "user=root key='{{lookup('file','/root/.ssh/id_rsa.pub')}}'" -k
如果密码不同就需要在主机清单中定义密码(/etc/ansible/hosts)。如果首次使用SSH/Ansible连接目标服务器需要关闭host_key_checking(/etc/ansible/ansible.cfg)否则会报错,分发完毕后重新开启并清除ssh连接信息。
ssh连接信息如下如下所示(/etc/ansible/hosts):
[lvs-dr] 10.1.1.11 ansible_connection=ssh ansible_ssh_user=root ansible_ssh_pass=rootpass1 10.1.1.12 ansible_connection=ssh ansible_ssh_user=root ansible_ssh_pass=rootpass2 [lvs-rs] 10.1.1.13 ansible_connection=ssh ansible_ssh_user=root ansible_ssh_pass=rootpass3 10.1.1.14 ansible_connection=ssh ansible_ssh_user=root ansible_ssh_pass=rootpass4
关于未关闭host_key_checking(/etc/ansible/ansible.cfg)的报错信息如下:
$ ansible lvs-dr -m authorized_key -a "user=root key='{{lookup('file','/root/.ssh/id_rsa.pub')}}'" 10.1.1.14 | FAILED! => { "failed": true, "msg": "Using a SSH password instead of a key is not possible because Host Key checking is enabled and sshpass does not support this. Please add this host's fingerprint to your known_hosts file to manage this host." }
使用authorized_key分发公钥,命令如下: 其中 lvs-dr服务器分组,-m 指定使用的模块,lookup为获取指定文件内容。
$ ansible lvs-dr -m authorized_key -a "user=root key='{{lookup('file','/root/.ssh/id_rsa.pub')}}'" 10.1.1.11 | SUCCESS => { "changed": true, "exclusive": false, "key": "ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAW5GGfs2x+Vibizoiq/WNrDYxyyBISgWTisijGJijQA/wWsxj+hoh5fEw== root@centos6.8-AutoOPS", "key_options": null, "keyfile": "/root/.ssh/authorized_keys", "manage_dir": true, "path": null, "state": "present", "unique": false, "user": "root", "validate_certs": true } 10.1.1.14 | SUCCESS => { "changed": true, "exclusive": false, "key": "ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA+Vibizoiq/WNrDYxyyBISgWTisijGJijQA/wWsxj+hoh5fEw== root@centos6.8-AutoOPS", "key_options": null, "keyfile": "/root/.ssh/authorized_keys", "manage_dir": true, "path": null, "state": "present", "unique": false, "user": "root", "validate_certs": true }
SSH登录公钥分发完毕后,启用host_key_checking(/etc/ansible/ansible.cfg)。清除/etc/ansible/hosts中定义INVENTORY中的连接信息。
在LVS DR模式中,Real Server需要在lo绑定LVS DR的虚拟IP并抑制ARP响应。这样才能正常收发用户请求。具体脚本如下:
$ vim /data/ansible/lvs-group/lvs_dr_realserver.sh #!/bin/bash # # Date: 2017/04/09 # Author:Eric.Wang # LB_VIP=10.1.1.9 . /etc/rc.d/init.d/functions case "$1" in start) ifconfig lo:0 $LB_VIP netmask 255.255.255.255 broadcast $LB_VIP up /sbin/route add -host $LB_VIP dev lo:0 echo "1" > /proc/sys/net/ipv4/conf/lo/arp_ignore echo "2" > /proc/sys/net/ipv4/conf/lo/arp_announce echo "1" > /proc/sys/net/ipv4/conf/all/arp_ignore echo "2" > /proc/sys/net/ipv4/conf/all/arp_announce sysctp -p > /dev/null 2>&1 echo "RealServer Start OK." ;; stop) ifconfig lo:0 down /sbin/route del $LB_VIP > /dev/null 2>&1 echo "0" > /proc/sys/net/ipv4/conf/lo/arp_ignore echo "0" > /proc/sys/net/ipv4/conf/lo/arp_announce echo "0" > /proc/sys/net/ipv4/conf/all/arp_ignore echo "0" > /proc/sys/net/ipv4/conf/all/arp_announce sysctp -p > /dev/null 2>&1 echo "RealServer Stoped." ;; * ) echo "Usage: $0 {start|stop}" exit 1 esac exit 0$ vim /data/ansible/lvs-group/lvs_dr_realserver.sh #!/bin/bash # # Date: 2017/04/09 # Author:Eric.Wang # LB_VIP=10.1.1.9 . /etc/rc.d/init.d/functions case "$1" in start) ifconfig lo:0 $LB_VIP netmask 255.255.255.255 broadcast $LB_VIP up /sbin/route add -host $LB_VIP dev lo:0 echo "1" > /proc/sys/net/ipv4/conf/lo/arp_ignore echo "2" > /proc/sys/net/ipv4/conf/lo/arp_announce echo "1" > /proc/sys/net/ipv4/conf/all/arp_ignore echo "2" > /proc/sys/net/ipv4/conf/all/arp_announce sysctp -p > /dev/null 2>&1 echo "RealServer Start OK." ;; stop) ifconfig lo:0 down /sbin/route del $LB_VIP > /dev/null 2>&1 echo "0" > /proc/sys/net/ipv4/conf/lo/arp_ignore echo "0" > /proc/sys/net/ipv4/conf/lo/arp_announce echo "0" > /proc/sys/net/ipv4/conf/all/arp_ignore echo "0" > /proc/sys/net/ipv4/conf/all/arp_announce sysctp -p > /dev/null 2>&1 echo "RealServer Stoped." ;; * ) echo "Usage: $0 {start|stop}" exit 1 esac exit 0
将此脚本分发到所有Real Server服务器并运行。
$ ansible lvs-rs -m copy -a "src=/data/ansible/lvs-group/lvs_dr_realserver.sh dest=/opt/lvs_dr_realserver.sh owner=root group=root mode=0750"
启动LVS RealServer服务
$ ansible lvs-rs -m command -a '/opt/lvs_dr_realserver.sh start' 10.1.1.13 | SUCCESS | rc=0 >> RealServer Start OK. 10.1.1.12 | SUCCESS | rc=0 >> RealServer Start OK.
安装RealServer上的Nginx Web服务器,并启动服务
$ ansible lvs-rs -m yum -a 'name=nginx state=installed' $ ansible lvs-rs -m command -a '/etc/init.d/nginx restart'
Keepalived作为LVS扩展软件,能够自动管理LVS Director Server,通过Real Server健康检查,及时发现故障的RS节点并自动从负载均衡集群中摘除,这一切都是自动化的,只需要在keepalived.conf配置即可。其中Real Server健康检查支持HTTP、SSL、TCP等协议。
在安装了Keepalived得LVS Director Server中,LVS不需要特殊配置,可以通过Keepalived来管理LVS集群的类型、虚拟IP和Real Server。
在LVS Director Server中使用Ansible的yum模块安装Keepalived和ipvsadm。
$ ansible lvs-dr -m yum -a 'name=keepalived,ipvsadm state=installed'
修改Keepalived配置文件,指定集群类型,虚拟IP,Real Server、集群服务和健康检查。
$ cat > /data/ansible/lvs-group/keepalived.conf <
向LVS Director Server分发Keepalived配置文件 在文件分发前,需要关闭SELinux,否则可能导致特定目录没有权限操作。
$ ansible lvs-dr -m copy -a 'src=/data/ansible/lvs-group/keepalived.conf dest=/etc/keepalived/keepalived.conf owner=root group=root backup=yes'
修改Keepalived备份服务器的配置文件,修改keepalived.conf中的state和priority,不能与Keepalived Master冲突。
$ ansible 10.1.1.14 -m command -a 'sed -i "s/state\ MASTER/state\ BACKUP/g" /etc/keepalived/keepalived.conf' 10.1.1.14 | SUCCESS | rc=0 >> $ ansible 10.1.1.14 -m command -a 'sed -i "s/priority\ 100/priority\ 99/g" /etc/keepalived/keepalived.conf' 10.1.1.14 | SUCCESS | rc=0 >>
关掉RealServer 10.1.1.13的nginx,然后看lvs集群状态
$ ansible lvs-dr -m command -a '/sbin/ipvsadm -L -n' 10.1.1.11 | SUCCESS | rc=0 >> IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.1.1.9:80 wlc -> 10.1.1.12:80 Route 1 1 0 10.1.1.14 | SUCCESS | rc=0 >> IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.1.1.9:80 wlc -> 10.1.1.12:80 Route 1 0 0
关掉Keepalived Master,Keepalived Backup在检测到Master宕机后,自动接管VIP。
$ ansible 10.1.1.11 -m command -a '/etc/init.d/keepalived stop' 10.1.1.11 | SUCCESS | rc=0 >> Stopping keepalived: [ OK ] $ ansible 10.1.1.11 -m command -a '/etc/init.d/keepalived status' 10.1.1.11 | FAILED | rc=3 >> keepalived is stopped $ ansible lvs-dr -m command -a '/sbin/ipvsadm -L -n' 10.1.1.11 | SUCCESS | rc=0 >> IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn 10.1.1.14 | SUCCESS | rc=0 >> IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.1.1.9:80 wlc -> 10.1.1.12:80 Route 1 0 0 -> 10.1.1.13:80 Route 1 1 0 $ ansible lvs-dr -m command -a 'ip addr show dev eth0' 10.1.1.11 | SUCCESS | rc=0 >> 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:15:5d:00:3c:06 brd ff:ff:ff:ff:ff:ff inet 10.1.1.11/24 brd 10.1.1.255 scope global eth0 inet6 fe80::215:5dff:fe00:3c06/64 scope link valid_lft forever preferred_lft forever 10.1.1.14 | SUCCESS | rc=0 >> 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:15:5d:00:3c:15 brd ff:ff:ff:ff:ff:ff inet 10.1.1.14/24 brd 10.1.1.255 scope global eth0 inet 10.1.1.9/32 scope global eth0 inet6 fe80::215:5dff:fe00:3c15/64 scope link valid_lft forever preferred_lft forever
当原Keepalived Master恢复上线后,VIP将重新被原来的Master接管。
$ ansible 10.1.1.11 -m command -a '/etc/init.d/keepalived start' 10.1.1.11 | SUCCESS | rc=0 >> Starting keepalived: [ OK ] $ ansible lvs-dr -m command -a 'ip addr show dev eth0' 10.1.1.11 | SUCCESS | rc=0 >> 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:15:5d:00:3c:06 brd ff:ff:ff:ff:ff:ff inet 10.1.1.11/24 brd 10.1.1.255 scope global eth0 inet 10.1.1.9/32 scope global eth0 inet6 fe80::215:5dff:fe00:3c06/64 scope link valid_lft forever preferred_lft forever 10.1.1.14 | SUCCESS | rc=0 >> 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:15:5d:00:3c:15 brd ff:ff:ff:ff:ff:ff inet 10.1.1.14/24 brd 10.1.1.255 scope global eth0 inet6 fe80::215:5dff:fe00:3c15/64 scope link valid_lft forever preferred_lft forever
可用使用Zabbix或者Nagios通过自定义Shell脚本来监控集群的运行状态。