基于Linux高可用负载均衡集群构建

关键字:linux,高可用,集群,Ansible,LVS,Keepalived
发布日期:2020-11-12 10:28:28.0

1. 系统架构设计

《基于Linux高可用负载均衡集群构建(Ansible+LVS+Keepalived)》

2. Ansible安装配置

2.1 Ansible安装

Ansible安装比较简单,首先安装CENTOS6 epel-release源,然后直接通过yum来安装。

sudo yum install epel-release
sudo yum install ansible

2.2 创建SSH-KEY

我们使用ssh-key方式登录和管理LVS负载均衡集群,首先创建ssh-key。

$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Passphrases do not match.  Try again.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
3b:91:ee:8d:2c:fb:f8:d7:a2:1e:e4:d5:fa:68:bf:8a root@centos6.8-AutoOPS
The key's randomart image is:
+--[ RSA 2048]----+
|                 |
|                 |
|                 |
|         . .     |
|        S . .    |
|       + + .     |
|        * ..     |
|      .+ *+o.    |
|      o*E++++.   |
+-----------------+

在使用ssh证书登录目标服务器后,有时候会面临在server1上想登陆server2,但他们之间并没有配置authorized_keys或为了安全没有配置,这时你就可以通过跳板机的ssh-agent来实现在多个服务器之间免密码登录。

# 开启 ssh-agent
$ eval ssh-agent
SSH_AUTH_SOCK=/tmp/ssh-wYwbJL1472/agent.1472; export SSH_AUTH_SOCK;
SSH_AGENT_PID=1473; export SSH_AGENT_PID;
echo Agent pid 1473;

$ ssh-agent bash
# 添加私钥,如果使用的默认的ssh-key路径可以不填
$ ssh-add ~/.ssh/id_rsa
# 允许ssh-agent 转发
# 修改每台服务器的 ssh 配置文件,让它们都对 ssh-agent 进行转发
#修改全局:
$ echo "ForwardAgent yes" >> /etc/ssh/ssh_config
#修改个人
$ touch ~/.ssh/config
$ vim ~/.ssh/config
Host *
  ForwardAgent yes

注意:如果你的SSH-KEY设置了密码,这需要将key加入到ssh-agent中,否则在批量分发时要求输入KEY密码而导致任务失败。

2.3 将LVS集群服务器加入Ansible

将集群内的所有服务器IP加入到Ansible host配置文件中。

$ cat >> /etc/ansible/hosts <

2.4 向所有服务器分发SSH-KEY登录公钥

在Linux操作系统中,如果需要将服务器登录的ssh公钥copy的目标服务器,可以通过ssh-copy-id命令或者直接将authorized_keys文件用scp上传到目标服务器。

在Ansible中,如果要分发公约文件可以通过文件传输类模块(copy,file)讲authorized_keys文件传输到目标服务器。

在分发ssh公钥之前,首先需要配置ssh登录凭证,如果你的集群服务器密码都是相同的,可以在执行命令时使用-k你指明输入密码。

$ ansible lvs-rs -m authorized_key -a "user=root key='{{lookup('file','/root/.ssh/id_rsa.pub')}}'" -k

如果密码不同就需要在主机清单中定义密码(/etc/ansible/hosts)。如果首次使用SSH/Ansible连接目标服务器需要关闭host_key_checking(/etc/ansible/ansible.cfg)否则会报错,分发完毕后重新开启并清除ssh连接信息。

ssh连接信息如下如下所示(/etc/ansible/hosts):

[lvs-dr]
 10.1.1.11 ansible_connection=ssh ansible_ssh_user=root ansible_ssh_pass=rootpass1
 10.1.1.12 ansible_connection=ssh ansible_ssh_user=root ansible_ssh_pass=rootpass2
 
 [lvs-rs]
 10.1.1.13 ansible_connection=ssh ansible_ssh_user=root ansible_ssh_pass=rootpass3
 10.1.1.14 ansible_connection=ssh ansible_ssh_user=root ansible_ssh_pass=rootpass4

关于未关闭host_key_checking(/etc/ansible/ansible.cfg)的报错信息如下:

$ ansible lvs-dr -m authorized_key -a "user=root key='{{lookup('file','/root/.ssh/id_rsa.pub')}}'"
10.1.1.14 | FAILED! => {
    "failed": true, 
    "msg": "Using a SSH password instead of a key is not possible because Host Key checking is enabled and sshpass does not support this.  Please add this host's fingerprint to your known_hosts file to manage this host."
}

使用authorized_key分发公钥,命令如下: 其中 lvs-dr服务器分组,-m 指定使用的模块,lookup为获取指定文件内容。

$ ansible lvs-dr -m authorized_key -a "user=root key='{{lookup('file','/root/.ssh/id_rsa.pub')}}'"
10.1.1.11 | SUCCESS => {
    "changed": true, 
    "exclusive": false, 
    "key": "ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAW5GGfs2x+Vibizoiq/WNrDYxyyBISgWTisijGJijQA/wWsxj+hoh5fEw== root@centos6.8-AutoOPS", 
    "key_options": null, 
    "keyfile": "/root/.ssh/authorized_keys", 
    "manage_dir": true, 
    "path": null, 
    "state": "present", 
    "unique": false, 
    "user": "root", 
    "validate_certs": true
}
10.1.1.14 | SUCCESS => {
    "changed": true, 
    "exclusive": false, 
    "key": "ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA+Vibizoiq/WNrDYxyyBISgWTisijGJijQA/wWsxj+hoh5fEw== root@centos6.8-AutoOPS", 
    "key_options": null, 
    "keyfile": "/root/.ssh/authorized_keys", 
    "manage_dir": true, 
    "path": null, 
    "state": "present", 
    "unique": false, 
    "user": "root", 
    "validate_certs": true
}

SSH登录公钥分发完毕后,启用host_key_checking(/etc/ansible/ansible.cfg)。清除/etc/ansible/hosts中定义INVENTORY中的连接信息。

3. LVS Real Server 应用服务器配置

在LVS DR模式中,Real Server需要在lo绑定LVS DR的虚拟IP并抑制ARP响应。这样才能正常收发用户请求。具体脚本如下:

$ vim /data/ansible/lvs-group/lvs_dr_realserver.sh
#!/bin/bash
#
# Date: 2017/04/09
# Author:Eric.Wang
#
LB_VIP=10.1.1.9
. /etc/rc.d/init.d/functions

case "$1" in
    start)
        ifconfig lo:0 $LB_VIP netmask 255.255.255.255 broadcast $LB_VIP up
        /sbin/route add -host $LB_VIP dev lo:0
        echo "1" > /proc/sys/net/ipv4/conf/lo/arp_ignore
        echo "2" > /proc/sys/net/ipv4/conf/lo/arp_announce
        echo "1" > /proc/sys/net/ipv4/conf/all/arp_ignore        
        echo "2" > /proc/sys/net/ipv4/conf/all/arp_announce
        
        sysctp -p > /dev/null 2>&1
        echo "RealServer Start OK."
        ;;
    stop)
        ifconfig lo:0 down
        /sbin/route del $LB_VIP > /dev/null 2>&1
        echo "0" > /proc/sys/net/ipv4/conf/lo/arp_ignore
        echo "0" > /proc/sys/net/ipv4/conf/lo/arp_announce
        echo "0" > /proc/sys/net/ipv4/conf/all/arp_ignore        
        echo "0" > /proc/sys/net/ipv4/conf/all/arp_announce
        sysctp -p > /dev/null 2>&1
        echo "RealServer Stoped."
        ;;
    * )
        echo "Usage: $0 {start|stop}"
        exit 1
esac
exit 0$ vim /data/ansible/lvs-group/lvs_dr_realserver.sh
#!/bin/bash
#
# Date: 2017/04/09
# Author:Eric.Wang
#
LB_VIP=10.1.1.9
. /etc/rc.d/init.d/functions

case "$1" in
    start)
        ifconfig lo:0 $LB_VIP netmask 255.255.255.255 broadcast $LB_VIP up
        /sbin/route add -host $LB_VIP dev lo:0
        echo "1" > /proc/sys/net/ipv4/conf/lo/arp_ignore
        echo "2" > /proc/sys/net/ipv4/conf/lo/arp_announce
        echo "1" > /proc/sys/net/ipv4/conf/all/arp_ignore        
        echo "2" > /proc/sys/net/ipv4/conf/all/arp_announce
        
        sysctp -p > /dev/null 2>&1
        echo "RealServer Start OK."
        ;;
    stop)
        ifconfig lo:0 down
        /sbin/route del $LB_VIP > /dev/null 2>&1
        echo "0" > /proc/sys/net/ipv4/conf/lo/arp_ignore
        echo "0" > /proc/sys/net/ipv4/conf/lo/arp_announce
        echo "0" > /proc/sys/net/ipv4/conf/all/arp_ignore        
        echo "0" > /proc/sys/net/ipv4/conf/all/arp_announce
        sysctp -p > /dev/null 2>&1
        echo "RealServer Stoped."
        ;;
    * )
        echo "Usage: $0 {start|stop}"
        exit 1
esac
exit 0

将此脚本分发到所有Real Server服务器并运行。

$ ansible lvs-rs -m copy -a "src=/data/ansible/lvs-group/lvs_dr_realserver.sh dest=/opt/lvs_dr_realserver.sh owner=root group=root mode=0750"

启动LVS RealServer服务

$ ansible lvs-rs -m command -a '/opt/lvs_dr_realserver.sh start'
10.1.1.13 | SUCCESS | rc=0 >>
RealServer Start OK.

10.1.1.12 | SUCCESS | rc=0 >>
RealServer Start OK.

安装RealServer上的Nginx Web服务器,并启动服务

$ ansible lvs-rs -m yum -a 'name=nginx state=installed'
$ ansible lvs-rs -m command -a '/etc/init.d/nginx restart'

4. Keepalived/LVS Director Server安装配置

Keepalived作为LVS扩展软件,能够自动管理LVS Director Server,通过Real Server健康检查,及时发现故障的RS节点并自动从负载均衡集群中摘除,这一切都是自动化的,只需要在keepalived.conf配置即可。其中Real Server健康检查支持HTTP、SSL、TCP等协议。

4.1 LVS Director Server

在安装了Keepalived得LVS Director Server中,LVS不需要特殊配置,可以通过Keepalived来管理LVS集群的类型、虚拟IP和Real Server。

4.2 Keepalived安装配置

4.2.1 安装Keepalived,ipvsadm

在LVS Director Server中使用Ansible的yum模块安装Keepalived和ipvsadm。

$ ansible lvs-dr -m yum -a 'name=keepalived,ipvsadm state=installed'

4.2.2 修改Keepalived配置文件

修改Keepalived配置文件,指定集群类型,虚拟IP,Real Server、集群服务和健康检查。

$ cat > /data/ansible/lvs-group/keepalived.conf <

4.2.3 分发keepalived.conf

向LVS Director Server分发Keepalived配置文件 在文件分发前,需要关闭SELinux,否则可能导致特定目录没有权限操作。

$ ansible lvs-dr -m copy -a 'src=/data/ansible/lvs-group/keepalived.conf dest=/etc/keepalived/keepalived.conf owner=root group=root backup=yes'

修改Keepalived备份服务器的配置文件,修改keepalived.conf中的state和priority,不能与Keepalived Master冲突。

$ ansible 10.1.1.14 -m command -a 'sed -i "s/state\ MASTER/state\ BACKUP/g" /etc/keepalived/keepalived.conf'
10.1.1.14 | SUCCESS | rc=0 >>


$ ansible 10.1.1.14 -m command -a 'sed -i "s/priority\ 100/priority\ 99/g" /etc/keepalived/keepalived.conf'
10.1.1.14 | SUCCESS | rc=0 >>

4.2.4 启动主、备Keepalived服务

4.3 故障模拟

关掉RealServer 10.1.1.13的nginx,然后看lvs集群状态

$ ansible lvs-dr -m command -a '/sbin/ipvsadm -L -n'
10.1.1.11 | SUCCESS | rc=0 >>
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  10.1.1.9:80 wlc
  -> 10.1.1.12:80                 Route   1      1          0         

10.1.1.14 | SUCCESS | rc=0 >>
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  10.1.1.9:80 wlc
  -> 10.1.1.12:80                 Route   1      0          0 

关掉Keepalived Master,Keepalived Backup在检测到Master宕机后,自动接管VIP。

$ ansible 10.1.1.11 -m command -a '/etc/init.d/keepalived stop'
10.1.1.11 | SUCCESS | rc=0 >>
Stopping keepalived: [  OK  ]

$ ansible 10.1.1.11 -m command -a '/etc/init.d/keepalived status'
10.1.1.11 | FAILED | rc=3 >>
keepalived is stopped

$ ansible lvs-dr -m command -a '/sbin/ipvsadm -L -n'
10.1.1.11 | SUCCESS | rc=0 >>
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn

10.1.1.14 | SUCCESS | rc=0 >>
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  10.1.1.9:80 wlc
  -> 10.1.1.12:80                 Route   1      0          0         
  -> 10.1.1.13:80                 Route   1      1          0  

$ ansible lvs-dr -m command -a 'ip addr show dev eth0'
10.1.1.11 | SUCCESS | rc=0 >>
2: eth0:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:15:5d:00:3c:06 brd ff:ff:ff:ff:ff:ff
    inet 10.1.1.11/24 brd 10.1.1.255 scope global eth0
    inet6 fe80::215:5dff:fe00:3c06/64 scope link 
       valid_lft forever preferred_lft forever

10.1.1.14 | SUCCESS | rc=0 >>
2: eth0:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:15:5d:00:3c:15 brd ff:ff:ff:ff:ff:ff
    inet 10.1.1.14/24 brd 10.1.1.255 scope global eth0
    inet 10.1.1.9/32 scope global eth0
    inet6 fe80::215:5dff:fe00:3c15/64 scope link 
       valid_lft forever preferred_lft forever  

当原Keepalived Master恢复上线后,VIP将重新被原来的Master接管。

$ ansible 10.1.1.11 -m command -a '/etc/init.d/keepalived start'
10.1.1.11 | SUCCESS | rc=0 >>
Starting keepalived: [  OK  ]

$ ansible lvs-dr -m command -a 'ip addr show dev eth0'
10.1.1.11 | SUCCESS | rc=0 >>
2: eth0:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:15:5d:00:3c:06 brd ff:ff:ff:ff:ff:ff
    inet 10.1.1.11/24 brd 10.1.1.255 scope global eth0
    inet 10.1.1.9/32 scope global eth0
    inet6 fe80::215:5dff:fe00:3c06/64 scope link 
       valid_lft forever preferred_lft forever

10.1.1.14 | SUCCESS | rc=0 >>
2: eth0:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:15:5d:00:3c:15 brd ff:ff:ff:ff:ff:ff
    inet 10.1.1.14/24 brd 10.1.1.255 scope global eth0
    inet6 fe80::215:5dff:fe00:3c15/64 scope link 
       valid_lft forever preferred_lft forever  

5. LVS集群状态监控

可用使用Zabbix或者Nagios通过自定义Shell脚本来监控集群的运行状态。