hello云胜

技术与生活

0%

k8s中部署redis集群

k8s中部署redis集群

上一篇中,我们在k8s中部署了一个单节点的redis。这一节就继续升级难度,部署一个redis集群。

redis集群和redis单节点差别很大,而且相比于在服务器上部署redis集群,在k8s环境下的部署又麻烦了一些。

原因在于服务器的ip的固定不变的,但是容器环境下pod的ip每次重启都会改变。

redis cluster维持集群又需要确人ip。所以需要我们处理pod重启的情况。

yaml文件

configmap文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-config
namespace: redis-test
data:
fix-ip.sh: |
#!/bin/sh
CLUSTER_CONFIG="/data/data/nodes.conf"
if [ -f ${CLUSTER_CONFIG} ]; then
if [ -z "${POD_IP}" ]; then
echo "miss Pod IP address!"
exit 1
fi
echo "Updating my IP to ${POD_IP} in ${CLUSTER_CONFIG}"
sed -i.bak -e '/myself/ s/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/'${POD_IP}'/' ${CLUSTER_CONFIG}
fi
exec "$@"
redis.conf: |
dir /data/data
requirepass 123456
masterauth 123456
logfile /data/data/redis.log
cluster-enabled yes
cluster-config-file /data/data/nodes.conf
protected-mode no
daemonize no
pidfile /var/run/redis.pid
port 6379
bind 0.0.0.0
timeout 3600
tcp-keepalive 1
loglevel verbose
databases 16
save 900 1
save 300 10
save 60 10000
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec
篇幅原因,很多配置省略了

注意这个fix-ip.sh脚本,他的作用是在pod重启时,将自己之前生成的nodes.conf文件中myself的ip改成新的。

其他活着的pod能够自动更新新的ip,只有自己需要处理下。

statefulset文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
apiVersion: v1
kind: Service
metadata:
namespace: redis-test
name: redis-cluster
spec:
clusterIP: None
ports:
- port: 6379
targetPort: 6379
name: client
- port: 16379
targetPort: 16379
name: gossport
selector:
app: redis-cluster
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
namespace: redis-test
name: redis-cluster
spec:
serviceName: redis-cluster
replicas: 6
selector:
matchLabels:
app: redis-cluster
template:
metadata:
labels:
app: redis-cluster
spec:
containers:
- name: redis
image: redis:5.0.5-alpine
ports:
- containerPort: 6379
name: client
- containerPort: 16379
name: gossport
command: ["/data/conf/fix-ip.sh", "redis-server", "/data/conf/redis.conf"]
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
volumeMounts:
- name: config
mountPath: /data/conf
readOnly: false
- name: data
mountPath: /data/data
readOnly: false
volumes:
- name: config
configMap:
name: redis-config
defaultMode: 0755
volumeClaimTemplates:
- metadata:
name: data
annotations:
volume.beta.kubernetes.io/storage-class: "nfs-client"
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Gi

我们使用statefulset的volumeClaimTemplates,为每一个pod生成其对应的pvc和pv

service

1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: v1
kind: Service
metadata:
namespace: redis-test
name: redis-cluster
spec:
type: ClusterIP
ports:
- port: 6379
targetPort: 6379
name: client
selector:
app: redis-cluster

好了。

分别执行。全部启动

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[root@paas-m-k8s-master-1 ~]# kc -n redis-test get pod,sts,svc
NAME READY STATUS RESTARTS AGE
pod/redis-cluster-0 1/1 Running 0 3h3m
pod/redis-cluster-1 1/1 Running 0 14m
pod/redis-cluster-2 1/1 Running 0 3h3m
pod/redis-cluster-3 1/1 Running 0 3h3m
pod/redis-cluster-4 1/1 Running 0 3h3m
pod/redis-cluster-5 1/1 Running 0 3h2m

NAME READY AGE
statefulset.apps/redis-cluster 6/6 3h3m

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/redis-cluster ClusterIP None <none> 6379/TCP,16379/TCP 33m

组装集群

我们在服务器上组件集群会用以下命令

1
redis-cli -a 123456 --cluster create  ip:port ip2:port ip3:port ip4:port ip5:port ip6:port  --cluster-replicas 1

我们又可以用以下命令获取所有pod的ip

1
kubectl -n redis-test get pods -l app=redis-cluster -o jsonpath='{range.items[*]}{.status.podIP}:6379 '

所以连起来就是

1
kubectl -n redis-test exec -it redis-cluster-0 -- redis-cli -a 123456 --cluster create $(kubectl -n redis-test get pods -l app=redis-cluster -o jsonpath='{range.items[*]}{.status.podIP}:6379 ') --cluster-replicas 1

image-20230616135854306

好了,搭建完成。

去pod里连接一下看看

image-20230616140018243

用service连接一下看看

1
2
3
4
5
6
7
[root@paas-m-k8s-master-1 redis]# kc -n yys exec -it myredis-sample-0 -- redis-cli -c -h redis-cluster.redis-test -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
redis-cluster.redis-test:6379> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
...

注意,现在我们的redis在集群外是不能连接的,就算把service设置为nodeport模式也不行。因为现在redis配置的ip是k8s集群内的ip。

在外面不能被识别。

目前我们的策略是生产环境就应该是生产环境集群内使用,本来也不应该被外界访问。

这样确实在需要排查问题时会比较麻烦,我们单独开发了一个web页面用来供用户被授权后查询使用。

另外有个取巧的方法是,部署一个predixy进行redis的代理。

测试pod重启

先写一个数据aaa,看到这个数据存到了100.105.152.12这个pod

1
2
3
4
5
6
[root@paas-m-k8s-master-1 ~]# kc -n yys exec -it myredis-sample-0 -- redis-cli -c -h redis-cluster.redis-test -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
redis-cluster.redis-test:6379> set aaa 2023
-> Redirected to slot [10439] located at 100.105.152.12:6379
OK
100.105.152.12:6379>

看到这个pod是redis-cluster-1

1
2
3
4
5
6
7
8
[root@paas-m-k8s-master-1 ~]# kc -n redis-test get pod -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
redis-cluster-0 1/1 Running 0 168m 100.66.220.252 paas-m-k8s-node-1 <none> <none>
redis-cluster-1 1/1 Running 0 168m 100.105.152.12 paas-m-k8s-node-4 <none> <none>
redis-cluster-2 1/1 Running 0 168m 100.115.23.183 paas-m-k8s-node-5 <none> <none>
redis-cluster-3 1/1 Running 0 168m 100.66.54.124 paas-m-k8s-node-3 <none> <none>
redis-cluster-4 1/1 Running 0 168m 100.108.161.168 paas-m-k8s-node-2 <none> <none>
redis-cluster-5 1/1 Running 0 167m 100.111.149.223 paas-m-k8s-node-6 <none> <none>

删除redis-cluster-1 。新启动的pod的ip变为100.105.152.25

1
2
3
4
5
6
7
8
9
10
[root@paas-m-k8s-master-1 ~]# kc -n redis-test delete pod redis-cluster-1
pod "redis-cluster-1" deleted
[root@paas-m-k8s-master-1 ~]# kc -n redis-test get pod -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
redis-cluster-0 1/1 Running 0 169m 100.66.220.252 paas-m-k8s-node-1 <none> <none>
redis-cluster-1 1/1 Running 0 4s 100.105.152.25 paas-m-k8s-node-4 <none> <none>
redis-cluster-2 1/1 Running 0 168m 100.115.23.183 paas-m-k8s-node-5 <none> <none>
redis-cluster-3 1/1 Running 0 168m 100.66.54.124 paas-m-k8s-node-3 <none> <none>
redis-cluster-4 1/1 Running 0 168m 100.108.161.168 paas-m-k8s-node-2 <none> <none>
redis-cluster-5 1/1 Running 0 168m 100.111.149.223 paas-m-k8s-node-6 <none> <none>

查看集群的数据ok

1
2
3
4
5
[root@paas-m-k8s-master-1 ~]# kc -n myhome exec -it myredis-sample-0 -- redis-cli -c -h redis-cluster.redis-test -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
redis-cluster.redis-test:6379> get aaa
-> Redirected to slot [10439] located at 100.105.152.25:6379
"2023"

说明持久化和fix-ip是ok的

为什么会使用fix-ip这种蹩脚的方式?

其实一开始的时候,测试使用podName.svcName.nameSpace这种statefulset特有的访问方式,因为statefulset这种headless service方式来说,其podName.svcName.nameSpace命名是固定不变的,所以以podName.svcName.nameSpace次方式来构建redis cluster简直太合适了。

1
kubectl -n redis-test exec -it redis-cluster-0 -- redis-cli -a 123456 --cluster create redis-cluster-0.redis-cluster.redis-test:6379 redis-cluster-1.redis-cluster.redis-test:6379 redis-cluster-2.redis-cluster.redis-test:6379 redis-cluster-3.redis-cluster.redis-test:6379 redis-cluster-4.redis-cluster.redis-test:6379 redis-cluster-5.redis-cluster.redis-test:6379 --cluster-replicas 1

不过非常可惜,redis有检验,这种方式不是合法的ip

image-20230619113325746

另外ix-ip这种方案有一个风险,那就是如果多个redis实例同时宕机重启。也就是cluster集群已经fail,那么整个集群无法自我修复。需要人工干预。