hello云胜

技术与生活

0%

部署rook并创建ceph集群

准备软件包

1
2
3
wget https://github.com/rook/rook/archive/v1.4.4.tar.gz
tar -zxvf v1.4.4.tar.gz
cd rook-1.4.4/cluster/examples/kubernetes/ceph

修改operator.yaml

rook需要在master主机上部署node,master默认被打上了NoSchedule的taint,需要在operator上面添加容忍

1
2
3
4
5
6
7
8
[root@d-paas-k8s-master-0 ceph-image]# vi operator.yaml
CSI_PLUGIN_TOLERATIONS: |
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
- effect: NoExecute
key: node-role.kubernetes.io/etcd
operator: Exists

加载ceph-csi image

部署ceph的cluster时会部署ceph-csi,csi的docker image都是放在quay.io仓库下的。在我实际部署中发现,我们的网络环境即使配置了docker镜像加速也访问不了quay.io下的image,导致image pulling失败,csi pods启动不了。

所以我事先将ceph-csi所需的image打包成tgz,放在跳板机 /home/api/ceph-image下

部署rook之前,先将如下的image都scp到集群内所有的主机上

1
2
3
4
5
6
7
8
9
10
[api@kfxqtyglpt ceph-image]$ pwd
/home/api/ceph-image
[api@kfxqtyglpt ceph-image]$ ll
总用量 1230580
-rwxr-xr-x. 1 api api 1049964544 11月 11 13:48 cephcsi.tgz
-rwxr-xr-x. 1 api api 47385088 11月 11 13:49 csi-attacher.tgz
-rwxr-xr-x. 1 api api 18313728 11月 11 13:49 csi-node-driver-register.tgz
-rwxr-xr-x. 1 api api 49535488 11月 11 13:49 csi-provisioner.tgz
-rwxr-xr-x. 1 api api 47319040 11月 11 13:49 csi-resizer.tgz
-rwxr-xr-x. 1 api api 47581696 11月 11 13:49 csi-snapshotter.tgz

在每台主机上执行

1
2
[root@d-paas-k8s-master-0 ~]# for i in `ls *.tgz`;do docker load -i $i;done
[root@d-paas-k8s-master-0 ~]# yum install lvm2.x86_64 -y

部署rook operator

1
2
3
4
5
[root@d-paas-k8s-master-0 ~]# cd /root/rook-1.4.4/cluster/examples/kubernetes/ceph
[root@d-paas-k8s-master-0 ceph]# kubectl create -f common.yaml
[root@d-paas-k8s-master-0 ceph]# kubectl create -f operator.yaml
#等待rook-ceph下的pod都处于running状态
[root@d-paas-k8s-master-0 ceph]# kubectl get pods -n rook-ceph

部署ceph cluster

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
[root@d-paas-k8s-master-0 ceph]# kubectl create -f cluster.yaml

#等待rook-ceph下的pod都处于running或者completed状态
[root@d-paas-k8s-master-0 ceph]# kubectl get pods -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-4j7tt 3/3 Running 0 19h
csi-cephfsplugin-l99bm 3/3 Running 0 19h
csi-cephfsplugin-n5xvw 3/3 Running 0 19h
csi-cephfsplugin-provisioner-598854d87f-n5ltz 6/6 Running 0 19h
csi-cephfsplugin-provisioner-598854d87f-vnjbq 6/6 Running 0 19h
csi-cephfsplugin-psmfq 3/3 Running 0 19h
csi-rbdplugin-2zt76 3/3 Running 0 19h
csi-rbdplugin-9jdwx 3/3 Running 0 19h
csi-rbdplugin-ddzpk 3/3 Running 0 19h
csi-rbdplugin-provisioner-dbc67ffdc-8xsbd 6/6 Running 0 19h
csi-rbdplugin-provisioner-dbc67ffdc-jspd9 6/6 Running 0 19h
csi-rbdplugin-wmqvm 3/3 Running 0 19h
rook-ceph-crashcollector-d-paas-k8s-0-node-0-7b696f9f8d-zqb49 1/1 Running 0 19h
rook-ceph-crashcollector-d-paas-k8s-0-node-1-645b49b659-gcrjz 1/1 Running 0 19h
rook-ceph-crashcollector-d-paas-k8s-0-node-2-dbb5978b6-6pwhv 1/1 Running 0 19h
rook-ceph-mgr-a-5977cf7cd7-dlmnj 1/1 Running 0 19h
rook-ceph-mon-a-6cfc9f64cc-k8vdp 1/1 Running 0 19h
rook-ceph-mon-b-574d74f4c9-lgl76 1/1 Running 0 19h
rook-ceph-mon-c-fd6fcb588-rtzfz 1/1 Running 0 19h
rook-ceph-operator-667756ddb6-rjr9v 1/1 Running 0 19h
rook-ceph-osd-0-95dd775b6-757w5 1/1 Running 0 19h
rook-ceph-osd-1-69c45949b5-8fphv 1/1 Running 0 19h
rook-ceph-osd-2-847cb97d55-n87d4 1/1 Running 0 19h
rook-ceph-osd-3-78b76c9475-2qsgw 1/1 Running 0 19h
rook-ceph-osd-4-55c4cb85d8-zqvlq 1/1 Running 0 19h
rook-ceph-osd-5-576db964d8-bwml7 1/1 Running 0 19h
rook-ceph-osd-prepare-d-paas-k8s-0-node-0-ndr7g 0/1 Completed 0 21m
rook-ceph-osd-prepare-d-paas-k8s-0-node-1-nfmjm 0/1 Completed 0 21m
rook-ceph-osd-prepare-d-paas-k8s-0-node-2-xt9hf 0/1 Completed 0 21m
rook-ceph-tools-7cc7fd5755-c44p9 1/1 Running 0 19h
rook-discover-ktjcr 1/1 Running 0 19h
rook-discover-mhcv7 1/1 Running 0 19h
rook-discover-xppd6 1/1 Running 0 19h


#检查ceph状态
[root@d-paas-k8s-master-0 ceph]# kubectl create -f toolbox.yaml

#进入toolbox pod
[root@d-paas-k8s-master-0 ceph]# kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') -- bash

#检查ceph集群状态,osd数量,osd状态
[root@rook-ceph-tools-7cc7fd5755-c44p9 /]# ceph status
cluster:
id: b56328d7-2256-4faa-a6f0-f8a684e1ab70
health: HEALTH_OK

services:
mon: 3 daemons, quorum a,b,c (age 19h)
mgr: a(active, since 24m)
osd: 6 osds: 6 up (since 19h), 6 in (since 19h) #集群内总共有几块裸盘就应该有几个osd,本例三台node每台两块裸盘,总共6个裸盘,所以有6块osd

data:
pools: 2 pools, 33 pgs
objects: 1.66k objects, 4.8 GiB
usage: 20 GiB used, 6.5 TiB / 6.5 TiB avail
pgs: 33 active+clean

io:
client: 1.3 MiB/s wr, 0 op/s rd, 83 op/s wr

#退出rook-ceph-tool容器
[root@rook-ceph-tools-7cc7fd5755-c44p9 /]# exit

创建storage class

1
2
3
[root@d-paas-k8s-master-0 rbd]# kubectl apply -f /root/rook-1.4.4/cluster/examples/kubernetes/ceph/csi/rbd/storageclass.yaml
#设置为默认storageclass
[root@d-paas-k8s-master-0 rbd]# kubectl patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

本例中使用的storageclass是rbd也就是块存储,ceph还提供了文件存储的storage-class,/root/rook-1.4.4/cluster/examples/kubernetes/ceph/csi/cephfs/storageclass.yaml

rbd相较于cephfs性能要更优,但是不支持多mount,也就是每个pvc只能被一个pod挂载。

cephfs在读写大文件时性能比较优秀,读写小文件时性能较差,但是其支持多mount,所以在需要多个pod共享存储时就需要选用cephfs作为storage-class

创建pvc 验证csi

1
2
3
4
5
6
7
8
9
[root@d-paas-k8s-master-0 rbd]# kubectl apply -f /root/rook-1.4.4/cluster/examples/kubernetes/ceph/csi/rbd/pvc.yaml
persistentvolumeclaim/rbd-pvc created
# 查看在default namespace下是否生成了name为rbd-pvc的pvc,status为bound
[root@d-paas-k8s-master-0 rbd]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rbd-pvc Bound pvc-5f1b5ce4-9d5b-40fb-b08d-5d37de576ea2 1Gi RWO rook-ceph-block 4s

# 验证完成删除rdb-pvc
[root@d-paas-k8s-master-0 rbd]# kubectl delete pvc rbd-pvc

遇到的坑

部署完rook和ceph cluster后,尝试创建一个pvc,发现pvc状态一直时pending状态,describe pvc 提示等待csi创建pv。检查rook-ceph下的pod,发现并没有创建csi pod (csi-xxxx)。csi-pod是创建ceph cluster时,operater创建的,所以检查rook-operator log

1
2
3
4
[root@d-paas-k8s-master-0 ~]# kubectl logs rook-ceph-operator-667756ddb6-rjr9v -nrook-ceph
##发现有错误如下所示
E | ceph-csi: invalid csi version. failed to run CmdReporter rook-ceph-csi-detect-version successfully. failed to delete existing results ConfigMap rook-ceph-csi-detect-version. failed to delete ConfigMap rook-ceph-csi-detect-version. etcdserver: request timed out
failed to complete ceph CSI version job

原因时operater在部署csi时会先启动一个rook-ceph-csi-detect-version 的job去和etcd做交互,具体交互什么内容目前还不清楚。但是需要这个job完成后才会去创建csi。这个job使用的image是上文提到的quay.io下的所以image无法拉下来,导致job长时间无法完成,最终超时。operator不会去创建csi。

这个问题解决方法就是安照本节开始的方法,提前把csi使用到的image都load到每台机器本地。

由于rook-ceph-csi-detect-version是和etcd做交互,那么这两方任意一个有问题都会导致csi无法创建。所以如果etcd在这期间出现了不稳定的状态,如leader重新选举导致无法提供服务,也会影响csi的创建。见https://github.com/rook/rook/issues/6291

如果node节点没有挂载裸盘,会有如下pod运行失败,影响后续storage class创建:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
NAME                                                   READY   STATUS             RESTARTS   AGE
csi-cephfsplugin-6865x 3/3 Running 0 6m53s
csi-cephfsplugin-7hdg7 3/3 Running 0 6m53s
csi-cephfsplugin-pcmnh 3/3 Running 0 6m53s
csi-cephfsplugin-provisioner-598854d87f-5h79c 6/6 Running 0 6m53s
csi-cephfsplugin-provisioner-598854d87f-9r6z2 6/6 Running 0 6m53s
csi-rbdplugin-9fkkp 3/3 Running 0 6m54s
csi-rbdplugin-fjc86 3/3 Running 0 6m54s
csi-rbdplugin-provisioner-dbc67ffdc-qsvs5 6/6 Running 0 6m54s
csi-rbdplugin-provisioner-dbc67ffdc-zrthq 6/6 Running 0 6m54s
csi-rbdplugin-x4gl4 3/3 Running 0 6m54s
rook-ceph-crashcollector-t-docker02-659696b779-sjjtf 1/1 Running 0 5m55s
rook-ceph-crashcollector-t-docker03-5856b9458-4rrm2 1/1 Running 0 5m5s
rook-ceph-crashcollector-t-docker04-ff475547f-6mrvr 1/1 Running 0 4m27s
rook-ceph-mgr-a-74d7d89b9-2bz72 1/1 Running 0 4m27s
rook-ceph-mon-a-9d56b548-zgfqv 1/1 Running 0 5m55s
rook-ceph-mon-b-d5f999ffb-64bs5 1/1 Running 0 5m49s
rook-ceph-mon-c-56856c4cff-knb8g 1/1 Running 0 5m5s
rook-ceph-operator-667756ddb6-nhbkk 1/1 Running 0 12m
rook-ceph-osd-prepare-t-docker02-5rpfp 0/1 CrashLoopBackOff 5 4m26s
rook-ceph-osd-prepare-t-docker03-kd5jt 0/1 CrashLoopBackOff 5 4m26s
rook-ceph-osd-prepare-t-docker04-qg9kx 0/1 CrashLoopBackOff 5 4m25s
rook-discover-kwwgq 1/1 Running 0 12m
rook-discover-m4fb2 1/1 Running 0 12m
rook-discover-rnpc5 1/1 Running 0 12m

如果osd创建失败,可以考虑重装rook或者重新挂在裸盘,清理请参考如下链接: https://rook.io/docs/rook/v1.4/ceph-teardown.html