暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

0174.K kuboard监控套件安装

rundba 2022-06-01
2376

 

使用kuboard管理K8S集群后,通过在Kuboard上配置监控套件来完成对集群进行监控。

 

 

0.ENV

 

0.1 软件版本

Kubernetes 1.23.5

Kuboard3.5.0.1

资源层监控套件system-monitor.addons.kuboard.cn v3.1.7


0.2 存储摘要

使用NFS作为数据prometheus时序数据存储,需要创建Kuboard依赖的StorageClassPVPVC,简要信息


 

 

 

StorageClass

kuboard-kube-prometheus

N/A

PV

kuboard-kube-prometheus0

/kuboard_pv/prometheus-k8s-db-prometheus-k8s-0

kuboard-kube-prometheus1

/kuboard_pv/prometheus-k8s-db-prometheus-k8s-1

PVC

prometheus-k8s-db-prometheus-k8s-0

N/A

prometheus-k8s-db-prometheus-k8s-1

N/A


 

1.创建StorageClass[必须]

 

1) 创建kuboard依赖的StorageClassPVPVC

后面会用到此处创建的StorageClass

StorageClassPV可以指定名称,两个PVC名称必须是prometheus-k8s-db-prometheus-k8s-0prometheus-k8s-db-prometheus-k8s-1

    vim kuboard-kube-prometheusV3.yaml     #编辑配置文件,内容如下
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
    name: kuboard-kube-prometheus #StorageClass名称,后续用到
    provisioner: kubernetes.io/no-provisioner
    reclaimPolicy: Retain
    volumeBindingMode: Immediate


    ---
    apiVersion: v1
    kind: PersistentVolume
    metadata:
    name: kuboard-kube-prometheus0
    spec:
    accessModes:
    - ReadWriteMany
    capacity:
    storage: 40Gi
    nfs:
    path: /kuboard_pv/prometheus-k8s-db-prometheus-k8s-0 #视现场环境而定,需要提前在nfs-server上创建该目录
    server: 192.18.80.159 #NFS Server地址
    persistentVolumeReclaimPolicy: Retain
    storageClassName: kuboard-kube-prometheus
    volumeMode: Filesystem


    ---
    apiVersion: v1
    kind: PersistentVolume
    metadata:
    name: kuboard-kube-prometheus1
    spec:
    accessModes:
    - ReadWriteMany
    capacity:
    storage: 40Gi
    nfs:
    path: /kuboard_pv/prometheus-k8s-db-prometheus-k8s-1 #视现场环境而定,需要提前在nfs-server上创建该目录
    server: 192.18.80.159 #NFS Server地址
    persistentVolumeReclaimPolicy: Retain
    storageClassName: kuboard-kube-prometheus
    volumeMode: Filesystem


    ---
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
    labels:
    name: prometheus-k8s-db-prometheus-k8s-0
    name: prometheus-k8s-db-prometheus-k8s-0
    namespace: kuboard
    spec:
    accessModes: #访客模式
    - ReadWriteMany
    resources: #请求空间
    requests:
    storage: 40Gi
    storageClassName: kuboard-kube-prometheus


    ---
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
    labels:
    name: prometheus-k8s-db-prometheus-k8s-1
    name: prometheus-k8s-db-prometheus-k8s-1
    namespace: kuboard
    spec:
    accessModes: # 访客模式
    - ReadWriteMany
    resources: # 请求空间
    requests:
    storage: 40Gi
    storageClassName: kuboard-kube-prometheus

    上面共5段,分别为1StorageClass2PV2PVC


    2) 准备目录

    创建目录:

      mkdir -p kuboard_pv/prometheus-k8s-db-prometheus-k8s-0
      mkdir -p kuboard_pv/prometheus-k8s-db-prometheus-k8s-1



      修改权限:

        chmod –R 777 kuboard_pv/prometheus-k8s-db-prometheus-k8s-0
        chmod –R 777 kuboard_pv/prometheus-k8s-db-prometheus-k8s-1

        如果目录属主为root(较不安全),确保目录权限为777。否则后续创建的pod会失败,提示msg="Error opening query log file" file=/prometheus/queries.active err="open prometheus/queries.active: permission denied" panic: Unable to create mmap-ed active query log。默认nfs目录属主为nobody


        查看权限为777

          [root@rh-master01 ~]# ll kuboard_pv/
          total 8
          drwxrwxrwx 3 root root 4096 May 26 18:20 prometheus-k8s-db-prometheus-k8s-0
          drwxrwxrwx 3 root root 4096 May 26 18:20 prometheus-k8s-db-prometheus-k8s-1



          3) 创建StorageClassPVPVC


            [root@rh-master01 kuboard]# kubectl apply -f kuboard-kube-prometheusV3.yaml 
            storageclass.storage.k8s.io/kuboard-kube-prometheus created
            persistentvolume/kuboard-kube-prometheus0 created
            persistentvolume/kuboard-kube-prometheus1 created
            persistentvolumeclaim/prometheus-k8s-db-prometheus-k8s-0 created
            persistentvolumeclaim/prometheus-k8s-db-prometheus-k8s-1 created


            4) 查看已创建存储资源

            查看StorageClass

              [root@rh-master01 ~]# kubectl get sc | grep prometheus
              dev-em-dtbase-test-prometheus kubernetes.io/no-provisioner Delete Immediate false 12d
              kuboard-kube-prometheus kubernetes.io/no-provisioner Retain Immediate false 68m


              查看PV

                [root@rh-master01 ~]# kubectl get pv | grep prometheus
                kuboard-kube-prometheus0 40Gi RWX Retain Bound kuboard/prometheus-k8s-db-prometheus-k8s-0 kuboard-kube-prometheus 69m
                kuboard-kube-prometheus1 40Gi RWX Retain Bound kuboard/prometheus-k8s-db-prometheus-k8s-1 kuboard-kube-prometheus 69m
                nfs-pv-kuboard-prometheus 40Gi RWX Retain Terminating kube-system/nfs-pvc-kuboard-prometheus nfs-storageclass-provisioner 45h
                test-em-dtbase-test-prometheus0 10Gi RWO Retain Bound em/prometheus-data-dtbase-prometheus-master-0 dev-em-dtbase-test-prometheus 12d


                查看PVCSTATUSBound

                  [root@rh-master01 ~]# kubectl get pvc -n kuboard
                  NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
                  prometheus-k8s-db-prometheus-k8s-0 Bound kuboard-kube-prometheus0 40Gi RWX kuboard-kube-prometheus 70m
                  prometheus-k8s-db-prometheus-k8s-1 Bound kuboard-kube-prometheus1 40Gi RWX kuboard-kube-prometheus 70m


                   

                  2. 从yaml安装套件

                   

                  集群导入> 套件 > 套件仓库 > 查找套件



                  点击查看



                  点击“离线安装”,稍等片刻点击



                  集群导入 > 套件 > 从YAML创建



                  粘贴刚才复制的内容,点击确定



                  按提示,点击应用




                  点击确定


                   

                  3.监控套件安装(①套件参数)

                   

                  集群导入 > 套件 > 已安装套件 > 详情



                  此时状态为LOADED,根据向导,完成后面4个步骤的操作,此时先进行第一个步骤,套件参数配置。



                  套件参数页面,在PROMETHEUS_STORAGE_CLASS一栏填入之前创建的StorageClass:kuboard-kube-prometheus



                  下列检查项均正常,curl不能访问,可以忽略。也可调整,调整参考下方。


                  Curl访问方法:

                    curl -ik https://master节点的ip地址:10257
                    curl -ik https://master节点的ip地址:10259



                    访问所有master的10257和10259端口,此处以master01为例,当前提示访问异常:

                      [root@rh-master01 kuboard]# curl -ik https://192.18.80.135:10257
                      curl: (7) Failed connect to 192.18.80.135:10257; Connection refused
                      [root@rh-master01 kuboard]# curl -ik https://192.18.80.135:10259
                      curl: (7) Failed connect to 192.18.80.135:10259; Connection refused


                      #查看kube-controller-manager.yaml配置

                        grep bind-address=127.0.0.1 etc/kubernetes/manifests/kube-controller-manager.yaml



                        #修改配置,将bind-address=127.0.0.1更换为bind-address=0.0.0.0

                          sed -i 's/bind-address=127.0.0.1/bind-address=0.0.0.0/g' etc/kubernetes/manifests/kube-controller-manager.yaml
                          vim etc/kubernetes/manifests/kube-controller-manager.yaml
                          0.0.0.0


                          #查看kube-scheduler.yaml配置

                            grep bind-address=127.0.0.1 etc/kubernetes/manifests/kube-scheduler.yaml



                            #修改配置,将bind-address=127.0.0.1更换为bind-address=0.0.0.0

                              sed -i 's/bind-address=127.0.0.1/bind-address=0.0.0.0/g' etc/kubernetes/manifests/kube-scheduler.yaml 
                              vim etc/kubernetes/manifests/kube-scheduler.yaml



                              10527端口正常参考

                                [root@rh-master01 ~]# curl -ik https://192.18.80.135:10257
                                HTTP/1.1 403 Forbidden
                                Cache-Control: no-cache, private
                                Content-Type: application/json
                                X-Content-Type-Options: nosniff
                                Date: Wed, 25 May 2022 00:53:09 GMT
                                Content-Length: 217


                                {
                                "kind": "Status",
                                "apiVersion": "v1",
                                "metadata": {},
                                "status": "Failure",
                                "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
                                "reason": "Forbidden",
                                "details": {},
                                "code": 403
                                }


                                也可用浏览器打开:


                                准备无误后,勾选“确认已完成”,点击保存。



                                点击应用



                                点击确定,进入第②步。



                                 

                                4.监控套件安装(②安装脚本)

                                 

                                4.1 预安装

                                点击预安装



                                后续点击N个“下一步”



                                点击确定



                                已通过校验,点击应用



                                点击确定



                                预检查完成,点击确定,进行安装。



                                4.2 安装

                                点击安装



                                点击N个“下一步”



                                下一步



                                下一步



                                下一步



                                点击确定,如果要调整副本数,可“重置副本数为1”。



                                点击应用



                                点击确定



                                已完成安装,点击确定,进入第③步。

                                 


                                5.监控套件安装(③初始化)

                                 

                                等待数分钟后,kuboard的各Pod就绪,点击确定



                                此时kuboard套件的状态变为INSTALLED,点击“执行初始化”



                                点击确定,进入第④步


                                 


                                6.监控套件安装(④扩展)

                                 

                                看到“当您看到此页面时,说明您已经激活了此套件...”提示,说明监控套件已安装完成。



                                集群导入 > 套件 > 资源层监控套件提示“已安装”


                                 


                                7.查看监控

                                 

                                集群管理 > 资源层监控套件 > 可点击7个扩展套件进行查看



                                点击资源监控效果:


                                 

                                8.卸载“监控套件”

                                 

                                如果需要卸载,可以点击删除套件,也可以点击禁用套件”,会临时禁用所有套件。



                                点击确定



                                点击卸载

                                然后按提示完成卸载。


                                 


                                9.启用套件

                                 

                                当禁用套件后,在相同位置点击“启用套件”,即可再次启用所有监控套件。


                                 


                                10.后续工作-告警配置(略)

                                 

                                后续将可进行告警相关配置,此处略过,未完待续。



                                 

                                11.参考

                                 

                                  https://kuboard.cn/learning/k8s-advanced/ts/application.html#debugging-pods
                                  https://kuboard.cn/guide/addon/#%E6%A6%82%E8%BF%B0
                                  https://kuboard.cn/learning/k8s-advanced/observe/monitor.html#%E5%89%8D%E6%8F%90
                                  https://kuboard.cn/learning/k8s-advanced/observe/alert.html






                                  -- 完 --


                                  更多精彩,敬请期待



                                  不足之处,还望抛转。

                                  作者:王坤,微信公众号:rundba,欢迎转载,转载请注明出处。

                                  如需公众号转发,请联系wx:landnow。


                                     




                                                               长按二维码                                   


                                  欢迎加入>>西安K8S组(XAKG)


                                         

                                     请注明:来自rundba,加入XAK8S组                

                                               




                                  往期推荐

                                  0173.K pod日志提示persistentvolumeclaim not found解决方法

                                  0172.K pod日志提示pod has unbound immediate PersistentVolumeClaims解决

                                  0171.K pod日志提示open prometheus queries.active permission denied解决

                                  0170.K K8S增加node节点

                                  0169.K K8S集群删除与添加节点

                                  0168.K k8s增加node资源后,显示资源没有更新解决方法

                                  0165.K docker login报错x509: certificate relies on legacy...处理记录

                                  0164.K starting Harbor non-overlapping IPv4 address pool among..

                                  0163.K 在CentOS上使用Harbor搭建K8S/docker私有镜像仓库

                                  0162.C CDP集群中新增kafka主机并删除旧的kafka主机-自动同步数据

                                  0161.C CDP集群中新建kafka并删除旧的kafka-不需要迁移数据时

                                  0158.K 升级kubernetes集群_多主多从

                                  0157.K 升级 kubeadm 集群_一主两从

                                  0156.K  kubeadm安装高可用K8S集群(2/2)

                                  0155.K kubeadm安装高可用K8S集群(1/2)

                                  0154.K master初始化后_kube-proxy状态一直为CrashLoopBackOff处理记录

                                  云原生DevOps,研发一体化的践行利器

                                  0152.K 在K8S中安装/升级/卸载 Kuboard v3

                                  0151.K 升级kuboard(内建用户库方式)

                                  0150.K 安装kuboard(内建用户库方式)




                                   

                                   

                                  文章转载自rundba,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                                  评论