暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

0189.S 使用StarRocks Operator在Kubernetes部署和管理CN

rundba 2022-11-11
2801

 

自StarRocks V2.4.0开始,提供了计算和存储分离作为可选功能,新增了CN组件,可以承担部分SQL计算,CN是一种无状态计算服务,支持基于Kubernetes的容器化部署,实现弹性伸缩,支撑数据湖分析等消耗大量计算资源的分析场景。

 


 

0. ENV



 

CentOS 7.6;
golang 1.19.3;
docker 20.10.12;
GNU Make 3.82;
gcc version 4.8.5;
jdk-8u144-linux-x64.tar.gz;
StarRocks-2.4.0-test-1104.tar.gz;
starrocks-kubernetes-operator-computenodegroup.zip。



说明:

文中使用StarRocks Operator在Kubernetes上部署CN并实现弹性伸缩;

CN作为新增组件,需要原有的FE、BE配合使用,需要提前部署FE、BE,高可用FE + 高可用BE + CN_K8s将是比较优化的方案;

CN架构最早是在腾讯内部使用,目前在SR处于公测阶段;

以下均已root用户操作。


0.1 安装git

    [CentOS]
    yum -y install git
    
    [Ubuntu]
    sudo apt-get install git

    git官网:
    https://git-scm.com


    0.2 golang安装

    [CentOS]

    ① 下载

      wget https://dl.google.com/go/go1.19.3.linux-amd64.tar.gz

      ② 解压

        rm -rf usr/local/go && tar -C usr/local -xzf go1.19.3.linux-amd64.tar.gz

        ③ 设置golang相关环境变量

          cat >> etc/profile <<-'EOF'### go env add ###export GOPROXY=https://goproxy.cnexport GOROOT=/usr/local/goexport PATH=$PATH:$GOROOT/binexport GOPATH=/opt/goexport PATH=$PATH:$GOPATH/bin### go env end ###EOF

          同时,创建GOPATH目录

            mkdir -p opt/go

            说明:

            GOPROXY使用国内代理,否则会导致下载失败

            GOROOT指golang安装目录

            GOPATH指golang工作目录

            PATH需要将环境变量加入

            ④ 使环境变量生效

              source etc/profile

              ⑤ 查看版本

                [root@rundba ~]# go versiongo version go1.19.3 linux/amd64
                [Ubuntu]sudo apt-get install golang

                golang官网:

                  https://golang.google.cn/dl/https://golang.google.cn/doc/install


                  0.3 docker安装

                  [CentOS]

                  # 使用阿里云docker镜像

                    wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O etc/yum.repos.d/docker-ce.repo


                    # 安装最新版本

                      yum -y install docker-ce


                      # 启动docker服务

                        systemctl restart dockersystemctl enable docker
                        [Ubuntu]sudo apt install docker.io


                        0.4 系统软件安装

                          yum -y install gcc-c++ libstdc++-static ant cmake byacc flex automake libtool binutils-devel bison ncurses-devel make unzip wget vim


                          0.5 部署StarRocks

                          部署StarRocks方式:

                          以二进制安装包方式手动部署StarRocks;

                          从源码编译安装StarRocks;

                          通过Docker镜像安装(测试环境);

                          通过StarRocks Manager安装;

                          StarGo安装StarRocks。


                          安装详情参考官方链接:

                          https://docs.starrocks.io/zh-cn/latest/quick_start/Deploy


                          为快速安装测试,可参考文档《0188.S 使用Docker部署StarRocks V2.4.0测试环境》 -> "8. 直接拉取(参考)" 部分,使用两条命令快速完成FE、BE的搭建,连接如下:

                          往期推荐



                          0188.S 使用Docker部署StarRocks V2.4.0测试环境



                           

                          1. 制作StarRocks Operator镜像


                           

                          1.1 下载 StarRocks Operator 代码,并保存至目录

                          创建下载目录

                            mkdir /soft/SR && cd /soft/SR


                            浏览器进入starrocks-kubernetes-operator的computenodegroup分支:

                              https://github.com/StarRocks/starrocks-kubernetes-operator/tree/computenodegroup

                              Code -> Download ZIP,下载压缩包starrocks-kubernetes-operator-computenodegroup.zip


                              或者使用wget直接下载:

                                wget -O starrocks-kubernetes-operator-computenodegroup.zip https://github.com/StarRocks/starrocks-kubernetes-operator/archive/refs/heads/computenodegroup.zip


                                说明:

                                官方提供直接使用git clone:

                                  git clone https://github.com/StarRocks/starrocks-kubernetes-operator

                                  提供的starrocks-kubernetes-operator这个版本作为main分支,后面有很多异常,故使用官方同学提供的上述computenodegroup分支。


                                  1.2 进入StarRocks Operator代码目录

                                    unzip starrocks-kubernetes-operator-computenodegroup.zipcd starrocks-kubernetes-operator-computenodegroup


                                    1.3 制作StarRocks Operator镜像

                                    为了和官方首次提供的分支进行区别,故后续名称进行调整,原有名称:

                                    --make docker IMG="starrocks-kubernetes-operator:v1.0"

                                    更新后名称:

                                      [root@rundba starrocks-kubernetes-operator-computenodegroup]# make docker IMG="starrocks-kubernetes-operator-computenodegroup:v1.0"
                                      /opt/go/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."
                                      go fmt ./...
                                      api/v1alpha1/computenodegroup_types.go
                                      api/v1alpha1/groupversion_info.go
                                      common/common.go
                                      components/offline_job/main.go
                                      components/register_container/main.go
                                      controllers/state.go
                                      controllers/predicates/predicates.go
                                      controllers/spec/cronjob.go
                                      controllers/spec/deployment.go
                                      controllers/spec/hpa.go
                                      controllers/spec/rbac.go
                                      controllers/utils/reconcile_ctrl.go
                                      internal/fe/register.go
                                      go vet ./...
                                      go: downloading github.com/DATA-DOG/go-sqlmock v1.5.0
                                      go: downloading github.com/agiledragon/gomonkey/v2 v2.8.0
                                      go: downloading github.com/onsi/ginkgo v1.16.5
                                      go: downloading github.com/onsi/gomega v1.17.0
                                      go: downloading github.com/nxadm/tail v1.4.8
                                      go: downloading gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7
                                      GOOS=linux GOARCH=amd64 go build -o bin/manager main.go
                                      docker build --rm --no-cache -f Dockerfile -t "starrocks-kubernetes-operator-computenodegroup:v1.0"  .
                                      Sending build context to Docker daemon  48.54MB
                                      Step 1/3 : FROM centos:7
                                      7: Pulling from library/centos
                                      Digest: sha256:c73f515d06b0fa07bb18d8202035e739a494ce760aa73129f60f4bf2bd22b407
                                      Status: Downloaded newer image for centos:7
                                       ---> eeb6ee3f44bd
                                      Step 2/3 : ADD bin/manager manager
                                       ---> 28a113f64b0c
                                      Step 3/3 : CMD ["/manager"]
                                       ---> Running in caea6f723fe2
                                      Removing intermediate container caea6f723fe2
                                       ---> 86ea1e20551b
                                      Successfully built 86ea1e20551b
                                      Successfully tagged starrocks-kubernetes-operator-computenodegroup:v1.0


                                      如果有类似报错:

                                        go: sigs.k8s.io/controller-tools/cmd/controller-gen@v0.8.0: Get "https://proxy.golang.org/sigs.k8s.io/controller-tools/cmd/controller-gen/@v/v0.8.0.info": dial tcp 172.217.160.113:443: i/o timeout


                                        解决方法,设置本地go代理:

                                          go env -w GOPROXY=https://goproxy.cn

                                          再次执行make docker命令。


                                          如果有类似报错:

                                            bash: /bin/controller-gen: No such file or directory


                                            解决方法:

                                              go install sigs.k8s.io/controller-tools/cmd/controller-gen@v0.3.0


                                              vim Makefile      # 更改102行,将shell pwd更改为GOPATH,否则不能找到controller-gen命令

                                                102 CONTROLLER_GEN = $(GOPATH)/bin/controller-gen        #default value $(shell pwd)/bin/controller-gen


                                                1.4 tag并上传镜像

                                                可以上传到Docker Hub,也可上传到私有仓库。

                                                1.4.1 查看镜像名称

                                                  [root@rundba starrocks-kubernetes-operator]# docker images | grep operatorREPOSITORY                      TAG       IMAGE ID       CREATED              SIZEstarrocks-kubernetes-operator   v1.0      86ea1e20551b   About a minute ago   252MBhttpd                           latest    b9bd7e513e0f   7 months ago         144MBcentos                          7         eeb6ee3f44bd   13 months ago        204MBapache/kudu                     latest    71f1401ee27a   16 months ago        209MB


                                                  1.4.2 上传Docker Hub方法

                                                  1) docker login

                                                  执行 docker login,按照提示输入账号和密码,登录远端 Docker 仓库 Docker Hub。

                                                  [root@rundba starrocks-kubernetes-operator]# cat ./hub_password.txt | docker login --username landnow --password-stdin
                                                  WARNING! Your password will be stored unencrypted in root/.docker/config.json.
                                                  Configure a credential helper to remove this warning. See
                                                  https://docs.docker.com/engine/reference/commandline/login/#credentials-store
                                                  
                                                  Login Succeeded


                                                    说明:您需要提前在 Docker Hub 中注册账号、创建 Docker 仓库。


                                                    2) 标记StarRocks Operator镜像

                                                    docker tag $operator_image_id $account/repo:tag,如:

                                                      docker tag 86ea1e20551b landnow/starrocks-kubernetes-operator-computenodegroup:v1.0

                                                      tar完成后可以查看当前主机已有的images tag:

                                                        [root@rundba starrocks-kubernetes-operator]# docker images
                                                        REPOSITORY                              TAG       IMAGE ID       CREATED         SIZE
                                                        landnow/starrocks-kubernetes-operator   v1.0      86ea1e20551b   2 minutes ago   252MB
                                                        starrocks-kubernetes-operator           v1.0      86ea1e20551b   2 minutes ago   252MB
                                                        httpd                                   latest    b9bd7e513e0f   7 months ago    144MB
                                                        centos                                  7         eeb6ee3f44bd   13 months ago   204MB
                                                        apache/kudu                             latest    71f1401ee27a   16 months ago   209MB


                                                        3) 并推送至远端Docker仓库

                                                        docker push $account/repo:tag,如:

                                                        root@ubd-2004:~/starrocks-kubernetes-operator# docker push landnow/starrocks-kubernetes-operator-computenodegroup:v1.0
                                                        The push refers to repository [docker.io/landnow/starrocks-kubernetes-operator]
                                                        28d768b4546a: Pushed
                                                        174f56854903: Mounted from library/centos 
                                                        v1.0: digest: sha256:dcc4662d9db99105be07e10675667977d9e50caf7335c73299563d77150a8c87 size: 741

                                                        说明:

                                                        dockerImageId:StarRocks Operator 镜像 ID,可以执行 docker images 进行查看。

                                                        account/repo:tag:StarRocks Operator 镜像标签,例如starrocks/sr-cn-test:operator。其中,account 为 Docker Hub 中注册的账号;repo 为 Docker 镜像仓库;tag为自定义的 StarRocks Operator 镜像标签。


                                                        1.4.3 上传私有仓库方法

                                                        1) docker login

                                                          [root@rundba ~]# docker login https://reg.rundba.com -uwk
                                                          Password: 
                                                          WARNING! Your password will be stored unencrypted in root/.docker/config.json.
                                                          Configure a credential helper to remove this warning. See
                                                          https://docs.docker.com/engine/reference/commandline/login/#credentials-store
                                                          
                                                          Login Succeeded

                                                          如果登录有类似错误:

                                                            Error response from daemon: Get "https://reg.rundba.com/v2/": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0

                                                            修改配置:

                                                            在/etc/docker/daemon.json中添加可访问的远程registry:

                                                             "insecure-registries": ["https://reg.rundba.com"]

                                                            完整配置参考:

                                                              [root@reg docker]# cat daemon.json
                                                              {
                                                                "registry-mirrors": ["https://ul2pzi84.mirror.aliyuncs.com"],
                                                                "insecure-registries": ["https://reg.rundba.com"]
                                                              }

                                                              重载配置并重启docker:

                                                                systemctl daemon-reload && systemctl restart docker


                                                                2) 标记StarRocks Operator镜像

                                                                  [root@rundba ~]# docker tag 86ea1e20551b reg.rundba.com/starrocks/starrocks-kubernetes-operator-computenodegroup:v1.0


                                                                  3) 推送至私有仓库

                                                                     [root@rundba ~]# docker push reg.rundba.com/starrocks/starrocks-kubernetes-operator-computenodegroup:v1.0

                                                                    The push refers to repository [reg.rundba.com/starrocks/starrocks-kubernetes-operator-computenodegroup]
                                                                    1685408996db: Pushed 
                                                                    174f56854903: Layer already exists 
                                                                    v1.0: digest: sha256:8ccb576b3d96d988b7583c21ca7d7c3daeedcc1f54b5a413e360cc1fd04655cb size: 741



                                                                    2. 制作CN镜像


                                                                     

                                                                    官方首次提供的测试版本介质不能完成完成部署,和官方同学沟通后,使用提供的新介质,需要重新制作cn镜像。介质目前(2022-11-07),官方文档尚未更新。


                                                                    2.1 下载StarRocks 2.4

                                                                    浏览器下载StarRocks-2.4.0-test-1104.tar.gz:

                                                                      http://cdn-release.starrocks.com/StarRocks-2.4.0-test-1104.tar.gz?Expires=1668769223&OSSAccessKeyId=LTAI4GFYjbX9e7QmFnAAvkt8&Signature=AWNyYuJR7w9fSpbjsO3KVe9ad10%3D


                                                                      或直接使用wget下载:

                                                                        wget -SO soft/SR/StarRocks-2.4.0-test-1104.tar.gz "http://cdn-release.starrocks.com/StarRocks-2.4.0-test-1104.tar.gz?Expires=1668996477&OSSAccessKeyId=LTAI4GFYjbX9e7QmFnAAvkt8&Signature=zUNVpBk3oqb4MZTcTnflyiI5XKo%3D"


                                                                        2.2 解压

                                                                          tar zxvf StarRocks-2.4.0-test-1104.tar.gz cd StarRocks-2.4.0-test-1104


                                                                          2.3 创建Dockerfile

                                                                            cat >> Dockerfile <<-'EOF'
                                                                            FROM centos:7
                                                                            ENV LANG='en_US.UTF-8' LANGUAGE='en_US:en' LC_ALL='en_US.UTF-8'
                                                                            RUN yum install -y tzdata openssl curl vim ca-certificates fontconfig gzip tar mysql java-11-openjdk
                                                                            COPY /be data/starrocks/be/
                                                                            ENV JAVA_HOME=/usr/lib/jvm/jre-11-openjdk/ \
                                                                                PATH="/usr/local/bin/jdk-11.0.16/bin:$PATH"
                                                                            ENV STARROCKS_ROOT=/data/starrocks
                                                                            USER root
                                                                            Workdir /data/starrocks
                                                                            EOF


                                                                            2.4 重新制作cn镜像

                                                                              [root@rundba StarRocks-2.4.0-test-1104]# docker build -f Dockerfile -t starrocks-cn:v1.0 .    # tag为:starrocks-cn:v1.0,docker build -f Dockerfile -t starrocks .    # tag为:starrocks:latest
                                                                              Sending build context to Docker daemon  2.214GB
                                                                              Step 1/8 : FROM centos:7
                                                                               ---> eeb6ee3f44bd
                                                                              Step 2/8 : ENV LANG='en_US.UTF-8' LANGUAGE='en_US:en' LC_ALL='en_US.UTF-8'
                                                                               ---> Using cache
                                                                               ---> c2c7a3e78234
                                                                              Step 3/8 : RUN yum install -y tzdata openssl curl vim ca-certificates fontconfig gzip tar mysql java-11-openjdk
                                                                               ---> Using cache
                                                                               ---> 19f7fd19811b
                                                                              Step 4/8 : COPY be data/starrocks/be/
                                                                               ---> ca697895b0f2
                                                                              Removing intermediate container 44205762a913
                                                                               ---> 675b50c34047
                                                                              Step 4/8 : COPY be data/starrocks/be/
                                                                               ---> b1e90a276b30
                                                                              Step 5/8 : ENV JAVA_HOME=/usr/lib/jvm/jre-11-openjdk/     PATH="/usr/local/bin/jdk-11.0.16/bin:$PATH"
                                                                               ---> Running in 22d120f921fb
                                                                              Removing intermediate container 22d120f921fb
                                                                               ---> ac5748e20fcd
                                                                              Step 6/8 : ENV STARROCKS_ROOT=/data/starrocks
                                                                               ---> Running in f5ca69c24b42
                                                                              Removing intermediate container f5ca69c24b42
                                                                               ---> b9539554d1b6
                                                                              Step 7/8 : USER root
                                                                               ---> Running in 5e994fd5e87e
                                                                              Removing intermediate container 5e994fd5e87e
                                                                               ---> f58f1ad1a8ee
                                                                              Step 8/8 : Workdir data/starrocks
                                                                               ---> Running in 7b04a240bac7
                                                                              Removing intermediate container 7b04a240bac7
                                                                               ---> b8b24293c71c
                                                                              Successfully built b8b24293c71c
                                                                              Successfully tagged starrocks-cn:v1.0


                                                                              2.5 查看镜像

                                                                                [root@rundba StarRocks-2.4.0-test-1104]#  docker images | grep cn
                                                                                starrocks-cn                                                              v1.0      b8b24293c71c   45 seconds ago   1.87GB


                                                                                2.6 标记CN镜像,并推送至远端仓库

                                                                                2.6.1 推送至Hub docker

                                                                                  docker tag b8b24293c71c landnow/starrocks-cn:v1.0
                                                                                  docker push landnow/starrocks-cn:v1.0


                                                                                  2.6.2 推送至私有仓库

                                                                                    docker tag $cn_Image_Id $account/repo:tag,如:

                                                                                    docker tag b8b24293c71c reg.rundba.com/starrocks/starrocks-cn:v1.0

                                                                                    docker push $account/repo:tag,如:

                                                                                    [root@rundba ~]# docker push reg.rundba.com/starrocks/starrocks-cn:v1.0
                                                                                    The push refers to repository [reg.rundba.com/starrocks/starrocks-cn]
                                                                                    431711a2e451: Pushed 
                                                                                    2ddf107f20c9: Pushed 
                                                                                    174f56854903: Layer already exists 
                                                                                    v1.0: digest: sha256:6468bc8b186c0b39aa41081016c95098054827cf096cd454ad5d253ce4e7778b size: 955


                                                                                    2.6.3 上传错误解决参考

                                                                                    如果上传到Hub Docker有类似错误

                                                                                      error pulling image configuration: Get "https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/af/afc243f32f5908516972688a489d357f342e32880e6347c6dddece9cfbeca44f/data?verify=1667462637-xn87AuAb%2BmwKjR8uTxSHFPFwl5Y%3D": dial tcp 104.18.124.25:443: i/o timeout

                                                                                      解决方法:添加其它源

                                                                                        [root@rundba docker]# cat etc/docker/daemon.json 
                                                                                        {
                                                                                          "registry-mirrors":["https://hub-mirror.c.163.com","https://registry.aliyuncs.com","https://registry.docker-cn.com","https://docker.mirrors.ustc.edu.cn","https://ul2pzi84.mirror.aliyuncs.com"],
                                                                                          "insecure-registries": ["https://reg.rundba.com"]
                                                                                        }


                                                                                        重启docker服务

                                                                                          [root@rundba docker]# systemctl restart docker

                                                                                          再次执行build。


                                                                                           

                                                                                          3. 制作CN辅助服务镜像


                                                                                           

                                                                                          CN辅助服务是指component目录下的register、offline,会将CN注册到至FE或者从FE中摘除。


                                                                                          3.1 进入starrocks-kubernetes-operator**/components**目录

                                                                                          cd $your_code_path/starrocks-kubernetes-operator/components,如:

                                                                                            [root@rundba SR]# cd starrocks-kubernetes-operator-computenodegroup/components


                                                                                            3.2 创建镜像

                                                                                            创建CN辅助服务镜像

                                                                                              [root@rundba components]# make docker IMG="starrocks-computenodegroup:v1.0"
                                                                                              GOOS=linux GOARCH=amd64 go build -o ./bin/offline ./offline_job
                                                                                              GOOS=linux GOARCH=amd64 go build -o ./bin/register ./register_container
                                                                                              docker build --rm --no-cache -f Dockerfile -t "starrocks-computenodegroup:v1.0"  .
                                                                                              Sending build context to Docker daemon  52.91MB
                                                                                              Step 1/3 : FROM centos:7
                                                                                               ---> eeb6ee3f44bd
                                                                                              Step 2/3 : ADD bin/offline offline
                                                                                               ---> de24d884fe97
                                                                                              Step 3/3 : ADD bin/register register
                                                                                               ---> dbffaff3f140
                                                                                              Successfully built dbffaff3f140
                                                                                              Successfully tagged starrocks-computenodegroup:v1.0

                                                                                              查看CN辅助服务镜像:

                                                                                                [root@rundba components]# docker images | grep computenodegroup
                                                                                                starrocks-computenodegroup                                                v1.0      dbffaff3f140   14 seconds ago      257MB
                                                                                                landnow/starrocks-kubernetes-operator-computenodegroup                    v1.0      86ea1e20551b   About an hour ago   252MB
                                                                                                starrocks-kubernetes-operator-computenodegroup                            v1.0      86ea1e20551b   About an hour ago   252MB
                                                                                                reg.rundba.com/starrocks/starrocks-kubernetes-operator-computenodegroup   v1.0      86ea1e20551b   About an hour ago   252MB


                                                                                                3.3 标记CN辅助镜像,并推送至远端仓库

                                                                                                3.3.1 推送至Hub docker

                                                                                                  docker tag dbffaff3f140 landnow/starrocks-computenodegroup:v1.0docker push landnow/starrocks-computenodegroup:v1.0


                                                                                                  3.3.2 推送至私有仓库

                                                                                                  标记:

                                                                                                    docker tag dbffaff3f140 reg.rundba.com/starrocks/starrocks-computenodegroup:v1.0

                                                                                                    将镜像上传到本地仓库:

                                                                                                    [root@rundba components]# docker push reg.rundba.com/starrocks/starrocks-computenodegroup:v1.0
                                                                                                    The push refers to repository [reg.rundba.com/starrocks/starrocks-computenodegroup]
                                                                                                    f88b96ff6cef: Pushed 
                                                                                                    40f1ee89598b: Pushed 
                                                                                                    174f56854903: Layer already exists 
                                                                                                    v1.0: digest: sha256:0e4e66d852a83b45dbbf2e02fef8906d147ce89068fa81de618beb794bb689b1 size: 952



                                                                                                    4. 部署 StarRocks Operator


                                                                                                     

                                                                                                    4.1 进入 starrocks-kubernetes-operator**/deploy**目录

                                                                                                    cd $your_code_path/starrocks-kubernetes-operator/deploy,如

                                                                                                      cd soft/SR/starrocks-kubernetes-operator/deploy

                                                                                                      如果镜像制作主机不在K8s集群内部,可将deploy目录下内容传输到K8s集群:

                                                                                                        scp -r deploy/ k8s3-master:/root/sr/


                                                                                                        4.2 修改 manager.yaml的image为制作的StarRocks Operator镜像标签。

                                                                                                        例如 starrocks/sr-cn-test:operator。

                                                                                                           25         args:
                                                                                                           26         - --leader-elect
                                                                                                           27         image: reg.rundba.com/starrocks/starrocks-kubernetes-operator-computenodegroup:v1.0      #controller:latest
                                                                                                           28         name: manager


                                                                                                          4.3 部署StarRocks Operator

                                                                                                            cd /root/sr/deploykubectl apply -f starrocks.com_computenodegroups.yamlkubectl apply -f namespace.yamlkubectl apply -f leader_election_role.yamlkubectl apply -f role.yamlkubectl apply -f role_binding.yamlkubectl apply -f leader_election_role_binding.yamlkubectl apply -f service_account.yamlkubectl apply -f manager.yaml


                                                                                                            如果镜像没有拉取到,报错:

                                                                                                              Normal   Scheduled  6m4s                  default-scheduler  Successfully assigned starrocks/cn-controller-manager-86c46ffb95-vl87q to k8s3-node2
                                                                                                              Warning  Failed     5m21s (x3 over 6m2s)  kubelet            Failed to pull image "reg.rundba.com/starrocks/starrocks-kubernetes-operator:v1.0": rpc error: code = Unknown desc = Error response from daemon: Get "https://reg.rundba.com/v2/": http: server gave HTTP response to HTTPS client
                                                                                                              Normal   Pulling    4m31s (x4 over 6m3s)  kubelet            Pulling image "reg.rundba.com/starrocks/starrocks-kubernetes-operator:v1.0"
                                                                                                              Warning  Failed     4m31s (x4 over 6m2s)  kubelet            Error: ErrImagePull
                                                                                                              Warning  Failed     4m31s                 kubelet            Failed to pull image "reg.rundba.com/starrocks/starrocks-kubernetes-operator:v1.0": rpc error: code = Unknown desc = Error response from daemon: Get "https://reg.rundba.com/v2/": x509: certificate relies on legacy Common Name field, use SANs instead
                                                                                                              Warning  Failed     4m20s (x6 over 6m1s)  kubelet            Error: ImagePullBackOff
                                                                                                              Normal   BackOff    60s (x21 over 6m1s)   kubelet            Back-off pulling image "reg.rundba.com/starrocks/starrocks-kubernetes-operator:v1.0"

                                                                                                            解决方法:

                                                                                                            确保你的私有仓库能解析,reg.rundba.com是我自己的,在K8s的每个节点配置/etc/hosts,或进行DNS解析;

                                                                                                            确保K8s节点可以访问reg.rundba.com:

                                                                                                              [root@k8s3-node2 ~]# cat etc/docker/daemon.json   #确保下面这行配置存在
                                                                                                              {
                                                                                                              ...
                                                                                                                 "insecure-registries": ["https://reg.rundba.com"]
                                                                                                              }


                                                                                                              每个节点配置完成后,还需要重启docker

                                                                                                                systemctl restart docker


                                                                                                                确保SR运行节点可以登录仓库,否则同样拉取不到镜像:

                                                                                                                  [root@k8s3-node2 ~]# docker login reg.rundba.com -uwk
                                                                                                                  Password: 
                                                                                                                  WARNING! Your password will be stored unencrypted in root/.docker/config.json.
                                                                                                                  Configure a credential helper to remove this warning. See
                                                                                                                  https://docs.docker.com/engine/reference/commandline/login/#credentials-store
                                                                                                                  
                                                                                                                  Login Succeeded


                                                                                                                  4.4 执行 kubectl get pod -n starrocks 查看 pod 状态。如果返回结果中

                                                                                                                  STATUS 显示Running,则表示 pod 正在运行。 

                                                                                                                  [root@k8s3-node1 ~]# kubectl get pod -n starrocks
                                                                                                                  NAME READY STATUS RESTARTS AGE cn-controller-manager-759d78bc54-n4t4v 1/1 Running 0 30s



                                                                                                                  5. 在K8s中部署CN


                                                                                                                   

                                                                                                                  5.1 进入starrocks-kubernetes-operator/examples/cn目录

                                                                                                                    cd $your_code_path/starrocks-kubernetes-operator/examples/cn,如:[root@rundba SR]# cd starrocks-kubernetes-operator-computenodegroup/examples/cn

                                                                                                                    如果镜像制作主机不在K8s集群内部,可将deploy目录下内容传输到K8s集群:

                                                                                                                      scp -r ../../examples k8s3-master:/root/sr/


                                                                                                                      5.2 修改cn.yaml

                                                                                                                      1) 修改cnImage为推送到远端仓库的CN镜像文件标签。例如: starrocks/sr-cn-test:v3

                                                                                                                         17     cnImage: reg.rundba.com/starrocks/starrocks-cn:v1.0                 # CN image


                                                                                                                        2) 修改componentsImage为推送到远端仓库的CN Group镜像文件标签。例如: starrocks/computenodegroup:v1.0

                                                                                                                           18     componentsImage: reg.rundba.com/starrocks/starrocks-computenodegroup:v1.0           # auxiliary components image


                                                                                                                          3) 修改<fe_ip>:<fe_query_port>为任意一个FE节点IP地址和query_port端口号(默认为9030)。

                                                                                                                             12   feInfo: 13     accountSecret: test-secret # secret to configure FE account 14     addresses: # FE addresses 15       - 192.16.80.125:9030     # <ip>:<port>


                                                                                                                            4) 增加command配置选项,路径为CN Group镜像中start_cn.shell的相对路径,即添加第30、31行。

                                                                                                                              22   podPolicy:
                                                                                                                               23     resources: # resources requirement of CN pod
                                                                                                                               24       limits:
                                                                                                                               25         cpu: 4000m
                                                                                                                               26         memory: 8Gi
                                                                                                                               27       requests:
                                                                                                                               28         cpu: 4000m
                                                                                                                               29         memory: 8Gi
                                                                                                                               30     command: 
                                                                                                                               31       - be/bin/start_cn.sh   #/root/StarRocks-2.4.0/be/bin/start_cn.sh, same with Dockerfile -> Workdir data/starrocks -> data/starrocks/be/bin 
                                                                                                                               32   autoScalingPolicy: # auto-scaling policy of CN cluster


                                                                                                                              5.3 部署CN

                                                                                                                                cd /root/sr/examples/cn
                                                                                                                                kubectl apply -f fe-account.yaml
                                                                                                                                kubectl apply -f cn-config.yaml
                                                                                                                                kubectl apply -f cn.yaml


                                                                                                                                5.4 检查CN运行状态

                                                                                                                                [root@k8s3-master ~]# kubectl -n starrocks get pod
                                                                                                                                NAME                                       READY   STATUS      RESTARTS   AGE
                                                                                                                                cn-controller-manager-759d78bc54-n4t4v     1/1     Running     0          20m
                                                                                                                                computenodegroup-sample-27796875-vgvnx     0/1     Completed   0          8s
                                                                                                                                computenodegroup-sample-5759565bd8-ckhgt   2/2     Running     0          23s
                                                                                                                                computenodegroup-sample-5759565bd8-hb8sc   2/2     Running     0          23s
                                                                                                                                computenodegroup-sample-5759565bd8-xgs7j   2/2     Running     0          23s
                                                                                                                                computenodegroup-sample-5759565bd8-xtrnw   2/2     Running     0          23s

                                                                                                                                CN部署成功后,StarRocks Operator自动调用cn.yaml文件中配置的FE IP和查询端口号,将CN加至StarRocks集群中。


                                                                                                                                 

                                                                                                                                6. 配置自动水平扩缩容策略


                                                                                                                                 

                                                                                                                                6.1 HPA说明

                                                                                                                                如果需要配置CN自动扩缩策略,则可以修改cn.yaml配置文件

                                                                                                                                  $your_code_path/starrocks-kubernetes-operator/examples/cn/cn.yaml。


                                                                                                                                  例如,您需要基于K8s中CN的内存和CPU使用率实现弹性伸缩,则需要配置内存和CPU平均使用率为资源指标,触发弹性伸缩的阈值。弹性伸缩上限和下限即pod副本数量或者CN数量的上限和下限。

                                                                                                                                  Kubernetes 还支持使用 behavior,根据业务场景定制扩缩容行为,实现快速扩容,缓慢缩容,禁用缩容等。


                                                                                                                                  6.2 CN组件HPA配置

                                                                                                                                    autoScalingPolicy: # auto-scaling policy of CN cluster
                                                                                                                                          maxReplicas: 10 # CN 数量的上限 10
                                                                                                                                          minReplicas: 1 # CN 数量的下限 1
                                                                                                                                          hpaPolicy:
                                                                                                                                            metrics: # 资源指标
                                                                                                                                              - type: Resource
                                                                                                                                                resource: 
                                                                                                                                                  name: memory # 资源指标为内存
                                                                                                                                                  target:
                                                                                                                                                    averageUtilization: 30 # 触发水平扩缩容的阈值为30%。K8s 集群中 CN 内存使用率超过 30% 时,增加 CN 数量进行扩容,低于 30% 时,减少 CN 数量进行缩容。
                                                                                                                                                    type: Utilization
                                                                                                                                              - type: Resource
                                                                                                                                                resource: # 触发水平扩缩容的阈值为 60%。K8s 集群中 CN CPU 内存使用率超过 60% 时,增加 CN 数量进行扩容,低于 60% 时,减少 CN 数量进行缩容。
                                                                                                                                                  name: cpu
                                                                                                                                                  target:
                                                                                                                                                    averageUtilization: 60
                                                                                                                                                    type: Utilization
                                                                                                                                            behavior: # 根据业务场景定制扩缩容行为,实现快速扩容,缓慢缩容,禁用缩容等。 
                                                                                                                                              scaleUp:
                                                                                                                                                policies:
                                                                                                                                                  - type: Pods
                                                                                                                                                    value: 1
                                                                                                                                                    periodSeconds: 10
                                                                                                                                              scaleDown:
                                                                                                                                                selectPolicy: Disabled


                                                                                                                                    部分配置说明如下:

                                                                                                                                    自动水平扩缩容的更多配置,请参考 Pod 水平自动扩缩。


                                                                                                                                    水平扩缩时 CN 数量的上限和下限。

                                                                                                                                    # CN 数量的上限 10

                                                                                                                                    maxReplicas: 10

                                                                                                                                    # CN 数量的下限 1

                                                                                                                                    minReplicas: 1


                                                                                                                                    触发水平扩缩的阈值。

                                                                                                                                    # 触发水平扩缩容的阈值,例如资源指标为 K8s 集群中 CN CPU 使用率。当 CPU 使用率超过 60% 时,增加 CN 数量进行扩容,低于 60% 时,减少 CN 数量进行缩容。

                                                                                                                                    - type: Resource
                                                                                                                                      resource:
                                                                                                                                        name: cpu
                                                                                                                                        target:
                                                                                                                                          averageUtilization: 60


                                                                                                                                    6.3 生效自动扩缩策略

                                                                                                                                      kubectl apply -f cn/cn.yaml


                                                                                                                                      6.4 查看生效后的策略

                                                                                                                                        [root@k8s3-master ~]# kubectl -n starrocks get hpa
                                                                                                                                        NAME                      REFERENCE                                  TARGETS                        MINPODS   MAXPODS   REPLICAS   AGE
                                                                                                                                        computenodegroup-sample   ComputeNodeGroup/computenodegroup-sample   <unknown>/30%, <unknown>/60%   1         10        4          29m


                                                                                                                                        6.5 CN部署故障分析

                                                                                                                                        如果在CN部署出现异常,可通过CN本身的日志进行查看,也可进一步通过FE的日志进行查看。

                                                                                                                                        FE日志查看方法:

                                                                                                                                        登录FE所在主机,查看log下的日志;

                                                                                                                                        docker部署的话,进入fe所在的docker,然后查看/data/deploy/StarRocks-2.4.0-test-1104/fe/log下面的日志。


                                                                                                                                         

                                                                                                                                        7. 登录并查看StarRocks服务状态


                                                                                                                                         

                                                                                                                                        7.1 Kubernetes集群上临时安装MySQL客户端

                                                                                                                                        本次使用docker创建临时MySQL客户端,也可使用fe的mysql客户端登录。

                                                                                                                                          [root@k8s3-node1 ~]# kubectl run mysql-client --image=mysql --restart=Never --rm -it -- bash -il
                                                                                                                                          If you don't see a command prompt, try pressing enter.
                                                                                                                                          root@mysql-client:/#


                                                                                                                                          7.2 使用mysql-client访问StarRocks

                                                                                                                                            root@mysql-client:/# mysql -h 192.16.80.123 -P9030    # mysql -h 192.16.80.122 -P9030 -uroot
                                                                                                                                            Welcome to the MySQL monitor.  Commands end with ; or \g.
                                                                                                                                            Your MySQL connection id is 90
                                                                                                                                            Server version: 5.1.0 StarRocks version 2.4.0-TEST-1104
                                                                                                                                            
                                                                                                                                            Copyright (c) 2000, 2021, Oracle and/or its affiliates.
                                                                                                                                            
                                                                                                                                            Oracle is a registered trademark of Oracle Corporation and/or its
                                                                                                                                            affiliates. Other names may be trademarks of their respective
                                                                                                                                            owners.
                                                                                                                                            
                                                                                                                                            Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
                                                                                                                                            
                                                                                                                                            mysql>


                                                                                                                                            7.3 查看各组件状态

                                                                                                                                            1) 查看fe集群状态

                                                                                                                                              mysql> SHOW PROC '/frontends';
                                                                                                                                              +-------------------------------+------------+-------------+----------+-----------+---------+--------+------------+------+-------+-------------------+---------------------+----------+--------+---------------------+-------------------------+
                                                                                                                                              | Name                          | IP         | EditLogPort | HttpPort | QueryPort | RpcPort | Role   | ClusterId  | Join | Alive | ReplayedJournalId | LastHeartbeat       | IsHelper | ErrMsg | StartTime           | Version                 |
                                                                                                                                              +-------------------------------+------------+-------------+----------+-----------+---------+--------+------------+------+-------+-------------------+---------------------+----------+--------+---------------------+-------------------------+
                                                                                                                                              | 172.17.0.2_9010_1667810020442 | 172.17.0.2 | 9010        | 8030     | 9030      | 9020    | LEADER | 1081409265 | true | true  | 813               | 2022-11-07 09:19:30 | true     |        | 2022-11-07 08:33:50 | 2.4.0-TEST-1104-4eaf688 |
                                                                                                                                              +-------------------------------+------------+-------------+----------+-----------+---------+--------+------------+------+-------+-------------------+---------------------+----------+--------+---------------------+-------------------------+
                                                                                                                                              1 row in set (0.04 sec)


                                                                                                                                              2) 查看be状态

                                                                                                                                                mysql> SHOW PROC '/backends';
                                                                                                                                                +-----------+------------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------+-------------------------+--------------------------------------------------------+-------------------+-------------+----------+
                                                                                                                                                | BackendId | IP         | HeartbeatPort | BePort | HttpPort | BrpcPort | LastStartTime       | LastHeartbeat       | Alive | SystemDecommissioned | ClusterDecommissioned | TabletNum | DataUsedCapacity | AvailCapacity | TotalCapacity | UsedPct | MaxDiskUsedPct | ErrMsg | Version                 | Status                                                 | DataTotalCapacity | DataUsedPct | CpuCores |
                                                                                                                                                +-----------+------------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------+-------------------------+--------------------------------------------------------+-------------------+-------------+----------+
                                                                                                                                                | 10002     | 172.17.0.2 | 9050          | 9060   | 8040     | 8060     | 2022-11-07 08:34:13 | 2022-11-07 09:19:50 | true  | false                | false                 | 30        | 0.000            | 276.321 GB    | 291.365 GB    | 5.16 %  | 5.16 %         |        | 2.4.0-TEST-1104-4eaf688 | {"lastSuccessReportTabletsTime":"2022-11-07 09:19:13"} | 276.321 GB        | 0.00 %      | 16       |
                                                                                                                                                +-----------+------------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------+-------------------------+--------------------------------------------------------+-------------------+-------------+----------+
                                                                                                                                                1 row in set (0.01 sec)


                                                                                                                                                3) 查看compute nodes

                                                                                                                                                  mysql> show compute nodes;    # 也可使用SHOW PROC '/compute_nodes'
                                                                                                                                                  +---------------+----------------+---------------+--------+----------+----------+---------------+---------------+-------+----------------------+-----------------------+----------------------------------------------------+---------+
                                                                                                                                                  | ComputeNodeId | IP             | HeartbeatPort | BePort | HttpPort | BrpcPort | LastStartTime | LastHeartbeat | Alive | SystemDecommissioned | ClusterDecommissioned | ErrMsg                                             | Version |
                                                                                                                                                  +---------------+----------------+---------------+--------+----------+----------+---------------+---------------+-------+----------------------+-----------------------+----------------------------------------------------+---------+
                                                                                                                                                  | 10074         | 10.244.203.205 | 9050          | -1     | -1       | -1       | NULL          | NULL          | false | false                | false                 | java.net.SocketTimeoutException: connect timed out |         |
                                                                                                                                                  | 10075         | 10.244.219.134 | 9050          | -1     | -1       | -1       | NULL          | NULL          | false | false                | false                 | java.net.SocketTimeoutException: connect timed out |         |
                                                                                                                                                  | 10076         | 10.244.41.12   | 9050          | -1     | -1       | -1       | NULL          | NULL          | false | false                | false                 | java.net.SocketTimeoutException: connect timed out |         |
                                                                                                                                                  | 10077         | 10.244.41.13   | 9050          | -1     | -1       | -1       | NULL          | NULL          | false | false                | false                 | java.net.SocketTimeoutException: connect timed out |         |
                                                                                                                                                  +---------------+----------------+---------------+--------+----------+----------+---------------+---------------+-------+----------------------+-----------------------+----------------------------------------------------+---------+
                                                                                                                                                  4 rows in set (0.00 sec)


                                                                                                                                                  7.4 通过浏览器查看FE、BE、CN信息

                                                                                                                                                  通过浏览器查看FE信息:

                                                                                                                                                  http://HOST_IP:8030/

                                                                                                                                                  其中HOST_IP为docker宿主机IP。


                                                                                                                                                  通过浏览器查看BE信息:

                                                                                                                                                  http://HOST_IP:8040/

                                                                                                                                                  其中HOST_IP为docker宿主机IP。


                                                                                                                                                  通过浏览器查看CN信息:

                                                                                                                                                  FE主页 -> 点击“system” -> compute_nodes,即可查看,效果同show compute nodes。


                                                                                                                                                   

                                                                                                                                                  8. 创建测试数据验证



                                                                                                                                                  CREATE DATABASE TEST;
                                                                                                                                                  USE TEST;
                                                                                                                                                  
                                                                                                                                                  CREATE TABLE `sr_on_mac` (
                                                                                                                                                   `c0` int(11) NULL COMMENT "",
                                                                                                                                                   `c1` date NULL COMMENT "",
                                                                                                                                                   `c2` datetime NULL COMMENT "",
                                                                                                                                                   `c3` varchar(65533) NULL COMMENT ""
                                                                                                                                                  ) ENGINE=OLAP 
                                                                                                                                                  DUPLICATE KEY(`c0`)
                                                                                                                                                  PARTITION BY RANGE (c1) (
                                                                                                                                                    START ("2022-02-01") END ("2022-02-10") EVERY (INTERVAL 1 DAY)
                                                                                                                                                  )
                                                                                                                                                  DISTRIBUTED BY HASH(`c0`) BUCKETS 1 
                                                                                                                                                  PROPERTIES (
                                                                                                                                                  "replication_num" = "1",
                                                                                                                                                  "in_memory" = "false",
                                                                                                                                                  "storage_format" = "DEFAULT"
                                                                                                                                                  );
                                                                                                                                                  
                                                                                                                                                  insert into sr_on_mac values (1, '2022-02-01', '2022-02-01 10:47:57', '111');
                                                                                                                                                  insert into sr_on_mac values (2, '2022-02-02', '2022-02-02 10:47:57', '222');
                                                                                                                                                  insert into sr_on_mac values (3, '2022-02-03', '2022-02-03 10:47:57', '333');
                                                                                                                                                  
                                                                                                                                                  MySQL [TEST]> select * from sr_on_mac where c1 >= '2022-02-02';
                                                                                                                                                  +------+------------+---------------------+------+
                                                                                                                                                  | c0   | c1         | c2                  | c3   |
                                                                                                                                                  +------+------------+---------------------+------+
                                                                                                                                                  |    2 | 2022-02-02 | 2022-02-02 10:47:57 | 222  |
                                                                                                                                                  |    3 | 2022-02-03 | 2022-02-03 10:47:57 | 333  |
                                                                                                                                                  +------+------------+---------------------+------+
                                                                                                                                                  2 rows in set (0.14 sec)

                                                                                                                                                  如果无错误返回,则表明您已成功在K8s中环境中部署CN,并结合FE、BE正常提供服务。


                                                                                                                                                   

                                                                                                                                                  9. 参考


                                                                                                                                                   

                                                                                                                                                    https://docs.starrocks.io/zh-cn/latest/administration/k8s_operator_cnhttps://github.com/StarRocks/starrocks-kubernetes-operator/tree/computenodegrouphttps://git-scm.com/downloads

                                                                                                                                                    最后修改时间:2022-11-17 11:49:51
                                                                                                                                                    文章转载自rundba,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                                                                                                                                                    评论