暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

17 - MLSQL on k8s(4) - JuiceFS共享配置

MLSQL之道 2021-09-24
970

在K8S多节点部署MLSQL实例的时候会有一个问题,就是在每台机器都要放配置文件和Jar包,当然可以通过写代码来简化这个过程,但是有没有什么好的工具呢?那是肯定的,今天笔者通过使用JuiceFS来实现数据共享,并通过静态挂载方式来解决这个问题。JuiceFS官网示例地址:
    https://github.com/juicedata/juicefs/blob/main/docs/en/how_to_use_on_kubernetes.md
    https://github.com/juicedata/juicefs-csi-driver/tree/master/examples/static-provisioning
    继续前文中的k8s环境(13 - MLSQL on k8s(1) - k8s安装)与上一篇的JuiceFS  Volumn(16 - MLSQL集成JuiceFs),来配置JuiceFS静态挂载。
      #上一篇的Volumn
      $ ./juicefs format --storage hdfs --bucket  172.16.2.119:8020 redis://:123456@172.16.2.120:16379/2 mhdfs
      $ hadoop fs -ls jfs://test


      # 笔者上篇忘记说了,172.16.2.119,172.16.2.120,172.16.2.121是笔者的测试Hadoop集群。
      # 172.16.2.119:8020为Hdfs Active NameNode
      # HA配置为 --bucket=namenode1:port,namenode2:port.

      首先安装JuiceFS K8S的驱动:注意Docker的镜像配置,因为JuiceFS驱动镜像的TAG是latest,如果用阿里云的镜像,可能同步不及时。)

        $ cat /etc/docker/daemon.json
        {
        "registry-mirrors": ["https://b9pmyelo.mirror.aliyuncs.com"],
        "insecure-registries":["172.16.2.66:5000"]
        }
        因此需要去掉registry-mirrors,保证拉取的是docker.io/juicedata/juicefs-csi-driver:latest最新镜像,重启Docker:systemctl restart docker。
          #安装驱动
          $ kubectl apply -f https://raw.githubusercontent.com/juicedata/juicefs-csi-driver/master/deploy/k8s.yaml


          serviceaccount/juicefs-csi-controller-sa created
          clusterrole.rbac.authorization.k8s.io/juicefs-external-provisioner-role created
          clusterrolebinding.rbac.authorization.k8s.io/juicefs-csi-provisioner-binding created
          statefulset.apps/juicefs-csi-controller created
          daemonset.apps/juicefs-csi-node created
          csidriver.storage.k8s.io/csi.juicefs.com created
          配置JuiceFS静态挂载:
            $ mkdir static-provisioning
            $ cd static-provisioning


            $ cat > secrets.env << EOF
            name=mhdfs
            metaurl=redis://:123456@172.16.2.120:16379/2
            access-key=hdfs
            storage=hdfs
            bucket=172.16.2.119:8020
            EOF


            $ cat > kustomizeconfig.yaml << EOF
            nameReference:
            - kind: Secret
            fieldSpecs:
            - path: spec/csi/nodePublishSecretRef/name
            kind: PersistentVolume
            namespace:
            - path: spec/csi/nodePublishSecretRef/namespace
            kind: PersistentVolume
            EOF


            $ cat > resources.yaml << EOF
            ---
            apiVersion: v1
            kind: PersistentVolume
            metadata:
            name: juicefs-hdfs-1
            spec:
            capacity:
            storage: 1Gi
            volumeMode: Filesystem
            accessModes:
            - ReadWriteMany
            persistentVolumeReclaimPolicy: Retain
            csi:
            driver: csi.juicefs.com
            volumeHandle: hdfs-1
            fsType: juicefs
            nodePublishSecretRef:
            name: juicefs-hdfs-1
            namespace: default
            ---
            apiVersion: v1
            kind: PersistentVolumeClaim
            metadata:
            name: one-gb-fs
            namespace: default
            spec:
            accessModes:
            - ReadWriteMany
            volumeMode: Filesystem
            storageClassName: ""
            resources:
            requests:
            storage: 1Gi
            EOF


            $ cat > kustomization.yaml << EOF
            apiVersion: kustomize.config.k8s.io/v1beta1
            kind: Kustomization
            namespace: default
            configurations:
            - kustomizeconfig.yaml
            resources:
            - resources.yaml
            secretGenerator:
            - name: juicefs-hdfs-1
            env: secrets.env
            EOF


            # 创建persistentvolumeclaim,名为one-gb-fs
            $ kubectl apply -k .
            secret/juicefs-hdfs-1-mg2575m8hf created
            persistentvolume/juicefs-hdfs-1 created
            persistentvolumeclaim/one-gb-fs created
            把 15 - MLSQL on k8s(3) - MLSQL on k8s 中MLSQL的配置按照如下结构放到JuiceFS的文件系统中:
              hadoop fs -ls jfs://test/mlsql
              drwxr-xr-x - hdfs hdfs 4096 2021-03-06 18:47 jfs://test/mlsql/mlsqljar
              drwxr-xr-x - hdfs hdfs 4096 2021-03-06 23:43 jfs://test/mlsql/script
              drwxr-xr-x - hdfs hdfs 4096 2021-03-06 18:43 jfs://test/mlsql/sparkconf


              hadoop fs -ls jfs://test/mlsql/mlsqljar
              -rw-r--r-- 1 hdfs hdfs 14822782 2021-03-06 18:44 jfs://test/mlsql/mlsqljar/juicefs-hadoop-0.11.0.jar
              -rw-r--r-- 1 hdfs hdfs 119974685 2021-03-06 18:44 jfs://test/mlsql/mlsqljar/streamingpro-mlsql-spark_3.0_2.12-2.1.0-SNAPSHOT.jar


              hadoop fs -ls jfs://test/mlsql/sparkconf
              -rw-r--r-- 1 hdfs hdfs 2544 2021-03-06 18:43 jfs://test/mlsql/sparkconf/log4j.properties


              hadoop fs -ls jfs://test/mlsql/script
              -rw-r--r-- 1 hdfs hdfs 2471 2021-03-06 19:07 jfs://test/mlsql/script/mlsql-start.sh
              接下来配置deployment文件:
                cat > testp.yaml << EOF
                apiVersion: apps/v1
                kind: Deployment
                metadata:
                name: spark-hello
                namespace: default
                spec:
                selector:
                matchLabels:
                app: spark-hello
                strategy:
                rollingUpdate:
                maxUnavailable: 0
                type: RollingUpdate
                template:
                metadata:
                labels:
                app: spark-hello
                spec:
                serviceAccountName: spark
                containers:
                - name: spark-hello
                args: [ "while true; do sleep 10000; done;" ]
                command:
                - /bin/sh
                - '-c'
                image: '172.16.2.66:5000/mlsql:3.0-j14-mlsql'
                imagePullPolicy: Always
                securityContext:
                runAsUser: 0
                volumeMounts:
                - name: juicefs-pv
                mountPath: /opt/mlsql/jar
                subPath: mlsql/mlsqljar
                - name: juicefs-pv
                mountPath: /opt/spark/conf
                subPath: mlsql/sparkconf
                - name: juicefs-pv
                            mountPath: /opt/mlsql/script
                subPath: mlsql/script
                volumes:
                - name: juicefs-pv
                persistentVolumeClaim:
                claimName: one-gb-fs
                EOF
                验证配置是否正确加载到容器:
                  $ docker ps | grep hello
                  132a130ce60b 172.16.2.66:5000/mlsql "/bin/sh -c 'while t…" 2 minutes ago Up 2 minutes k8s_spark-hello_spark-hello-56d64fff75-78bhq_default_26427172-85f4-47ce-84c6-a8e3f53f817e_0
                  $ docker exec -it 132a130ce60b /bin/sh


                  sh-5.0# ls opt/mlsql/jar/
                  juicefs-hadoop-0.11.0.jar streamingpro-mlsql-spark_3.0_2.12-2.1.0-SNAPSHOT.jar
                  sh-5.0# ls opt/mlsql/script/
                  mlsql-start-d.sh  mlsql-start.sh
                  sh-5.0# ls opt/spark/conf/
                  log4j.properties


                  # 启动MLSQL服务来验证一下吧
                  $ cd opt/mlsql
                  nohup sh script/mlsql-start.sh &
                  等MLSQL启动完成(大概一分多钟),验证服务:
                    curl -XPOST hello.mlsql.com/run/script -d 'sql=select 1 a as t;'
                    [{"a":1}]

                    图片素材1:水上威尼斯

                    图片素材2:互联网

                    往期经典回顾:
                    1 - MLSQL介绍
                    2 - MLSQL加载JDBC数据源深度剖析
                    3 - MLSQL DSL-你准备好搞自己的DSL了吗
                    4 - 教你如何实现 Hive 列权限控制
                    5 - 教你如何实现 JDBC 列权限控制
                    7 - 教你如何读取MySQL binlog
                    11 - 对MLSQL支持逻辑处理的思考
                    13 - MLSQL on k8s(1) - k8s安装
                    14 - MLSQL on k8s(2) - Spark on k8s
                    15 - MLSQL on k8s(3) - MLSQL on k8s
                    16 - MLSQL集成JuiceFs

                    喜欢就点击最上方的[ MLSQL之道 ]关注下吧!右下角还有在看哦!

                    源码地址:

                    https://github.com/latincross/mlsqlwechat


                    文章转载自MLSQL之道,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                    评论