暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

UpdateRecord实战

NIFI实战 2020-11-11
884

近两天公众号一个好友问了我一个问题:

如何通过UpdateRecord处理器把一个科学计数法的数值,转换成日期输出。

第一步:我用GenerateFlowFile构造了一个科学数据

    {"a":1.60502E+12}

    1.60502E+12对应的毫秒数是1605019513000,1605019513000对应着日期是2020/11/10 22:45:13

    第二步:用UpdateRecord做了一个数据转换

    wirter的schema为

      {
      "type": "record","name":"at",
      "fields" : [
      {"name": "a", "type": "string"}
      ]
      }

      数据处理结果为:

      备注:需要特别注意这个schema,因为一开始a是数值,通过reader会生成一个内置的schema,如果writer不自己实现schema,默认会使用inner schema,而后来a字段变成了string,所以会报错,只能通过自己实现schema重新构造数据结构。

      nifi表达式语言我写的表达式是

        ${field.value:toDecimal():format("yyyy/MM/dd HH:mm:ss"):toString()}

        遗留问题:虽然勉强实现了从科学数值到日期的转换,但是两者时间差了8分钟左右,后续再研究,下面记录学习UpdateRecord一些知识点

        官网介绍

        Updates the contents of a FlowFile that contains Record-oriented data (i.e., data that can be read via a RecordReader and written by a RecordWriter). This Processor requires that at least one user-defined Property be added. The name of the Property should indicate a RecordPath that determines the field that should be updated. The value of the Property is either a replacement value (optionally making use of the Expression Language) or is itself a RecordPath that extracts a value from the Record. Whether the Property value is determined to be a RecordPath or a literal value depends on the configuration of the <Replacement Value Strategy> Property.

        个人解读

        更新flwofile的content内容,通过配置多个自定义属性来做到动态更新。

        动态属性的key用Xpath表达式,value可以支持NIFI的表达式语言。

        配置详情

        Record Reader

        配置继承于RecordReaderFactory的controller service用于读取Flwofile的content数据

        GrokReader

        AvroReader见附录一
        SyslogReader
        ScriptedReader
        Syslog5424Reader
        XMLReader
        ParquetReader

        JsonTreeReader见附录三

        JsonPathReader

        CSVReader
        Record Writer

        配置继承于RecordSetWriterFactory的controller service用于写出Flwofile的content数据

        AvroRecordSetWriter见附录二

        ScriptedRecordSetWriter
        CSVRecordSetWriter
        FreeFormTextRecordSetWriter

        JsonRecordSetWriter见附录四

        ParquetRecordSetWriter

        XMLRecordSetWriter
        Replacement Value Strategy
        • Literal Value 

        • Record Path Value 

        替换值策略

        原始数据

           [{
          "id": 17,
          "name": "John",
          "child": {
          "id": "1"
          },
          "siblingIds": [4, 8],
          "siblings": [
          { "name": "Jeremy", "id": 4 },
          { "name": "Julia", "id": 8 }
          ]
          },
          {
          "id": 98,
          "name": "Jane",
          "child": {
          "id": 2
          },
          "gender": "F",
          "siblingIds": [],
          "siblings": []
          }]

          实战一:Replace with Literal

          Property NameProperty Value
          Replacement Value StrategyLiteral Value
          /nameJeremy
          /genderM
            [{
            "id": 17,
            "name": "Jeremy",
            "child": {
            "id": "1"
            },
            "gender": "M",
            "siblingIds": [4, 8],
            "siblings": [
            { "name": "Jeremy", "id": 4 },
            { "name": "Julia", "id": 8 }
            ]
            },
            {
            "id": 98,
            "name": "Jeremy",
            "child": {
            "id": 2
            },
            "gender": "M",
            "siblingIds": [],
            "siblings": []
            }]

            备注:第一个记录没有gender,则会添加一个

            实战二:Replace with RecordPath

            Property NameProperty Value
            Replacement Value StrategyRecord Path Value
            /name/siblings[0]/name
              [{
              "id": 17,
              "name": "Jeremy",
              "child": {
              "id": "1"
              },
              "siblingIds": [4, 8],
              "siblings": [
              { "name": "Jeremy", "id": 4 },
              { "name": "Julia", "id": 8 }
              ]
              },
              {
              "id": 98,
              "name": null,
              "child": {
              "id": 2
              },
              "gender": "F",
              "siblingIds": [],
              "siblings": []
              }]

              实战三:Replace with Relative RecordPath

              Property NameProperty Value
              Replacement Value StrategyRecord Path Value
              /siblings[*]/name../id
                [{
                "id": 17,
                "name": "John",
                "child": {
                "id": "1"
                },
                "siblingIds": [4, 8],
                "siblings": [
                { "name": "4", "id": 4 },
                { "name": "8", "id": 8 }
                ]
                },
                {
                "id": 98,
                "name": "Jane",
                "child": {
                "id": 2
                },
                "gender": "F",
                "siblingIds": [],
                "siblings": []
                }]

                备注:..当前层级的父节点

                实战四:Replace Multiple Values

                Property NameProperty Value
                Replacement Value StrategyLiteral Value
                //id${replacement.id}

                备注:从属性拿值。假如Flowfile的replacement.id是91

                  [{
                  "id": 91,
                  "name": "John",
                  "child": {
                  "id": "91"
                  },
                  "siblingIds": [4, 8],
                  "siblings": [
                  { "name": "Jeremy", "id": 91 },
                  { "name": "Julia", "id": 91 }
                  ]
                  },
                  {
                  "id": 91,
                  "name": "Jane",
                  "child": {
                  "id": 91
                  },
                  "gender": "F",
                  "siblingIds": [],
                  "siblings": []
                  }]

                  实战五:Use Expression Language to Modify Value

                  Property NameProperty Value
                  Replacement Value StrategyLiteral Value
                  //name${field.value:toUpper()}

                    [{
                    "id": 17,
                    "name": "JOHN",
                    "child": {
                    "id": "1"
                    },
                    "siblingIds": [4, 8],
                    "siblings": [
                    { "name": "JEREMY", "id": 4 },
                    { "name": "JULIA", "id": 8 }
                    ]
                    },
                    {
                    "id": 98,
                    "name": "JANE",
                    "child": {
                    "id": 2
                    },
                    "gender": "F",
                    "siblingIds": [],
                    "siblings": []
                    }]

                    备注://代表所有路径,${field.value}代表当前路径的值

                    文章转载自NIFI实战,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                    评论