近两天公众号一个好友问了我一个问题:
如何通过UpdateRecord处理器把一个科学计数法的数值,转换成日期输出。

第一步:我用GenerateFlowFile构造了一个科学数据
{"a":1.60502E+12}
1.60502E+12对应的毫秒数是1605019513000,1605019513000对应着日期是2020/11/10 22:45:13

第二步:用UpdateRecord做了一个数据转换

wirter的schema为
{"type": "record","name":"at","fields" : [{"name": "a", "type": "string"}]}
数据处理结果为:

备注:需要特别注意这个schema,因为一开始a是数值,通过reader会生成一个内置的schema,如果writer不自己实现schema,默认会使用inner schema,而后来a字段变成了string,所以会报错,只能通过自己实现schema重新构造数据结构。
nifi表达式语言我写的表达式是
${field.value:toDecimal():format("yyyy/MM/dd HH:mm:ss"):toString()}
遗留问题:虽然勉强实现了从科学数值到日期的转换,但是两者时间差了8分钟左右,后续再研究,下面记录学习UpdateRecord一些知识点
官网介绍
Updates the contents of a FlowFile that contains Record-oriented data (i.e., data that can be read via a RecordReader and written by a RecordWriter). This Processor requires that at least one user-defined Property be added. The name of the Property should indicate a RecordPath that determines the field that should be updated. The value of the Property is either a replacement value (optionally making use of the Expression Language) or is itself a RecordPath that extracts a value from the Record. Whether the Property value is determined to be a RecordPath or a literal value depends on the configuration of the <Replacement Value Strategy> Property.
个人解读
更新flwofile的content内容,通过配置多个自定义属性来做到动态更新。
动态属性的key用Xpath表达式,value可以支持NIFI的表达式语言。
配置详情
| Record Reader | 配置继承于RecordReaderFactory的controller service用于读取Flwofile的content数据 GrokReader AvroReader见附录一SyslogReader ScriptedReader Syslog5424Reader XMLReader ParquetReader JsonTreeReader见附录三 JsonPathReader CSVReader |
| Record Writer | 配置继承于RecordSetWriterFactory的controller service用于写出Flwofile的content数据 AvroRecordSetWriter见附录二 ScriptedRecordSetWriterCSVRecordSetWriter FreeFormTextRecordSetWriter JsonRecordSetWriter见附录四 ParquetRecordSetWriter XMLRecordSetWriter |
| Replacement Value Strategy |
替换值策略 |
原始数据
[{"id": 17,"name": "John","child": {"id": "1"},"siblingIds": [4, 8],"siblings": [{ "name": "Jeremy", "id": 4 },{ "name": "Julia", "id": 8 }]},{"id": 98,"name": "Jane","child": {"id": 2},"gender": "F","siblingIds": [],"siblings": []}]
实战一:Replace with Literal
| Property Name | Property Value |
| Replacement Value Strategy | Literal Value |
| /name | Jeremy |
| /gender | M |
[{"id": 17,"name": "Jeremy","child": {"id": "1"},"gender": "M","siblingIds": [4, 8],"siblings": [{ "name": "Jeremy", "id": 4 },{ "name": "Julia", "id": 8 }]},{"id": 98,"name": "Jeremy","child": {"id": 2},"gender": "M","siblingIds": [],"siblings": []}]
备注:第一个记录没有gender,则会添加一个
实战二:Replace with RecordPath
| Property Name | Property Value |
| Replacement Value Strategy | Record Path Value |
| /name | /siblings[0]/name |
[{"id": 17,"name": "Jeremy","child": {"id": "1"},"siblingIds": [4, 8],"siblings": [{ "name": "Jeremy", "id": 4 },{ "name": "Julia", "id": 8 }]},{"id": 98,"name": null,"child": {"id": 2},"gender": "F","siblingIds": [],"siblings": []}]
实战三:Replace with Relative RecordPath
| Property Name | Property Value |
| Replacement Value Strategy | Record Path Value |
| /siblings[*]/name | ../id |
[{"id": 17,"name": "John","child": {"id": "1"},"siblingIds": [4, 8],"siblings": [{ "name": "4", "id": 4 },{ "name": "8", "id": 8 }]},{"id": 98,"name": "Jane","child": {"id": 2},"gender": "F","siblingIds": [],"siblings": []}]
备注:..当前层级的父节点
实战四:Replace Multiple Values
| Property Name | Property Value |
| Replacement Value Strategy | Literal Value |
| //id | ${replacement.id} |
备注:从属性拿值。假如Flowfile的replacement.id是91
[{"id": 91,"name": "John","child": {"id": "91"},"siblingIds": [4, 8],"siblings": [{ "name": "Jeremy", "id": 91 },{ "name": "Julia", "id": 91 }]},{"id": 91,"name": "Jane","child": {"id": 91},"gender": "F","siblingIds": [],"siblings": []}]
实战五:Use Expression Language to Modify Value
| Property Name | Property Value |
| Replacement Value Strategy | Literal Value |
| //name | ${field.value:toUpper()} |
[{"id": 17,"name": "JOHN","child": {"id": "1"},"siblingIds": [4, 8],"siblings": [{ "name": "JEREMY", "id": 4 },{ "name": "JULIA", "id": 8 }]},{"id": 98,"name": "JANE","child": {"id": 2},"gender": "F","siblingIds": [],"siblings": []}]
备注://代表所有路径,${field.value}代表当前路径的值




