官网介绍
The Jolt utilities processing JSON are not not stream based therefore large JSON document transformation may consume large amounts of memory. Currently UTF-8 FlowFile content and Jolt specifications are supported. A specification can be defined using Expression Language where attributes can be referred either on the left or right hand side within the specification syntax. Custom Jolt Transformations (that implement the Transform interface) are supported. Modules containing custom libraries which do not existing on the current class path can be included via the custom module directory property. Note: When configuring a processor if user selects of the Default transformation yet provides a Chain specification the system does not alert that the specification is invalid and and will produce failed flow files. This is a known issue identified within the Jolt library.
个人解读
JoltTransformJSON一个高度抽象功能强大的数据转换、清洗的处理器。
这个处理器不只自身功能强大,最厉害的在于它支持外置扩展,很适合二次开发。
jolt处理json数据不是基于流的,所以对于大json对象会消耗相当大的java内存。
Jolt Specification支持从nifi正则表达式,可以从flowfile的流的属性里边获取。
自定义jolt的转换要实现Transform 接口。
Custom Module Directory这个路径放的是二次开发的jar路径。
配置详情
| 属性 | 含义 |
| Jolt Transformation DSL |
数据处理类型。 Cardinality :处理数据基数 Default :添加默认 Remove :去冗余 Shift :映射 Chain:执行转换列表 Modify - Default :缺少或者值为null的时候写入 Modify - Define :不存在则写入 Modify - Overwrite:始终覆盖 Sort :排序 Custom:自定义模式 |
| Custom Transformation Class Name | 二次开发的实现类名称(类似mysql的驱动类一样的) |
| Custom Module Directory | 二次开发的jar包 |
| Jolt Specification | 映射规则配置 |
| Transform Cache Size | 编译一次转换规则是很消耗性能的,尤其是当Jolt Specification存在flowfile如果每条flwofile都去编译一次,这个效率无疑是严重的浪费。所以这边的配置是设置缓存jolt转换规则的个数。 |
| Pretty Print |
true美化json输出 |




