问题描述
使用GetFile处理器将csv带入NiFi工作流程.我有一列由"id"组成.每个id表示一个特定的字符串.大约有3个ID.例如,如果我的csv由
组成 名称,年龄,ID约翰(Y)10杰克,55,NFinn,23,C
我知道Y表示York,N表示Old,C表示Cat.我想要一个标题为"nick"的新列,并为每个id都具有相应的昵称.
名称,年龄,id,昵称约翰10,Y,约克Jake,55,N,OldFinn,23,C,Cat
最后,我想要一个带有额外列和每条记录适当数据的csv .使用Apache NiFi怎么可能.请给我有关必须使用的处理器以及必须更改的配置以完成此任务的建议.
流:
- 添加新的昵称栏
- 将ID复制到昵称"列
- 查看每一行,并将id与它的对应值匹配
- 将此值设置为刻痕"列中的当前行
您可以使用
UpdateRecord将解析csv文件,添加新列并复制ID值:
创建一个 CSVReader
并保留默认属性.创建一个 CSVRecordSetWriter
并将Schema访问策略设置为 Schema Text
.将模式文本属性设置为
{"type":记录","name":"foobar","namespace":"my.example",字段":[{"name":名称","type":"string"},{姓名年龄","type":"int"},{"name":"id","type":"string"},{"name":"nick","type":"string"}]}
请注意,它具有新列.最后,将原始值替换为映射:
PS:我注意到您是新手,欢迎光临!您之前的任何问题都没有接受一个答案.如果他们解决了您的问题,请接受他们,因为这将帮助其他人找到解决方案.
A csv is brought into the NiFi Workflow using a GetFile Processor. I have a column consisting of a "id". Each id means a certain string. There are around 3 id's. For an example if my csv consists of
name,age,id
John,10,Y
Jake,55,N
Finn,23,C
I am aware that Y means York, N means Old and C means Cat. I want a new column with a header named "nick" and have the corresponding nick for each id.
name,age,id,nick
John,10,Y,York
Jake,55,N,Old
Finn,23,C,Cat
Finally I want a csv with the extra column and the appropriate data for each record. How is this possible Using Apache NiFi. Please advice me on the processors that must be used and the configurations that must be changed in order to accomplish this task.
Flow:
- add a new nick column
- copy over the id to the nick column
- look at each line and match id with it's corresponding value
- set this value into current line in the nick column
You can achieve this using either ReplaceText or ReplaceTextWithMapping. I do it with ReplaceText:
UpdateRecord will parse the csv file, add the new column and copy the id value:
Create a CSVReader
and keep the default properties. Create a CSVRecordSetWriter
and set Schema access strategy to Schema Text
. Set Schema Text property to
{
"type":"record",
"name":"foobar",
"namespace":"my.example",
"fields":[
{
"name":"name",
"type":"string"
},
{
"name":"age",
"type":"int"
},
{
"name":"id",
"type":"string"
},
{
"name":"nick",
"type":"string"
}
]
}
Notice that it has the new column. Finally replace the original values with the mapping:
PS: I noticed you are new to SO, welcome! You have not accepted a single answer in any of your previous questions. Accept them, if they solve your problem, as it will help others to find solutions.
这篇关于Apache NiFi:使用映射值将列添加到csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!