本文介绍了空间双引号之间更换新行字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想按行读取数据行,无论我发现双引号,我想,直到第二个双引号相遇的空间,以取代新行字符
像
090033ec82b13639,CPDM发起,后勤,有更正。古尔ÿSerbest,泌尿外科
090033ec82ae0c07,发起,NA,为了本地测试
罗希特3至4,茱莉安乙奥尔,肿瘤
090033ec82b35fd0,外部启动,NA,监管机构的要求,肯尼斯领主,肿瘤
就像上面的数据,因为它发现了双引号(开)和关闭双引号中的3号线,所以我们需要通过如下一个空格这些行合并第二行:
090033ec82b13639,CPDM发起,后勤,有更正。古尔ÿSerbest,泌尿外科
090033ec82ae0c07,发起,NA,为了本地测试罗希特3〜4个。朱利安乙奥尔,肿瘤
090033ec82b35fd0,外部启动,NA,监管机构的要求,肯尼斯领主,肿瘤
解决方案
您可以使用此的GNU AWK单行
的awk -v RS =[^] *'-v ORS ='{GSUB(/ \\ n /,,RT);打印$ 0个RT}文件
090033ec82b13639,CPDM发起,后勤,有更正。古尔ÿSerbest,泌尿外科
090033ec82ae0c07,发起,NA,为了本地测试罗希特3〜4个。朱利安乙奥尔,肿瘤
090033ec82b35fd0,外部启动,NA,监管机构的要求,肯尼斯领主,肿瘤
-
RS =[^] *
- 输入记录分隔符设置为正则表达式[^] *'
-
-v ORS =
- 输出记录分隔符设置为null -
GSUB(/ \\ n /,,RT)
- 在由输入记录分隔符$匹配的文本空格替换换行C $ C>
这是一个 Perl的一个班轮
perl的-0pe的/[^ \\ n] *(*跳过)(* F)|([^\\ n] *)\\ n([ ^] *)/ $ 1 $ 2 / g的'文件
090033ec82b13639,CPDM发起,后勤,有更正。古尔ÿSerbest,泌尿外科
090033ec82ae0c07,发起,NA,为了本地测试罗希特3〜4个。朱利安乙奥尔,肿瘤
090033ec82b35fd0,外部启动,NA,监管机构的要求,肯尼斯领主,肿瘤
i want to read a data row by row and whereever i find double quote i want to replace new line character with a space till the second double quote encounterlike
090033ec82b13639,CPDM Initiated,Logistical,"There corrected.",Gul Y Serbest,Urology
090033ec82ae0c07,Initiated,NA,"To local testing
Rohit 3 to 4.",Julienne B Orr,Oncology
090033ec82b35fd0,Externally Initiated,NA,regulatory agency requests,Kenneth A Lord,Oncology
Like in above data second row as it finds the double quote(open) and close double quote in 3rd line so we need to merge these lines by a single space as below:
090033ec82b13639,CPDM Initiated,Logistical,"There corrected.",Gul Y Serbest,Urology
090033ec82ae0c07,Initiated,NA,"To local testing Rohit 3 to 4.",Julienne B Orr,Oncology
090033ec82b35fd0,Externally Initiated,NA,regulatory agency requests,Kenneth A Lord,Oncology
解决方案
You can use this gnu-awk one-liner:
awk -v RS='"[^"]*"' -v ORS= '{gsub(/\n/, " ", RT); print $0 RT}' file
090033ec82b13639,CPDM Initiated,Logistical,"There corrected.",Gul Y Serbest,Urology
090033ec82ae0c07,Initiated,NA,"To local testing Rohit 3 to 4.",Julienne B Orr,Oncology
090033ec82b35fd0,Externally Initiated,NA,regulatory agency requests,Kenneth A Lord,Oncology
RS='"[^"]*"'
- Input Record Separator is set to regex'"[^"]*"'
-v ORS=
- Output Record Separator is set to nullgsub(/\n/, " ", RT)
- Replace newlines with space in the text matched byInput Record Separator
And here is a perl one-liner:
perl -0pe 's/"[^\n"]*"(*SKIP)(*F)|("[^"\n]*)\n([^"]*")/$1 $2/g' file
090033ec82b13639,CPDM Initiated,Logistical,"There corrected.",Gul Y Serbest,Urology
090033ec82ae0c07,Initiated,NA,"To local testing Rohit 3 to 4.",Julienne B Orr,Oncology
090033ec82b35fd0,Externally Initiated,NA,regulatory agency requests,Kenneth A Lord,Oncology
这篇关于空间双引号之间更换新行字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!