本文介绍了空间双引号之间更换新行字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想按行读取数据行,无论我发现双引号,我想,直到第二个双引号相遇的空间,以取代新行字符

  090033ec82b13639,CPDM发起,后勤,有更正。古尔ÿSerbest,泌尿外科
090033ec82ae0c07,发起,NA,为了本地测试
罗希特3至4,茱莉安乙奥尔,肿瘤
090033ec82b35fd0,外部启动,NA,监管机构的要求,肯尼斯领主,肿瘤

就像上​​面的数据,因为它发现了双引号(开)和关闭双引号中的3号线,所以我们需要通过如下一个空格这些行合并第二行:

  090033ec82b13639,CPDM发起,后勤,有更正。古尔ÿSerbest,泌尿外科
090033ec82ae0c07,发起,NA,为了本地测试罗希特3〜4个。朱利安乙奥尔,肿瘤
090033ec82b35fd0,外部启动,NA,监管机构的要求,肯尼斯领主,肿瘤


解决方案

您可以使用此的GNU AWK单行

 的awk -v RS =[^] *'-v ORS ='{GSUB(/ \\ n /,,RT);打印$ 0个RT}文件
090033ec82b13639,CPDM发起,后勤,有更正。古尔ÿSerbest,泌尿外科
090033ec82ae0c07,发起,NA,为了本地测试罗希特3〜4个。朱利安乙奥尔,肿瘤
090033ec82b35fd0,外部启动,NA,监管机构的要求,肯尼斯领主,肿瘤


  • RS =[^] * - 输入记录分隔符设置为正则表达式[^] *'

  • -v ORS = - 输出记录分隔符设置为null

  • GSUB(/ \\ n /,,RT) - 在由输入记录分隔符


这是一个 Perl的一个班轮

  perl的-0pe的/[^ \\ n] *(*跳过)(* F)|([^\\ n] *)\\ n([ ^] *)/ $ 1 $ 2 / g的'文件
090033ec82b13639,CPDM发起,后勤,有更正。古尔ÿSerbest,泌尿外科
090033ec82ae0c07,发起,NA,为了本地测试罗希特3〜4个。朱利安乙奥尔,肿瘤
090033ec82b35fd0,外部启动,NA,监管机构的要求,肯尼斯领主,肿瘤

i want to read a data row by row and whereever i find double quote i want to replace new line character with a space till the second double quote encounterlike

090033ec82b13639,CPDM Initiated,Logistical,"There corrected.",Gul Y Serbest,Urology
090033ec82ae0c07,Initiated,NA,"To   local testing
Rohit  3 to 4.",Julienne B Orr,Oncology
090033ec82b35fd0,Externally Initiated,NA,regulatory agency requests,Kenneth A Lord,Oncology

Like in above data second row as it finds the double quote(open) and close double quote in 3rd line so we need to merge these lines by a single space as below:

090033ec82b13639,CPDM Initiated,Logistical,"There corrected.",Gul Y Serbest,Urology
090033ec82ae0c07,Initiated,NA,"To   local testing Rohit  3 to 4.",Julienne B Orr,Oncology
090033ec82b35fd0,Externally Initiated,NA,regulatory agency requests,Kenneth A Lord,Oncology
解决方案

You can use this gnu-awk one-liner:

awk -v RS='"[^"]*"' -v ORS= '{gsub(/\n/, " ", RT); print $0  RT}' file
090033ec82b13639,CPDM Initiated,Logistical,"There corrected.",Gul Y Serbest,Urology
090033ec82ae0c07,Initiated,NA,"To   local testing Rohit  3 to 4.",Julienne B Orr,Oncology
090033ec82b35fd0,Externally Initiated,NA,regulatory agency requests,Kenneth A Lord,Oncology

  • RS='"[^"]*"' - Input Record Separator is set to regex '"[^"]*"'
  • -v ORS= - Output Record Separator is set to null
  • gsub(/\n/, " ", RT) - Replace newlines with space in the text matched by Input Record Separator

And here is a perl one-liner:

perl -0pe 's/"[^\n"]*"(*SKIP)(*F)|("[^"\n]*)\n([^"]*")/$1 $2/g' file
090033ec82b13639,CPDM Initiated,Logistical,"There corrected.",Gul Y Serbest,Urology
090033ec82ae0c07,Initiated,NA,"To   local testing Rohit  3 to 4.",Julienne B Orr,Oncology
090033ec82b35fd0,Externally Initiated,NA,regulatory agency requests,Kenneth A Lord,Oncology

这篇关于空间双引号之间更换新行字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-29 14:14