如何使用awk或其他linux程序指定文本限定符?
我的数据如下:
它实际上是制表符分隔的,但有些字段内部有制表符。字段用双引号限定。
如何指定字段不仅由制表符分隔,而且由引号分隔?
这是我目前的剧本:

 awk '{OF=OFS="\t"}{print $1,$7,$8,$10,$11,$21}' cyme.txt | grep -i pilates

另外,出于实际目的,我还提供了一个数据示例的完美文本副本:
"723721093013"  "AFL"   "1" ""  "15"    "ALT ROCK...."  "Hai!........................"  "Creatures, The.............."  2   "N" 4   7.48    2004.02.17  0.0000  .  .    .  .    2
"723721093112"  "AFL"   "1" ""  "5" "ELECTRONIC.."  "Crash And Burn.............."  "Foxx, John/Gordon, Louis...."  1   "W" 4   11.98   2004.02.17  0.0000  .  .    .  .    73
"819162013137"  "AHY"   "1" ""  "101"   "PUNK........"  "Truth, Love and Liberty....."  "FM359......................."  2   "H" 1   4.48    2014.01.14  0.0000  .  .    .  .    39
"879198005148"  "AHY"   "1" ""  "14"    "PUNK........"  "Re-Volts S/T................"  "Re-Volts, The..............."  1   "J" 4   5.48    2007.12.11  0.0000  .  .    .  .    10
"879198004288"  "AHY"   "1" ""  "24"    "PUNK........"  "Read Between The Lines......"  "Smalltown..................."  1   "N" 4   7.48    2009.12.01  0.0000  .  .    .  .    17

如果有什么需要澄清的,请告诉我。
如何使用awk或其他linux程序指定文本限定符?
我意识到令人惊讶的awk可能不是这个工作的正确工具,如果确实是这样的话,我很高兴了解应该用来处理带有字段限定符的文本文件的其他命令。

最佳答案

如果gawk可用,请使用regex作为字段分隔符:

> gawk '{for (i=1;i<=NF;i++){if ($i){printf("FN: %d Content: %s",i,$i)}}print "\n"}' FS='([\t]*?\"| +)' infile
FN: 2 Content: 723721093013FN: 5 Content: AFLFN: 8 Content: 1FN: 14 Content: 15FN: 17 Content: ALTFN: 18 Content: ROCK....FN: 21 Content: Hai!........................FN: 24 Content: Creatures,FN: 25 Content: The..............FN: 27 Content: 2FN: 29 Content: NFN: 31 Content: 4FN: 32 Content: 7.48FN: 33 Content: 2004.02.17FN: 34 Content: 0.0000FN: 35 Content: .FN: 36 Content: .FN: 37 Content: .FN: 38 Content: .FN: 39 Content: 2

FN: 2 Content: 723721093112FN: 5 Content: AFLFN: 8 Content: 1FN: 14 Content: 5FN: 17 Content: ELECTRONIC..FN: 20 Content: CrashFN: 21 Content: AndFN: 22 Content: Burn..............FN: 25 Content: Foxx,FN: 26 Content: John/Gordon,FN: 27 Content: Louis....FN: 29 Content: 1FN: 31 Content: WFN: 33 Content: 4FN: 34 Content: 11.98FN: 35 Content: 2004.02.17FN: 36 Content: 0.0000FN: 37 Content: .FN: 38 Content: .FN: 39 Content: .FN: 40 Content: .FN: 41 Content: 73

FN: 2 Content: 819162013137FN: 5 Content: AHYFN: 8 Content: 1FN: 14 Content: 101FN: 17 Content: PUNK........FN: 20 Content: Truth,FN: 21 Content: LoveFN: 22 Content: andFN: 23 Content: Liberty.....FN: 26 Content: FM359.......................FN: 28 Content: 2FN: 30 Content: HFN: 32 Content: 1FN: 33 Content: 4.48FN: 34 Content: 2014.01.14FN: 35 Content: 0.0000FN: 36 Content: .FN: 37 Content: .FN: 38 Content: .FN: 39 Content: .FN: 40 Content: 39

FN: 2 Content: 879198005148FN: 5 Content: AHYFN: 8 Content: 1FN: 14 Content: 14FN: 17 Content: PUNK........FN: 20 Content: Re-VoltsFN: 21 Content: S/T................FN: 24 Content: Re-Volts,FN: 25 Content: The...............FN: 27 Content: 1FN: 29 Content: JFN: 31 Content: 4FN: 32 Content: 5.48FN: 33 Content: 2007.12.11FN: 34 Content: 0.0000FN: 35 Content: .FN: 36 Content: .FN: 37 Content: .FN: 38 Content: .FN: 39 Content: 10

FN: 2 Content: 879198004288FN: 5 Content: AHYFN: 8 Content: 1FN: 14 Content: 24FN: 17 Content: PUNK........FN: 20 Content: ReadFN: 21 Content: BetweenFN: 22 Content: TheFN: 23 Content: Lines......FN: 26 Content: Smalltown...................FN: 28 Content: 1FN: 30 Content: NFN: 32 Content: 4FN: 33 Content: 7.48FN: 34 Content: 2009.12.01FN: 35 Content: 0.0000FN: 36 Content: .FN: 37 Content: .FN: 38 Content: .FN: 39 Content: .FN: 40 Content: 17

关于linux - 在Linux中指定文本限定符和定界符,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/22461282/

10-13 01:11