so I have a file that looks like this :/translation="MDGVTQQNAALVQEATTAAASLEEQARNLTAAVAAFDLGDKQTV LITPRAAVPALKRPALKASLPASSSHGNWETF" /product="Methyl-accepting chemotaxis protein I (serine chemoreceptor protein)" CDS complement(471..590) /db_xref="SEED:fig|1240086.14.peg.2" /translation="MHQYQSAILAKICRYGGIEKPEITPASVYKLDSHWRYVI" /product="hypothetical protein" CDS 717..2354 /db_xref="SEED:fig|1240086.14.peg.3" /translation="MGFFVVLWGGASGFSLYSLKQVTTLLHDNSTQGRTYTYLVYGND QYFRSVTRMARVMDYSQFSDAAIASLEEQAQQLTKAVEVFHLGSEYQTAAS RTRPAGNMALKRPALSGMAPALPPARTASDEGSWEKF" /product="Methyl-accepting chemotaxis protein I (serine chemoreceptor protein)" /product="macromolecule metabolism; macromolecule degradation; degradation of proteins, peptides, glycopeptides"I need to extract the text that is between quotes after a "/product=", so I need this :Methyl-accepting chemotaxis protein I (serine chemoreceptor protein)hypothetical proteinMethyl-accepting chemotaxis protein I (serine chemoreceptor protein)macromolecule metabolism; macromolecule degradation; degradation of proteins, peptides, glycopeptidesI have to use awk, so I wrote this : awk '/\/product/ {split($0, a, "\""); printf a[2] "\n"}'but this only takes the info on the same line as "/product", and some times the info is on two or three lines.. I'm out of ideas as to how to get the entire info between the quotes, anyone can help? 解决方案 awk to the rescue! needs multi-char RS support (gawk)$ awk -v RS='/| CDS' -F'"' '/^product/{gsub("\n +"," "); print $2}' fileMethyl-accepting chemotaxis protein I (serine chemoreceptor protein)hypothetical proteinMethyl-accepting chemotaxis protein I (serine chemoreceptor protein)macromolecule metabolism; macromolecule degradation; degradation of proteins, peptides, glycopeptidesExplanationset the record structure (either starts with "/" or " CDS", find related records (starting with product), trim extra spaces and print the field between two quotes (second field based on set field delimiter to double quotes). 这篇关于awk:提取多行数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 1403页,肝出来的..
09-09 00:56