本文介绍了如何在特定列中添加双引号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
如何在gene_id中添加双引号?
How to add double quotes to the gene_id?
我的原始文件:
##gtf-version 3
Bany_Scaf1 maker gene 201136 207903 . + . Alias "maker-Bany_Scaf1-snap-gene-2.23"; Dbxref "InterPro:IPR019774" "Pfam:PF00351"; ID Bany_03723; Name Bany_03723; Ontology_term "GO:0016714" "GO:0055114"; gene_id Bany_03723
Bany_Scaf1 maker transcript 201136 207903 . + . Alias "maker-Bany_Scaf1-snap-gene-2.23-mRNA-1"; Dbxref "InterPro:IPR019774" "Pfam:PF00351"; ID "Bany_03723-RA"; Name "Bany_03723-RA"; Ontology_term "GO:0016714" "GO:0055114"; Parent Bany_03723; _AED "0.06"; _QI "45|1|1|1|1|1|7|425|530"; _eAED "0.06"; gene_id Bany_03723; original_biotype mrna; transcript_id "Bany_03723-RA"
Bany_Scaf1 maker exon 201136 201304 . + . ID "Bany_03723-RA:1"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA"
Bany_Scaf1 maker exon 202687 202770 . + . ID "Bany_03723-RA:2"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA"
Bany_Scaf1 maker exon 202886 202921 . + . ID "Bany_03723-RA:3"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA"
Bany_Scaf1 maker exon 203004 203820 . + . ID "Bany_03723-RA:4"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA"
Bany_Scaf1 maker exon 206097 206223 . + . ID "Bany_03723-RA:5"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA"
Bany_Scaf1 maker exon 206649 206878 . + . ID "Bany_03723-RA:6"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA"
Bany_Scaf1 maker exon 207304 207903 . + . ID "Bany_03723-RA:7"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA"
我希望将所有 gene_id Bany_xxxxxx
更改为 gene_id"Bany_xxxxxx"
.
我尝试过:
sed -E 's#(Parent|gene_id|ID) ([0-9A-Za-z.]+)#\1 \"\2\"#g'
但是双引号添加在错误的位置,例如:
But the double quotes were added in the wrong place, like:
gene_id "Bany"_03723
我该怎么办...
推荐答案
与 sed
$ sed -E 's/gene_id ([^;]+)/gene_id "\1"/' file
找到下一个以;
分隔的单词,并输入gene_id.假设它们之间有空间.如果选项卡更改为 \ t
find the next word delimited with ;
to gene_id and quote it. Assumes space between them. If tab change to \t
这篇关于如何在特定列中添加双引号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!