本文介绍了除非在双引号单卡更换空白的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设由一个或多个空格分隔的字符串多行文件。进一步假设,串组可以用双引号括起来。

Assume a multi-line file with strings separated by one or more whitespaces. Assume further that groups of strings can be enclosed by double quotes.

> cat file
foo bar "foobar baz qux"
foo "bar foobar baz" qux
"foo   bar foobar" baz   qux   # multiple whitespaces in this line

如果我想使用,以取代单一的制表符双引号外的所有空格的 AWK 的下面列出,我收到以下内容:

If I wish to replace all whitespaces outside the double quotes with single tab characters using awk as listed below, I receive the following:

awk '{OFS="\t"; FPAT="([^, ]+)|(\"[^\"]+\")"; $1=$1; print}' file
# foo   bar "foobar baz qux" # In this line, strings inside the quote are separated by tabs
# foo   "bar foobar baz"    qux
# "foo  bar foobar" baz qux

问题只似乎仅限于以双引号结束行。

The problem only seems to be restricted to the line that ends with a double quote.

编辑1:
为了更好地可视化的问题在眼前:

EDIT 1:To better visualize the issue at hand:

awk '{OFS="\t"; FPAT="([^, ]+)|(\"[^\"]+\")"; $1=$1; print}' file | cat -A
# foo^Ibar^I"foobar^Ibaz^Iqux"$
# foo^I"bar foobar baz"^Iqux$
# "foo   bar foobar"^Ibaz^Iqux$

编辑2:
看来,这两个命令回答部分做工精细建议,除非非字母字符一定数量或组合在输入present。下面是一个例子:

EDIT 2:It appears that both commands suggested in the answer section work fine unless a certain number or combination of non-letter characters are present in the input. Here is an example:

> cat file
foo_bar_baz foo foo_bar . Name=foo;product="bar baz qux"
foo_bar_baz foo foo_bar . Name=foo;product="bar baz qux"
foo_bar_baz foo foo_bar . Name=foo;product="bar baz qux"

> awk -v FPAT='"[^"]*"|[^[:blank:]]+' -v OFS='\t' '{$1=$1} 1' file | cat -A
foo_bar_baz^Ifoo^Ifoo_bar^I.^IName=foo;product="bar^Ibaz^Iqux"$
foo_bar_baz^Ifoo^Ifoo_bar^I.^IName=foo;product="bar^Ibaz^Iqux"$
foo_bar_baz^Ifoo^Ifoo_bar^I.^IName=foo;product="bar^Ibaz^Iqux"$

> awk '{$1=$1}1' OFS='\t' FPAT='"[^"]+"|[^ ]+' file | cat -A
foo_bar_baz^Ifoo^Ifoo_bar^I.^IName=foo;product="bar^Ibaz^Iqux"$
foo_bar_baz^Ifoo^Ifoo_bar^I.^IName=foo;product="bar^Ibaz^Iqux"$
foo_bar_baz^Ifoo^Ifoo_bar^I.^IName=foo;product="bar^Ibaz^Iqux"$

编辑3:
这个问题提出的编辑2 的进一步这里讨论:Replacing空白单标签,除非在双引号 - 第二部分

EDIT 3:This question posed EDIT 2 is further discussed here: Replacing whitespace with single tab unless in double quotes - Part II

推荐答案

使用的GNU AWK 你可以做到这一点很容易:

Using gnu-awk you can do this easily:

awk -v FPAT='"[^"]*"|[^[:blank:]]+' -v OFS='\t' '{$1=$1} 1' file
foo bar "foobar baz qux"
foo "bar foobar baz"    qux
"foo   bar foobar"  baz qux

这篇关于除非在双引号单卡更换空白的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-13 22:48