如何删除特定符号后立即出现的所有CJK文本？

本文介绍了如何删除特定符号后立即出现的所有CJK文本？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有这样的文字：

您可以使用 GNU sed 非ASCII字符：

sed -n l0 file.txt

结果：

这是一些文字Z \ 344 \\ \\271\246。这是Zsome更多文字Z\350\256\\241\347\256\227\346\234\272。$
这还不算多Z'\\347\ 224 \265\350\204\221 text Z.

然后你可以使用 GNU sed 来完成你想要的替换。在我的测试中，我必须将我的语言环境设置为POSIX：
LC_ALL =POSIXsed -r's / Z [\\ \\ o200-\o377] + // g'file.txt

结果：
这是一些文字。这是Zsome更多的文字。
这还有一些文字Z.

I have some text like this:
This is some text Z书. This is Zsome more text Z计算机.
This is yet some more Z电脑 text Z.
I need to delete all cases matching the pattern Z+(CJK), where (CJK) is any number of continuous CJK characters. The file above would become:
This is some text . This is Zsome more text .
This is yet some more text Z.
How can I delete all CJK text matching this pattern?
解决方案
You can using GNU sed to check the codes of non-ASCII characters:
sed -n l0 file.txt
Results:
This is some text Z\344\271\246. This is Zsome more text Z\350\256\241\347\256\227\346\234\272.$
This is yet some more Z\347\224\265\350\204\221 text Z.$
Then you can use GNU sed to do the replacement you desire. In my testing I had to set my locale to POSIX:
LC_ALL="POSIX" sed -r 's/Z[\o200-\o377]+//g' file.txt
Results:
This is some text . This is Zsome more text .
This is yet some more text Z.

这篇关于如何删除特定符号后立即出现的所有CJK文本？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！