基于正则表达式拆分字符串

基于正则表达式拆分字符串

本文介绍了基于正则表达式拆分字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表格形式的命令输出.我正在从结果文件中解析这个输出并将其存储在一个字符串中.一行中的每个元素由一个或多个空格字符分隔,因此我使用正则表达式来匹配 1 个或多个空格并将其拆分.但是,在每个元素之间插入了一个空格:

>>>str1="a b c d" # 空格不规则>>>字符串1'A B C D'>>>str2=re.split("( )+", str1)>>>str2['a', ' ', 'b', ' ', 'c', ' ', 'd'] # 之间有 1 个空格元素!!!

有没有更好的方法来做到这一点?

在每次拆分后 str2 被附加到一个列表中.

解决方案

通过使用 (,),您正在捕获组,如果您简单地删除它们,您将没有这个问题.

>>>str1 = "a b c d">>>re.split(" +", str1)['A B C D']

但是不需要正则表达式,没有指定任何分隔符的 str.split 将为您用空格分割它.在这种情况下,这将是最好的方法.

>>>str1.split()['A B C D']

如果你真的想要正则表达式,你可以使用这个('\s' 代表空格,它更清晰):

>>>re.split("\s+", str1)['A B C D']

或者你可以找到所有非空白字符

>>>re.findall(r'\S+',str1)['A B C D']

I have the output of a command in tabular form. I'm parsing this output from a result file and storing it in a string. Each element in one row is separated by one or more whitespace characters, thus I'm using regular expressions to match 1 or more spaces and split it. However, a space is being inserted between every element:

>>> str1="a    b     c      d" # spaces are irregular
>>> str1
'a    b     c      d'
>>> str2=re.split("( )+", str1)
>>> str2
['a', ' ', 'b', ' ', 'c', ' ', 'd'] # 1 space element between!!!

Is there a better way to do this?

After each split str2 is appended to a list.

解决方案

By using (,), you are capturing the group, if you simply remove them you will not have this problem.

>>> str1 = "a    b     c      d"
>>> re.split(" +", str1)
['a', 'b', 'c', 'd']

However there is no need for regex, str.split without any delimiter specified will split this by whitespace for you. This would be the best way in this case.

>>> str1.split()
['a', 'b', 'c', 'd']

If you really wanted regex you can use this ('\s' represents whitespace and it's clearer):

>>> re.split("\s+", str1)
['a', 'b', 'c', 'd']

or you can find all non-whitespace characters

>>> re.findall(r'\S+',str1)
['a', 'b', 'c', 'd']

这篇关于基于正则表达式拆分字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 18:00