问题描述
我知道进行解析时,我应该理想地删除所有空格和换行符,但是我只是作为对我正在尝试的操作的快速解决方案,因此我不知道为什么它不起作用.带有"#### 1"之类的包装器的文档中的文本,并且正在尝试基于此进行解析,但是无论我如何尝试,它都无法正常工作,我认为我正确地使用了多行. >
这根本不返回任何结果:
string='
####1
ttteest
####1
ttttteeeestt
####2
ttest
####2'
import re
pattern = '.*?####(.*?)####'
returnmatch = re.compile(pattern, re.MULTILINE).findall(string)
return returnmatch
尝试re.findall(r"####(.*?)\s(.*?)\s####", string, re.DOTALL)
(当然也可以与re.compile
一起使用).
此正则表达式将返回包含节号和节内容的元组.
对于您的示例,这将返回[('1', 'ttteest'), ('2', ' \n\nttest')]
.
(顺便说一句:您的示例将无法运行,对于多行字符串,请使用'''
或"""
)
I know that for parsing I should ideally remove all spaces and linebreaks but I was just doing this as a quick fix for something I was trying and I can't figure out why its not working.. I have wrapped different areas of text in my document with the wrappers like "####1" and am trying to parse based on this but its just not working no matter what I try, I think I am using multiline correctly.. any advice is appreciated
This returns no results at all:
string='
####1
ttteest
####1
ttttteeeestt
####2
ttest
####2'
import re
pattern = '.*?####(.*?)####'
returnmatch = re.compile(pattern, re.MULTILINE).findall(string)
return returnmatch
Try re.findall(r"####(.*?)\s(.*?)\s####", string, re.DOTALL)
(works with re.compile
too, of course).
This regexp will return tuples containing the number of the section and the section content.
For your example, this will return [('1', 'ttteest'), ('2', ' \n\nttest')]
.
(BTW: your example won't run, for multiline strings, use '''
or """
)
这篇关于Python正则表达式,在多行上匹配模式..为什么不起作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!