问题描述
我需要一些关于声明正则表达式的帮助.我的输入如下:
I need some help on declaring a regex. My inputs are like the following:
this is a paragraph with<[1> in between</[1> and then there are cases ... where the<[99> number ranges from 1-100</[99>.
and there are many other lines in the txt files
with<[3> such tags </[3>
所需的输出是:
this is a paragraph with in between and then there are cases ... where the number ranges from 1-100.
and there are many other lines in the txt files
with such tags
我已经试过了:
#!/usr/bin/python
import os, sys, re, glob
for infile in glob.glob(os.path.join(os.getcwd(), '*.txt')):
for line in reader:
line2 = line.replace('<[1> ', '')
line = line2.replace('</[1> ', '')
line2 = line.replace('<[1>', '')
line = line2.replace('</[1>', '')
print line
我也试过这个(但似乎我使用了错误的正则表达式语法):
I've also tried this (but it seems like I'm using the wrong regex syntax):
line2 = line.replace('<[*> ', '')
line = line2.replace('</[*> ', '')
line2 = line.replace('<[*>', '')
line = line2.replace('</[*>', '')
我不想将 replace
从 1 硬编码到 99 ...
I dont want to hard-code the replace
from 1 to 99 . . .
推荐答案
这个经过测试的代码片段应该可以:
This tested snippet should do it:
import re
line = re.sub(r"</?\[\d+>", "", line)
这是解释其工作原理的注释版本:
Here's a commented version explaining how it works:
line = re.sub(r"""
(?x) # Use free-spacing mode.
< # Match a literal '<'
/? # Optionally match a '/'
\[ # Match a literal '['
\d+ # Match one or more digits
> # Match a literal '>'
""", "", line)
正则表达式有趣!但我强烈建议花一两个小时学习基础知识.对于初学者,您需要了解哪些字符是特殊的:"metacharacters" 哪些需要转义(即在前面放置一个反斜杠 - 并且字符类内外的规则不同.)有优秀的在线教程:www.regular-expressions.info.你在那里度过的时间会物有所值.快乐的正则表达式!
Regexes are fun! But I would strongly recommend spending an hour or two studying the basics. For starters, you need to learn which characters are special: "metacharacters" which need to be escaped (i.e. with a backslash placed in front - and the rules are different inside and outside character classes.) There is an excellent online tutorial at: www.regular-expressions.info. The time you spend there will pay for itself many times over. Happy regexing!
这篇关于如何在 string.replace 中输入正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!