我有一个看似简单的问题,却似乎无法解决。给定一个包含DOI的字符串,如果最后一个字符是标点符号,我需要删除它,直到最后一个字符是字母或数字。
例如,如果字符串是:
sampleDoi = "10.1097/JHM-D-18-00044.',"
我需要以下输出:
"10.1097/JHM-D-18-00044"
即删除
.',
为此,我编写了以下脚本:
invalidChars = set(string.punctuation.replace("_", ""))
a = "10.1097/JHM-D-18-00044.',"
i = -1
for each in reversed(a):
if any(char in invalidChars for char in each):
a = a[:i]
i = i - 1
else:
print (a)
break
但是,这会产生
10.1097/JHM-D-18-00
,但我希望它产生10.1097/JHM-D-18-00044
。为什么44
从末尾移除? 最佳答案
更正代码:
import string
invalidChars = set(string.punctuation.replace("_", ""))
a = "10.1097/JHM-D-18-00044.',"
i = -1
for each in reversed(a):
if any(char in invalidChars for char in each):
a = a[:i]
i = i # Well Really this line can just be removed all together.
else:
print (a)
break
这将提供所需的输出,同时保持原始代码基本相同。