问题描述
我不知道如何为一组子字符串执行 line.startswith("substring")
,所以我在底部的代码上尝试了一些变体:因为我有已知的 4 字符开头子字符串的奢侈,但我很确定我的语法错误,因为这不会拒绝任何行.
I couldn't figure out how to perform line.startswith("substring")
for a set of substrings, so I tried a few variations on the code at bottom: since I have the luxury of known 4-character beginning substrings, but I'm pretty sure I've got the syntax wrong, since this doesn't reject any lines.
(上下文:我的目标是在读取文件时丢弃标题行.标题行以一组有限的字符串开头,但我不能只在任何地方检查子字符串,因为有效的内容行可能包括关键字后面的字符串.)
(Context: my aim is to throw out header lines when reading in a file. Header lines start with a limited set of strings, but I can't just check for the substring anywhere, because a valid content line may include a keyword later in the string.)
cleanLines = []
line = "sample input here"
if not line[0:3] in ["node", "path", "Path"]: #skip standard headers
cleanLines.append(line)
推荐答案
你的问题源于字符串切片不包括停止索引:
Your problem stems from the fact that string slicing is exclusive of the stop index:
In [7]: line = '0123456789'
In [8]: line[0:3]
Out[8]: '012'
In [9]: line[0:4]
Out[9]: '0123'
In [10]: line[:3]
Out[10]: '012'
In [11]: line[:4]
Out[11]: '0123'
在 i
和 j
之间切分字符串返回从 i
开始,到(但不包括)j 结束的子字符串.
Slicing a string between
i
and j
returns the substring starting at i
, and ending at (but not including) j
.
为了让你的代码运行得更快,你可能想要测试集合中的成员资格,而不是列表:
Just to make your code run faster, you might want to test membership in sets, instead of in lists:
cleanLines = []
line = "sample input here"
blacklist = set(["node", "path", "Path"])
if line[:4] not in blacklist: #skip standard headers
cleanLines.append(line)
现在,您实际使用该代码做的是一个
startswith
,它不受任何长度参数的限制:
Now, what you're actually doing with that code is a
startswith
, which is not restricted by any length parameters:
In [12]: line = '0123456789'
In [13]: line.startswith('0')
Out[13]: True
In [14]: line.startswith('0123')
Out[14]: True
In [15]: line.startswith('03')
Out[15]: False
所以你可以这样做来排除标题:
So you could do this to exclude headers:
cleanLines = []
line = "sample input here"
headers = ["node", "path", "Path"]
if not any(line.startswith(header) for header in headers) : #skip standard headers
cleanLines.append(line)
这篇关于检查字符串是否以 Python 中的几个子字符串之一开头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!