本文介绍了计算字符串的出现次数,不区分大小写的搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 我正在尝试读取文件并获取字符串出现的计数,而不考虑大小写(上/下)。但我的代码没有给出理想的结果。 为什么会这样? 另外如何使我的搜索不区分大小写? 代码是: import os,re fileName_path = input( 请输入带位置的文件名:) directory = os.path.dirname(fileName_path) os.chdir(directory) fileName = os.path.basename(fileName_path) openFile = open(fileName, r ) cnt = 0 openFile as readFile: for searchpattern in readFile: if ' tempCharSearch' searchpattern: cnt + = 1 openFile.close() print (cnt) 在文本文件中有14个tempCharSearch,但结果只显示3,为什么会这样? 此处附带的文本文件: Lorem Ipsum 简单 虚拟 text tempCharSearch :='100-111-875'打印 和 排版 行业。 Lorem Ipsum 已 tempCharSearch:='100-111-875'行业的标准 dummy text 永远 自 tempCharSearch:=' 100-111-875' 1500s , 未知 printer 参加 galley type 和 scrambled it to make a 类型 标本 书。 它 幸存 不 仅 五 几个世纪,但 tempCharSearch:='100- 111-875' leap into electronic 排版,剩余 基本上 不变。 popularized in tempCharSearch:='100-111-875' 1960s with tempCharSearch:='100-111-875' release Letraset 表 包含 Lorem Ipsum 段落,和 更多 最近 桌面 发布 software like Aldus PageMaker 包括 版本 的 Lorem Ipsum 。 tempCharSearch:='100-111-875're 很多 variants 段落 Lorem Ipsum 可用,但是 tempCharSearch:='100-111-875'多数 遭遇 更改 in some form , by 注入 幽默,或 randomized words 不外观 甚至 略 可信。 如果 您 正在 到 使用 段落 Lorem Ipsum ,您 需要 确定 tempCharSearch:='100-111-875 '不是任何 令人尴尬 隐藏 tempCharSearch:='100-111-875' text 的e> middle 。 所有 tempCharSearch:='100-111-875' Lorem 生成器 tempCharSearch:='100-111-875' Internet tend to 重复 预定义 chunks as 必要,制作 此 tempCharSearch:=' 100-111-875' first true generator tempCharSearch:='100-111-875 '互联网。 它 使用 a over code-leadattribute> dictionary 200 拉丁语 字,合并 a 少数 model 句子 结构,到 生成 Lorem Ipsum whi ch 看起来 合理。 tempCharSearch:='100-111-875'生成 Lorem Ipsum tempCharSearch:='100-111-875'refore 始终 free 来自 重复,注入 幽默,或 非特征 字 etc 。 解决方案 您的代码不计算文件中tempCharSearch的出现次数,而是计算出现模式的行数。由于您的输入文件似乎只有三行,每个人都包含g多次出现,结果为3. 使用Python的内置字符串计数方法计算一行中的所有事件: cnt + = searchpattern.count(' tempCharSearch'); 如果你想比较不区分大小写,那么在运行之前将行字符串和搜索模式转换为小写计数,例如: 行> readFile: cnt + = line.lower()。count(' tempcharsearch'); I am trying to read a file and get the count of occurence of a string irrespective of case(upper/lower). But my code is not giving desired results.Why is it so? Also how can I make my search case insensitive?code is: import os,refileName_path = input ("Please input the file name with location: ")directory = os.path.dirname(fileName_path)os.chdir(directory)fileName = os.path.basename(fileName_path)openFile = open(fileName ,"r")cnt = 0with openFile as readFile: for searchpattern in readFile: if 'tempCharSearch' in searchpattern: cnt += 1openFile.close()print (cnt)In the text file there are 14 tempCharSearch, but the result is showing only 3, why is it so?The text file attached here with:Lorem Ipsum is simply dummy text of tempCharSearch:='100-111-875' printing and typesetting industry. Lorem Ipsum has been tempCharSearch:='100-111-875' industry's standard dummy text ever since tempCharSearch:='100-111-875' 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.It has survived not only five centuries, but also tempCharSearch:='100-111-875' leap into electronic typesetting, remaining essentially unchanged. It was popularised in tempCharSearch:='100-111-875' 1960s with tempCharSearch:='100-111-875' release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.tempCharSearch:='100-111-875're are many variations of passages of Lorem Ipsum available, but tempCharSearch:='100-111-875' majority have suffered alteration in some form, by injected humour, or randomised words which don't look even slightly believable. If you are going to use a passage of Lorem Ipsum, you need to be sure tempCharSearch:='100-111-875're isn't anything embarrassing hidden in tempCharSearch:='100-111-875' middle of text. All tempCharSearch:='100-111-875' Lorem Ipsum generators on tempCharSearch:='100-111-875' Internet tend to repeat predefined chunks as necessary, making this tempCharSearch:='100-111-875' first true generator on tempCharSearch:='100-111-875' Internet. It uses a dictionary of over 200 Latin words, combined with a handful of model sentence structures, to generate Lorem Ipsum which looks reasonable. tempCharSearch:='100-111-875' generated Lorem Ipsum is tempCharSearch:='100-111-875'refore always free from repetition, injected humour, or non-characteristic words etc. 解决方案 Your code is not counting the number of occurrences of "tempCharSearch' in the file, but the number of lines, in which the pattern occurs. As your input file appears to have just three lines, each one containing multiple occurrences, your result is 3.Use Python's built in string count method to count all occurrences in a line:cnt += searchpattern.count ('tempCharSearch');If you want to compare case insensitive then convert both the line string and your search pattern to lower-case before running the count, for example:for line in readFile: cnt += line.lower().count ('tempcharsearch'); 这篇关于计算字符串的出现次数,不区分大小写的搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
10-31 00:31