快速/高效地计算Python中空格分隔的字符串列表 | RAM

RAM

使用星星打印形状

寻找比GNU Indent更灵活的工具

Dask Dataframe有效的行对生成器?

Firefox上的window.open（url）和window.location.href = url有什么区别？

如何在java swing中设置按钮的背景图像

Foriegnkey问题提交表单

如何设置电子UserAgent

在Python中，如何确定IP地址是否私有?

查找网络驱动器的UNC路径?

从mysql数据库回显特定的行

如何检测（路由器/网关）的净变化公网IP地址？

如何将特定模式的文件递归复制到 Windows 上的单个平面文件夹中?

如何在JRE上禁用IPv4 IP堆栈的IPv6堆栈？

过滤os.walk()目录和文件

为什么Google Cloud SQL（使用JDBC）花费较长时间从Google App Engine插入记录比从我的个人计算机中删除记录？

快速/高效地计算Python中空格分隔的字符串列表

扫码查看

本文介绍了快速/高效地计算Python中空格分隔的字符串列表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

给出输入：

x = ['foo bar', 'bar blah', 'black sheep']

我可以这样做来获取以空格分隔的字符串列表中每个单词的计数：

I could do this to get the count of each word in the list of space delimited string:

from itertools import chain
from collections import Counter
c = Counter(chain(*map(str.split, x)))

或者我可以简单地遍历并获得：

Or I could simple iterate through and get:

c = Counter()
for sent in x:
    for word in sent.split():
        c[word]+=1

[出]：

Counter({'bar': 2, 'sheep': 1, 'blah': 1, 'foo': 1, 'black': 1})

问题是如果字符串的输入列表非常大，效率更高？还有其他方法吗？来实现相同的计数器对象？

The question is which is more efficient if the input list of string is extremely huge? Are there other ways to achieve the same counter object?

想象一下，这是一个文本文件对象，其中包含数十亿行，每行10-20个单词。

Imagine it's a text file object that has billions of lines with 10-20 words each.

推荐答案

您的问题的答案是。

以下是一些分析工具：

print 在重要位置。（或使用）

跟踪Python内存中的所有对象（有益于内存泄漏）

对于长时间运行的系统，请使用：允许对活动对象进行自省（网络浏览器界面）

for RAM用法

使用 dis

print time.time() in strategic places. (or use Unix time)
cProfile
line_profiler
heapy tracks all objects inside Python’s memory (good for memory leaks)
For long-running systems, use dowser: allows live objects introspection (web browser interface)
memory_profiler for RAM usage
examine Python bytecode with dis

这篇关于快速/高效地计算Python中空格分隔的字符串列表的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！