问题描述
我有一个包含字符串的元组列表例如:
I have a list of tuples that has strings in itFor instance:
[('this', 'is', 'a', 'foo', 'bar', 'sentences')
('is', 'a', 'foo', 'bar', 'sentences', 'and')
('a', 'foo', 'bar', 'sentences', 'and', 'i')
('foo', 'bar', 'sentences', 'and', 'i', 'want')
('bar', 'sentences', 'and', 'i', 'want', 'to')
('sentences', 'and', 'i', 'want', 'to', 'ngramize')
('and', 'i', 'want', 'to', 'ngramize', 'it')]
现在,我希望将元组中的每个字符串连接起来,以创建由空格分隔的字符串列表.我使用了以下方法:
Now I wish to concatenate each string in a tuple to create a list of space separated strings.I used the following method:
NewData=[]
for grams in sixgrams:
NewData.append( (''.join([w+' ' for w in grams])).strip())
这工作得很好.
但是,我的列表中有超过一百万个元组.所以我的问题是这种方法是否足够有效或是否有更好的方法来做到这一点.谢谢.
However, the list that I have has over a million tuples. So my question is that is this method efficient enough or is there some better way to do it.Thanks.
推荐答案
对于大量数据,您应考虑是否需要将其全部保存在列表中.如果您一次要处理每个字符串,则可以创建一个生成器,该生成器将产生每个连接的字符串,但不会使它们全都占用内存:
For a lot of data, you should consider whether you need to keep it all in a list. If you are processing each one at a time, you can create a generator that will yield each joined string, but won't keep them all around taking up memory:
new_data = (' '.join(w) for w in sixgrams)
如果您还可以从生成器中获取原始元组,那么您也可以避免在内存中包含sixgrams
列表.
if you can get the original tuples also from a generator, then you can avoid having the sixgrams
list in memory as well.
这篇关于在python列表中串联一个元组的元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!