我有一个句子清单a = [['i am a testing'],['we are working on project']]
我正在尝试为列表中的所有句子创建单词词典。我试过了
vectorizer = CountVectorizer()
vectorizer.fit_transform(a)
coffee_dict2 = vectorizer.vocabulary_
而且我收到一个错误
AttributeError: 'list' object has no attribute 'lower'
我期望的结果是一本字典
{'i': 1, 'am': 1, 'testing': 2}
最佳答案
您需要扁平化的嵌套列表:
from sklearn.feature_extraction.text import CountVectorizer
coffee_reviews_test = [['i am a testing'],['we are working on project']]
from itertools import chain
vectorizer = CountVectorizer()
vectorizer.fit_transform(chain.from_iterable(coffee_reviews_test))
另一个解决方案:
vectorizer.fit_transform([x for y in coffee_reviews_test for x in y])
coffee_dict2 = vectorizer.vocabulary_
print (coffee_dict2)
{'am': 0, 'testing': 4, 'we': 5, 'are': 1, 'working': 6, 'on': 2, 'project': 3}
关于python - 为列表中的句子创建单词词典,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/57973566/