问题描述
我有以下代码
import nltk, os, json, csv, string, cPickle
from scipy.stats import scoreatpercentile
lmtzr = nltk.stem.wordnet.WordNetLemmatizer()
def sanitize(wordList):
answer = [word.translate(None, string.punctuation) for word in wordList]
answer = [lmtzr.lemmatize(word.lower()) for word in answer]
return answer
words = []
for filename in json_list:
words.extend([sanitize(nltk.word_tokenize(' '.join([tweet['text']
for tweet in json.load(open(filename,READ))])))])
我编写时已经在单独的testing.py文件中测试了2-4行
I've tested lines 2-4 in a separate testing.py file when I wrote
import nltk, os, json, csv, string, cPickle
from scipy.stats import scoreatpercentile
wordList= ['\'the', 'the', '"the']
print wordList
wordList2 = [word.translate(None, string.punctuation) for word in wordList]
print wordList2
answer = [lmtzr.lemmatize(word.lower()) for word in wordList2]
print answer
freq = nltk.FreqDist(wordList2)
print freq
,命令提示符返回['the','the','the'],这正是我想要的(删除标点符号).
and the command prompt returns ['the','the','the'], which is what I wanted (removing punctuation).
但是,当我将完全相同的代码放在另一个文件中时,python返回一个TypeError声明
However, when I put the exact same code in a different file, python returns a TypeError stating that
File "foo.py", line 8, in <module>
for tweet in json.load(open(filename, READ))])))])
File "foo.py", line 2, in sanitize
answer = [word.translate(None, string.punctuation) for word in wordList]
TypeError: translate() takes exactly one argument (2 given)
json_list是所有文件路径的列表(我已打印并检查此列表是否有效).我对这个TypeError感到困惑,因为当我在另一个文件中对其进行测试时,一切工作都很好.
json_list is a list of all the file paths (I printed and check that this list is valid). I'm confused on this TypeError because everything works perfectly fine when I'm just testing it in a different file.
推荐答案
如果要完成的工作是在Python 3中执行与Python 2中相同的操作,这就是我在Python 2.0中所做的丢弃标点符号和数字:
If all you are looking to accomplish is to do the same thing you were doing in Python 2 in Python 3, here is what I was doing in Python 2.0 to throw away punctuation and numbers:
text = text.translate(None, string.punctuation)
text = text.translate(None, '1234567890')
这是我的Python 3.0等效版本:
Here is my Python 3.0 equivalent:
text = text.translate(str.maketrans('','',string.punctuation))
text = text.translate(str.maketrans('','','1234567890'))
基本上,它说的是什么都没有翻译成无用"(前两个参数),并将所有标点符号或数字都翻译成None
(即删除它们).
Basically it says 'translate nothing to nothing' (first two parameters) and translate any punctuation or numbers to None
(i.e. remove them).
这篇关于str.translate提供TypeError-Translate接受一个参数(给定2个参数),在Python 2中有效的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!