问题描述
我大约有100,000个唯一的城市名称,其中许多有拼写错误(扫描错误,ocr错误,带有特殊字符的许多欧洲名称,等等).我可以在python中编写一个循环,用Google地图逐一检查城市,以查看拼写是否正确吗?例如.如果我发送新纽约",我想收到类似你的意思是:纽约"之类的信息.我已经做了很多事情,例如与列表匹配,然后计算levenshtein距离,等等.
I have about 100,000 unique city names and many of them have spelling mistakes (bad scanning, bad ocr, many european names with special characters, etc...). Can I write a loop in python to check cities one by one with google maps, to see if the spelling is correct? E.g. if I send "nev york", I want to receive something like "Did you mean: New York". I've already done lots of things such as matching with a list and then calculating the levenshtein distance, etc.
推荐答案
我刚刚发现了difflib
非常酷的东西.
i just found out about difflib
its pretty cool stuff.
它几乎像拼写检查一样
>>> import difflib
>>> x = 'smoke'
>>> y = ['choke','poke','loc','joke','mediocre', 'folk']
>>>
>>> difflib.get_close_matches(x,y)
['poke', 'joke', 'choke']
>>> x = 'nev york'
>>> y = ['New York', 'Compton', ' Phoenix']
>>> difflib.get_close_matches(x,y)
['New York']
唯一的另一部分是将您的所有城市正确拼写到列表中..或找到带有正确拼写城市"字文件的人
The only other part, is to get all your cities correctly spelled into a list.. or find someone with a "correctly spelled city" word file
这篇关于Python-使用Google Maps API检查城市名称的拼写的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!