问题描述
我正在使用langdetect
来确定一组我所知道的英语或法语字符串的语言.
I'm using langdetect
to determine the language of a set of strings which I know are either in English or French.
有时,langdetect
告诉我罗马尼亚语是我所知道的法语字符串.
Sometimes, langdetect
tells me the language is Romanian for a string I know is in French.
如何使langdetect
仅在英语或法语之间进行选择,而不是在所有其他语言之间进行选择?
How can I make langdetect
choose between English or French only, and not all other languages?
谢谢!
推荐答案
选项1
一个选择是使用软件包langid
.然后,您可以使用方法调用来简单地限制语言:
One option would be using the package langid
instead. Then you can simply restrict the languages with a method call:
import langid
langid.set_languages(['fr', 'en']) # ISO 639-1 codes
lang, score = langid.classify('This is a french or english text')
print(lang) # en
选项2
如果您确实要使用langdetect
软件包,则可以复制软件包文件夹(如果不确定它在哪里,请使用python -m site --user-site)
并从文件夹
If you really want to use the langdetect
package, you can copy the package folder (if you're not sure where it is, use python -m site --user-site)
and remove the profiles you don't need from the folder langdetect\profiles
.
但这不是一个非常动态的解决方案.
This is not a very dynamic solution though.
这篇关于Python langdetect:仅选择一种语言或另一种语言的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!