问题描述
如何在运行Tesseract for English时禁用字典更正?
How can I disable dictionary corrections when running Tesseract for English language?
我目前正在以子进程的方式运行tesseract。
I'm currently running tesseract as a child process.
推荐答案
尝试将这些变量设置为false:
Try to set these variables (put them in a config file) to false:
load_system_dawg
load_freq_dawg
load_punc_dawg
load_number_dawg
load_unambig_dawg
load_bigram_dawg
load_fixed_length_dawgs
另请参阅在常见问题。从中:
Also read How to increase the trust in/strength of the dictionary? in the FAQ. From it:
对于tesseract-ocr> = 3.01,尝试增加变量 language_model_penalty_non_freq_dict_word 和 language_model_penalty_non_dict_word 。默认情况下,它们分别为0.1和0.15。
For tesseract-ocr >= 3.01 try increasing the variables language_model_penalty_non_freq_dict_word and language_model_penalty_non_dict_word in a config file. By default they are 0.1 and 0.15 respectively.
这篇关于在Tesseract中禁用字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!