Tesseract运行错误

本文介绍了Tesseract运行错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在Linux上运行tesseract-ocr引擎时遇到问题.我已经下载了RUS语言数据并将其放入tessdata目录(/usr/local/share/tessdata).当我尝试使用命令tesseract blob.jpg out -l rus运行tesseract时，它显示错误:

I have a problem with running tesseract-ocr engine on linux. I've downloaded RUS language data and put it to tessdata directory (/usr/local/share/tessdata). When I'm trying to run tesseract with command tesseract blob.jpg out -l rus , it displays an error:

Error opening data file /usr/local/share/tessdata/eng.traineddata

Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.

Failed loading language eng
Tesseract couldn't load any languages!

Could not initialize tesseract.

根据 编译指南 ，我使用了export TESSDATA_PREFIX='/usr/local/share/'指向我的tessdata目录.也许我应该编辑任何配置文件? Tesseract尝试加载"eng"数据文件而不是"rus".

According to compiling guide, I used export TESSDATA_PREFIX='/usr/local/share/' to point my tessdata directory.Maybe I should edit any config files? Tesseract try to load 'eng' data files instead of 'rus'.

截屏: http://i.stack.imgur.com/I0Guc.png >

推荐答案

您可以获取eng.traineddata Github:

You can grab eng.traineddata Github:

wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata

检查 https://github.com/tesseract-ocr/tessdata 以获得完整列表训练有素的语言数据.

Check https://github.com/tesseract-ocr/tessdata for a full list of trained language data.

当您抓取文件时，将它们移动到/usr/local/share/tessdata文件夹.警告:某些Linux发行版(例如openSUSE和Ubuntu)可能期望在/usr/share/tessdata中使用它.

When you grab the file(s), move them to the /usr/local/share/tessdata folder. Warning: some Linux distributions (such as openSUSE and Ubuntu) may be expecting it in /usr/share/tessdata instead.

# If you got the data from Google, unzip it first!
gunzip eng.traineddata.gz 
# Move the data
sudo mv -v eng.traineddata /usr/local/share/tessdata/

这篇关于Tesseract运行错误的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！