问题描述
在过去的3个月中,我一直在尝试训练Tesseract
在确定我所拥有的图像集合后,由于实际缺乏
适当的文档,以及非常高的复杂度,我开始
放弃Tesseract作为解决方案.
For the past 3 months I've been trying to train the Tesseract
With identifying a collection of images I've had, due a real lack
of proper documentation, and very high level of complexity I'm starting to
give up on Tesseract as a solution.
我正在寻找一种替代方法,该方法将相对减轻痛苦
为了训练,我不想在这里重新发现轮子.
I'm looking for an alternative, which would be relatively pain free
for training, I'm not looking to rediscover the wheel here.
如果没有免费的东西,我想付费解决方案会
必须做的事(不超过200美元)
If there isn't anything free, I guess paid solutions would
have to do (nothing above 200$)
推荐答案
根据您的评论,您所需要做的就是以几乎100%的准确度扫描相对少量的文档,并且预算约为200美元
Based on your comment, all you need is to scan relatively small amount of documents with almost 100% accuracy and your budget is about 200$
那么,答案很简单.您不需要任何编程解决方案.只需购买优质的商用OCR产品,例如ABBYY FineReader(免责声明:我为ABBYY工作).它在不同地区的价格不同,但我想它在您的预算中.
Well, the answer is simple then. You don't need any programming solution. Just buy quality commercial OCR product, f.e. ABBYY FineReader (disclaimer: I work for ABBYY). It has different prices in different regions, but I guess it is somewhere in about your budget.
商用台式机OCR产品将为您提供开箱即用的典型语言几乎100%的准确性.此外,他们还有方便的手动验证工具来修复所有剩余的错误.通常,它们支持各种各样的现代字体,但是如果您的字体不是很普通的话,它们确实具有字体训练实用程序.
Commercial desktop OCR product will provide you out-of-the box almost 100% accuracy on typical languages. Also they have convenient manual verification tools to fix all remaining errors. Typically they support whole variety of modern fonts, but if your font is not trivial, they do have font training utility for that.
我确实认为这是您的最佳解决方案.
I do think that is optimal solution for you.
更新:Linux平台.不幸的是,很遗憾,几乎没有选择适用于Linux的高质量OCR产品.我知道的唯一一个来自ABBYY: http://ocr4linux.com/en:start 但它确实没有用户界面,验证和字体培训.但是至少您可以尝试一下,看看它是否会给您足够的准确性,这可能是事实.
UPDATE: Linux platform.Unfortunately, there is almost no choice of high quality OCR products for Linux, sorry. The only one I know is from ABBYY: http://ocr4linux.com/en:start but it does not have UI, verification and font training. But at least you can give it a try to see if it will give you good enough accuracy as it is, which may happen to be the case.
这篇关于替代Tesseract OCR培训吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!