问题描述
我需要做的简单的程序whcih需要提取使用OneNote的Interop图片文字?可以任何一个建议我适当的文档我的概念吗?
I need to do the simple Program whcih need to extract text from image using Onenote Interop? Could any one suggest me the appropriate document for my concept please?
推荐答案
通过OneNote中的OCR识别文本存储在之一:OCRText 在OneNote中的XML文件的结构元素。 。例如
Text recognized by OneNote's OCR is stored in the one:OCRText element in the XML file structure in OneNote. e.g.
<one:Page ...>
...
<one:Image ...>
...
<one:OCRData lang="en-US">
<one:OCRText><![CDATA[This is some sampletext]]></one:OCRText>
</one:OCRData>
</one:Image>
</one:Page>
您可以看到使用了一个名为OMSPY程序(它表明你的OneNote页面背后的XML)这个XML -
You can see this XML using a program called OMSPY (it shows you the XML behind OneNote pages) - http://blogs.msdn.com/b/johnguin/archive/2011/07/28/onenote-spy-omspy-for-onenote-2010.aspx
要提取您需要使用的OneNote COM互操作的文本(正如你所指出)。 。例如
To extract the text you would use the OneNote COM interop (as you pointed out). e.g.
//Instantialize OneNote
ApplicationClass onApp = new ApplicationClass();
//Get the XMl from the selected page
string xml = "";
onApp.GetPageContent("put the page id here", out xml);
//Put it into an XML document (from System.XML.Linq)
XDocument xDoc = XDocument.Parse(xml);
//OneNote's Namespace - for OneNote 2010
XNamespace one = "http://schemas.microsoft.com/office/onenote/2010/onenote";
//Get all the OCRText from the page
string[] OCRText = xDoc.Descendants(one + "OCRText").Select(x => x.Value).ToArray();
请参阅MSDN上的应用程序接口文档的更多信息 - 的
See the "Application Interface" docs on MSDN for more info - http://msdn.microsoft.com/en-us/library/gg649853.aspx
这篇关于需要一个文件提取使用OneNote的Interop图片文字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!