本文介绍了使用WordToHtmlConverter转换器的Apache POI的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用WordToHtmlConverter类word文档转换成HTML格式,但文档不清晰。

I am trying to use WordToHtmlConverter class to convert a word document in HTML, but the documentation is not clear.

该WordToHtmlConverter有一个构造函数org.w3c.dom.Document中服用,但我不认为这是Word文档。

The WordToHtmlConverter has a constructor taking org.w3c.dom.Document, but I don't think it is the word document.

有没有人对如何加载word文档,并将其转换成HTML的示例程序。

Does anyone have a sample program on how to load a word document and convert it into html.

推荐答案

您最好的选择,现在大概是看单元测试,如: TestWordToHtmlConverter 。这将告诉你如何做到这一点。

You best bet for now is probably to look at the unit tests, eg TestWordToHtmlConverter. That will show you how to do it

在一般虽然,你在XML文档中传递给填充,已WordToHtmlConverter生成HTML到其从Word文档,然后将XML文档转换成适当的输出(缩进,新行等)

In general though, you pass in the xml document to be populated, have WordToHtmlConverter generate the HTML into it from the Word document, then transform the xml document into appropriate output (indenting, new lines etc)

您code会想看起来是这样的:

Your code would want to look something like:

    Document newDocument = DocumentBuilderFactory.newInstance()
            .newDocumentBuilder().newDocument();
    WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
            newDocument );

    wordToHtmlConverter.processDocument( hwpfDocument );

    StringWriter stringWriter = new StringWriter();
    Transformer transformer = TransformerFactory.newInstance()
            .newTransformer();
    transformer.setOutputProperty( OutputKeys.INDENT, "yes" );
    transformer.setOutputProperty( OutputKeys.ENCODING, "utf-8" );
    transformer.setOutputProperty( OutputKeys.METHOD, "html" );
    transformer.transform(
            new DOMSource( wordToHtmlConverter.getDocument() ),
            new StreamResult( stringWriter ) );

    String html = stringWriter.toString();

这篇关于使用WordToHtmlConverter转换器的Apache POI的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-27 17:09