JTidy java API将HTML转换为XHTML

本文介绍了JTidy java API将HTML转换为XHTML的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用JTidy从HTML转换为XHTML，但我在XHTML文件中找到了此标记& nbsp; 。
我可以阻止吗？

这是我的代码

I am using JTidy to convert from HTML to XHTML but I found in my XHTML file this tag  .Can i prevent it ?
this is my code

    //from html to xhtml
   try
    {
        fis = new FileInputStream(htmlFileName);
    }
    catch (java.io.FileNotFoundException e)
    {
        System.out.println("File not found: " + htmlFileName);
    }
        Tidy tidy = new Tidy();
        tidy.setShowWarnings(false);
        tidy.setXmlTags(false);
        tidy.setInputEncoding("UTF-8");
        tidy.setOutputEncoding("UTF-8");
        tidy.setXHTML(true);//
        tidy.setMakeClean(true);
        Document xmlDoc = tidy.parseDOM(fis, null);
    try
    {
        tidy.pprint(xmlDoc,new FileOutputStream("c.xhtml"));
    }
    catch(Exception e)
    {
    }

推荐答案

我创建了一个解析xhtml代码并删除不受欢迎的标签
的函数，并添加了一个指向css文件tableStyle.css的链接

i created a function that parse the the xhtml code and remove the unwelcome tagsand to add a link to the css File "tableStyle.css"

    public static  String xhtmlparser(){
    String Cleanline="";

    try {
        // the file url
        FileInputStream fstream = new FileInputStream("c.xhtml");
        // Use DataInputStream to read binary NOT text.
        BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
        String strLine = null;
        int linescounter=0;
        while ((strLine = br.readLine()) != null)   {// read every line in the file
            String m=strLine.replaceAll("&nbsp;", "");
            linescounter++;
            if(linescounter==5)
                m=m+"\n"+ "<link rel="+ "\"stylesheet\" "+"type="+ "\"text/css\" "+"href= " +"\"tableStyle.css\""+ "/>";
            Cleanline+=m+"\n";
        }

    }
    catch(IOException e){}

    return Cleanline;
}

但是性能问题是好的吗？

but as a performance issue is it good?

按其工作方式

这篇关于JTidy java API将HTML转换为XHTML的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！