问题描述
信息:
我有一个为Google网站管理员工具生成XML站点地图的程序(以及其他内容)。
GWTs给我某些网站地图的错误,因为网址包含字符序列,例如ã¾,ã<,ã...等。
GWTs说:
特殊字符在XML (含HTML实体)。
XML文件片段:
<?xml version =1.0 encoding =UTF-8?>
/ pre>
< urlset xmlns =http://www.sitemaps.org/schemas/sitemap/0.9>
< url>
< loc> http:// domain / folder / listing-&#227;&#129; .shtml< / loc>
...
我的网址是否编码为UTF-8?
如果没有,如何在Java 中执行此操作?
以下是程序中我将网址添加到站点地图的行:siteMap.addUrl(StringEscapeUtils.escapeXml(countryName +/+ twoCharFile.getRelativeFileName()。toLowerCase()));
** =可能是前两个例子。
我为所有编辑道歉。
解决方案尝试使用
URLEncoder.encode(stringToBeEncoded,UTF-8)
编码网址。Info:
I've a program which generates XML sitemaps for Google Webmaster Tools (among other things).
GWTs is giving me errors for some sitemaps because the URLs contain character sequences like ã¾, ã‹, ã€, etc. **GWTs says:
The special characters are excaped in the XML files (with HTML entities).
XML file snippet:<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>http://domain/folder/listing-ã.shtml</loc> ...
Are my URLs UTF-8 encoded?
If not, How do I do this in Java?
The following is the line in my program where I add the URL to the sitemap:siteMap.addUrl(StringEscapeUtils.escapeXml(countryName+"/"+twoCharFile.getRelativeFileName().toLowerCase()));
** = I'm not sure which ones are causing the error, probably the first two examples.
I apologize for all the editing.
解决方案Try using
URLEncoder.encode(stringToBeEncoded, "UTF-8")
to encode the url.这篇关于UTF-8编码URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!