java - Selenium Web驱动程序getPageSource()错位了包含转义值的属性和值

在使用硒时，刚才我在解析硒getPageSource()方法的输出时遇到错误。
firefox页面来源上的实际元标记=

  <meta name="news_keywords" content="devo max,independence vote,no campaign,referendum,scotland \"no\" vote,scotland independence,scotland powers,scotland referendum,scotland vote,scottish referendum" />

使用带有selenium =的firefox驱动程序的getPageSource（）方法结果

<meta referendum"="" vote,scottish="" referendum,scotland="" powers,scotland="" independence,scotland="" vote,scotland="" no\"="" content="devo max,independence vote,no campaign,referendum,scotland \" name="news_keywords" />

在进一步处理html输出时，这非常荒谬并产生了问题。
有任何建议或帮助或解决方法吗？

最佳答案

从文档：

getPageSource

java.lang.String getPageSource（）

获取上次加载页面的来源。如果页面已被修改
加载（例如，通过Javascript）后，无法保证
返回的文本是修改后的页面的文本。请咨询
用于确定是否使用特定驱动程序的文档
返回的文本反映了页面或文本的当前状态
最后由网络服务器发送。返回的页面来源是
基础DOM的表示形式：不要期望它被格式化
或以与从Web服务器发送的响应相同的方式进行转义。
可以将其视为艺术家的印象。

返回值：
当前页面的来源

http://selenium.googlecode.com/git/docs/api/java/org/openqa/selenium/WebDriver.html#getPageSource%28%29