问题描述
在尝试使用YQL提供的Yahoo查询语言和xpath功能解析html时,遇到了无法提取text()或属性值的问题。
例如,
select * from html where url =http: //backoverflow.com
和xpath ='// div / h3 / a'
将锚点列表作为xml
< results>
< a class =question-hyperlinkhref =/ questions / 661184 /当按钮被点击时用文本填充文本区域title =In ASP.net,当点击一个按钮时,我需要代码来填充文本区域(在表单中)。你能通过显示一个简单的包含脚本标签的.aspx代码来帮助我完成操作吗?>用文本区域填充文本区域单击按钮时的文本< / a> ...
< / results>
现在,当我尝试使用 $ b $提取节点值时b
select * from html where url =http://stackoverflow.com
和xpath ='// div / h3 / a / text()'
我得到结果连接而不是节点列表
eg
< results> Xcode:附加到远程进程进行调试为什么是b
......< / results>
我如何将其分成节点列表,以及如何选择属性值?
像这样的查询
select * from html where url = http://stackoverflow.com
和xpath ='// div / h3 / a [@href]'
给了我同样的结果来查询 div / h3 / a
YQL要求xpath表达式求值为itemPath而不是节点文本。但是,一旦你有一个itemPath,你可以从树中投射各种值。
换句话说,ItemPath应该指向结果HTML中的节点而不是文本内容/属性。当您从数据中选择*时,YQL返回所有匹配的节点及其子节点。
示例
select * from html where url =http://stackoverflow.comand xpath ='// div / h3 / a'
这将返回与xpath匹配的所有a。现在要投影文本内容,您可以使用
从html中选择内容,其中url =http:// stackoverflow。 com和xpath ='// div / h3 / a'
content返回文本内容在节点内部举行。
为了突出显示属性,您可以指定它相对于xpath表达式。在这种情况下,由于您需要与href有关的href。
从html中选择href其中url =http:/ /stackoverflow.com和xpath ='// div / h3 / a'
返回
< results>
....
< / results>
如果您需要属性'href'和textContent,则可以执行以下YQL查询:
选择href,html中的内容where url =http://stackoverflow.com和xpath ='// div / h3 / a'
返回:
<结果> < a href =/ questions / 663950 / double-pointer-const-issue-issue>双指针const问题< / a> ...< / results>
希望有所帮助。让我知道你是否对YQL有更多问题。
While trying to parse html using Yahoo Query Language and xpath functionality provided by YQL, I ran into problems of not being able to extract "text()" or attribute values.
For e.g.
perma link
select * from html where url="http://stackoverflow.com"
and xpath='//div/h3/a'
gives a list of anchors as xml
<results>
<a class="question-hyperlink" href="/questions/661184/filling-the-text-area-with-the-text-when-a-button-is-clicked" title="In ASP.net, I need the code to fill the text area (in the form) when a button is clicked. Can you help me through by showing a simple .aspx code containing the script tag? ">Filling the text area with the text when a button is clicked</a>...
</results>
Now when I try to extract the node value using
select * from html where url="http://stackoverflow.com"
and xpath='//div/h3/a/text()'
I get results concatenated rather than a node liste.g.
<results>Xcode: attaching to a remote process for debuggingWhy is b
…… </results>
How do I separate it into node lists and how do I select attribute values ?
A query like this
select * from html where url="http://stackoverflow.com"
and xpath='//div/h3/a[@href]'
gave me the same results for querying div/h3/a
YQL requires the xpath expression to evaluate to an itemPath rather than node text. But once you have an itemPath you can project various values from the tree
In other words an ItemPath should point to the Node in the resulting HTML rather than text content/attributes. YQL returns all matching nodes and their children when you select * from the data.
example
select * from html where url="http://stackoverflow.com" and xpath='//div/h3/a'
This returns all the a's matching the xpath. Now to project the text content you can project it out using
select content from html where url="http://stackoverflow.com" and xpath='//div/h3/a'
"content" returns the text content held within the node.
For projecting out attributes, you can specify it relative to the xpath expression. In this case, since you need the href which is relative to a.
select href from html where url="http://stackoverflow.com" and xpath='//div/h3/a'
this returns<results> <a href="/questions/663973/putting-a-background-pictures-with-leds"/> <a href="/questions/663013/advantages-and-disadvantages-of-popular-high-level-languages"/>....</results>
If you needed both the attribute 'href' and the textContent, then you can execute the following YQL query:
select href, content from html where url="http://stackoverflow.com" and xpath='//div/h3/a'
returns:
<results> <a href="/questions/663950/double-pointer-const-issue-issue">double pointer const issue issue</a>... </results>
Hope that helps. let me know if you have more questions on YQL.
这篇关于使用Yahoo YQL查询html的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!