问题描述
我在Google文档中使用Xpath获取< div> 中的文字。
我想将文本保存在Google文档电子表格的一个单元格内的< div id =job_description> 中,但它显示了每个<$ c $
< div id =job_description>< div> ;
< div>
< strong>
基本目的:
< / strong>
< br>< / br>
< / div>
< div>
与开发人员,产品所有者和Q ...密切合作
< br>< / br>
< / div>
< div>
测试分析师对开发人员负责...
< br>< / br>
< / div>
< div>
< strong>
职责和责任:
< / strong>
< / div>
< ul>
< li>< / li>
< li>< / li>
< / ul>
< div>
< strong>
要求:
< / strong>
< br>< / br>
< / div>
< ul>
< li>< / li>
< li>< / li>
< / ul>
< / div>
图片:
这就是代码我写道:
= IMPORTXML(E4,// div [@ id ='job_description'])
可以帮我把所有文字(包括< div> < div id =job_description>中的code> < ul> ...)< / code >只有一个单元格?
使用JOIN是一个很好的开始,但是您可以将它作为单个操作。 >
您没有显示要导入的网页的网址,因此我只能给出另一个网页的示例。例如,如果您正在导入www.w3.org并寻找 div ,其中 @ class ='event closed expand_block'$ c $使用
= JOIN(CHAR(10),IMPORTXML(http://www.w3.org/ ,// div [@ class ='event closed expand_block'] // text()))
// text()确保只有后代文本节点被检索到,即所有文本。
编辑:回应您的评论: $ b
当然可以。 CHAR 返回一个字符并将一个数字作为输入。在 CHAR(10)的情况下,返回一个换行符(我假定是因为&#10; )。
在公式中, CHAR(10)用作第一个参数 JOIN ,它是要连接的对象的分隔符。
I'm using Xpath in Google docs to get the text inside <div>.I want to save the text inside <div id="job_description"> in one cell of Google doc spreadsheet, but it shows each <div> in separate cell.
<div id="job_description"> <div> <strong> Basic Purpose: </strong> <br></br> </div> <div> Work closely with developers, product owners and Q… <br></br> </div> <div> The Test Analyst is accountable for the developmen… <br></br> </div> <div> <strong> Duties and Responsibilities: </strong> </div> <ul> <li></li> <li></li> </ul> <div> <strong> Requirements: </strong> <br></br> </div> <ul> <li></li> <li></li> </ul> </div>
Image: http://i.stack.imgur.com/K0mAY.png
and this is the code I wrote:
=IMPORTXML(E4,"//div[@id='job_description']")
May you help me to put all of the text (including <div> <ul> ...) inside the <div id="job_description"> in only one cell ?
Using JOIN is a good start, but you can make it a single operation.
You did not show the URL to the page you're importing, so I can only give you an example with another page. For instance, if you are importing www.w3.org and looking for a div where @class='event closed expand_block', use
=JOIN(CHAR(10),IMPORTXML("http://www.w3.org/","//div[@class='event closed expand_block']//text()"))
Notice that I also modified the XPath expression: //text() makes sure only descendant text nodes are retrieved, that is, all the text.
EDIT: Responding to your comment:
Yes, of course. CHAR returns a character and takes a number as input. In the case of CHAR(10), a newline character is returned (I assume because of ).
In the formula, CHAR(10) is used as the first argument of JOIN, which is the delimiter of the objects that are to be joined.
这篇关于< DIV> < div>内的标签在Google Spreadsheet中使用importXML Xpath查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!