我试图使用Xpath
获取DataTable
标头。
我的输出应该是:
ItemNum |项目|的ResultCode |状态| ExtBackLinks | RefDomains | AnalysisResUnitsCost | ACRank |的ItemType | IndexedURLs | GetTopBackLinksAnalysisResUnitsCost | DownloadBacklinksAnalysisResUnitsCost | DownloadRefDomainBacklinksAnalysisResUnitsCost | RefIPs | RefSubNets | RefDomainsEDU | ExtBackLinksEDU | RefDomainsGOV | ExtBackLinksGOV | RefDomainsEDU_Exact | ExtBackLinksEDU_Exact | RefDomainsGOV_Exact | ExtBackLinksGOV_Exact | CrawledFlag | LastCrawlDate | LastCrawlResult | RedirectFlag | FinalRedirectResult | OutDomainsExternal | OutLinksExternal | OutLinksInternal | OutLinksPages | LastSeen |标题| RedirectTo |语言LanguageDesc | LanguageConfidence | LanguagePageRatios | LanguageTotalPages | RefLanguage | RefLanguageDesc | RefLanguageConfidence | RefLanguagePageRatios | RefLanguageTotalPages | CrawledURLs | RootDomainIPAddress | TotalNonUniqueLinks | NonUniqueLinkTypeHomepages | NonUniqueLinkTypeIndirect | NonUniqueLinkTypeDeleted | NonUniqueLinkTypeNoFollow | NonUniqueLinkTypeProtocolHTTPS | NonUniqueLinkTypeFrame | NonUniqueLinkTypeImageLink | NonUniqueLinkTypeRedirect | NonUnique LinkTypeTextLink | RefDomainTypeLive | RefDomainTypeFollow | RefDomainTypeHomepageLink | RefDomainTypeDirect | RefDomainTypeProtocolHTTPS | CitationFlow | TrustFlow | TrustMetric | TopicalTrustFlow_Topic_0 | TopicalTrustFlow_Value_0 | TopicalTrustFlow_Topic_1 | TopicalTrustFlow_Value_1 | TopicalTrustFlow_Value_1 | TopicalTrustFlow_Value_1
这是原始的XML:
<Result Code="OK" ErrorMessage="" FullError="">
<GlobalVars FirstBackLinkDate="2012-09-21" IndexBuildDate="2018-05-24 19:47:18" IndexType="0" MostRecentBackLinkDate="2018-04-23" QueriedRootDomains="1" QueriedSubDomains="0" QueriedURLs="0" QueriedURLsMayExist="0" ServerBuild="2018-06-11 13:52:01" ServerName="BRUNO28" ServerVersion="1.0.6736.23160" UniqueIndexID="20180524194718-HISTORICAL"/>
<DataTables Count="1">
<DataTable Name="Results" RowsCount="1" Headers="ItemNum|Item|ResultCode|Status|ExtBackLinks|RefDomains|AnalysisResUnitsCost|ACRank|ItemType|IndexedURLs|GetTopBackLinksAnalysisResUnitsCost|DownloadBacklinksAnalysisResUnitsCost|DownloadRefDomainBacklinksAnalysisResUnitsCost|RefIPs|RefSubNets|RefDomainsEDU|ExtBackLinksEDU|RefDomainsGOV|ExtBackLinksGOV|RefDomainsEDU_Exact|ExtBackLinksEDU_Exact|RefDomainsGOV_Exact|ExtBackLinksGOV_Exact|CrawledFlag|LastCrawlDate|LastCrawlResult|RedirectFlag|FinalRedirectResult|OutDomainsExternal|OutLinksExternal|OutLinksInternal|OutLinksPages|LastSeen|Title|RedirectTo|Language|LanguageDesc|LanguageConfidence|LanguagePageRatios|LanguageTotalPages|RefLanguage|RefLanguageDesc|RefLanguageConfidence|RefLanguagePageRatios|RefLanguageTotalPages|CrawledURLs|RootDomainIPAddress|TotalNonUniqueLinks|NonUniqueLinkTypeHomepages|NonUniqueLinkTypeIndirect|NonUniqueLinkTypeDeleted|NonUniqueLinkTypeNoFollow|NonUniqueLinkTypeProtocolHTTPS|NonUniqueLinkTypeFrame|NonUniqueLinkTypeImageLink|NonUniqueLinkTypeRedirect|NonUniqueLinkTypeTextLink|RefDomainTypeLive|RefDomainTypeFollow|RefDomainTypeHomepageLink|RefDomainTypeDirect|RefDomainTypeProtocolHTTPS|CitationFlow|TrustFlow|TrustMetric|TopicalTrustFlow_Topic_0|TopicalTrustFlow_Value_0|TopicalTrustFlow_Topic_1|TopicalTrustFlow_Value_1|TopicalTrustFlow_Topic_2|TopicalTrustFlow_Value_2" MaxTopicsRootDomain="30" MaxTopicsSubDomain="20" MaxTopicsURL="10" TopicsCount="3">
<Row>
0|nu.nl|OK|Found|508322106|165344|508322106|-1|1|4149991|5000|512472097|3356880|59147|26204|233|3613|43|308|73|1757|4|12|False| | |True| |5|10|44|1722150| |NU - Het laatste nieuws het eerst op NU.nl|https://www.nu.nl/|nl|Dutch/Flemish|92|99.9|482980|nl,en,de|Dutch/Flemish,English,German|87,93,58|96.5,3.1,0.1|76319583|1915923|52.85.201.19|611833777|15034990|53120677|444371798|95283418|52384870|388104|53497551|5655999|552292123|102171|115787|21952|150164|49554|76|70|70|News/Breaking News|69|Sports/Resources|45|Arts/Radio|43
</Row>
</DataTable>
</DataTables>
</Result>
当我在Google表格中使用此
Xpath
命令时:=importxml("http://enterprise.majesticseo.com/api_command?privatekey=xxx&accessToken=xxx&cmd=GetIndexItemInfo&item0=nu.nl&items=1","//DataTable"
我得到行结果。很棒,但是我还需要在工作表的第一行中添加标题名称。
最佳答案
XPath的简短介绍:-)
使用//DataTable
,您将在XML中的任何位置获取任何<DataTable>
的完整节点(此处不涉及名称空间)。
根据经验,最好是尽可能具体(而不是使用/Result/DataTables/DataTable
)。但这不是您问题的答案...
试想一下这样的XML:
<root>
<innerNode attr="1"><a>Some a content</a><b>Some b content</b></innerNode>
<innerNode attr="2"><a>aaa</a><b>bbb</b></innerNode>
</root>
使用
/root/innerNode
,您将同时获得所有内容的<innerNode>
。使用
/root/innerNode[(b/text())[1]="bbb"]
只会得到一个<innerNode>
,其中<b>
的text()
是"bbb"
使用
/root/innerNode[@attr="1"]
,您将得到一个<innerNode>
,其中属性attr
的值为“ 2”。所有三个
XPath
样本都带回整个节点,包括子节点,属性等等。如果只需要属性的值,则必须要求它:
(/root/innerNode/@attr)[2]
...返回第二个
<innerNode>
的属性值(实际上是第二次出现)/root/innerNode[(b/text())[1]="Some b content"]/@attr
...返回
<innerNode>
的属性值,其中<b>
具有text()
0f "Some b content"
回到你的问题
您想读取位于
Headers
的元素<DataTable>
中的属性/Result/DataTables
。只需使用/Result/DataTables/DataTable/@Headers
关于xml - XML输出所需的Xpath帮助,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/51040348/