我正在抓取此页面
http://www.modeluxproperties.com/?act=list_web&m=search&purpose=sale&project=&type=32&beds=&lop=&Submit.x=37&Submit.y=20
我想获取parking
属性的值:
HTML是这样的:
<span class="smallredtext" style="font-size:12px;">
<img src="images/listwebpoint.png" width="6" height="6"> Status: for <b>Sale</b>
<img src="images/listwebpoint.png" width="6" height="6"> Ref No: <b>AFS503</b>
<img src="images/listwebpoint.png" width="6" height="6"> BUA: <b>1700 Sq.Ft.</b>
<img src="images/listwebpoint.png" width="6" height="6"> Bedroom: <b>2</b>
<img src="images/listwebpoint.png" width="6" height="6"> Bathroom: <b>3</b>
<img src="images/listwebpoint.png" width="6" height="6"> Parking: <b>1</b>
</span>
这是我的xpath:
.//span[@class='smallredtext'][normalize-space(text())=Parking:]/following-sibling::b[1]/text()
我收到此错误:
raise ValueError("Invalid XPath: %s" % query)
ValueError: Invalid Xpath: //span[@class='smallredtext'][normalize-space(text())=Parking:]/following-sibling::b[1]/text()
我正在使用python 0.27的scrapy
最佳答案
找到b
标记并检查precending-sibling
:
.//span[@class='smallredtext']/b[preceding-sibling::text()=' Parking: ']/text()
UPD(使用
normalize-space()
):.//span[@class='smallredtext']/b[preceding-sibling::text()[normalize-space() = 'Parking:']]/text()
关于python - 当下一个元素是文本时,xpath中的异常,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/22361086/