本文介绍了将lxml转换为scrapy xxs选择器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
如何将这个纯python lxml转换为内置于xxs选择器中的scrapy?这是可行的,但我想将其转换为草率的xxs选择器.
How can I convert this pure python lxml to scrapy built in xxs selectors? This one works but i want to convert this to the scrapy xxs selectors.
def parse_device_list(self, response):
self.log("\n\n\n List of devices \n\n\n")
self.log('Hi, this is the parse_device_list page! %s' % response.url)
root = lxml.etree.fromstring(response.body)
for row in root.xpath('//row'):
allcells = row.xpath('./cell')
# first cell contain the link to follow
detail_page_link = allcells[0].get("href")
yield Request(urlparse.urljoin(response.url, detail_page_link ), callback=self.parse_page)
推荐答案
尝试一下:
def parse_page(self, response):
xxs = XmlXPathSelector(response)
for row in xxs.select('//row'):
detail_page_link = row.select('.//cell[1]/@href')[0].extract()
yield Request(urlparse.urljoin(response.url, detail_page_link), callback=self.parse_page)
这篇关于将lxml转换为scrapy xxs选择器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!