本文介绍了Scrapy 将抓取的值返回到数组中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
Scrapy 似乎正确地提取了数据,但是在我的 JSON 对象中格式化输出,就好像它是一个数组:
Scrapy seems to be pulling the data out correctly, but is formatting the output in my JSON object as if it were an array:
[{"price": ["$34"], "link": ["/product/product..."], "name": ["productname"]},
{"price": ["$37"], "link": ["/product/product"]...
我的蜘蛛类看起来像这样:
My spider class looks like this:
def parse(self, response):
sel = Selector(response)
items = sel.select('//div/ul[@class="product"]')
skateboards = []
for item in items:
skateboard = SkateboardItem()
skateboard['name'] = item.xpath('li[@class="desc"]//text()').extract()
skateboard['price'] = item.xpath('li[@class="price"]"]//text()[1]').extract()
skateboard['link'] = item.xpath('li[@class="image"]').extract()
skateboards.append(skateboard)
return skateboards
我将如何确保 Scrapy 只为每个键输出一个值,而不是它当前生成的数组?
How would I go about ensuring that Scrapy is only outputting a single value for each key, rather than the array it's currently producing?
推荐答案
.extract()
总是返回一个你可以使用的列表
always returns a list you can use
''.join(item.xpath('li[@class="desc"]//text()').extract())
获取字符串
这篇关于Scrapy 将抓取的值返回到数组中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!