本文介绍了如何抓取优惠券网站的优惠券代码(优惠券代码在点击按钮上出现)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想抓取一个页面 - 我正在使用 scrapy 和 python...
I want to scrape a page like - I'm using scrapy and python for the same...
我想刮下您可以在下图(左图)中看到的按钮
I want to scrape the button which you can see in the below pic (left pic)
http://postimg.org/image/syhauheo7/
当我点击绿色按钮 View Code
时,它会做三件事:
When I click the button in green saying View Code
, It does three things:
- 重定向到另一个 ID.
- 打开一个包含
code
的弹出窗口 - 在同一个页面上显示
code
,如上图所示右边
- Redirect to another id.
- Opens a popup containing
code
- Show the
code
on the same page as can be seen in the above picon right
如何使用scrapy和python框架抓取代码?
How can I scrape the code using scrapy and python framework?
推荐答案
这是你的蜘蛛:
from scrapy.http import Request
from scrapy.item import Item, Field
from scrapy.selector import HtmlXPathSelector
from scrapy.spider import BaseSpider
class VoucherItem(Item):
voucher_id = Field()
code = Field()
class CuponationSpider(BaseSpider):
name = "cuponation"
allowed_domains = ["cuponation.in"]
start_urls = ["https://www.cuponation.in/babyoye-coupons"]
def parse(self, response):
hxs = HtmlXPathSelector(response)
crawled_items = hxs.select('//div[@class="six columns voucher-btn"]/a')
for button in crawled_items:
voucher_id = button.select('@data-voucher-id').extract()[0]
item = VoucherItem()
item['voucher_id'] = voucher_id
request = Request("https://www.cuponation.in/clickout/index/id/%s" % voucher_id,
callback=self.parse_code,
meta={'item': item})
yield request
def parse_code(self, response):
hxs = HtmlXPathSelector(response)
item = response.meta['item']
item['code'] = hxs.select('//div[@class="code-field"]/span/text()').extract()
return item
如果您通过以下方式运行它:
If you run it via:
scrapy runspider <script_name.py> --output output.json
您将在 output.json
中看到以下内容:
you'll see the following in the output.json
:
{"voucher_id": "5735", "code": ["MUM10"]}
{"voucher_id": "3634", "code": ["Deal Activated. Enjoy Shopping"]}
{"voucher_id": "5446", "code": ["APP20"]}
{"voucher_id": "5558", "code": ["No code for this deal"]}
{"voucher_id": "1673", "code": ["Deal Activated. Enjoy Shopping"]}
{"voucher_id": "3963", "code": ["CNATION150"]}
{"voucher_id": "5515", "code": ["Deal Activated. Enjoy Shopping"]}
{"voucher_id": "4313", "code": ["Deal Activated. Enjoy Shopping"]}
{"voucher_id": "4309", "code": ["Deal Activated. Enjoy Shopping"]}
{"voucher_id": "1540", "code": ["Deal Activated. Enjoy Shopping"]}
{"voucher_id": "4310", "code": ["Deal Activated. Enjoy Shopping"]}
{"voucher_id": "1539", "code": ["Deal Activated. Enjoy Shopping"]}
{"voucher_id": "4312", "code": ["Deal Activated. Enjoy Shopping"]}
{"voucher_id": "4311", "code": ["Deal Activated. Enjoy Shopping"]}
{"voucher_id": "2785", "code": ["Deal Activated. Enjoy Shopping"]}
{"voucher_id": "3631", "code": ["Deal Activated. Enjoy Shopping"]}
{"voucher_id": "4496", "code": ["Deal Activated. Enjoy Shopping"]}
快乐爬行!
这篇关于如何抓取优惠券网站的优惠券代码(优惠券代码在点击按钮上出现)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!