问题是,当我尝试通过调用此url来模拟此过程时: https://www.tele2.no/Services/Webshop/FilterService.svc/ApplyPhoneFilters 我收到400响应,告诉我该请求是不允许的.这是我的代码: #-*-编码:utf-8-*-进口沙皮导入jsonTele2Spider(scrapy.Spider)类:名称="tele2"#allowed_domains = ["tele2.no/mobiltelefon.aspx"]start_urls =('https://www.tele2.no/mobiltelefon.aspx/',)def parse(自身,响应):url ='https://www.tele2.no/Services/Webshop/FilterService.svc/ApplyPhoneFilters'my_data ="{filters:[]}"req = scrapy.Request(url,method ='POST',body = json.dumps(my_data),headers = {'X-Requested-With':'XMLHttpRequest','Content-Type':'application/json'},callback = self.parser2)产量要求def parser2(自身,响应):打印测试" 我是scrapy和python的新手,所以我可能会缺少一些明显的东西解决方案关键问题在于正文中过滤器周围缺少引号: url ='https://www.tele2.no/Services/Webshop/FilterService.svc/ApplyPhoneFilters'req = scrapy.Request(url,method ='POST',body ='{"filters":[]}',headers = {'X-Requested-With':'XMLHttpRequest','Content-Type':'application/json;charset = UTF-8'},callback = self.parser2)产量要求 或者,您可以将其定义为字典,然后调用 json.dumps()将其转储为字符串: params = {过滤器":[]}req = scrapy.Request(url,method ='POST',body = json.dumps(params),headers = {'X-Requested-With':'XMLHttpRequest','Content-Type':'application/json;charset = UTF-8'},callback = self.parser2) 作为证明,这是控制台上给我的东西: 2014-12-30 12:30:38-0500 [tele2]调试:已抓取(200)< GET https://www.tele2.no/mobiltelefon.aspx/>(引荐来源:无)2014-12-30 12:30:42-0500 [tele2]调试:已抓取(200)< POST https://www.tele2.no/Services/Webshop/FilterService.svc/ApplyPhoneFilters>(引荐来源:https://www.tele2.no/mobiltelefon.aspx/)测试 I'm trying to get data from a site using Ajax. The page loads and then Javascript requests the content. See this page for details: https://www.tele2.no/mobiltelefon.aspxThe problem is that when i try to simulate this process by calling this url:https://www.tele2.no/Services/Webshop/FilterService.svc/ApplyPhoneFiltersI get a 400 response telling me that the request is not allowed.This is my code:# -*- coding: utf-8 -*-import scrapyimport jsonclass Tele2Spider(scrapy.Spider): name = "tele2" #allowed_domains = ["tele2.no/mobiltelefon.aspx"] start_urls = ( 'https://www.tele2.no/mobiltelefon.aspx/', ) def parse(self, response): url = 'https://www.tele2.no/Services/Webshop/FilterService.svc/ApplyPhoneFilters' my_data = "{filters: []}" req = scrapy.Request( url, method='POST', body=json.dumps(my_data), headers={'X-Requested-With': 'XMLHttpRequest','Content-Type':'application/json'}, callback=self.parser2) yield req def parser2(self, response): print "test"I'm new to scrapy and python so there might be something obvious I'm missing 解决方案 The key problem is in missing quotes around the filters in the body:url = 'https://www.tele2.no/Services/Webshop/FilterService.svc/ApplyPhoneFilters'req = scrapy.Request(url, method='POST', body='{"filters": []}', headers={'X-Requested-With': 'XMLHttpRequest', 'Content-Type': 'application/json; charset=UTF-8'}, callback=self.parser2)yield reqOr, you can define it as a dictionary and then call json.dumps() to dump it to a string:params = {"filters": []}req = scrapy.Request(url, method='POST', body=json.dumps(params), headers={'X-Requested-With': 'XMLHttpRequest', 'Content-Type': 'application/json; charset=UTF-8'}, callback=self.parser2)As a proof, here is what it is giving me on the console:2014-12-30 12:30:38-0500 [tele2] DEBUG: Crawled (200) <GET https://www.tele2.no/mobiltelefon.aspx/> (referer: None)2014-12-30 12:30:42-0500 [tele2] DEBUG: Crawled (200) <POST https://www.tele2.no/Services/Webshop/FilterService.svc/ApplyPhoneFilters> (referer: https://www.tele2.no/mobiltelefon.aspx/)test 这篇关于Scrapy模拟XHR请求-返回400的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云!
06-11 20:33