时未提供架构和其他错误

时未提供架构和其他错误

本文介绍了使用 requests.get() 时未提供架构和其他错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在通过遵循自动化无聊的东西来学习 Python.这个程序应该去http://xkcd.com/ 并下载所有图像以供离线查看.

I'm learning Python by following Automate the Boring Stuff. This program is supposed to go to http://xkcd.com/ and download all the images for offline viewing.

我使用的是 2.7 版和 Mac.

I'm on version 2.7 and Mac.

出于某种原因,我收到诸如未提供架构"之类​​的错误以及使用 request.get() 本身的错误.

For some reason, I'm getting errors like "No schema supplied" and errors with using request.get() itself.

这是我的代码:

# Saves the XKCD comic page for offline read

import requests, os, bs4, shutil

url = 'http://xkcd.com/'

if os.path.isdir('xkcd') == True: # If xkcd folder already exists
    shutil.rmtree('xkcd') # delete it
else: # otherwise
    os.makedirs('xkcd') # Creates xkcd foulder.


while not url.endswith('#'): # If there are no more posts, it url will endswith #, exist while loop
    # Download the page
    print 'Downloading %s page...' % url
    res = requests.get(url) # Get the page
    res.raise_for_status() # Check for errors

    soup = bs4.BeautifulSoup(res.text) # Dowload the page
    # Find the URL of the comic image
    comicElem = soup.select('#comic img') # Any #comic img it finds will be saved as a list in comicElem
    if comicElem == []: # if the list is empty
        print 'Couldn\'t find the image!'
    else:
        comicUrl = comicElem[0].get('src') # Get the first index in comicElem (the image) and save to
        # comicUrl

        # Download the image
        print 'Downloading the %s image...' % (comicUrl)
        res = requests.get(comicUrl) # Get the image. Getting something will always use requests.get()
        res.raise_for_status() # Check for errors

        # Save image to ./xkcd
        imageFile = open(os.path.join('xkcd', os.path.basename(comicUrl)), 'wb')
        for chunk in res.iter_content(10000):
            imageFile.write(chunk)
        imageFile.close()
    # Get the Prev btn's URL
    prevLink = soup.select('a[rel="prev"]')[0]
    # The Previous button is first <a rel="prev" href="/1535/" accesskey="p">&lt; Prev</a>
    url = 'http://xkcd.com/' + prevLink.get('href')
    # adds /1535/ to http://xkcd.com/

print 'Done!'

以下是错误:

Traceback (most recent call last):
  File "/Users/XKCD.py", line 30, in <module>
    res = requests.get(comicUrl) # Get the image. Getting something will always use requests.get()
  File "/Library/Python/2.7/site-packages/requests/api.py", line 69, in get
    return request('get', url, params=params, **kwargs)
  File "/Library/Python/2.7/site-packages/requests/api.py", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File "/Library/Python/2.7/site-packages/requests/sessions.py", line 451, in request
    prep = self.prepare_request(req)
  File "/Library/Python/2.7/site-packages/requests/sessions.py", line 382, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "/Library/Python/2.7/site-packages/requests/models.py", line 304, in prepare
    self.prepare_url(url, params)
  File "/Library/Python/2.7/site-packages/requests/models.py", line 362, in prepare_url
    to_native_string(url, 'utf8')))
requests.exceptions.MissingSchema: Invalid URL '//imgs.xkcd.com/comics/the_martian.png': No schema supplied. Perhaps you meant http:////imgs.xkcd.com/comics/the_martian.png?

问题是我已经多次阅读书中有关该程序的部分,阅读请求文档,以及查看此处的其他问题.我的语法看起来正确.

The thing is I've been reading the section in the book about the program multiple times, reading the Requests doc, as well as looking at other questions on here. My syntax looks right.

感谢您的帮助!

这不起作用:

comicUrl = ("http:"+comicElem[0].get('src'))

我认为添加 http: 之前会消除没有提供架构的错误.

I thought adding the http: before would get rid of the no schema supplied error.

推荐答案

将您的 comicUrl 更改为此

comicUrl = comicElem[0].get('src').strip("http://")
comicUrl="http://"+comicUrl
if 'xkcd' not in comicUrl:
    comicUrl=comicUrl[:7]+'xkcd.com/'+comicUrl[7:]

print "comic url",comicUrl

这篇关于使用 requests.get() 时未提供架构和其他错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 03:26