本文介绍了Python Glassdoor API的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从他们的 Python API 中获取 glassdoor 数据:

导入 urllib2id1 = 'x'键 = 'y'行动 = '雇主'公司 = '公司'basepath = 'http://api.glassdoor.com/api/api.htm?v=1&format=json&t.p='url = basepath + id1 + '&t.k=' + key + '&action=' + action + '&q=' + company + '&userip=192.168.43.42&useragent=Mozilla/5.0'响应 = urllib2.urlopen(url)html = response.read()

我收到以下错误:

>>>响应 = urllib2.urlopen(url)回溯(最近一次调用最后一次):文件<stdin>",第 1 行,在 <module> 中文件//anaconda/lib/python2.7/urllib2.py",第 154 行,在 urlopen返回 opener.open(url, data, timeout)文件//anaconda/lib/python2.7/urllib2.py",第437行,打开响应 = 甲基(请求,响应)文件//anaconda/lib/python2.7/urllib2.py",第 550 行,在 http_response'http'、请求、响应、代码、味精、hdrs)文件//anaconda/lib/python2.7/urllib2.py",第475行,出错返回 self._call_chain(*args)_call_chain 中的文件//anaconda/lib/python2.7/urllib2.py",第 409 行结果 = func(*args)文件//anaconda/lib/python2.7/urllib2.py",第 558 行,在 http_error_default 中引发 HTTPError(req.get_full_url(), code, msg, hdrs, fp)urllib2.HTTPError:HTTP 错误 403:禁止

有人可以帮忙吗...?

谢谢

解决方案

以下是通过添加 BeautifulSoup 模块并在 hdr 变量中设置 User-Agent 进行了一些改进的工作代码.

import urllib2, sys从 BeautifulSoup 导入 BeautifulSoupurl = "http://api.glassdoor.com/api/api.htm?tp=yourID&tk=yourkey&userip=8.28.178.133&useragent=Mozilla&format=json&v=1&action=employers&q="hdr = {'用户代理':'Mozilla/5.0'}req = urllib2.Request(url,headers=hdr)响应 = urllib2.urlopen(req)汤 = BeautifulSoup(响应)

希望能帮到你,谢谢

I'm trying to get glassdoor data from their API in Python:

import urllib2

id1 = 'x'
key = 'y'
action = 'employers'
company = 'company'

basepath = 'http://api.glassdoor.com/api/api.htm?v=1&format=json&t.p='
url = basepath + id1 + '&t.k=' + key + '&action=' + action + '&q=' + company + '&userip=192.168.43.42&useragent=Mozilla/5.0'

response = urllib2.urlopen(url)
html = response.read()

And I'm getting the following error:

>>> response = urllib2.urlopen(url)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "//anaconda/lib/python2.7/urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "//anaconda/lib/python2.7/urllib2.py", line 437, in open
    response = meth(req, response)
  File "//anaconda/lib/python2.7/urllib2.py", line 550, in http_response
    'http', request, response, code, msg, hdrs)
  File "//anaconda/lib/python2.7/urllib2.py", line 475, in error
    return self._call_chain(*args)
  File "//anaconda/lib/python2.7/urllib2.py", line 409, in _call_chain
    result = func(*args)
  File "//anaconda/lib/python2.7/urllib2.py", line 558, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden

Can anyone help...?

Thanks

解决方案

below is the working code with some improvement by adding BeautifulSoup module and set User-Agent in hdr variable.

import urllib2, sys
from BeautifulSoup import BeautifulSoup

url = "http://api.glassdoor.com/api/api.htm?t.p=yourID&t.k=yourkey&userip=8.28.178.133&useragent=Mozilla&format=json&v=1&action=employers&q="
hdr = {'User-Agent': 'Mozilla/5.0'}
req = urllib2.Request(url,headers=hdr)
response = urllib2.urlopen(req)
soup = BeautifulSoup(response)

Hope it help, thanks

这篇关于Python Glassdoor API的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-16 19:36