本文介绍了使用beautifulsoup4的CSS选择不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我尝试了bs4,但是select方法不起作用.
I tried bs4, but the select method doesn't work.
我的代码怎么了?
import requests
import bs4
def main():
r = requests.get("http://nodejs.org/download/")
soup = bs4.BeautifulSoup(r.text)
selector = "div.interior:nth-child(2) > table:nth-child(2) > tbody:nth-child(1) > tr:nth-child(1) > td:nth-child(3) > a:nth-child(1)"
print(soup.select(selector)[0].text)
if __name__ == "__main__":
main()
推荐答案
此页面上的答案不同于在浏览器中查看与使用bs进行解析不同.看一下您的r.text并从那里解析.
The Answer on this page differs from viewing in a browser than parsing with bs.Have a look at your r.text and parse from there.
响应类似于
<div class="interior row">
<div id="installers">
<ul>
<li>
<a href="http://nodejs.org/dist/v0.10.26/node-v0.10.26-x86.msi">
<img alt="" height="50" src="http://nodejs.org/images/platform-icon-win.png" width="45">
Windows Installer
<small>node-v0.10.26-x86.msi</small>
</img></a>
</li>
<li>
<a href="http://nodejs.org/dist/v0.10.26/node-v0.10.26.pkg">
<img alt="" height="50" src="http://nodejs.org/images/platform-icon-osx.png" width="45">
Macintosh Installer
<small>node-v0.10.26.pkg</small>
所以这里没有桌子.希望这会有所帮助.
so there is no table here.Hope this helps.
以下是我的代码,以获得此响应:
My Code is the following to get this response:
def main():
r = requests.get("http://nodejs.org/download/")
soup = bs4.BeautifulSoup(r.text)
# print r.text
selector = "div.interior"
print(soup.select(selector)[2])
您可以尝试使用find.你对那个很灵活.
Edit 2:You could try it with find. You are mor flexible with that one.
soup = bs4.BeautifulSoup(r.text)
print(soup.find("a", text="64-bit"))
这应该起作用:
def main():
r = requests.get("http://nodejs.org/download/", headers={"content-type":"text", "User- Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.52 Safari/536.5"})
soup = bs4.BeautifulSoup(r.text)
print(soup.find("table").tr.td.findNextSibling().a['href'])
这篇关于使用beautifulsoup4的CSS选择不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!