我想要的只是刮所有产品。为什么我也不能使用container.div?当我的教程中只有<div><\div><div>
时,我真的很困惑。
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = 'https://hbx.com/categories/sneakers'
# membuka koneksi, mengambil halaman
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
# html parsing
page_soup = soup(page_html, "html.parser")
# mengambil masing2 produk
containers = page_soup.findAll("div",{"class":"product-wrapper col-xs-6 col-sm-4"})
filename = "kontol.csv"
f = open(filename, "w")
headers = "judul, brand, harga\n"
f.write(headers)
for container in containers:
title_container = container.findAll("h3", {"class":"name"})
judul = title_container[0].text
brand_container = container.findAll("h4", {"class":"brand"})
brand = brand_container[0].text
price_container = container.findAll("span", {"class":"regular-price"})
harga = price_container[0].text
print("judul: " + judul)
print("brand: " + brand)
print("harga: " + harga)
f.write(judul + "," + brand + "," + harga + "\n")
f.close()
当我尝试使用container.findAll(“ h3”,{“ class”:“ name”})调用时,出现此错误
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python36\lib\site-packages\bs4\element.py", line 1807, in __getattr__
"ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?" % key
AttributeError: ResultSet object has no attribute 'findAll'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
最佳答案
试试下面的脚本,告诉我它不能解决问题。我使用条件语句来避免任何项目都不存在时应该发生的任何错误,例如在第二个结果中,价格都不存在。现在效果很好。
import requests ; from bs4 import BeautifulSoup
url = "https://hbx.com/categories/sneakers"
soup = BeautifulSoup(requests.get(url).text,"lxml")
for item in soup.find_all(class_="product-box"):
name = item.find(class_="name").text if item.find(class_="name") else ""
brand = item.find(class_="brand").text if item.find(class_="brand") else ""
price = item.find(class_="regular-price").text if item.find(class_="regular-price") else ""
print(name,brand,price)
或使用
find_all
(如果您愿意)。但是,结果始终相同。for item in soup.find_all(class_="product-box"):
name = item.find_all(class_="name")[0].text if item.find_all(class_="name") else ""
brand = item.find_all(class_="brand")[0].text if item.find_all(class_="brand") else ""
price = item.find_all(class_="regular-price")[0].text if item.find_all(class_="regular-price") else ""
print(name,brand,price)
部分结果:
Club C 85 Reebok USD 75.00
NMD R2 Runner Primeknit Adidas Originals
NMD R2 Runner Adidas Originals USD 155.00
关于python - 为什么我不能调用container.findAll(“h3”,{“class”:“name”}))?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/46792631/