问题描述
我使用的是 python 3.8.2 和 bs4 BeautifulSoup.我试图找到一个标签的所有实例,并在结果集中列出每个实例,每行一个.然而,返回的结果集包含的行数比网站的原始数据多.这是因为结果集的第一行包含标记的所有实例.下一行包含除第一个实例之外的所有实例,第三个包含除第一个和第二个实例以外的所有实例,依此类推以及结果集的其余部分.
I am using python 3.8.2 and bs4 BeautifulSoup. I am trying to find all instances of a tag and have each one listed in the result set, one per row. However the result set that is returned contains more lines than the original scrape of the website. This is because the first row of the result set contains all instances of the tag. The following row contains all instances except the first instance, the third contains all instances except the first and the second and so on and so forth with the remainder of the result set.
代码如下:
from bs4 import BeautifulSoup
import requests
url = "https://www.sainsburys.co.uk/shop/gb/groceries/drinks/seeall"
html_content = requests.get(url, timeout=5)
soup = BeautifulSoup(html_content.text)
test_1 = soup.find('ul',{"class": "productLister gridView"})
test = test_1.find_all("li", attrs={"class": "gridItem"})
我如何获得它以便 的每个实例仅单独列出,每行一个.
How do I get it so that each instance of <li class: "gridItem">
is only listed by itself, one per row.
谢谢
推荐答案
网站加载了 JavaScript
事件,该事件在页面加载后动态呈现其数据.
The website is loaded with JavaScript
event which render it's data dynamically once the page loads.
requests
库将无法即时呈现 JavaScript
.所以你可以使用 selenium
或 requests_html
.确实有很多模块可以做到这一点.
requests
library will not be able to render JavaScript
on the fly. so you can use selenium
or requests_html
. and indeed there's a lot of modules which can do that.
现在,我们在表上还有另一个选项,用于跟踪数据的呈现位置.我能够找到用于从 中检索数据的 XHR 请求后端
API
并呈现给用户端.
Now, we do have another option on the table, to track from where the data is rendered. I were able to locate the XHR request which is used to retrieve the data from the back-end
API
and render it to the users side.
您可以通过打开 XHR 请求">Developer-Tools 并检查 Network 并检查XHR/JS
请求取决于调用的类型,例如 fetch
下面你可以实现你的目标:
Below you can achieve your goal:
注意以下几点:
website
持有 3068item
- 我使用
parameter
"pageSize": "120"
将每页的项目增加到 - 所以
3068/120
= 比方说26
,这意味着 26 页每页 120 个项目. - 所以你需要从
(0, 3120, 120)
开始循环,这意味着0 >120 >240
等,使用参数"beginIndex": "0"
您将在for
循环下递增.
120
website
holding 3068item
- I've increased the items per page to be
120
usingparameter
"pageSize": "120"
- So
3068 / 120
= let's say26
, Which means 120 item per page for 26 pages. - So you will need to loop from
(0, 3120, 120)
which means0 > 120 > 240
and so on, Using parameter"beginIndex": "0"
which you will increment underfor
loop.
下面你可以实现你的目标,因为你没有向我们提供你的最终目标.但我相信你的目标是 name
或 price
(url, img) 或其他什么.你会找到的.
Below you can achieve your goal, since you didn't provided us your end goal. but i believe your target is name
or price
(url, img) or whatever. you will find it.
import requests
from bs4 import BeautifulSoup
params = {
"langId": "44",
"storeId": "10151",
"catalogId": "10241",
"categoryId": "12192",
"parent_category_rn": "",
"top_category": "12192",
"pageSize": "120",
"orderBy": "FAVOURITES_FIRST",
"searchTerm": "",
"catSeeAll": "true",
"beginIndex": "0",
"categoryFacetId1": "12192",
"categoryFacetId2": "",
"requesttype": "ajax"
}
def main(url):
with requests.Session() as req:
r = req.post(url, params=params).json()
for item in r[5]['productLists']:
for nest in item['products']:
soup = BeautifulSoup(nest['result'], 'html.parser')
target = soup.find("div", class_="productNameAndPromotions")
name = target.h3.a.text.strip()
url = target.h3.a.get("href")
img = f"https"+target.h3.a.img.get("src")
price = soup.find(
"p", class_="pricePerUnit").get_text(strip=True)
print(name, price, img, url)
main("https://www.sainsburys.co.uk/webapp/wcs/stores/servlet/gb/groceries/drinks/AjaxApplyFilterSearchResultView")
名称和价格的简要输出:
Brief output for name and price:
Sainsbury's British Semi Skimmed Milk 2.27L (4 pint) £1.10/unit
Sainsbury's British Semi Skimmed Milk 1.13L (2 pint) 80p/unit
Sainsbury's British Whole Milk 2.27L (4 pint) £1.10/unit
Cravendale Purefilter Semi Skimmed Milk 2L £1.90/unit
Sainsbury's British Skimmed Milk 2.27L (4 pint) £1.10/unit
Sainsbury's British Semi Skimmed Milk, SO Organic 2.27L (4 pint) £1.80/unit
Sainsbury's Sparkling Water, Basics 2L 25p/unit
Sainsbury's British Skimmed Milk 1.13L (2 pint) 80p/unit
Sainsbury's 100% Pure Squeezed Smooth Orange Juice, Not From Concentrate 1L £1.30/unit
Sainsbury's Water, Basics 2L 25p/unit
Sainsbury's British Whole Milk 1.13L (2 pint) 80p/unit
Sainsbury's Smooth Pure Orange Juice 1L 95p/unit
Pepsi Max 2L £1.90/unit
Sainsbury's Caledonian Still Water 4x2L £1.50/unit
Highland Spring Still Water 12x500ml £3.00/unit
Sainsbury's 100% Pressed Apple Juice, Not From Concentrate 1L £1.30/unit
Sainsbury's British Whole Milk, SO Organic 2.27L (4 Pint) £1.80/unit
Lactofree Semi Skimmed Lactose Free Fresh Dairy Drink 1L £1.50/unit
Diet Coke 8x330ml £4.00/unit
Alpro Roasted Almond Unsweetened UHT Drink 1L £1.80/unit
Robinsons Orange Squash No Added Sugar 1L £1.65/unit
Sainsbury's Soda Water 1L 60p/unit
Sainsbury's Caledonian Sparkling Water 4x2L £1.60/unit
Tropicana Smooth Orange Juice 950ml £2.45/unit
Sainsbury's Diet Indian Tonic Water 1L 60p/unit
Sainsbury's Pure Apple Juice 1L 95p/unit
Robinsons Apple & Blackcurrant Squash No Added Sugar 1L £1.65/unit
Sainsbury's Sparkling Flavoured Water, Lemon & Lime 1L 50p/unit
Sainsbury's Conegliano Prosecco, Taste the Difference 75cl £8.00/unit
Sainsbury's Unsweetened Soya Drink 1L 90p/unit
Sainsbury's British Semi Skimmed Milk, SO Organic 1.13L (2 pint) £1.15/unit
Sainsbury's Caledonian Sparkling Water 6x500ml £1.50/unit
Sainsbury's Apple & Blackcurrant Squash, No Added Sugar 1.5L £1.00/unit
Highland Spring Still Water 6x1.5L £3.00/unit
Alpro Roasted Almond Unsweetened Fresh Drink 1L £1.85/unit
Sainsbury's Semi Skimmed Long Life Milk 1L 90p/unit
Tropicana Smooth Orange Juice 1.6L £2.50/unit
Sainsbury's 100% Pure Squeezed Orange Juice with Bits, Not From Concentrate 1L £1.30/unit
Cravendale Purefilter Semi Skimmed Milk 1L £1.15/unit
Sainsbury's Caledonian Still Water Sports Cap 6x500ml £1.50/unit
Sainsbury's Double Strength Orange Squash, No Added Sugar 1.5L £1.00/unit
Diet Coke 18x330ml £7.00/unit
Sainsbury's Indian Tonic Water 1L 60p/unit
Sainsbury's Pure Orange Juice 1L 85p/unit
Sainsbury's Pure Apple Juice 6x200ml £1.50/unit
Buxton Still Natural Mineral Water 8x500ml £2.00/unit
Sainsbury's Whole Long Life Milk 1L £1.05/unit
Cravendale Purefilter Skimmed Milk 2L £1.90/unit
Sainsbury's Sparkling Flavoured Water, Blackcurrant & Cherry 1L 50p/unit
Innocent Smooth Orange Juice 1.35L £3.00/unit
Alpro Original Soya Fresh Drink 1L £1.55/unit
Sainsbury's Still Flavoured Water, Strawberry & Kiwi 1L 50p/unit
Sainsbury's British Filtered Semi Skimmed Milk 2L £1.35/unit
Sainsbury's Sparkling Flavoured Water, Mango & Passionfruit 1L 50p/unit
Sainsbury's Caledonian Still Water 5L £1.10/unit
McGuigan Estate Merlot 75cl £5.10/unit
Schweppes Slimline Tonic Water 1L £1.50/unit
PG tips Pyramid Tea Bags x240 696g £4.50/unit
Sainsbury's Sparkling Flavoured Water, Strawberry & Kiwi 1L 50p/unit
Sainsbury's Caledonian Sparkling Water 2L 55p/unit
Sainsbury's Sweetened Soya Drink 1L 90p/unit
Sainsbury's 100% Pure Squeezed Smooth Orange Juice, Not From Concentrate 1.75L £2.10/unit
Sainsbury's Diet Lemonade 2L 60p/unit
Sainsbury's Apple & Mango Juice, Not From Concentrate 1L £1.30/unit
Robinsons Summer Fruits Squash No Added Sugar 1L £1.65/unit
Sainsbury's 100% Pure Squeezed Pineapple Juice, Not From Concentrate 1L £1.30/unit
Clearsprings Sauvignon Blanc 75cl £5.50/unit
Phantom River Sauvignon Blanc 75cl £5.00/unit
Nestle Pure Life Still Spring Water 12x500ml £2.50/unit
Buxton Sparkling Natural Mineral Water 8x500ml £2.10/unit
Brancott Estate Sauvignon Blanc 75cl £6.75/unit
Schweppes Slimline Lemonade 2L £1.30/unit
McGuigan Estate South Australian Shiraz 75cl £5.10/unit
Coca-Cola Zero Sugar 8x330ml £4.00/unit
Villa Maria Private Bin Sauvignon Blanc 75cl £9.25/unit
Diet Coke Caffeine Free 8x330ml £4.00/unit
Sainsbury's British Skimmed Milk, SO Organic 1.13L (2 pint) £1.15/unit
Sainsbury's Kids Caledonian Still Water 6x300ml £1.10/unit
Canti Prosecco 75cl £7.50/unit
Oatly Enriched with Calcium Oat UHT Drink 1L £1.50/unit
Sainsbury's Pure Orange Juice 6x200ml £1.50/unit
Sainsbury's Still Flavoured Water, Lemon & Lime 1L 50p/unit
Valdo Prosecco Marca Oro 75cl £8.50/unit
Oyster Bay Sauvignon Blanc 75cl £8.00/unit
Ribena Blackcurrant Squash 850ml £2.30/unit
Volvic Mineral Water 6x1.5L £3.40/unit
Campo Viejo Rioja Tempranillo 75cl £6.75/unit
Nescafé Azera Americano Instant Coffee 100g £4.60/unit
Tropicana Orange Juice Original 950ml £2.45/unit
Sainsbury's Double Strength Orange & Mango Squash, No Added Sugar 1.5L £1.00/unit
Robinsons Lemon Squash No Added Sugar 1L £1.65/unit
Schweppes Lemonade 2L £1.30/unit
Robinsons Orange & Pineapple Squash No Added Sugar 1L £1.65/unit
Sainsbury's Diet Indian Tonic with Lime 1L 60p/unit
St Helen's Farm Semi Skimmed Goats Milk 1L £1.80/unit
Sainsbury's Double Strength Orange, Lemon & Pineapple Squash, No Added Sugar 1.5L £1.00/unit
Sainsbury's Double Strength Summerfruits Squash, No Added Sugar 1.5L £1.00/unit
Alpro Oat UHT Drink 1L £1.80/unit
Innocent Smooth Orange Juice 900ml £1.50/unit
Sainsbury's British Whole Milk, SO Organic 1.13L (2 pint) £1.15/unit
Sainsbury's Skimmed Long Life Milk 1L 80p/unit
Nescafé Gold Blend Instant Coffee 200g £7.00/unit
Highland Spring Still Water Sports Cap 12x330ml £3.00/unit
Sainsbury's Cava Brut 75cl £6.00/unit
Alpro Light Unsweetened Soya Fresh Drink 1L £1.55/unit
Sainsbury's Caledonian Still Water 2L 50p/unit
Koko Coconut UHT Drink 1L £1.50/unit
Sainsbury's House Pinot Grigio 75cl £4.50/unit
Sainsbury's Cola Zero 2L 45p/unit
St Helen's Farm Whole Goats Milk 1L £1.80/unit
Sainsbury's Double Strength Cherries & Berries Squash, No Added Sugar 1.5L £1.00/unit
Sainsbury's Lemonade 2L 60p/unit
Sainsbury's Pure Orange Juice With Bits 1L 85p/unit
Sainsbury's Pinot Grigio, Taste the Difference 75cl £6.00/unit
Schweppes Tonic Water 1L £1.50/unit
Sainsbury's Cranberry Juice Drink 1L 85p/unit
Nescafé Gold Blend Instant Coffee Refill 150g £3.50/unit
Sainsbury's Gold Roast Instant Coffee 200g £3.15/unit
Sainsbury's Pure Orange Juice with Juicy Bits 1L 95p/unit
Edizione 789 Di Mondelli Prosecco 75cl £6.25/unit
这篇关于使用 Find_All 函数返回一个意外的结果集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!