本文介绍了我想从 carrefouruae.com 获取产品数据.当我检查产品名称和 div 类时,它返回空括号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

网站图片这是我正在使用的代码.输出只返回一个空括号,而不是它在类中的数据.

website imageThis is the code I'm using. The output is returning just a empty brackets instead of the data it has in the class.

        import requests 
    from bs4 import BeautifulSoup as bs

    from http import cookiejar  
    class BlockAll(cookiejar.CookiePolicy):
        return_ok = set_ok = domain_return_ok = path_return_ok = lambda self, *args, **kwargs: False
        netscape = True
        rfc2965 = hide_cookie2 = False
        
    s = requests.Session()
    s.cookies.set_policy(BlockAll())
    url = "https://www.carrefouruae.com/mafuae/en/c/F1600000?currentPage=0&filter=&nextPageOffset=0&pageSize=60&sortBy=relevance"
            
    headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Cafari/537.36'}


    r = s.get(url, headers=headers)
    soup = bs(r.text, 'html.parser')
    s=soup.find_all("div",{"class":"ltr-12fzzt2"})
    print(s)

推荐答案

像这样的动态网站需要 Selenium 从中抓取数据.BeautifulSoup 不是为了这个.像这样的网站包含大量的 Java 脚本内容.您的代码没有任何问题.

Dynamic websites like this one need Selenium to scrape data from them. BeautifulSoup is not for this. websites like these contain lot of Java script content. There is nothing wrong with your code.

这篇关于我想从 carrefouruae.com 获取产品数据.当我检查产品名称和 div 类时,它返回空括号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-28 04:41