是否有可能在网页上抓取“动态网页"?与Beautifulsoup?

本文介绍了是否有可能在网页上抓取“动态网页"?与Beautifulsoup?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前正在开始使用beautifulsoup抓取网站，尽管我缺乏有关网页的理论知识，但我想我已经掌握了基本知识，我会尽力提出自己的问题.

I am currently begining to use beautifulsoup to scrape websites, I think I got the basics even though I lack theoretical knowledge about webpages, I will do my best to formulate my question.

动态网页的含义是:一个HTML网站，其HTML根据用户操作(在我的情况下是可折叠表格)而更改.

What I mean with dynamical webpage is the following: a site whose HTML changes based on user action, in my case its collapsible tables.

我想在"div"标签中获取数据，但是当您加载页面时，该数据在html代码中似乎是不可用的，当您单击表时它将展开，并且此"div"的类"从类似"blabla可折叠的东西"变为"blabla可折叠活动的东西"，这我可以用我的知识来抓取.

I want to obtain the data inside some "div" tag but when you load the page, the data seems unavalible in the html code, when you click on the table it expands, and the "class" of this "div" changes from something like "something blabla collapsible" to "something blabla collapsible active" and this I can scrape with my knowledge.

我可以使用beautifulsoup获得此数据吗?万一我做不到，我想过要使用诸如硒之类的东西来单击所有表格，然后下载html(我可以抓取)，有没有更简单的方法?

Can I get this data using beautifulsoup? In case I can't, I thought of using something like selenium to click on all the tables and then download the html, which I could scrape, is there an easier way?

非常感谢您.

与Beautifulsoup

是否有可能在网页上抓取“动态网页"?与Beautifulsoup?

问题描述

推荐答案