问题描述
我想从公共站点
页面包含 DIV
带班视图内容,里面有我的信息需要:
The page http://www.asx.com.au/asx/research/company.do#!/ACB/details contains a div
with class 'view-content', which has the information I need:
但是,当我尝试通过浏览这个页面Python的 urllib2.urlopen
该分区是空的:
But when I try to view this page via Python's urllib2.urlopen
that div is empty:
import urllib2
from bs4 import BeautifulSoup
url = 'http://www.asx.com.au/asx/research/company.do#!/ACB/details'
page = urllib2.urlopen(url).read()
soup = BeautifulSoup(page, "html.parser")
contentDiv = soup.find("div", {"class": "view-content"})
print(contentDiv)
# the results is an empty div:
# <div class="view-content" ui-view=""></div>
是否有可能的该div编程方式访问的内容?
Is it possible to access the contents of that div programmatically?
编辑:根据评论看来,内容是通过 Angular.js
渲染。是否有可能通过Python来触发内容的呈现?
as per the comment it appears that the content is rendered via Angular.js
. Is it possible to trigger the rendering of that content via Python?
推荐答案
本页面使用JavaScript来读取服务器上的数据,并填写页面。
This page use JavaScript to read data from server and fill page.
我看你用的开发工具铬 - 看到标签上的XHR,网络或JS的要求
I see you use developer tools in chrome - see in tab "Network" on "XHR" or "JS" requests.
我发现这个网址
<一个href=\"http://data.asx.com.au/data/1/company/ACB?fields=primary_share,latest_annual_reports,last_dividend,primary_share.indices&callback=angular.callbacks._0\" rel=\"nofollow\">http://data.asx.com.au/data/1/company/ACB?fields=primary_share,latest_annual_reports,last_dividend,primary_share.indices&callback=angular.callbacks._0
此网址给几乎JSON格式的所有数据
This url gives all data almost in JSON format
但是如果你使用这个链接,而不&安培;回调= angular.callbacks._0
那么你在纯JSON格式获取数据,你将可以使用 JSON
模块,将其转换为Python字典。
But if you use this link without &callback=angular.callbacks._0
then you get data in pure JSON format and you will could use json
module to convert it to python dictionary.
编辑:工作code
import urllib2
from bs4 import BeautifulSoup
import json
# new url
url = 'http://data.asx.com.au/data/1/company/ACB?fields=primary_share,latest_annual_reports,last_dividend,primary_share.indices'
# read all data
page = urllib2.urlopen(url).read()
# convert json text to python dictionary
data = json.loads(page)
print(data['principal_activities'])
Mineral exploration in Botswana, China and Australia.
这篇关于网页抓取 - 如何访问通过Angular.js在JavaScript中呈现的内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!