问题描述
我正在尝试从关键统计信息"页面中提取雅虎股票代码的信息(因为 Pandas 库不支持此功能).
I'm trying to pull information from the 'Key Statistics' page for a ticker in Yahoo (since this isn't supported in the Pandas library).
AAPL 示例:
from bs4 import BeautifulSoup
import requests
url = 'http://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
enterpriseValue = soup.findAll('$ENTERPRISE_VALUE', attrs={'class': 'yfnc_tablehead1'}) #HTML tag for where enterprise value is located
print(enterpriseValue)
谢谢安迪!
问题:这是打印一个空数组.如何更改我的 findAll
以返回 598.56B
?
Question: This is printing an empty array. How do I change my findAll
to return 598.56B
?
推荐答案
好吧,find_all
返回的列表为空的原因是因为该数据是通过未完成的单独调用生成的只需向该 URL 发送 GET
请求即可.如果您浏览 Chrome/Firefox 上的网络选项卡并按 XHR 过滤,通过检查每个网络操作的请求和响应,您也可以找到应该发送 GET
请求的 URL.
Well, the reason the list that find_all
returns is empty is because that data is generated with a separate call that isn't completed by just sending a GET
request to that URL. If you look through the Network tab on Chrome/Firefox and filter by XHR, by examining the requests and responses of each network action, you can find what you URL you ought to be sending the GET
request too.
在这种情况下,它是 https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?formatted=true&crumb=8ldhetOu7RJ&lang=en-US®ion=US&modules=defaultKeyStatistics%2CfinancialData%2CcalendarEvents&corsDomain=finance.yahoo.com
,我们可以在这里看到:
In this case, it's https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?formatted=true&crumb=8ldhetOu7RJ&lang=en-US®ion=US&modules=defaultKeyStatistics%2CfinancialData%2CcalendarEvents&corsDomain=finance.yahoo.com
, as we can see here:
那么,我们如何重新创建它?简单的!:
So, how do we recreate this? Simple! :
from bs4 import BeautifulSoup
import requests
r = requests.get('https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?formatted=true&crumb=8ldhetOu7RJ&lang=en-US®ion=US&modules=defaultKeyStatistics%2CfinancialData%2CcalendarEvents&corsDomain=finance.yahoo.com')
data = r.json()
这会将 JSON
响应作为 dict
返回.从那里,通过 dict
导航,直到找到所需的数据:
This will return the JSON
response as a dict
. From there, navigate through the dict
until you find the data you're after:
financial_data = data['quoteSummary']['result'][0]['defaultKeyStatistics']
enterprise_value_dict = financial_data['enterpriseValue']
print(enterprise_value_dict)
>>> {'fmt': '598.56B', 'raw': 598563094528, 'longFmt': '598,563,094,528'}
print(enterprise_value_dict['fmt'])
>>> '598.56B'
这篇关于使用 BeautifulSoup 搜索雅虎财经的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!