我试图弄清楚如何从Yahoo财务中获取财务信息(收益表,资产负债表和现金流量。我有一个名为symbol的列表,其中包含所有的股票代码(请参见下面的代码)。最终我想结束具有连续4年(2018、2017、2016、2015)的行的csv。
'''
我可以手动执行此操作,但是我要执行的操作是使其自动化,以便返回具有所有相关信息(77列和4 *#ticker符号行)的.csv文件。
python - 合并财务数据-LMLPHP
将上面的图片转到:
python - 合并财务数据-LMLPHP

我已经弄清楚了如何使用刮板从雅虎刮板数据。

from lxml import html
from lxml import html
import requests

import numpy as np

import pandas as pd
def scrape_table(url):
    page = requests.get(url)
    tree = html.fromstring(page.content)
    table = tree.xpath('//table')
    assert len(table) == 1

    df = pd.read_html(lxml.etree.tostring(table[0], method='html'))[0]

    df = df.set_index(0)
    df = df.dropna()
    df = df.transpose()
    df = df.replace('-', '0')

    # The first column should be a date
    df[df.columns[0]] = pd.to_datetime(df[df.columns[0]])
    cols = list(df.columns)
    cols[0] = 'Date'
    df = df.set_axis(cols, axis='columns', inplace=False)

    numeric_columns = list(df.columns)[1::]
    df[numeric_columns] = df[numeric_columns].astype(np.float64)

    return df



def merge_IS_BS_CF(df_IS, df_BS, df_CF):
    #merge the three financial statements - Income statement, balance sheet, cash flow into one dataframe
    #return the dataframe
    df_merge_IS_BS = pd.merge(df_IS, df_BS, on='Date')
    df_merge_IS_BS_CF = pd.merge(df_merge_IS_BS, df_CF, on='Date')
    return df_merge_IS_BS_CF

symbols = ['AAPL', 'MFT.NZ']

financials = {}
#create a dictionary of ticker names and their respective statements' urls
for symbol in symbols:
    financials[symbol] = ['https://finance.yahoo.com/quote/' + symbol + '/financials?p=' + symbol, 'https://finance.yahoo.com/quote/' + symbol + '/balance-sheet?p=' + symbol, 'https://finance.yahoo.com/quote/' + symbol + '/cash-flow?p=' + symbol]
print (financials['AAPL'][0])
data = pd.DataFrame([])


我得到的结果是它不会将下一个行情收录器数据连接到熊猫数据框中。
谢谢您的帮助。

最佳答案

抱歉,我是自己解决这个问题的。仅对于下一个人,我的错误是没有意识到我必须保存附加的数据框。

symbols = ['AAPL', 'MFT.NZ']
financials = {}
#create a dictionary of ticker names and their respective statements' urls
for symbol in symbols:
    financials[symbol] = ['https://finance.yahoo.com/quote/' + symbol + '/financials?p=' + symbol, 'https://finance.yahoo.com/quote/' + symbol + '/balance-sheet?p=' + symbol, 'https://finance.yahoo.com/quote/' + symbol + '/cash-flow?p=' + symbol]
print (financials['AAPL'][0])
data = pd.DataFrame()

for f in financials:
    print (f)
    df_income_statement = scrape_table(financials[f][0])
    df_balance_sheet = scrape_table(financials[f][1])
    df_cash_flow = scrape_table(financials[f][2])
    oldmerge = merge_IS_BS_CF(df_income_statement, df_balance_sheet, df_cash_flow)
    #print (oldmerge)
    data = data.append(oldmerge)

关于python - 合并财务数据,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/56212436/

10-09 17:13
查看更多