问题描述
如果是单行,我可以得到如下的迭代器:
If it's single row, I can get the iterator as following
import pandas as pd
import numpy as np
a = np.zeros((100,40))
X = pd.DataFrame(a)
for index, row in X.iterrows():
print index
print row
现在我希望每个迭代器都会返回子集X [0:9,...],X [5:14,...],X [10:19,...]等如何通过滚动(pandas.DataFrame.rolling)实现这一点?
Now I want each iterator will return a subset X[0:9, :], X[5:14, :], X[10:19, :] etc. How do I achieve this with rolling (pandas.DataFrame.rolling)?
推荐答案
我将尝试以下数据框。
import pandas as pd
import numpy as np
from string import uppercase
def generic_portfolio_df(start, end, freq, num_port, num_sec, seed=314):
np.random.seed(seed)
portfolios = pd.Index(['Portfolio {}'.format(i) for i in uppercase[:num_port]],
name='Portfolio')
securities = ['s{:02d}'.format(i) for i in range(num_sec)]
dates = pd.date_range(start, end, freq=freq)
return pd.DataFrame(np.random.rand(len(dates) * num_sec, num_port),
index=pd.MultiIndex.from_product([dates, securities],
names=['Date', 'Id']),
columns=portfolios
).groupby(level=0).apply(lambda x: x / x.sum())
df = generic_portfolio_df('2014-12-31', '2015-05-30', 'BM', 3, 5)
df.head(10)
我现在将介绍一个函数来滚动多行,并连接到一个数据框中,在列索引中添加一个顶层,指示卷中的位置。
I'll now introduce a function to roll a number of rows and concatenate into a single dataframe where I'll add a top level to the column index that indicates the location in the roll.
def rolled(df, n):
k = range(df.columns.nlevels)
_k = [i - len(k) for i in k]
myroll = pd.concat([df.shift(i).stack(level=k) for i in range(n)],
axis=1, keys=range(n)).unstack(level=_k)
return [(i, row.unstack(0)) for i, row in myroll.iterrows()]
虽然它隐藏在函数中, myroll
像这样
Though its hidden in the function, myroll
would look like this
现在我们可以使用它像一个迭代器。
Now we can use it just like an iterator.
for i, roll in rolled(df.head(5), 3):
print roll
print
0 1 2
Portfolio
Portfolio A 0.326164 NaN NaN
Portfolio B 0.201597 NaN NaN
Portfolio C 0.085340 NaN NaN
0 1 2
Portfolio
Portfolio A 0.278614 0.326164 NaN
Portfolio B 0.314448 0.201597 NaN
Portfolio C 0.266392 0.085340 NaN
0 1 2
Portfolio
Portfolio A 0.258958 0.278614 0.326164
Portfolio B 0.089224 0.314448 0.201597
Portfolio C 0.293570 0.266392 0.085340
0 1 2
Portfolio
Portfolio A 0.092760 0.258958 0.278614
Portfolio B 0.262511 0.089224 0.314448
Portfolio C 0.084208 0.293570 0.266392
0 1 2
Portfolio
Portfolio A 0.043503 0.092760 0.258958
Portfolio B 0.132221 0.262511 0.089224
Portfolio C 0.270490 0.084208 0.293570
这篇关于滑动窗口迭代器使用大 pandas 滚动的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!