



我有一个包含多索引(股票代码和日期)的数据框,其中包含一个包含 1 和 0 的虚拟列,我想为每只股票计算每一行中 1 或 0 出现的次数假人"列,每次从 1 开始,向上计数为 1,向下计数为 0 我在下面有一个示例,其中Counter"列代表我想要创建的内容:

df = pd.DataFrame( {'股票':['AAPL','AAPL','AAPL','AAPL','MSFT','MSFT','MSFT','MSFT'],'日期时间':['2015-01-02'、'2015-01-03'、'2015-01-04'、'2015-01-05'、'2015-01-02'、'2015-01-03', '2015-01-04', '2015-01-05'],'虚拟': [0, 0, 1, 1, 1,1, 0, 1],'计数器':[-1, -2, 1, 2, 1, 2, -1, 1]})df['datetime'] = pd.to_datetime(df['datetime'])df.set_index(['stock', 'datetime'], inplace =True)


将熊猫导入为 pddf = pd.DataFrame({'股票':['AAPL','AAPL','AAPL','AAPL','MSFT', 'MSFT', 'MSFT', 'MSFT'],'日期时间':['2015-01-02','2015-01-03','2015-01-04'、'2015-01-05'、'2015-01-02'、'2015-01-03'、'2015-01-04', '2015-01-05'],'虚拟': [0, 0, 1, 1, 1, 1, 0, 1]})df['datetime'] = pd.to_datetime(df['datetime'])df.set_index(['stock', 'datetime'], inplace=True)# 将每只股票中的连续 1 组和 0 组组合在一起df['group'] = df.groupby('stock')['Dummy'] \.transform(lambda g: g.ne(g.shift()).cumsum())# 在计数器 1 中设置值占位符 ->1, 0 ->-1df['Counter'] = df['Dummy'].apply(lambda x: 1 if x == 1 else -1)# 从每个股票和组中获取 cumsumdf['Counter'] = df.groupby(['stock', 'group'])['Counter'].cumsum().astype(int)# 删除组列df = df.drop(columns='group')# 用于显示打印(df.to_string())


虚拟计数器股票日期时间苹果 2015-01-02 0 -12015-01-03 0 -22015-01-04 1 12015-01-05 1 2微软 2015-01-02 1 12015-01-03 1 22015-01-04 0 -12015-01-05 1 1

A version of this problem was answered here (this uses data at the minute frequency however).

Counting the number of consecutive occurences of numbers in dataframe with multi index

I have a dataframe that has a multi index (stock ticker and date) with a dummy column that contains 1s and 0s and I would like to count for each stock, in each row how many times the 1s or 0s have occurred in the 'Dummy" column, starting at 1 every time, and counting up for 1s and counting down for 0s I have an example below where the column 'Counter' represents what I would like to create:

df = pd.DataFrame(  {
'stock': ['AAPL', 'AAPL', 'AAPL','AAPL', 'MSFT', 'MSFT','MSFT', 'MSFT'],
'datetime': ['2015-01-02', '2015-01-03', '2015-01-04', '2015-01-05', '2015-01-02', '2015-01-03', '2015-01-04', '2015-01-05'],
'Dummy': [0, 0, 1, 1, 1,1, 0, 1],
'Counter': [-1, -2, 1, 2, 1, 2, -1, 1]})
df['datetime'] = pd.to_datetime(df['datetime'])
df.set_index(['stock', 'datetime'], inplace =True)

Try something like:

import pandas as pd

df = pd.DataFrame({
    'stock': ['AAPL', 'AAPL', 'AAPL', 'AAPL',
              'MSFT', 'MSFT', 'MSFT', 'MSFT'],
    'datetime': ['2015-01-02', '2015-01-03',
                 '2015-01-04', '2015-01-05',
                 '2015-01-02', '2015-01-03',
                 '2015-01-04', '2015-01-05'],
    'Dummy': [0, 0, 1, 1, 1, 1, 0, 1]})
df['datetime'] = pd.to_datetime(df['datetime'])
df.set_index(['stock', 'datetime'], inplace=True)

# Group Consecutive 1 and 0 groups in each stock together
df['group'] = df.groupby('stock')['Dummy'] \
    .transform(lambda g: g.ne(g.shift()).cumsum())
# Set Value Placeholder in Counter 1 -> 1, 0 -> -1
df['Counter'] = df['Dummy'].apply(lambda x: 1 if x == 1 else -1)
# Get cumsum from each stock and group
df['Counter'] = df.groupby(['stock', 'group'])['Counter'].cumsum().astype(int)
# Drop Group Column
df = df.drop(columns='group')

# For Display


                  Dummy  Counter
stock datetime
AAPL  2015-01-02      0       -1
      2015-01-03      0       -2
      2015-01-04      1        1
      2015-01-05      1        2
MSFT  2015-01-02      1        1
      2015-01-03      1        2
      2015-01-04      0       -1
      2015-01-05      1        1


08-11 14:07