我正在通过Spyder IDE运行Windows 10,Python 2.7。

我有一个名为DataFrame的熊猫df

import random
import pandas as pd
tee = pd.date_range('2000-01-01','2005-01-01',freq = 'M')
owe = pd.date_range('2000-01-01','2005-02-29')
eye = random.sample(range(100), 60)
you = random.sample(range(100), 60)
df = pd.DataFrame({'tee': tee , 'owe':owe, 'eye': eye,'you': you})
#df.dtypes
#dtypes: eye and owe = 'datetime64[ns]' , you and tee = 'int64'


you以天为单位。

对于2004年中具有tee的每一行,我想将eye的记录替换为该行的int64owe的月份(格式为you)。

请让我知道我是否可以提供更多信息。

最佳答案

评论说一些错误。因此,我更改了创建dataframe的方式,然后通过函数locSeries.dt.yearSeries.dt.month应用条件:

注意:测试后,将功能eye_new中的列名eye更改为loc

import random
import numpy as np
import pandas as pd

tee = pd.date_range(pd.to_datetime('2000-01-01'),pd.to_datetime('2005-01-01'),freq = 'M')
owe = pd.date_range(pd.to_datetime('2000-01-01'),pd.to_datetime('2005-02-28'))
eye = random.sample(range(100), 60)
#http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.random.randint.html
you = np.random.randint(0, 100 , size=1886)

print len(tee)  #60
print len(owe)  #1886
print len(eye)  #60
print len(you)  #1886

df1 = pd.DataFrame({'tee': tee , 'eye': eye})
print df1.head()

   eye        tee
0   17 2000-01-31
1   71 2000-02-29
2   56 2000-03-31
3    2 2000-04-30
4   92 2000-05-31

df2 = pd.DataFrame({'owe':owe, 'you': you})
print df2.head()

         owe  you
0 2000-01-01   78
1 2000-01-02   76
2 2000-01-03   78
3 2000-01-04   51
4 2000-01-05   66




#merging df1 and df2 by colum,ns tee and owe
result = pd.merge(df1, df2, left_on='tee', right_on='owe')

#after testing change eye_new to eye
result.loc[ result['tee'].dt.year == 2004 ,'eye_new'] = result['owe'].dt.month +
                                                        result['you']


    

print result

    eye        tee        owe  you  eye_new
0    17 2000-01-31 2000-01-31    4      NaN
1    71 2000-02-29 2000-02-29   27      NaN
2    56 2000-03-31 2000-03-31   14      NaN
3     2 2000-04-30 2000-04-30   13      NaN
4    92 2000-05-31 2000-05-31   10      NaN
5    11 2000-06-30 2000-06-30   54      NaN
6    93 2000-07-31 2000-07-31   54      NaN
...
45   49 2003-10-31 2003-10-31   50      NaN
46   79 2003-11-30 2003-11-30   93      NaN
47   68 2003-12-31 2003-12-31   20      NaN
48   96 2004-01-31 2004-01-31   49       50
49   53 2004-02-29 2004-02-29   24       26
50   83 2004-03-31 2004-03-31   56       59
51   39 2004-04-30 2004-04-30   18       22
52   40 2004-05-31 2004-05-31    4        9
53   86 2004-06-30 2004-06-30   42       48
54   48 2004-07-31 2004-07-31   76       83
55    9 2004-08-31 2004-08-31   21       29
56   89 2004-09-30 2004-09-30   24       33
57   18 2004-10-31 2004-10-31   32       42
58   13 2004-11-30 2004-11-30   58       69
59   12 2004-12-31 2004-12-31    6       18


通过评论编辑:

#after testing change eye_new to eye
result.loc[ result['tee'].dt.year == 2004 ,'eye_new'] = result.apply(lambda x: x['owe'] +
                                               pd.DateOffset(days=x['you']), axis=1).dt.month
print result


    eye        tee        owe  you  eye_new
0     5 2000-01-31 2000-01-31  149      NaN
1    81 2000-02-29 2000-02-29    5      NaN
2    42 2000-03-31 2000-03-31   39      NaN

.
.
.
47   56 2003-12-31 2003-12-31   74      NaN
48   64 2004-01-31 2004-01-31   39        3
49    0 2004-02-29 2004-02-29  395        3
50   13 2004-03-31 2004-03-31  257       12
51   31 2004-04-30 2004-04-30  164       10
52   14 2004-05-31 2004-05-31  116        9
53   37 2004-06-30 2004-06-30  335        5
54   49 2004-07-31 2004-07-31  158        1
55   95 2004-08-31 2004-08-31  244        5
56   82 2004-09-30 2004-09-30  279        7
57   38 2004-10-31 2004-10-31   20       11
58   74 2004-11-30 2004-11-30   33        1
59   59 2004-12-31 2004-12-31  326       11

10-07 13:23