在pandas dataframe字符串列中,我想基于一行的值派生一个新列,直到下一个值再次出现。什么是最有效的方法?

输入数据框:

import pandas as pd

df = pd.DataFrame({'neighborhood':['Chicago City', 'Wicker Park', 'Bucktown','Lincoln Park','West Loop','River North','Milwaukee City','Bay View','East Side','South Side','Bronzeville','North Side','New York City','Harlem','Midtown','Chinatown']})


我想要的数据帧输出将是:

      neighborhood city
0     Chicago City Chicago
1      Wicker Park Chicago
2         Bucktown Chicago
3     Lincoln Park Chicago
4        West Loop Chicago
5      River North Chicago
6   Milwaukee City Milwaukee
7         Bay View Milwaukee
8        East Side Milwaukee
9       South Side Milwaukee
10     Bronzeville Milwaukee
11      North Side Milwaukee
12   New York City New York
13          Harlem New York
14         Midtown New York
15       Chinatown New York

最佳答案

1)如果第一列包含“城市”,则将其复制到第二列,但剪切掉“城市”部分

2)用正向填充方法填充NA

import numpy as np

df['city'] = np.where(
df.neighborhood.str.contains('City'),
df.neighborhood.str.replace(' City', '', case = False),
None)


结果:

      neighborhood       city
0     Chicago City    Chicago
1      Wicker Park       None
2         Bucktown       None
3     Lincoln Park       None
4        West Loop       None
5      River North       None
6   Milwaukee City  Milwaukee
7         Bay View       None
8        East Side       None
9       South Side       None
10     Bronzeville       None
11      North Side       None
12   New York City   New York
13          Harlem       None
14         Midtown       None
15       Chinatown       None


df['city'] = df['city'].fillna(method = 'ffill')


结果:

      neighborhood       city
0     Chicago City    Chicago
1      Wicker Park    Chicago
2         Bucktown    Chicago
3     Lincoln Park    Chicago
4        West Loop    Chicago
5      River North    Chicago
6   Milwaukee City  Milwaukee
7         Bay View  Milwaukee
8        East Side  Milwaukee
9       South Side  Milwaukee
10     Bronzeville  Milwaukee
11      North Side  Milwaukee
12   New York City   New York
13          Harlem   New York
14         Midtown   New York
15       Chinatown   New York

关于python - 根据某行的某个值派生一个新的pandas列,并应用直到下一个值再次出现,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/55112419/

10-13 08:37
查看更多