我有一个名为df
的列的数据框,Account Number
。我正在尝试删除帐户的前两个字母为“ AA”或“ BB”的行
import pandas as pd
df = pd.DataFrame(data=["AA121", "AB1212", "BB121"],columns=['Account'])
print df
df = df[df['Account Number'][2:] not in ['AA', 'BB']]
错误:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
最佳答案
您可以尝试contains
:
import pandas as pd
df = pd.DataFrame(data=["AA121", "AB1212", "BB121"],columns=['Account'])
print df
Account
0 AA121
1 AB1212
2 BB121
print df['Account'].str[:2]
0 AA
1 AB
2 BB
Name: Account, dtype: object
print df['Account'].str[:2].str.contains('AA|BB')
0 True
1 False
2 True
Name: Account, dtype: bool
df = df[~(df['Account'].str[:2].str.contains('AA|BB'))]
print df
Account
1 AB1212
或使用
startswith
:print ((df['Account'].str[:2].str.startswith('AA')) |
(df['Account'].str[:2].str.startswith('BB')))
0 True
1 False
2 True
Name: Account, dtype: bool
print ~((df['Account'].str[:2].str.startswith('AA')) |
(df['Account'].str[:2].str.startswith('BB')))
0 False
1 True
2 False
Name: Account, dtype: bool
df = df[~((df['Account'].str[:2].str.startswith('AA')) |
(df['Account'].str[:2].str.startswith('BB')))]
print df
Account
1 AB1212