我有一个名为df的列的数据框,Account Number。我正在尝试删除帐户的前两个字母为“ AA”或“ BB”的行

import pandas as pd

df = pd.DataFrame(data=["AA121", "AB1212", "BB121"],columns=['Account'])
print df
df = df[df['Account Number'][2:] not in ['AA', 'BB']]


错误:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

最佳答案

您可以尝试contains

import pandas as pd

df = pd.DataFrame(data=["AA121", "AB1212", "BB121"],columns=['Account'])
print df
  Account
0   AA121
1  AB1212
2   BB121

print df['Account'].str[:2]
0    AA
1    AB
2    BB
Name: Account, dtype: object

print df['Account'].str[:2].str.contains('AA|BB')
0     True
1    False
2     True
Name: Account, dtype: bool


df = df[~(df['Account'].str[:2].str.contains('AA|BB'))]
print df
  Account
1  AB1212


或使用startswith

print ((df['Account'].str[:2].str.startswith('AA')) |
        (df['Account'].str[:2].str.startswith('BB')))
0     True
1    False
2     True
Name: Account, dtype: bool

print ~((df['Account'].str[:2].str.startswith('AA')) |
        (df['Account'].str[:2].str.startswith('BB')))
0    False
1     True
2    False
Name: Account, dtype: bool

df = df[~((df['Account'].str[:2].str.startswith('AA')) |
          (df['Account'].str[:2].str.startswith('BB')))]
print df
  Account
1  AB1212

10-08 11:14