python - Pandas 数据框str拆分最大值

我有一个包含单个名称的列的数据框。名称并不总是采用相同的格式，因此我试图将名字和姓氏分成单独的列。例如，我可能会看到：

Smith John

Smith, John

Smith, John A

Smith John A

Smith John and Jane

一致的模式是姓氏在先。我该如何为姓氏创建两个单独的字段，然后再为不是姓氏的所有内容创建第二列。这是我到目前为止的

owners_df['normal_name'] = owners_df['name'].str.replace(', ', ' ')
owners_df['lastname'] = owners_df["normal_name"].str.split(' ', 1)[0]
owners_df['firstname'] = owners_df["normal_name"].str.split(' ', 1)[1]

问题是我得到一个错误“ ValueError：值的长度与索引的长度不匹配”

最佳答案

正如@Datanovice在评论中所说的那样，“当您运行此owners_df["normal_name"].str.split(' ', 1)[0]时，您仅在抓取第一行”

使用.str访问器获得预期的输出

owners_df['lastname'] = owners_df["normal_name"].str.split(' ', n=1).str[0]
owners_df['firstname'] = owners_df["normal_name"].str.split(' ', n=1).str[1]

See docs注意n参数可将拆分限制为一次。

关于python - Pandas 数据框str拆分最大值，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/59775009/