问题描述
在熊猫df中,我有一列 ['name']
,其中包含各种操作系统分类,例如'Windows 7','Windows 10','Linux','MobileiOS 9.1","OS X 10.12"
等.这是字符串.
In a pandas df, I have a column ['name']
with various Operating System classifications such as 'Windows 7', 'Windows 10', 'Linux', 'Mobile iOS 9.1', 'OS X 10.12'
, etc. That are strings.
我希望使用此功能来创建新列 ['type']
,这将是更通用的版本:
I am hoping to use this function to create a new column ['type']
that will be a more generalized version:
def name_group(row):
if 'Windows' in row:
name = 'Microsoft Windows'
elif 'iOS' in row:
name = 'Apple iOS'
elif 'OS X' in row:
name ='Apple Macintosh'
elif 'Macintosh' in row:
name = 'Apple Macintosh'
elif 'Linux' in row:
name = 'GNU/Linux'
else:
name = 'Other'
return name
当我通过传入单个字符串变量来测试函数时,它可以正常工作,但是由于某种原因,当我将函数应用于df时,它仅对每一行返回"other".
It works correctly when I test the function by passing in a single string variable, but for some reason when I apply the function to the df like this, it only returns "other" for each row.
new_df['type'] = new_df.apply(name_group, axis=1)
对造成这种情况的任何想法?
Any thoughts on what could be causing this?
推荐答案
您需要使用 Series.apply
:
You need pass column name
with Series.apply
:
new_df['type'] = new_df['name'].apply(name_group)
但是,如果要使用 DataFrame.应用
,然后还需要 lambda
函数并传递列名:
But if want use DataFrame.apply
then need lambda
function and pass name of column too:
new_df['type'] = new_df.apply(lambda x: name_group(x['name']), axis=1)
这篇关于将函数应用于pandas df中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!