问题描述
我有这个data.frame。我需要循环浏览每一列并搜索SI,然后更改列名。我有这个:
SKU Tv.y.VideoCómputoTecnología
2003091090002P NO NO
2003091090002 NO NO
2003120060006P NO NO NO
2003120060006P NO NO NO
2003120060006 NO NO
2004121460000P NO SI NO
2004121460000 NO SI NO
2004121440002P NO SI NO
2004121440002 NO SI NO
2004123030003P NO SI NO
需要得到这个:
SKU Tv.y.VideoCómputoTecnología
2003091090002P NO NO NO
2003091090002 NO NO
20031200600 06P NO NO NO
2003120060006P NO NO NO
2003120060006 NO NO
2004121460000P NOCómputoNO
2004121460000 NOCómputoNO
2004121440002P NOCómputoNO
2004121440002 NO CómputoNO
2004123030003P NOCómputoNO
我的代码:
我已经尝试过这个代码:
df $ Tv.y.Video< - mutate(df $ Tv.y.Video,
Tv.y.Video = ifelse(sub(SI,Tv.y.Video),Tv.y.Video,Tv.y.Video))
但收到以下消息:
UseMethod(mutate_)中的错误:
pre>
mutate_不适用于类factor的对象的适用方法
所以我把这个列的类改成了这个字符:
df $ Tv.y.Video< - as.character(df $ Tv.y.Video)
/ pre>
并收到以下消息:
UseMethod mutate_):
不适用于mutate_应用于类character的对象的方法
这是str(df)的结果:
'data.frame':10 obs。的4个变量:
$ SKU:因素w / 9028级别2003014460004,..:9 8 16 16 15 842 841 840 839 846
$ Tv.y.Video:chrNONO NONO...
$Cómputo:因子w / 2级别NO,SI:1 1 1 1 1 2 2 2 2 2
$Tecnología: 2级NO,SI:1 1 1 1 1 1 1 1 1 1
解决方案这是一个基本的R方法,如果你想试一试:
#change所有列的类别为
df []< - lapply(df,as.character)
#替换具有列名称的SI条目:
df []< - Map function(cols,df_names)replace(cols,which(cols ==SI),
df_names),df,names(df))
df
#SKU Tv.y.Video C.mputo Tecnolog.a
#1 2003091090002P NO NO NO
#2 2003091090002 NO NO NO
#3 2003120060006P NO NO NO
#4 2003120060006P NO NO NO
#5 2003120060006 NO NO NO
#6 2004121460000P NO C.mputo NO
#7 2004121460000 NO C.mputo NO
#8 2004121440002P NO C.mputo NO
#9 2004121440002 NO C.mputo NO
#10 2004123030003P NO C.mputo NO
评论后编辑:
OP中尝试的代码的主要问题:
df $ Tv.y.Video< - mutate(df $ Tv.y.Video,
Tv.y.Video = ifelse(sub(SI,Tv.y.Video),Tv.y.视频,Tv.y.Video))
是你试图使用
mutate
只能直接在列上。通常,dplyr与数据类似框架的对象一起工作,并且dplyr中的大多数函数都希望有一个类似数据的框架对象作为第一个参数。在这里,它将是df
,所以你需要开始如下:df< - mutate(df,
Tv.y.Video = ifelse(Tv.y.Video ==SI,Tv.y.Video,Tv.y.Video)
)
或者您可以使用pipe运算符(
%>%
),可让您首先指定data.frame,然后将其管道到mutate
。但是请注意,在mutate
之下,仍然使用df
作为其上面显示的第一个参数。管道
主要使您更容易阅读,并允许您创建通过管道连接的长序列操作。管道操作员将是:df< - df%>%
mutate(
Tv.y.Video = ifelse(Tv.y.Video ==SI,Tv.y.Video,Tv.y.Video)
)
另请注意,
替换
将比ifelse
,这就是为什么我用它的基础R方法。I've this data.frame. And i need to loop through every column and search for "SI", then change it for the columns name.
I have this:
SKU Tv.y.Video Cómputo Tecnología 2003091090002P NO NO NO 2003091090002 NO NO NO 2003120060006P NO NO NO 2003120060006P NO NO NO 2003120060006 NO NO NO 2004121460000P NO SI NO 2004121460000 NO SI NO 2004121440002P NO SI NO 2004121440002 NO SI NO 2004123030003P NO SI NO
Need to get this:
SKU Tv.y.Video Cómputo Tecnología 2003091090002P NO NO NO 2003091090002 NO NO NO 2003120060006P NO NO NO 2003120060006P NO NO NO 2003120060006 NO NO NO 2004121460000P NO Cómputo NO 2004121460000 NO Cómputo NO 2004121440002P NO Cómputo NO 2004121440002 NO Cómputo NO 2004123030003P NO Cómputo NO
My code:
I've tried this with this code:
df$Tv.y.Video <- mutate(df$Tv.y.Video, Tv.y.Video = ifelse(sub("SI", Tv.y.Video), "Tv.y.Video", Tv.y.Video))
But got this message:
Error in UseMethod("mutate_") : no applicable method for 'mutate_' applied to an object of class "factor"
So i've change the class of that column to character with this:
df$Tv.y.Video <- as.character(df$Tv.y.Video)
And got this message:
Error in UseMethod("mutate_") : no applicable method for 'mutate_' applied to an object of class "character"
This is the result from str(df):
'data.frame': 10 obs. of 4 variables: $ SKU : Factor w/ 9028 levels "2003014460004",..: 9 8 16 16 15 842 841 840 839 846 $ Tv.y.Video: chr "NO" "NO" "NO" "NO" ... $ Cómputo : Factor w/ 2 levels "NO","SI": 1 1 1 1 1 2 2 2 2 2 $ Tecnología: Factor w/ 2 levels "NO","SI": 1 1 1 1 1 1 1 1 1 1
解决方案Here's a base R approach, if you want to give it a try:
# change the class to character for all columns: df[] <- lapply(df, as.character) # replace SI entries with column names: df[] <- Map(function(cols, df_names) replace(cols, which(cols == "SI"), df_names), df, names(df) ) df # SKU Tv.y.Video C.mputo Tecnolog.a #1 2003091090002P NO NO NO #2 2003091090002 NO NO NO #3 2003120060006P NO NO NO #4 2003120060006P NO NO NO #5 2003120060006 NO NO NO #6 2004121460000P NO C.mputo NO #7 2004121460000 NO C.mputo NO #8 2004121440002P NO C.mputo NO #9 2004121440002 NO C.mputo NO #10 2004123030003P NO C.mputo NO
Edit after comment:
The main problem with the attempted code in the OP:
df$Tv.y.Video <- mutate(df$Tv.y.Video, Tv.y.Video = ifelse(sub("SI", Tv.y.Video), "Tv.y.Video", Tv.y.Video))
is that you are trying to use
mutate
only on a column directly. Generally, dplyr works with data.frame-like objects and most functions in dplyr expect a data.frame-like object as the first argument. Here, it would bedf
, so you would need to start something like the following:df <- mutate(df, Tv.y.Video = ifelse(Tv.y.Video == "SI", "Tv.y.Video", Tv.y.Video) )
Or you can use the "pipe" operator (
%>%
) which lets you specify the data.frame first and then "pipe" it intomutate
. Note however, that under the hoodmutate
still usesdf
as its first argument as displayed above. Thepipe
mainly makes it easier to read and allows you to create long sequences of manipulations connected by pipes. With the pipe operator it would be:df <- df %>% mutate( Tv.y.Video = ifelse(Tv.y.Video == "SI", "Tv.y.Video", Tv.y.Video) )
Also note that
replace
would be faster thanifelse
, which is why I used it my base R approach.这篇关于R:dplyr - 变化因子“SI”对于具有“突变”的列名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!