本文介绍了R:dplyr - 变化因子“SI”对于具有“突变”的列名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个data.frame。我需要循环浏览每一列并搜索SI,然后更改列名。



我有这个:

  SKU Tv.y.VideoCómputoTecnología
2003091090002P NO NO
2003091090002 NO NO
2003120060006P NO NO NO
2003120060006P NO NO NO
2003120060006 NO NO
2004121460000P NO SI NO
2004121460000 NO SI NO
2004121440002P NO SI NO
2004121440002 NO SI NO
2004123030003P NO SI NO

需要得到这个:

  SKU Tv.y.VideoCómputoTecnología
2003091090002P NO NO NO
2003091090002 NO NO
20031200600 06P NO NO NO
2003120060006P NO NO NO
2003120060006 NO NO
2004121460000P NOCómputoNO
2004121460000 NOCómputoNO
2004121440002P NOCómputoNO
2004121440002 NO CómputoNO
2004123030003P NOCómputoNO

我的代码:



我已经尝试过这个代码:

  df $ Tv.y.Video<  - mutate(df $ Tv.y.Video,
Tv.y.Video = ifelse(sub(SI,Tv.y.Video),Tv.y.Video,Tv.y.Video))

但收到以下消息:

  UseMethod(mutate_)中的错误:
mutate_不适用于类factor的对象的适用方法
pre>

所以我把这个列的类改成了这个字符:

  df $ Tv.y.Video<  -  as.character(df $ Tv.y.Video)
/ pre>

并收到以下消息:

  UseMethod mutate_):
不适用于mutate_应用于类character的对象的方法

这是str(df)的结果:

 'data.frame':10 obs。的4个变量:
$ SKU:因素w / 9028级别2003014460004,..:9 8 16 16 15 842 841 840 839 846
$ Tv.y.Video:chrNONO NONO...
$Cómputo:因子w / 2级别NO,SI:1 1 1 1 1 2 2 2 2 2
$Tecnología: 2级NO,SI:1 1 1 1 1 1 1 1 1 1


解决方案

这是一个基本的R方法,如果你想试一试:

 #change所有列的类别为
df []< - lapply(df,as.character)
#替换具有列名称的SI条目:
df []< - Map function(cols,df_names)replace(cols,which(cols ==SI),
df_names),df,names(df))
df
#SKU Tv.y.Video C.mputo Tecnolog.a
#1 2003091090002P NO NO NO
#2 2003091090002 NO NO NO
#3 2003120060006P NO NO NO
#4 2003120060006P NO NO NO
#5 2003120060006 NO NO NO
#6 2004121460000P NO C.mputo NO
#7 2004121460000 NO C.mputo NO
#8 2004121440002P NO C.mputo NO
#9 2004121440002 NO C.mputo NO
#10 2004123030003P NO C.mputo NO

评论后编辑:



OP中尝试的代码的主要问题:

  df $ Tv.y.Video<  -  mutate(df $ Tv.y.Video,
Tv.y.Video = ifelse(sub(SI,Tv.y.Video),Tv.y.视频,Tv.y.Video))

是你试图使用 mutate 只能直接在列上。通常,dplyr与数据类似框架的对象一起工作,并且dplyr中的大多数函数都希望有一个类似数据的框架对象作为第一个参数。在这里,它将是 df ,所以你需要开始如下:

  df<  -  mutate(df,
Tv.y.Video = ifelse(Tv.y.Video ==SI,Tv.y.Video,Tv.y.Video)

或者您可以使用pipe运算符( %>%),可让您首先指定data.frame,然后将其管道到 mutate 。但是请注意,在 mutate 之下,仍然使用 df 作为其上面显示的第一个参数。 管道主要使您更容易阅读,并允许您创建通过管道连接的长序列操作。管道操作员将是:

  df<  -  df%>%
mutate(
Tv.y.Video = ifelse(Tv.y.Video ==SI,Tv.y.Video,Tv.y.Video)

另请注意,替换将比 ifelse ,这就是为什么我用它的基础R方法。


I've this data.frame. And i need to loop through every column and search for "SI", then change it for the columns name.

I have this:

SKU             Tv.y.Video  Cómputo     Tecnología
2003091090002P     NO          NO           NO
2003091090002      NO          NO           NO
2003120060006P     NO          NO           NO
2003120060006P     NO          NO           NO
2003120060006      NO          NO           NO
2004121460000P     NO          SI           NO
2004121460000      NO          SI           NO
2004121440002P     NO          SI           NO
2004121440002      NO          SI           NO
2004123030003P     NO          SI           NO

Need to get this:

         SKU         Tv.y.Video   Cómputo       Tecnología
   2003091090002P      NO          NO           NO
   2003091090002       NO          NO           NO
   2003120060006P      NO          NO           NO
   2003120060006P      NO          NO           NO
   2003120060006       NO          NO           NO
   2004121460000P      NO          Cómputo      NO
   2004121460000       NO          Cómputo      NO
   2004121440002P      NO          Cómputo      NO
   2004121440002       NO          Cómputo      NO
   2004123030003P      NO          Cómputo      NO

My code:

I've tried this with this code:

df$Tv.y.Video <- mutate(df$Tv.y.Video,
                Tv.y.Video = ifelse(sub("SI", Tv.y.Video), "Tv.y.Video", Tv.y.Video))

But got this message:

Error in UseMethod("mutate_") :
  no applicable method for 'mutate_' applied to an object of class "factor"

So i've change the class of that column to character with this:

df$Tv.y.Video <- as.character(df$Tv.y.Video)

And got this message:

Error in UseMethod("mutate_") :
  no applicable method for 'mutate_' applied to an object of class "character"

This is the result from str(df):

    'data.frame':   10 obs. of  4 variables:
 $ SKU       : Factor w/ 9028 levels "2003014460004",..: 9 8 16 16 15 842 841 840 839 846
 $ Tv.y.Video: chr  "NO" "NO" "NO" "NO" ...
 $ Cómputo   : Factor w/ 2 levels "NO","SI": 1 1 1 1 1 2 2 2 2 2
 $ Tecnología: Factor w/ 2 levels "NO","SI": 1 1 1 1 1 1 1 1 1 1
解决方案

Here's a base R approach, if you want to give it a try:

# change the class to character for all columns:
df[] <- lapply(df, as.character)
# replace SI entries with column names:
df[] <- Map(function(cols, df_names) replace(cols, which(cols == "SI"),
               df_names), df, names(df) )
df
#              SKU Tv.y.Video C.mputo Tecnolog.a
#1  2003091090002P         NO      NO         NO
#2   2003091090002         NO      NO         NO
#3  2003120060006P         NO      NO         NO
#4  2003120060006P         NO      NO         NO
#5   2003120060006         NO      NO         NO
#6  2004121460000P         NO C.mputo         NO
#7   2004121460000         NO C.mputo         NO
#8  2004121440002P         NO C.mputo         NO
#9   2004121440002         NO C.mputo         NO
#10 2004123030003P         NO C.mputo         NO

Edit after comment:

The main problem with the attempted code in the OP:

df$Tv.y.Video <- mutate(df$Tv.y.Video,
                Tv.y.Video = ifelse(sub("SI", Tv.y.Video), "Tv.y.Video", Tv.y.Video))

is that you are trying to use mutate only on a column directly. Generally, dplyr works with data.frame-like objects and most functions in dplyr expect a data.frame-like object as the first argument. Here, it would be df, so you would need to start something like the following:

df <- mutate(df,
      Tv.y.Video = ifelse(Tv.y.Video == "SI", "Tv.y.Video", Tv.y.Video)
)

Or you can use the "pipe" operator (%>%) which lets you specify the data.frame first and then "pipe" it into mutate. Note however, that under the hood mutate still uses df as its first argument as displayed above. The pipe mainly makes it easier to read and allows you to create long sequences of manipulations connected by pipes. With the pipe operator it would be:

df <- df %>%
        mutate(
          Tv.y.Video = ifelse(Tv.y.Video == "SI", "Tv.y.Video", Tv.y.Video)
)

Also note that replace would be faster than ifelse, which is why I used it my base R approach.

这篇关于R:dplyr - 变化因子“SI”对于具有“突变”的列名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 04:03