如何将相同的变换应用于数据框中的变量组？

本文介绍了如何将相同的变换应用于数据框中的变量组？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述 29岁程序员，3月因学历无情被辞！ mydf< ; - data.frame（ var_x = 1：5，var_y = runif（5），var_z = runif（5）， other_x = 10:14，other_p = runif（5），other_r = runif （5）） mydf var_x var_y var_z other_x other_p other_r 1 1 0.2700212 0.05893272 10 0.6212327 0.6177092 2 2 0.1284033 0.27333098 11 0.6933060 0.7520978 3 3 0.7313771 0.69352560 12 0.3154764 0.8479646 4 4 0.2400357 0.25151053 13 0.2057361 0.5138406 5 5 0.1797793 0.78550584 14 0.6671606 0.5801830 我想用 var_x 和分开 var _ * 其他_ * 变量与 other_x 。我如何轻松地这样做？我尝试使用 mutate_each dplyr 。如果只有一个小组可以进行下列工作：如何自动化这个每个标签？ library（dplyr） scale_var< ; - mydf $ var_x mydf％>％mutate_each（funs（./ scale_var），matches（^ var））我试图自己编写如下功能。 mymutate< - function（data，type）{ scale_var data％>％mutate_each（ funs（./ scale_var）， matches（paste0（^，type）））} 但是当我试图运行它只是一个类型 mymutate（mydf，type =var）它抛出一个我不太明白的错误： paste0中的错误（^，键入）：对象'类型'未找到更新 p> 我只想使用新的变量，所以方法将自己的变量分为 x 我有很多这样的标签，如 var 和 other 所以我不想在每种情况下写出来。这就是为什么我试图构建我自己的函数，以便稍后使用 lapply 。 UPDATE2 这些是我的数据框的变量。 [1]location_50_all_1location_50_both_sides_important_1 [3]location_50_left_important_1location_50_other_important_1 [5]location_50_right_important_1ownership_all_1 [7] owner_both_sides_important_1ownership_left_important_1 [9]ownership_other_important_1ownership_right_important_1 [11]person_all_1person_both_sides_important_1 [13]person_left_important_1person_other_important_1 [ 15]person_right_important_1union_all_1 [17]union_both_sides_important_1union_left_important_1 [19]union_other_important_1union_right_important_1 [21]total_left_importanttotal_right_important [23]total_both_sides_importanttotal_other_important [25]total_firm_officials左 [27]右连接 code> location_50 * 变量由 location_50_all_1 和 location_200 * 相同，所有权* ， person * ， union * p> UPDATE3 这里是为什么 'type'not found 。解决方案 code> mymutate 将会工作，即使数据框架不是很好地结构化（也就是每个案例应该被缩放的列数不一样）。＃mydf ＃var_x var_y var_z other_x other_p other_r ＃1 1 0.1913353 0.4706113 10 0.003120607 0.17808048 ＃2 2 0.1620725 0.6228830 11 0.844399758 0.01361841 ＃3 3 0.5148884 0.3671178 12 0.996055741 0.33513972 ＃4 4 0.8086168 0.3265216 13 0.984819261 0.96802056 ＃5 5 0.9902217 0.9087540 14 0.951119864 0.82479090 mymutate< - function（data，type）{ scale_var< - data [[paste0（type，_x）]] data％<>％ select（matches（paste0（^，type）））％>％ mutate_each（funs（./ scale_var）） data [[paste0（type，_x）] ]< - scale_var data } types< - c（var，other） lapply（types，mymutate，data = mydf ）％> ％bind_cols（。）＃var_x var_y var_z other_x other_p other_r ＃1 1 0.19133528 0.47061133 10 0.0003120607 0.017808048 ＃2 2 0.08103626 0.31144148 11 0.0767636144 0.001238037 ＃3 3 0.17162946 0.12237259 12 0.0830046451 0.027928310 ＃4 4 0.20215421 0.08163039 13 0.0757553278 0.074463120 ＃5 5 0.19804435 0.18175081 14 0.0679371332 0.058913635 I have a data frame with lots of variables whose names include tags.mydf <- data.frame( var_x = 1:5, var_y = runif(5), var_z = runif(5), other_x = 10:14, other_p = runif(5), other_r = runif(5) )mydf var_x var_y var_z other_x other_p other_r1 1 0.2700212 0.05893272 10 0.6212327 0.61770922 2 0.1284033 0.27333098 11 0.6933060 0.75209783 3 0.7313771 0.69352560 12 0.3154764 0.84796464 4 0.2400357 0.25151053 13 0.2057361 0.51384065 5 0.1797793 0.78550584 14 0.6671606 0.5801830I would like to divide var_* variables by var_x and other_* variables with other_x. How can I do this easily?I tried to use mutate_each of dplyr. The following works if there is only one group to scale. How can I automate this to each tag?library(dplyr)scale_var <- mydf$var_xmydf %>% mutate_each(funs(./scale_var), matches("^var"))I tried to write my own function as follows. mymutate <- function(data, type) { scale_var <- mydf[[paste0(type, "_x")]] data %>% mutate_each( funs(./scale_var), matches(paste0("^", type)) )}But when I tried to run it on just one type mymutate(mydf, type = "var") it threw an error that I do not really understand: Error in paste0("^", type) : object 'type' not foundUPDATEI would like to use only the new variables, so it does not matter that the method divides the x variables by themselves as well.I have a lots of such tags as var and other so I do not want to write them out in each case. That is why I tried to construct my own function to use it later with lapply.UPDATE2These are the variables of my data frame. [1] "location_50_all_1" "location_50_both_sides_important_1" [3] "location_50_left_important_1" "location_50_other_important_1" [5] "location_50_right_important_1" "ownership_all_1" [7] "ownership_both_sides_important_1" "ownership_left_important_1" [9] "ownership_other_important_1" "ownership_right_important_1"[11] "person_all_1" "person_both_sides_important_1"[13] "person_left_important_1" "person_other_important_1"[15] "person_right_important_1" "union_all_1"[17] "union_both_sides_important_1" "union_left_important_1"[19] "union_other_important_1" "union_right_important_1"[21] "total_left_important" "total_right_important"[23] "total_both_sides_important" "total_other_important"[25] "total_firm_officials" "left"[27] "right" "connected"I would like to divide location_50* variables by location_50_all_1 and the same for location_200*, ownership*, person*, union*.UPDATE3Here is the answer to the question why 'type' not found. 解决方案 This modified version of mymutate would work even if the data frame is not nicely structured (that is the number of columns which should be scaled is not the same for each case).# mydf# var_x var_y var_z other_x other_p other_r# 1 1 0.1913353 0.4706113 10 0.003120607 0.17808048# 2 2 0.1620725 0.6228830 11 0.844399758 0.01361841# 3 3 0.5148884 0.3671178 12 0.996055741 0.33513972# 4 4 0.8086168 0.3265216 13 0.984819261 0.96802056# 5 5 0.9902217 0.9087540 14 0.951119864 0.82479090mymutate <- function(data, type) { scale_var <- data[[paste0(type, "_x")]] data %<>% select(matches(paste0("^", type))) %>% mutate_each(funs(./scale_var)) data[[paste0(type, "_x")]] <- scale_var data}types <- c("var", "other")lapply(types, mymutate, data=mydf) %>% bind_cols(.)# var_x var_y var_z other_x other_p other_r# 1 1 0.19133528 0.47061133 10 0.0003120607 0.017808048# 2 2 0.08103626 0.31144148 11 0.0767636144 0.001238037# 3 3 0.17162946 0.12237259 12 0.0830046451 0.027928310# 4 4 0.20215421 0.08163039 13 0.0757553278 0.074463120# 5 5 0.19804435 0.18175081 14 0.0679371332 0.058913635 这篇关于如何将相同的变换应用于数据框中的变量组？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！上岸，阿里云！