问题描述
这些是我的数据框:
# data
set.seed(1234321)
# Original data frame (i.e. a questionnaire survey data)
answer <- c("Yes", "No")
likert_scale <- c("strongly disagree", "disagree", "undecided", "agree", "strongly agree")
d1 <- c(rnorm(10)*10)
d2 <- sample(x = c(letters), size = 10, replace = TRUE)
d3 <- sample(x = likert_scale, size = 10, replace = TRUE)
d4 <- sample(x = likert_scale, size = 10, replace = TRUE)
d5 <- sample(x = likert_scale, size = 10, replace = TRUE)
d6 <- sample(x = answer, size = 10, replace = TRUE)
d7 <- sample(x = answer, size = 10, replace = TRUE)
original_df <- data.frame(d1, d2, d3, d4, d5, d6, d7)
# Questionnaire codebook data frame
quest_section <- c("generic", "likert scale", "specific approval")
starting_column <- c(1, 3, 6)
ending_column <- c(2, 5, 7)
df_codebook <- data.frame(quest_section, min_column, max_column)
我想根据 df_codebook
中的 quest_section
变量,使用 starting_column
和 将原始数据帧拆分为不同的数据帧>ending_column
作为 indeces 来选择 original_df
中的列.
I would like to split the orginal dataframe in different ones on the basis of quest_section
variable in the df_codebook
, using starting_column
and ending_column
as indeces to select columns in the original_df
.
这是我尝试创建一个函数以拆分original_df
:
This is what I tried creating a function in order to split the original_df
:
# splitting dataframe function
split_df <- function(my_df, my_codebook) {
df_names <- df_codebook[,1] %>%
map(set_names)
for (i in 1:length(df_codebook[,1])) {
df_names$`[i]` <- original_df %>%
dplyr::select(df_codebook[[2]][i]:df_codebook[[3]][i])
}
}
# apply function to two dataframes
my_df_list <- split_df(my_df = original_df, my_codebook = df_codebook)
结果是一个 NULL
对象而不是以下列表:
and the result was a NULL
object instead of the following list:
> my_df_list
$generic
d1 d2
1 12.369081 z
2 15.616230 x
3 18.396185 f
4 3.173245 q
5 10.715115 j
6 -11.459955 p
7 2.488894 j
8 1.158625 n
9 26.200816 a
10 12.624048 b
$`likert scale`
d3 d4 d5
1 disagree strongly agree strongly agree
2 undecided undecided strongly disagree
3 strongly agree undecided strongly disagree
4 agree undecided undecided
5 strongly disagree agree undecided
6 disagree strongly disagree undecided
7 disagree agree disagree
8 disagree strongly disagree undecided
9 undecided strongly disagree disagree
10 strongly disagree disagree strongly agree
$`specific approval`
d6 d7
1 No No
2 No No
3 Yes No
4 Yes Yes
5 Yes Yes
6 Yes Yes
7 Yes No
8 No Yes
9 No No
10 No Yes
我对任何类型的解决方案都感兴趣:使用 tidyverse
和 purrr
方法,或功能性方法.
I am interested in any kind of solution: using tidyverse
and purrr
approach, or functional one.
推荐答案
您可以使用 Map
在每个 starting_column
之间创建一个序列:ending_column
并使用该序列从 original_df
中提取相关列.我们可以使用 setNames
为列表分配名称.
You can use Map
to create a sequence between each of starting_column
: ending_column
and use that sequence to extract the relevant columns from original_df
. We can use setNames
to assign names to the list.
setNames(Map(function(x, y) original_df[, x:y],
df_codebook$starting_column, df_codebook$ending_column),
df_codebook$quest_section)
返回
#$generic
# d1 d2
#1 12.369081 z
#2 15.616230 x
#3 18.396185 f
#4 3.173245 q
#5 10.715115 j
#6 -11.459955 p
#7 2.488894 j
#8 1.158625 n
#9 26.200816 a
#10 12.624048 b
#$`likert scale`
# d3 d4 d5
#1 disagree strongly agree strongly agree
#2 undecided undecided strongly disagree
#3 strongly agree undecided strongly disagree
#4 agree undecided undecided
#5 strongly disagree agree undecided
#6 disagree strongly disagree undecided
#7 disagree agree disagree
#8 disagree strongly disagree undecided
#9 undecided strongly disagree disagree
#10 strongly disagree disagree strongly agree
#$`specific approval`
# d6 d7
#1 No No
#2 No No
#3 Yes No
#4 Yes Yes
#5 Yes Yes
#6 Yes Yes
#7 Yes No
#8 No Yes
#9 No No
#10 No Yes
这篇关于通过列选择将数据帧拆分为多个数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!