问题描述
我希望确定一种将数据帧列表转换为单个数据帧的有效方法.以下是我的可复制MWE:
I am hoping to determine an efficient way to convert a list of data frames into a single data frame. Below is my reproducible MWE:
set.seed(1)
ABAge = runif(100)
ABPoints = rnorm(100)
ACAge = runif(100)
ACPoints = rnorm(100)
BCAge = runif(100)
BCPoints = rnorm(100)
A_B <- data.frame(ID = as.character(paste0("ID", 1:100)), Age = ABAge, Points = ABPoints)
A_C <- data.frame(ID = as.character(paste0("ID", 1:100)), Age = ACAge, Points = ACPoints)
B_C <- data.frame(ID = as.character(paste0("ID", 1:100)), Age = BCAge, Points = BCPoints)
A_B$ID <- as.character(A_B$ID)
A_C$ID <- as.character(A_C$ID)
B_C$ID <- as.character(B_C$ID)
listFormat <- list("A_B" = A_B, "A_C" = A_C, "B_C" = B_C)
dfFormat <- data.frame(ID = as.character(paste0("ID", 1:100)), A_B.Age = ABAge, A_B.Points = ABPoints, A_C.Age = ACAge, A_C.Points = ACPoints, B_C.Age = BCAge, B_C.Points = BCPoints)
dfFormat$ID = as.character(dfFormat$ID)
这将导致数据帧格式(dfFormat
)如下所示:
This results in a data frame format (dfFormat
) that looks like this:
'data.frame': 100 obs. of 7 variables:
$ ID : chr "ID1" "ID2" "ID3" "ID4" ...
$ A_B.Age : num 0.266 0.372 0.573 0.908 0.202 ...
$ A_B.Points: num 0.398 -0.612 0.341 -1.129 1.433 ...
$ A_C.Age : num 0.6737 0.0949 0.4926 0.4616 0.3752 ...
$ A_C.Points: num 0.409 1.689 1.587 -0.331 -2.285 ...
$ B_C.Age : num 0.814 0.929 0.147 0.75 0.976 ...
$ B_C.Points: num 1.474 0.677 0.38 -0.193 1.578 ...
以及数据帧listFormat
的列表,如下所示:
and a list of data frames listFormat
that looks like this:
List of 3
$ A_B:'data.frame': 100 obs. of 3 variables:
..$ ID : chr [1:100] "ID1" "ID2" "ID3" "ID4" ...
..$ Age : num [1:100] 0.266 0.372 0.573 0.908 0.202 ...
..$ Points: num [1:100] 0.398 -0.612 0.341 -1.129 1.433 ...
$ A_C:'data.frame': 100 obs. of 3 variables:
..$ ID : chr [1:100] "ID1" "ID2" "ID3" "ID4" ...
..$ Age : num [1:100] 0.6737 0.0949 0.4926 0.4616 0.3752 ...
..$ Points: num [1:100] 0.409 1.689 1.587 -0.331 -2.285 ...
$ B_C:'data.frame': 100 obs. of 3 variables:
..$ ID : chr [1:100] "ID1" "ID2" "ID3" "ID4" ...
..$ Age : num [1:100] 0.814 0.929 0.147 0.75 0.976 ...
..$ Points: num [1:100] 1.474 0.677 0.38 -0.193 1.578 ...
我希望提出一种自动方法,将dfFormat
转换为listFormat
.从以上对象可以看出,有两个主要条件:
I am hoping to come up with an automated way to convert the dfFormat
to listFormat
. As can be seen in the above objects there are two main conditions:
1)如果listFormat
的每个子列表中有一个公共列(名称和内容)(在这些示例中为ID
),则在输出的dfFormat
中将不重复它们(在此示例中,最后一个ID
列)
1) If there is a common column (name and contents) in each sublist of listFormat
(in these examples ID
), then they are not repeated in the outputted dfFormat
(in this example, it has one final ID
column),
2)listFormat
子列表中的其余列名称成为dfFormat
中的列,并且具有这样的名称:它们保留其子列表名称(即"A_B"),后跟一个点,然后保留其原始列名称(即年龄),使其成为dfFormat
中的(即"A_B.Age").
2) The rest of the column names in sublists of listFormat
become columns in dfFormat
and have names such that they retain their sublist name (i.e "A_B") followed by a dot and then their original column name (i.e. Age), so that it becomes (i.e. "A_B.Age") in the dfFormat
.
我尝试了各种unlist()
和sapply
代码,但到目前为止仍未成功.有什么有效的方法可以做到这一点?
I have tried various unlist()
and sapply
codes but have been unsuccessful thus far. What is an efficient way to accomplish this?
推荐答案
在需要保留输入listFormat
的情况下,将listFormat
复制到L
.从第一个列中的cbind
组成部分中除去L
的每个列中的ID
列,然后确定第一列的名称.不使用任何软件包.
Copy listFormat
to L
in case we need to preserve the input, listFormat
. Remove the ID
column from each component of L
except the first, cbind
what we have left together and then fix up the name of the first column. No packages are used.
L <- listFormat
L[-1] <- lapply(L[-1], transform, ID = NULL)
DF <- do.call(cbind, L)
names(DF)[1] <- "ID"
给予:
> str(DF)
'data.frame': 100 obs. of 7 variables:
$ ID : chr "ID1" "ID2" "ID3" "ID4" ...
$ A_B.Age : num 0.9932 0.1451 0.6166 0.0372 0.9039 ...
$ A_B.Points: num 0.4752 0.0288 1.0548 0.6113 0.0651 ...
$ A_C.Age : num 0.912 0.761 0.618 0.895 0.507 ...
$ A_C.Points: num -0.515 -0.945 0.398 0.502 -1.021 ...
$ B_C.Age : num 0.7935 0.2747 0.0487 0.6307 0.3499 ...
$ B_C.Points: num -0.963 -1.772 1.716 -0.819 0.577 ...
这篇关于将数据框列表转换为具有列表名称的单个数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!