将长数据转换为宽数据并计算

将长数据转换为宽数据并计算

本文介绍了将长数据转换为宽数据并计算 R 中的总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何将长数据重塑为宽数据

How do I reshape long data to wide

x = c('x1','x1','x2','x2')
y  = c('y1','y1','y2','y2')
z= c('a','b','a','b')
n = c(3,5,7,2)
df1 <- data.table(x,y,z,n)
    > df1
    x  y z n
1: x1 y1 a 3
2: x1 y1 b 5
3: x2 y2 a 7
4: x2 y2 b 2

以宽格式获得如下输出.其中我按 x 和 y 列分组,将 z 列跨行分布并计算 n 列的总和.

to get output like below in wide format. where I groupby x and y column, spread z columns across rowwise and calculate sum of n column.

    x  y z n z.1 z.2
1: x1 y1 a 8 a   b
2: x2 y2 b 9 a   b

我尝试使用 reshape 和 dcast 但它对我没有帮助

I tried playing with reshape and dcast but it not helping me

dcast(df1, x ~ y, value.var="value")

推荐答案

我不清楚你为什么需要 z、z.1 和 z.2.在输出表中,它在上面要求的输出示例中为您提供了哪些信息?

I am not clear why you need to have z, and z.1 and z.2. in the output table, what information is it providing to you in the above-required output example?

我在这里的解决方案可能会有所帮助,这也会捕获 z 值,为您提供有关哪个值具有 id a 或 id b

My solution here should probably help, this also captures the z value, giving you info on which value has id a or id b

df1 <- data.table(x,y,z,n)
df1$id <- c(as.factor(df1$z)) # create an id on z, so that you can capture the info

   x  y z n id
1: x1 y1 a 3  1
2: x1 y1 b 5  2
3: x2 y2 a 7  1
4: x2 y2 b 2  2

# reshape with the id var to wide format
dt <- reshape(df1,timevar= "id", idvar = c("x","y"), direction="wide")

    x  y z.1 n.1 z.2 n.2
1: x1 y1   a   3   b   5
2: x2 y2   a   7   b   2

# finally do a rowsums
dt[, Sum := rowSums(.SD, na.rm = TRUE), .SDcols = grep("n", names(dt))]
dt
    x  y z.1 n.1 z.2 n.2 Sum
1: x1 y1   a   3   b   5   8
2: x2 y2   a   7   b   2   9

这篇关于将长数据转换为宽数据并计算 R 中的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 17:12