本文介绍了STATS::RESHAPE的替代品的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

RESHAPE包中的熔化/铸造功能非常好,但我不确定是否有一种简单的方法可以在测量的变量类型不同的情况下应用它们。例如,以下是每个MD提供三个患者的性别和体重的数据片段:

ID PT1 WT1 PT2 WT2 PT3 WT3
1  "M" 170 "M" 175 "F" 145
...

目标是重塑,使每一行都是一个患者:

ID PTNUM GENDER WEIGHT
1    1     "M"    170
1    2     "M"    175
1    3     "F"    145
...

使用STATS包中的重塑功能是我知道的一种选择,但我在这里发帖是希望比我更有经验的R用户会发帖其他更好的方法。非常感谢!

--

@Vincent Zoonekynd:

我非常喜欢您的示例,所以我将其泛化为多个变量。

# Sample data
n <- 5
d <- data.frame(
  id = 1:n,
  p1 = sample(c("M","F"),n,replace=TRUE),
  q1 = sample(c("Alpha","Beta"),n,replace=TRUE),
  w1 = round(runif(n,100,200)),
  y1 = round(runif(n,100,200)),
  p2 = sample(c("M","F"),n,replace=TRUE),
  q2 = sample(c("Alpha","Beta"),n,replace=TRUE),
  w2 = round(runif(n,100,200)),
  y2 = round(runif(n,100,200)),
  p3 = sample(c("M","F"),n,replace=TRUE),
  q3 = sample(c("Alpha","Beta"),n,replace=TRUE),
  w3 = round(runif(n,100,200)),
  y3 = round(runif(n,100,200))
  )
# Reshape the data.frame, one variable at a time
library(reshape)
d1 <- melt(d, id.vars="id", measure.vars=c("p1","p2","p3","q1","q2","q3"))
d2 <- melt(d, id.vars="id", measure.vars=c("w1","w2","w3","y1","y2","y3"))
d1 = cbind(d1,colsplit(d1$variable,names=c("var","ptnum")))
d2 = cbind(d2,colsplit(d2$variable,names=c("var","ptnum")))
d1$variable = NULL
d2$variable = NULL
d1c = cast(d1,...~var)
d2c = cast(d2,...~var)
# Join the two data.frames
d3 = merge(d1c, d2c, by=c("id","ptnum"), all=TRUE)

--

最后的想法:我提出这个问题的动机是为了了解除stats::rehape函数之外的重塑包的替代方法。目前,我得出了以下结论:

  • 如果可以,请坚持使用stats::rehape。只要您记住使用列表而不是简单的向量来表示"变化"参数,您就不会遇到麻烦。对于较小的数据集--我这次处理的是总共少于200个变量的几千个患者病例--该函数的较低速度值得代码的简单性。

  • 要在Hadley Wickham的重塑(或重塑2)包中使用强制转换/熔化方法,您必须将变量分为两组,一组由数字变量组成,另一组由字符变量组成。当您的数据集足够大,以至于您无法忍受STATS::RESHAPE时,我想将变量分成两个集合的额外步骤看起来不会那么糟糕。

推荐答案

您可以分别处理每个变量,并连接生成的两个数据帧。

# Sample data
n <- 5
d <- data.frame(
  id = 1:n,
  pt1 = sample(c("M","F"),n,replace=TRUE),
  wt1 = round(runif(n,100,200)),
  pt2 = sample(c("M","F"),n,replace=TRUE),
  wt2 = round(runif(n,100,200)),
  pt3 = sample(c("M","F"),n,replace=TRUE),
  wt3 = round(runif(n,100,200))
)
# Reshape the data.frame, one variable at a time
library(reshape2)
d1 <- melt(d, 
  id.vars="id", measure.vars=c("pt1","pt2","pt3"), 
  variable.name="patient", value.name="gender"
)
d2 <- melt(d, 
  id.vars="id", measure.vars=c("wt1","wt2","wt3"), 
  variable.name="patient", value.name="weight"
)
d1$patient <- as.numeric(gsub("pt", "", d1$patient))
d2$patient <- as.numeric(gsub("wt", "", d1$patient))
# Join the two data.frames
merge(d1, d2, by=c("id","patient"), all=TRUE)

这篇关于STATS::RESHAPE的替代品的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-29 03:59