目标

我的目标是使用googleVis包在R中制作多个Sankey。输出应类似于以下内容:

r - 如何从data.frame制作googleVis多个Sankey?-LMLPHP

数据

我在R中创建了一些虚拟数据:

set.seed(1)

source <- sample(c("North","South","East","West"),100,replace=T)
mid <- sample(c("North ","South ","East ","West "),100,replace=T)
destination <- sample(c("North","South","East","West"),100,replace=T) # N.B. It is important to have a space after the second set of destinations to avoid a cycle
dummy <- rep(1,100) # For aggregation

dat <- data.frame(source,mid,destination,dummy)
aggdat <- aggregate(dummy~source+mid+destination,dat,sum)

到目前为止我尝试过的

如果我只有一个源和目标,但没有中间点,则可以用两个变量构建一个Sankey:
aggdat <- aggregate(dummy~source+destination,dat,sum)

library(googleVis)

p <- gvisSankey(aggdat,from="source",to="destination",weight="dummy")
plot(p)

该代码将产生以下结果:

r - 如何从data.frame制作googleVis多个Sankey?-LMLPHP

问题

我该如何修改
p <- gvisSankey(aggdat,from="source",to="destination",weight="dummy")

也接受mid变量?

最佳答案

函数gvisSankey确实直接接受中间级别。这些级别必须在基础数据中进行编码。

 source <- sample(c("NorthSrc", "SouthSrc", "EastSrc", "WestSrc"), 100, replace=T)
 mid <- sample(c("NorthMid", "SouthMid", "EastMid", "WestMid"), 100, replace=T)
 destination <- sample(c("NorthDes", "SouthDes", "EastDes", "WestDes"), 100, replace=T)
 dummy <- rep(1,100) # For aggregation

现在,我们将重塑原始数据:
 library(dplyr)

 datSM <- dat %>%
  group_by(source, mid) %>%
  summarise(toMid = sum(dummy) ) %>%
  ungroup()

数据帧datSM总结了从源到中的单位数量。
  datMD <- dat %>%
   group_by(mid, destination) %>%
   summarise(toDes = sum(dummy) ) %>%
   ungroup()

数据帧datMD汇总了从中点到目的地的单位数。该数据帧将被添加到最终数据帧中。数据帧必须为ungroup并具有相同的colnames
  colnames(datSM) <- colnames(datMD) <- c("From", "To", "Dummy")

由于datMD被追加为最后一个,因此gvisSankey将自动识别中间步骤。
  datVis <- rbind(datSM, datMD)

  p <- gvisSankey(datVis, from="From", to="To", weight="dummy")
  plot(p)

这是情节:
r - 如何从data.frame制作googleVis多个Sankey?-LMLPHP

关于r - 如何从data.frame制作googleVis多个Sankey?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/45510421/

10-12 19:22