问题描述
我的目标是使用googleVis
包在R中制作多个Sankey.输出应类似于以下内容:
I am aiming to make a multiple Sankey in R using the googleVis
package. The output should look similar to this:
我已经在R中创建了一些虚拟数据:
I've created some dummy data in R:
set.seed(1)
source <- sample(c("North","South","East","West"),100,replace=T)
mid <- sample(c("North ","South ","East ","West "),100,replace=T)
destination <- sample(c("North","South","East","West"),100,replace=T) # N.B. It is important to have a space after the second set of destinations to avoid a cycle
dummy <- rep(1,100) # For aggregation
dat <- data.frame(source,mid,destination,dummy)
aggdat <- aggregate(dummy~source+mid+destination,dat,sum)
到目前为止我已经尝试过的
如果我只有一个源和目标,但没有中间点,那么我可以用两个变量构建一个Sankey:
What I've tried so far
I can build a Sankey with 2 variables fine if I have just a source and destination, but not a middle point:
aggdat <- aggregate(dummy~source+destination,dat,sum)
library(googleVis)
p <- gvisSankey(aggdat,from="source",to="destination",weight="dummy")
plot(p)
代码会产生以下结果:
我如何修改
p <- gvisSankey(aggdat,from="source",to="destination",weight="dummy")
也接受mid
变量吗?
推荐答案
函数gvisSankey
确实直接接受中间级别.这些级别必须在基础数据中进行编码.
Function gvisSankey
does accept mid-levels directly. These levels have to be coded in underlying data.
source <- sample(c("NorthSrc", "SouthSrc", "EastSrc", "WestSrc"), 100, replace=T)
mid <- sample(c("NorthMid", "SouthMid", "EastMid", "WestMid"), 100, replace=T)
destination <- sample(c("NorthDes", "SouthDes", "EastDes", "WestDes"), 100, replace=T)
dummy <- rep(1,100) # For aggregation
现在,我们将重塑原始数据:
Now, we'll reshape original data:
library(dplyr)
datSM <- dat %>%
group_by(source, mid) %>%
summarise(toMid = sum(dummy) ) %>%
ungroup()
数据框datSM
总结了从源到中的单位数量.
Data frame datSM
summarises number of units from Source to Mid.
datMD <- dat %>%
group_by(mid, destination) %>%
summarise(toDes = sum(dummy) ) %>%
ungroup()
数据帧datMD
汇总了从中点到目的地的单位数.该数据帧将被添加到最终数据帧中.数据框必须为ungroup
并且具有相同的colnames
.
Data frame datMD
summarises number of units from Mid to Destination. This data frame will be added to the final data frame. Data frame need to be ungroup
and have same colnames
.
colnames(datSM) <- colnames(datMD) <- c("From", "To", "Dummy")
由于datMD
作为最后一个追加,gvisSankey
将自动识别中间步骤.
As the datMD
is appended as the last one, gvisSankey
will recognise the middle step automatically.
datVis <- rbind(datSM, datMD)
p <- gvisSankey(datVis, from="From", to="To", weight="dummy")
plot(p)
这是情节:
这篇关于如何从data.frame制作googleVis多个Sankey?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!