我尝试构建具有不同宽度的堆叠条形图,以使宽度表示分配的平均数量,而高度表示分配的数量。

接下来,您将找到我的可复制数据:

procedure = c("method1","method2", "method3", "method4","method1","method2", "method3", "method4","method1","method2", "method3","method4")
sector =c("construction","construction","construction","construction","delivery","delivery","delivery","delivery","service","service","service","service")
number = c(100,20,10,80,75,80,50,20,20,25,10,4)
amount_mean = c(1,1.2,0.2,0.5,1.3,0.8,1.5,1,0.8,0.6,0.2,0.9)

data0 = data.frame(procedure, sector, number, amount_mean)

当使用geom_bar并在es中包括宽度时,我收到以下错误消息:


bar<-ggplot(data=data0,aes(x=sector,y=number,fill=procedure, width = amount_mean)) +
geom_bar(stat="identity")

我还查看了mekko软件包,但这似乎仅适用于条形图。

这是我最后想要的内容(不基于上述数据):

知道如何解决我的问题吗?

最佳答案

我也尝试过同样的geom_col(),但是我遇到了同样的问题-对于position = "stack",似乎我们不能在不进行堆栈的情况下分配width参数。

但是事实证明,该解决方案非常简单-我们可以使用geom_rect()手动构建此类图。

有您的数据:

df = data.frame(
  procedure   = rep(paste("method", 1:4), times = 3),
  sector      = rep(c("construction", "delivery", "service"), each = 4),
  amount      = c(100, 20, 10, 80, 75, 80, 50, 20, 20, 25, 10, 4),
  amount_mean = c(1, 1.2, 0.2, 0.5, 1.3, 0.8, 1.5, 1, 0.8, 0.6, 0.2, 0.9)
)

首先,我已经转换了您的数据集:
df <- df %>%
  mutate(amount_mean = amount_mean/max(amount_mean),
         sector_num = as.numeric(sector)) %>%
  arrange(desc(amount_mean)) %>%
  group_by(sector) %>%
  mutate(
    xmin = sector_num - amount_mean / 2,
    xmax = sector_num + amount_mean /2,
    ymin = cumsum(lag(amount, default = 0)),
    ymax = cumsum(amount)) %>%
  ungroup()

我在这里做什么:
  • 我按比例缩小了amount_mean,所以减小了0 >= amount_mean <= 1(更好的绘图方式,无论如何我们没有其他比例可以显示amount_mean的实际值);
  • 我也将sector变量解码为数值型(用于绘图,请参见下文);
  • 我已经按照amount_mean降序排列了数据集(重的意思是-在底部,浅的意思是在顶部);
  • 按部门分组,我计算了xminxmax代表amount_mean,以及yminymax表示金额。前两个有点棘手。 ymax很明显-您只需从第一个开始就为所有amount求和。您还需要累积总和来计算ymin,但要从0开始。因此,第一个矩形是用ymin = 0绘制的,第二个是-先前三角形的ymin = ymax绘制的。所有这些都与sector的每个单独组一起执行。

  • 绘制数据:
    df %>%
      ggplot(aes(xmin = xmin, xmax = xmax,
                 ymin = ymin, ymax = ymax,
                 fill = procedure
                 )
             ) +
      geom_rect() +
      scale_x_continuous(breaks = df$sector_num, labels = df$sector) +
      #ggthemes::theme_tufte() +
      theme_bw() +
      labs(title = "Question 51136471", x = "Sector", y = "Amount") +
      theme(
        axis.ticks.x = element_blank()
        )
    

    结果:

    r - ggplot中宽度可变的堆叠条形图-LMLPHP

    防止对procedure变量重新排序的另一种选择。因此,所有人都说“红色”下降了,“绿色”上升了,等等。但是看起来很难看:
    df <- df %>%
      mutate(amount_mean = amount_mean/max(amount_mean),
             sector_num = as.numeric(sector)) %>%
      arrange(procedure, desc(amount), desc(amount_mean)) %>%
      group_by(sector) %>%
      mutate(
        xmin = sector_num - amount_mean / 2,
        xmax = sector_num + amount_mean /2,
        ymin = cumsum(lag(amount, default = 0)),
        ymax = cumsum(amount)
        ) %>%
      ungroup()
    

    r - ggplot中宽度可变的堆叠条形图-LMLPHP

    10-04 17:40