我面临一个问题。我想在RStudio中绘制所有四个变量。在这里,我似乎有2个小组的3个变量和一个Count。但是,不知道如何使用ggplot2执行此操作。在xlim轴上应为age_band和sex。在y轴上计数被录取和未被录取的人数。我希望图例在叠加的条形图下面。波纹管由于分析和数据的机密性,我添加了绘制的图片。有人可以帮忙吗?我在stackoverflow上搜索,找不到很好的可重复代码r - 使用ggplot2划分4个组-女性已录入和未录入,男性已录入和未录入-LMLPHP

这是我经过操纵技术获得的两种数据类型。

第一类数据:

 structure(list(age_band = c("0 yrs", "0 yrs", "0 yrs", "0 yrs",
                       "1-4 yrs", "1-4 yrs", "1-4 yrs", "1-4 yrs",
                     "10-14 yrs", "10-14 yrs", "10-14 yrs", "10-14 yrs",
                      "15-19 yrs", "15-19 yrs", "15-19 yrs","15-19 yrs"),
            sex = c("Female", "Female", "Male", "Male", "Female",
                     "Female", "Male", "Male", "Female", "Female",
                    "Male", "Male", "Female", "Female", "Male", "Male"),
            patient.class = c("Not Admitted", "ORDINARY ADMISSION",
                              "Not Admitted", "ORDINARY ADMISSION", "Not
                               Admitted", "ORDINARY ADMISSION", "Not
                               Admitted", "ORDINARY ADMISSION",
                               "Not Admitted", "ORDINARY ADMISSION", "Not
                                Admitted", "ORDINARY ADMISSION", "Not
                               Admitted", "ORDINARY ADMISSION",
                               "Not Admitted", "ORDINARY ADMISSION"),
            Count = c(5681L, 1458L, 7667L, 2154L, 8040L, 2481L, 11737L,
                      3601L, 2904L, 938L, 3883L, 1233L, 3251L, 1266L,
                      2465L, 1031L)),
            row.names = c(NA, -16L), class = c("tbl_df", "tbl",
           "data.frame"
         ))


第二类数据:

   structure(list(age_band = c("0 yrs", "0 yrs", "0 yrs", "0 yrs",
                               "1-4 yrs", "1-4 yrs", "1-4 yrs", "1-4 yrs",
                               "10-14 yrs", "10-14 yrs",
                               "10-14 yrs", "10-14 yrs", "15-19 yrs",
                               "15- 19 yrs", "15-19 yrs", "15-19 yrs"),
         sex_patient_class = c("female_admitted", "female_not_admitted",
                                "male_admitted", "male_not_admitted",
                               "female_admitted", "female_not_admitted",
                               "male_admitted", "male_not_admitted",
                               "female_admitted", "female_not_admitted",
                               "male_admitted", "male_not_admitted",
                               "female_admitted", "female_not_admitted",
                               "male_admitted", "male_not_admitted"),
         Count = c(1458L, 5681L,  2154L, 7667L, 2481L, 8040L, 3601L, 11737L,
                   938L, 2904L, 1233L, 3883L, 1266L, 3251L, 1031L, 2465L)),
         row.names = c(NA, -16L), class = c("grouped_df", "tbl_df", "tbl",
                                            "data.frame"),
        vars = "age_band", drop = TRUE, indices = list( 0:3, 4:7, 8:11,
                                                        12:15),
        group_sizes = c(4L, 4L, 4L, 4L), biggest_group_size = 4L, labels =
        structure(list(age_band = c("0 yrs", "1-4 yrs", "10-14 yrs", "15-19
                                     yrs")),
         row.names = c(NA, -4L), class = "data.frame", vars = "age_band",
         drop = TRUE))

最佳答案

要将入院患者的列叠加到未入院患者上,您可以通过两种方式过滤数据。我在一开始就将美学定义为一个通用的填充项。

library(tidyverse)

ggplot(my_data2, aes(age_band, Count, fill = sex_patient_class)) +
  geom_col(data = filter(my_data2, sex_patient_class %in% c("male_not_admitted", "female_not_admitted")),
           position = position_dodge()) +
  geom_col(data = filter(my_data2, sex_patient_class %in% c("male_admitted", "female_admitted")),
           position = position_dodge(0.9), width = 0.5) +
  scale_fill_manual(name = "",
                    breaks = c("male_admitted", "male_not_admitted",
                               "female_admitted", "female_not_admitted"),
                    labels = c("Male Admitted", "Male Not admitted",
                               "Female Admitted", "Female Not admitted"),
                    values = c("grey80", "black", "red", "orange"))


r - 使用ggplot2划分4个组-女性已录入和未录入,男性已录入和未录入-LMLPHP

详细说明

实际的叠加发生在两个geom_col调用中。呼叫的顺序很重要,因为第二个在第一个上方绘制。因此,我们从“后退”列开始:

对于filter,我们仅选择不准入院的患者,并将其用作geom_col的数据。我们不需要从最初的ggplot调用中重复美学,因为如果没有另外指定,则继承。 position_dodge()在每个年龄组中将列相邻放置。

p <- ggplot(my_data2, aes(age_band, Count, fill = sex_patient_class)) +
  geom_col(data = filter(my_data2, sex_patient_class %in% c("male_not_admitted", "female_not_admitted")),
           position = position_dodge())
p


r - 使用ggplot2划分4个组-女性已录入和未录入,男性已录入和未录入-LMLPHP

现在,将其他列添加到顶部,我们将过滤器语句更改为入院患者。因为我们希望“前”列比“后”列窄,所以我们指定width=0.5

p + geom_col(data = filter(my_data2, sex_patient_class %in% c("male_admitted", "female_admitted")),
             position = position_dodge(), width = 0.5)


r - 使用ggplot2划分4个组-女性已录入和未录入,男性已录入和未录入-LMLPHP

现在我们快完成了。要将“前”列移动到“后”列的中心,我们需要指定position_dodge()的宽度。在这种情况下,将它们居中,值为0.9。要位于“保存侧”(即确保该位置确实位于后栏的中心),请为两个geom_col调用指定相同的闪避宽度。然后,我们更改不太漂亮的颜色(此处为啤酒厂调色板“ Paired”)和图例信息,并完成:

p + geom_col(data = filter(my_data2, sex_patient_class %in% c("male_admitted", "female_admitted")),
             position = position_dodge(0.9), width = 0.5) +
  scale_fill_brewer(name = "",
                    breaks = c("male_admitted", "male_not_admitted",
                               "female_admitted", "female_not_admitted"),
                    labels = c("Male Admitted", "Male Not admitted",
                               "Female Admitted", "Female Not admitted"),
                    palette = "Paired")


r - 使用ggplot2划分4个组-女性已录入和未录入,男性已录入和未录入-LMLPHP

关于r - 使用ggplot2划分4个组-女性已录入和未录入,男性已录入和未录入,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/51875171/

10-11 21:30