问题描述
下面的代码使用R中的基本绘图功能创建一个pareto图表.如何使用ggplot创建相同的图表?
The code below creates a pareto chart using base plotting functions in R. How do I create the same chart with ggplot?
* 我确实知道有些人会讨厌带有两个y轴的图.请不要在这篇文章中包含这个辩论.谢谢
## Creating the d tribble
library(tidyverse)
d <- tribble(
~ category, ~defect,
"price", 80,
"schedule", 27,
"supplier", 66,
"contact", 94,
"item", 33
)
## Creating new columns
d <- arrange(d, desc(defect)) %>%
mutate(
cumsum = cumsum(defect),
freq = round(defect / sum(defect), 3),
cum_freq = cumsum(freq)
)
## Saving Parameters
def_par <- par()
## New margins
par(mar=c(5,5,4,5))
## bar plot, pc will hold x values for bars
pc = barplot(d$defect,
width = 1, space = 0.2, border = NA, axes = F,
ylim = c(0, 1.05 * max(d$cumsum, na.rm = T)),
ylab = "Cummulative Counts" , cex.names = 0.7,
names.arg = d$category,
main = "Pareto Chart (version 1)")
## Cumulative counts line
lines(pc, d$cumsum, type = "b", cex = 0.7, pch = 19, col="cyan4")
## Framing plot
box(col = "grey62")
## adding axes
axis(side = 2, at = c(0, d$cumsum), las = 1, col.axis = "grey62", col = "grey62", cex.axis = 0.8)
axis(side = 4, at = c(0, d$cumsum), labels = paste(c(0, round(d$cum_freq * 100)) ,"%",sep=""),
las = 1, col.axis = "cyan4", col = "cyan4", cex.axis = 0.8)
## restoring default paramenter
par(def_par)
推荐答案
这是一个开始.我将您的dplyr
函数组合到一个流中,只是为了避免分配和重新分配变量d
.我添加了一个mutate
调用,该调用使用forcats
(带有tidyverse
的船舶)中的fct_reorder
根据defect
的相应值对category
进行排序.
Here's a start. I combined your dplyr
functions into a single stream, just to avoid assigning and reassigning the variable d
. I added a mutate
call that makes category
a factor, ordered based on corresponding values of defect
, using fct_reorder
from forcats
(ships with tidyverse
).
我不确定如何获得左y轴中断.我通过采用d$cumsum
的唯一值来手动设置它们,但是可能有一种方法可以在scale_y_continuous
的breaks
参数中为其编写函数.
What I'm not sure about is how to get the left y-axis breaks. I set them manually by taking the unique values of d$cumsum
, but there might be a way to write a function for it within the breaks
argument in scale_y_continuous
.
ggplot2
的最新版本允许使用辅助轴,但是它需要基于主轴的转换.在这种情况下,这意味着它应该采用主轴的值并除以最大值以获得百分比.
The more recent versions of ggplot2
allow for a secondary axis, but it needs to be based on a transformation of the primary axis. In this case, that means it should take the primary axis's values and divide by the maximum value to get a percentage.
如@ClausWilke在评论中所指出的那样,要确保辅助轴与数据正确对齐,以使最高点位于100%,请在设置辅助轴时使用~. / max(d$cumsum)
.
As pointed out in comments by @ClausWilke, to make sure the secondary axis aligns properly with the data, such that the top point is at 100%, use ~. / max(d$cumsum)
in setting up your secondary axis.
library(tidyverse)
d <- tribble(
~ category, ~defect,
"price", 80,
"schedule", 27,
"supplier", 66,
"contact", 94,
"item", 33
) %>% arrange(desc(defect)) %>%
mutate(
cumsum = cumsum(defect),
freq = round(defect / sum(defect), 3),
cum_freq = cumsum(freq)
) %>%
mutate(category = as.factor(category) %>% fct_reorder(defect))
brks <- unique(d$cumsum)
ggplot(d, aes(x = fct_rev(category))) +
geom_col(aes(y = defect)) +
geom_point(aes(y = cumsum)) +
geom_line(aes(y = cumsum, group = 1)) +
scale_y_continuous(sec.axis = sec_axis(~. / max(d$cumsum), labels = scales::percent), breaks = brks)
由 reprex软件包(v0.2.0)创建于2018-05-12.
Created on 2018-05-12 by the reprex package (v0.2.0).
这篇关于用ggplot2重新创建高级基本R图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!