我已经将一些数据分为足够的分组,以至于标准箱形图看起来非常拥挤。 Tufte有自己的箱型图,您可以在其中基本上放掉全部或一半的框,如下所示:



一些样本数据:

cw <- transform(ChickWeight,
  Time = cut(ChickWeight$Time,4)
  )
cw$Chick <- as.factor( sample(LETTERS[seq(3)], nrow(cw), replace=TRUE) )
levels(cw$Diet) <- c("Low Fat","Hi Fat","Low Prot.","Hi Prot.")


我想要每种减肥*时间*小鸡分组的体重箱线图。

几年前,我遇到了这个问题,并想出了一个使用网格图形的解决方案,我将在稍后发布。但是,在解决这个新的(和类似的)问题时,我想知道是否有一种通用的方法可以解决这些问题,而不是固定我相互矛盾的例子。

顺便说一句,这些似乎是Tufte创作中最受欢迎的作品之一,但我真的很喜欢它们能够密集地显示大量分组的分布模式,如果它们在其中具有良好的功能,我会更多地使用它们。 ggplot2或点阵。

最佳答案

您显然只需要一个垂直版本,因此我获取了panel.bwplot代码,去除了所有不必要的内容,例如盒子和盖子,并在参数中设置了horizo​​ntal = FALSE并创建了panel.tuftebxp函数。还要将点的Cex设置为默认值的一半。仍然有很多选项可以根据您的口味进行调整。 “时间”的“数字”因子名称看起来很草率,但是我认为“概念验证”很明确,您可以清理对您来说很重要的内容:

panel.tuftebxp <-
function (x, y, box.ratio = 1, box.width = box.ratio/(1 + box.ratio), horizontal=FALSE,
    pch = box.dot$pch, col = box.dot$col,
    alpha = box.dot$alpha, cex = box.dot$cex, font = box.dot$font,
    fontfamily = box.dot$fontfamily, fontface = box.dot$fontface,
    fill = box.rectangle$fill, varwidth = FALSE, notch = FALSE,
    notch.frac = 0.5, ..., levels.fos = if (horizontal) sort(unique(y)) else sort(unique(x)),
    stats = boxplot.stats, coef = 1.5, do.out = TRUE, identifier = "bwplot")
{
    if (all(is.na(x) | is.na(y)))
        return()
    x <- as.numeric(x)
    y <- as.numeric(y)
    box.dot <- trellis.par.get("box.dot")
    box.rectangle <- trellis.par.get("box.rectangle")
    box.umbrella <- trellis.par.get("box.umbrella")
    plot.symbol <- trellis.par.get("plot.symbol")
    fontsize.points <- trellis.par.get("fontsize")$points
    cur.limits <- current.panel.limits()
    xscale <- cur.limits$xlim
    yscale <- cur.limits$ylim
    if (!notch)
        notch.frac <- 0
    #removed horizontal code
     blist <- tapply(y, factor(x, levels = levels.fos), stats,
            coef = coef, do.out = do.out)
        blist.stats <- t(sapply(blist, "[[", "stats"))
        blist.out <- lapply(blist, "[[", "out")
        blist.height <- box.width
        if (varwidth) {
            maxn <- max(table(x))
            blist.n <- sapply(blist, "[[", "n")
            blist.height <- sqrt(blist.n/maxn) * blist.height
        }
        blist.conf <- if (notch)
            sapply(blist, "[[", "conf")
        else t(blist.stats[, c(2, 4), drop = FALSE])
        ybnd <- cbind(blist.stats[, 3], blist.conf[2, ], blist.stats[,
            4], blist.stats[, 4], blist.conf[2, ], blist.stats[,
            3], blist.conf[1, ], blist.stats[, 2], blist.stats[,
            2], blist.conf[1, ], blist.stats[, 3])
        xleft <- levels.fos - blist.height/2
        xright <- levels.fos + blist.height/2
        xbnd <- cbind(xleft + notch.frac * blist.height/2, xleft,
            xleft, xright, xright, xright - notch.frac * blist.height/2,
            xright, xright, xleft, xleft, xleft + notch.frac *
                blist.height/2)
        xs <- cbind(xbnd, NA_real_)
        ys <- cbind(ybnd, NA_real_)
        panel.segments(rep(levels.fos, 2), c(blist.stats[, 2],
            blist.stats[, 4]), rep(levels.fos, 2), c(blist.stats[,
            1], blist.stats[, 5]), col = box.umbrella$col, alpha = box.umbrella$alpha,
            lwd = box.umbrella$lwd, lty = box.umbrella$lty, identifier = paste(identifier,
                "whisker", sep = "."))

        if (all(pch == "|")) {
            mult <- if (notch)
                1 - notch.frac
            else 1
            panel.segments(levels.fos - mult * blist.height/2,
                blist.stats[, 3], levels.fos + mult * blist.height/2,
                blist.stats[, 3], lwd = box.rectangle$lwd, lty = box.rectangle$lty,
                col = box.rectangle$col, alpha = alpha, identifier = paste(identifier,
                  "dot", sep = "."))
        }
        else {
            panel.points(x = levels.fos, y = blist.stats[, 3],
                pch = pch, col = col, alpha = alpha, cex = cex,
                 identifier = paste(identifier,
                  "dot", sep = "."))
        }
        panel.points(x = rep(levels.fos, sapply(blist.out, length)),
            y = unlist(blist.out), pch = plot.symbol$pch, col = plot.symbol$col,
            alpha = plot.symbol$alpha, cex = plot.symbol$cex*0.5,
            identifier = paste(identifier, "outlier", sep = "."))

}
bwplot(weight ~ Diet + Time + Chick, data=cw, panel=
         function(x,y, ...) panel.tuftebxp(x=x,y=y,...))

关于r - R?中可用于Tufte箱图的功能,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/6973394/

10-12 17:07