本文介绍了如何绘制具有自定义分布的直方图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在旧的统计教科书中,我找到了一个国家人口年龄分布的表格:

In an old statistics textbook, I found a table of a distribution of ages for a country's population:


        Percent of
 Age    population
------------------
 0-5         8
 5-14       18
14-18        8
18-21        5
21-25        6
25-35       12
35-45       11
45-55       11
55-65        9
65-75        6
75-85        4

我想将此分布绘制为R中的直方图,年龄范围为休息点,人口百分比为密度,但是似乎没有直接的方法. R的hist()函数希望您提供各个数据点,而不是像这样的预先计算的分布.

I wanted to plot this distribution as a histogram in R, with the age ranges as breaks and the percent of population as the density, but there didn't seem to be a straightforward way to do it. R's hist() function wants you to supply the individual data points, not a pre-computed distribution such as this.

这就是我要做的.

# Copy original textbook table into two data structures
ageRanges <- list(0:5, 5:14, 14:18, 18:21, 21:25, 25:35, 35:45, 45:55, 55:65, 65:75, 75:85)
pcPop <- c(8, 18, 8, 5, 6, 12, 11, 11, 9, 6, 4)
# Make up "fake" age data points from the distribution described by the table
ages <- lapply(1:length(ageRanges), function(i) {
    ageRange <- ageRanges[[i]]
    round(runif(pcPop[i] * 100, min=ageRange[1], max=ageRange[length(ageRange)-1]), 0)
})
ages <- unlist(ages)
# Use the endpoints of the age class intervals as breaks for the histogram
breaks <- append(0, sapply(ageRanges, function(x) x[length(x)]))
hist(ages, breaks=breaks)

似乎没有那么冗长/笨拙的方式了.

It seems like there has to be a less verbose/hacky way of going about it.

EDIT :FWIW,这就是生成的直方图:

EDIT: FWIW, here's what the resulting histogram looks like:

推荐答案

这应该得到您想要的:

test <- read.table(textConnection("age popperc
0-5 8
5-14 18
14-18 8
18-21 5
21-25 6
25-35 12
35-45 11
45-55 11
55-65 9
65-75 6
75-85 4"),header=TRUE,stringsAsFactors=FALSE)

midval <- sapply(strsplit(test$age,"-"),function(x) mean(as.numeric(x)))
breakval <- strsplit(test$age,"-")
breakval <- as.numeric(c(sapply(breakval,head,1),tail(unlist(breakval),1)))
hist(rep(midval,test$popperc),breaks=breakval)

您还可以定义自己的class直方图对象,然后只需plot即可,如果您只是想绘制频率而不是密度:

You can also define your own object of class histogram and then just plot that if you just want to plot the frequencies not the densities:

# define the histogram object and plot it
histres <- list(
breaks=breakval,
counts=test$popperc,
mids=midval,
xname="ages",
equidist = TRUE
)
class(histres) <- "histogram"
plot(histres)

这篇关于如何绘制具有自定义分布的直方图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-14 10:56