本文介绍了根据变量的值对变量进行分组并获得直方图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试根据变量的值对变量进行分组并获得直方图.

I'm trying to group the variable according to its values and get a histogram.

例如,这是我的数据:

r <-c(1,899,1,2525,763,3,2,2,1863,695,9,4,2876,1173,1156,5098,3,3876,1,1,
     3023,76336,13,003,9898,1,10,843,10546,617,1375,1,1,5679,1,21,1,13,6,28,1,14088,682)

我想按r的值将其分组,例如:1-5、5-10、10-100、100-500和大于500.然后我想获得一个x轴为类型的直方图间隔(1-5,5-10,10-100,100-500和大于500)的范围.怎么解决呢?

I want to group r by its value, like: 1-5, 5-10, 10-100, 100-500 and more than 500. And then I want to get a histogram which the x axis is in the type of interval (1-5,5-10,10-100,100-500 and more than 500) . How to solve that?

如果我想使用ggplot2软件包,则代码如下:

If I want to use le package ggplot2, code as following:

ggplot(data=r, aes(x=r))+geom_histogram(breaks = c(1, 5, 10, 100, 500,2000,Inf))

它不起作用,R说缺少需要TRUE/FALSE的值".以及如何使大垃圾箱相同?

It dosen't work and R says that "missing value where TRUE/FALSE needed". And how to make the larges of bins are the same?

推荐答案

在基数R中

r <-c(1,899,1,2525,763,3,2,2,1863,695,9,4,2876,1173,1156,5098,3,3876,1,1,5,
      3023,76336,13,003,9898,1,10,843,10546,617,1375,1,1,5679,1,21,1,13,6,28,1,14088,682)
cut.vals <- cut(r, breaks = c(1, 5, 10, 100, 500, Inf), right = FALSE)
xy <- data.frame(r, cut = cut.vals)
barplot(table(xy$cut))

请注意,我添加了xy变量以简化比较值的分组方式.您可以直接将cut.vals放入barplot(table()).

Note that I added the xy variable to ease in comparing how values were grouped. You can directly put cut.vals into the barplot(table()).

要使用ggplot2,您可以预先计算所有垃圾箱并绘制

To use ggplot2, you can pre-calculate all the bins and plot

ggplot(xy, aes(x = cut)) +
  theme_bw() +
  geom_bar() +
  scale_x_discrete(drop = FALSE)

geom_histogram控制箱大小的最常用参数是binwidth,对于所有箱都是恒定的.

geom_histogram's most common parameter that controls bin size is binwidth, which is constant for all bins.

这篇关于根据变量的值对变量进行分组并获得直方图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-22 23:51