


Suppose I have data in an R table which looks like this:

Id  Name Price sales Profit Month Category Mode
1   A     2     5     8       1     X       K
1   A     2     6     9       2     X       K
1   A     2     5     8       3     X       K
1   B     2     4     6       1     Y       L
1   B     2     3     4       2     Y       L
1   B     2     5     7       3     Y       L
2   C     2     5    11       1     X       M
2   C     2     5    11       2     X       L
2   C     2     5    11       3     X       K
2   D     2     8    10       1     Y       M
2   D     2     8    10       2     Y       K
2   D     2     5    7        3     Y       K
3   E     2     5    9        1     Y       M
3   E     2     5    9        2     Y       L
3   E     2     5    9        3     Y       M
3   F     2     4    7        1     Z       M
3   F     2     5    8        2     Z       L
3   F     2     5    8        3     Z       M

如果我使用 table 函数,例如:

If I use the table function on this data like:

table(df$Category, df$Mode)


It will show me under each mode which category has how many observations. It's like counting the number of items in each category under each mode.

但是,如果我希望表格显示在每个模式下的哪种模式下,该怎么办? code>赚了多少利润(总和或均值)而不是总数?

But what if I want the table to show under each Category which Mode earned how much Profit (sum or mean) and not the total count?


Is there any way to do this with the table function or another function in R?


我们可以在 base R 中使用 xtabs 。默认情况下, xtabs 获得 sum

We can use xtabs from base R. By default, the xtabs gets the sum

xtabs(Profit~Category+Mode, df)
#           Mode
#Category  K  L  M
#       X 36 11 11
#       Y 17 26 28
#       Z  0  8 15

或另一个基本R 选项更灵活,可以应用不同的 FUN tapply

Or another base R option that is more flexible to apply different FUN is tapply.

with(df, tapply(Profit, list(Category, Mode), FUN=sum))
#  K  L  M
#X 36 11 11
#Y 17 26 28
#Z NA  8 15

或者我们可以使用 dcast 将 long格式转换为 wide格式。更加灵活,因为我们可以将 fun.aggregate 指定为 sum 平均值中位数

Or we can use dcast to convert from 'long' to 'wide' format. It is more flexible as we can specify the fun.aggregate to sum, mean, median etc.

dcast(df, Category~Mode, value.var='Profit', sum)
# Category  K  L  M
#1        X 36 11 11
#2        Y 17 26 28
#3        Z  0  8 15

如果您需要使用长格式,这是 data.table 的一个选项。我们将'data.frame'转换为'data.table'( setDT(df)),按'Category'和'Mode'分组,得到<$ c 利润的$ c> sum 。

If you need it in the 'long' format, here is one option with data.table. We convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'Category' and 'Mode', we get the sum of 'Profit'.

setDT(df)[, list(Profit= sum(Profit)) , by = .(Category, Mode)]


07-29 14:44