本文介绍了如何在Julia DataFrame中添加ROW汇总/子汇总?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个像这样的df,其中有多个分类值和几个变量:

Let's say I have a df looking like this one, where I have multiple categorical values and several variables:

df = wsv"""
region product year prod cons
US     apples  2010 1    2
US     appels  2011 3    4
US     banana  2010 5    6
US     banana  2011 7    8
EU     apples  2010 9    10
EU     appels  2011 11   12
EU     banana  2010 13   14
EU     banana  2011 15   16
"""

如何将其转换为具有类别总计/小计的新行,即

How can I transform it to have category totals/subtotals as new rows, i.e.

df2 = wsv"""
index  prod  cons
US     16    20
apples 4     6
2010   1     2
2011   3     4
banana 12    14
2010   5     6
2011   7     8
EU     48    52
apples 20    22
2010   9     10
2011   11    12
banana 28    30
2010   13    14
2011   15    16
"""

在正确格式化(例如,总计为粗体..)后,这通常对报告数据很有用,因为许多报告实际上都使用这种结构.

This is often useful, after proper formatting (e.g. totals in bold..), to report data, as many reports use actually this kind of structure..

推荐答案

您可以使用嵌套的by来实现类似的目的:

You can use nested by to achieve something similar:

df2 = by(df, :region) do sub1
      t = DataFrame(product=NA, year=NA, prod=sum(sub1[:prod]), cons=sum(sub1[:cons]))
      sub1mod = by(sub1, [:region,:product]) do sub2
        t2 = DataFrame(year=NA, prod=sum(sub2[:prod]), cons=sum(sub2[:cons]))
        t3 = vcat(t2,sub2)
      end
      t2 = vcat(t,sub1mod)
end
delete!(df2,[:region_1,:region_2,:product_1])

出局:

14×5 DataFrames.DataFrame
│ Row │ region │ product  │ year │ prod │ cons │
├─────┼────────┼──────────┼──────┼──────┼──────┤
│ 1   │ "EU"   │ NA       │ NA   │ 48   │ 52   │
│ 2   │ "EU"   │ "apples" │ NA   │ 20   │ 22   │
│ 3   │ "EU"   │ "apples" │ 2010 │ 9    │ 10   │
│ 4   │ "EU"   │ "apples" │ 2011 │ 11   │ 12   │
│ 5   │ "EU"   │ "banana" │ NA   │ 28   │ 30   │
│ 6   │ "EU"   │ "banana" │ 2010 │ 13   │ 14   │
│ 7   │ "EU"   │ "banana" │ 2011 │ 15   │ 16   │
│ 8   │ "US"   │ NA       │ NA   │ 16   │ 20   │
│ 9   │ "US"   │ "apples" │ NA   │ 4    │ 6    │
│ 10  │ "US"   │ "apples" │ 2010 │ 1    │ 2    │
│ 11  │ "US"   │ "apples" │ 2011 │ 3    │ 4    │
│ 12  │ "US"   │ "banana" │ NA   │ 12   │ 14   │
│ 13  │ "US"   │ "banana" │ 2010 │ 5    │ 6    │
│ 14  │ "US"   │ "banana" │ 2011 │ 7    │ 8    │

这篇关于如何在Julia DataFrame中添加ROW汇总/子汇总?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-14 16:26