问题描述
我有一个与Web访问记录相关的域对象列表。这些域对象可以扩展到数千个。
I have a list of domain objects that relate to web access records. These domain objects can stretch into the thousands in number.
我没有资源或要求以原始格式将它们存储在数据库中,所以相反我想要预计算聚合并将聚合数据放入数据库。
I don't have the resources or requirement to store them in a database in raw format, so instead I want to precompute aggregations and put the aggregated data in a database.
我需要聚合在5分钟窗口中传输的总字节数,喜欢以下SQL查询
I need to aggregate the total bytes transferred in 5 minute windows, like the following SQL query
select
round(request_timestamp, '5') as window, --round timestamp to the nearest 5 minute
cdn,
isp,
http_result_code,
transaction_time,
sum(bytes_transferred)
from web_records
group by
round(request_timestamp, '5'),
cdn,
isp,
http_result_code,
transaction_time
在Java 8中,我的第一个当前刺戳看起来像这样,我知道这个解决方案类似于
In Java 8 my first current stab looks like this, I am aware this solution is similar to this response in Group by multiple field names in java 8
Map<Date, Map<String, Map<String, Map<String, Map<String, Integer>>>>>>> aggregatedData =
webRecords
.stream()
.collect(Collectors.groupingBy(WebRecord::getFiveMinuteWindow,
Collectors.groupingBy(WebRecord::getCdn,
Collectors.groupingBy(WebRecord::getIsp,
Collectors.groupingBy(WebRecord::getResultCode,
Collectors.groupingBy(WebRecord::getTxnTime,
Collectors.reducing(0,
WebRecord::getReqBytes(),
Integer::sum)))))));
这很有效,但它很难看,所有这些嵌套地图都是噩梦!要将地图展平或展开成行,我必须这样做
This works, but it's ugly, all those nested maps are a nightmare! To "flatten" or "unroll" the map out into rows I have to do this
for (Date window : aggregatedData.keySet()) {
for (String cdn : aggregatedData.get(window).keySet()) {
for (String isp : aggregatedData.get(window).get(cdn).keySet()) {
for (String resultCode : aggregatedData.get(window).get(cdn).get(isp).keySet()) {
for (String txnTime : aggregatedData.get(window).get(cdn).get(isp).get(resultCode).keySet()) {
Integer bytesTransferred = aggregatedData.get(window).get(cdn).get(distId).get(isp).get(resultCode).get(txnTime);
AggregatedRow row = new AggregatedRow(window, cdn, distId...
尽可能看到这是非常混乱和难以维护。
As you can see this is pretty messy and difficult to maintain.
任何人都有更好的方法来做到这一点吗?任何帮助将不胜感激。
Anyone have any ideas of a better way to do this? Any help would be greatly appreciated.
我想知道是否有更好的方法来展开嵌套地图,或者是否有一个允许你对集合进行GROUP BY的库。
I'm wondering if there is a nicer way to unroll the nested maps, or if there is a library that allows you to do a GROUP BY on a collection.
推荐答案
您应该为地图创建自定义键。最简单的方法是使用 Arrays.asList
:
You should create the custom key for your map. The simplest way is to use Arrays.asList
:
Function<WebRecord, List<Object>> keyExtractor = wr ->
Arrays.<Object>asList(wr.getFiveMinuteWindow(), wr.getCdn(), wr.getIsp(),
wr.getResultCode(), wr.getTxnTime());
Map<List<Object>, Integer> aggregatedData = webRecords.stream().collect(
Collectors.groupingBy(keyExtractor, Collectors.summingInt(WebRecord::getReqBytes)));
在这种情况下,键是固定顺序的5个元素的列表。不是面向对象,而是简单。或者,您可以定义自己的类型来表示自定义键,并创建正确的 hashCode
/ 等于
实现。
In this case the keys are lists of 5 elements in fixed order. Not quite object-oriented, but simple. Alternatively you can define your own type which represents the custom key and create proper hashCode
/equals
implementations.
这篇关于在Java 8中对具有聚合的多个字段进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!