问题描述
我想总结一下存储日志记录时移动平均多个不同的类别。试想一下,节省了网络服务器日志一次一个条目的服务。让我们进一步想象一下,我们没有访问登录记录。所以我们看到他们一次,但没有对它们的访问以后。
I'd like to sum up moving averages for a number of different categories when storing log records. Imagine a service that saves web server logs one entry at a time. Let's further imagine, we don't have access to the logged records. So we see them once but don't have access to them later on.
有关不同的页面,我想知道
For different pages, I'd like to know
- 命中总数(方便)
- 在一个最近的平均水平(如一个月或左右)
- 在一个长期的平均水平(超过一年)
有没有什么聪明的算法/数据模型,可以节省这样的移动平均线,而无需通过总结数据了庞大的批量重新计算?
Is there any clever algorithm/data model that allows to save such moving averages without having to recalculate them by summing up huge quantities of data?
我不需要精确的平均值(正好是30天左右),但只是趋势指标。因此,一些模糊性是没有问题的。它应该只是确保新的条目比年长的权重较高。
I don't need an exact average (exactly 30 days or so) but just trend indicators. So some fuzziness is not a problem at all. It should just make sure that newer entries are weighted higher than older ones.
一个解决方案很可能是自动创建的统计记录每个月。不过,我也不需要近一个月的统计数据,因此这似乎有点小题大做。而且它不会给我一个移动平均值,而是从本月交换到新的价值观来月。
One solution probably would be to auto-create statistics records for each month. However, I don't even need past month statistics, so this seems like overkill. And it wouldn't give me a moving average but rather swap to new values from month to month.
推荐答案
这是简单的解决办法是保持一个呈指数衰减的总和。
An easy solution would be to keep an exponentially decaying total.
有可以使用下列公式计算:
It can be calculated using the following formula:
newX = oldX * (p ^ (newT - oldT)) + delta
其中, oldX
是你的总的旧值(时间 oldT
),下一页末
是你的总的(时间纽特
)的新值; 增量
是新的事件总的贡献(例如命中今日数); P
小于或等于1并且是衰减因子。如果我们把 P = 1
,那么我们的总命中数。通过降低 P
,我们有效地降低我们的总描述的时间间隔。
where oldX
is the old value of your total (at time oldT
), newX
is the new value of your total (at time newT
); delta
is the contribution of new events to the total (for example the number of hits today); p
is less or equal to 1 and is the decay factor. If we take p = 1
, then we have the total number of hits. By decreasing p
, we effectively decrease the interval our total describes.
这篇关于数据结构/算法,有效地节省加权移动平均线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!