本文介绍了是否有一个Perl统计数据包不能让我立即加载整个数据集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找Perl的统计信息包(CPAN很好),它允许我逐步添加数据,而不必传递整个数据数组.

I'm looking for a statistics package for Perl (CPAN is fine) that allows me to add data incrementally instead of having to pass in an entire array of data.

平均值,中位数,stddev,max和min都是必需的,没有什么太复杂的.

Just the mean, median, stddev, max, and min is necessary, nothing too complicated.

之所以这样,是因为我的数据集太大,无法容纳到内存中.数据源位于MySQL数据库中,所以现在我只是查询数据的一个子集并为它们计算统计信息,然后将所有可管理的子集合并.

The reason for this is because my dataset is entirely too large to fit into memory. The data source is in a MySQL database, so right now I'm just querying a subset of the data and computing the statistics for them, then combining all the manageable subsets later.

如果您对如何解决此问题还有其他想法,我将非常有义务!

If you have other ideas on how to overcome this issue, I'd be much obliged!

推荐答案

Statistics :: Descriptive :: Discrete 允许您以类似于Statistics :: Descriptive的方式执行此操作,但已针对大型数据集进行了优化. (例如,文档报告将内存使用量提高了两个数量级(100x).)

Statistics::Descriptive::Discrete allows you to do this in a manner similar to Statistics::Descriptive, but has been optimized for use with large data sets. (The documentation reports an improvement by two orders of magnitude (100x) in memory usage, for example).

这篇关于是否有一个Perl统计数据包不能让我立即加载整个数据集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-31 10:17