我的Web应用程序上有图表,每当我收到新信息时,都需要更新图表。

现在我正在做模拟,所以我用大约100000个数据(在json中)进行了回测(但是如果浏览器和硬件可以处理的话,可能是数百万个)。

因此,我需要对我的代码进行尽可能的优化。

我有这样的对象:

var trades = {"1515867288390":{price:"10", quantity:"500"},
            "1515867289541":{price:"9", quantity:"400"},
            "1515867295400":{price:"11", quantity:"750"},
            "1515867296500":{price:"7", quantity:"1100"},
            ...}


每次我在交易中扫描一个对象时,我都希望获得最近X秒钟的中间价格,因此我有$ .each(交易,getAverage ...)

getAverage = function (trade_time) {

var total_quantity = 0;
var total_trade_value = 0;
var startfrom = trade_time - duration;

Object.keys(trades).forEach(function (time) {
    if (time < startfrom)
        delete trades[time];
});

$.each(trades, function (key, value) {
    total_quantity += parseFloat(value.quantity);
    total_trade_value += (value.price * value.quantity);
});

var average = (total_trade_value / total_quantity);
return average;
}


80000笔交易的平均执行时间约为7.5s。

不错,我想是这样,但是问题是我需要var startfrom = trade_time - duration中的持续时间是可调的,这引起了问题,因为我的getAverage函数根据startfrom删除了所有元素,其本身取决于持续时间,所以如果在开始时持续时间= 10,然后持续时间更改为20,则无论如何都只能回溯过去10秒钟的平均值。

一种解决方案是复制数组以保留“完整”副本,但是我的函数每次都会初始化所有元素,并且速度会变慢。
我尝试过的第二个选择是不删除项目并使用:

Object.keys(trades).filter(t => t>=startfrom).forEach(function (time) {
    var value = trades[time];
    total_quantity += parseFloat(value.quantity);
    total_trade_value += (value.price * value.quantity);
});


它慢了大约300倍,所以这真的是个错误的选择,我想知道您会怎么想?

谢谢。

PS:我当时在考虑使用数组,因为我的键始终是数字(时间戳),但是如果我使用数组,我最终将得到数百万个空索引,这会不会再变慢?

最佳答案

这是几个(密切相关的)想法。

想法1

随着每笔交易的到来,将其推入一个数组中(如果交易可能会无序到达,则将其拼接到数组中)。通过钩子或弯曲,确保阵列按时间戳顺序保留。然后,当您的非套接字代码从套接字获取数据(作为交易)并计算出平均值时,所需的数据将始终位于数组的一端。一旦达到不合格交易,计算就可以停止(跳出循环)。

想法2

与“创意1”类似,但不维护一系列原始交易,而是存储一系列“统计对象”,每个对象代表一个时间片-可能价值不超过15秒的交易,却价值不超过5分钟。

在每个统计对象中,汇总trade.quantitytrade.quantity * trade.price。这将允许计算时间片的平均值,但是更重要的是,在计算平均值之前,可以通过简单的加法将两个或多个时间片进行组合。

这可以通过两个相互依赖的构造函数来实现:

/*
 * Stats_store() Constructor
 * Description:
 *    A constructor, instances of which maintain an array of Stats_store() instances (each representing a time-slice),
 *    and receive a series of timestamped "trade" objects of the form { price:"10", quantity:"500" }.
 *    On receipt of a trade object, an exiting Stats_store() instance is found (by key based on timestamp) or a new one is created,
 *    then the found/created Stats_store's .addTrade(trade)` method is called.
 * Methods:
 *    .addTrade(timestamp, trade): called externally
 *    .getMean(millisecondsAgo): called externally
 *    .timeStampToKey(timestamp): called internally
 *    .findByKey(key): called internally
 * Example: var myStats_store = new Stats_store(101075933);
 * Usage:
 */
const Stats_store = function(granularity) {
    this.buffer = [];
    this.granularity = granularity || 60000; // milliseconds (default 1 minute)
};
Stats_store.prototype = {
    'addTrade': function(timestamp, trade) {
        let key = this.timeStampToKey(timestamp);
        let statObj = this.findByKey(key);
        if (!statObj) {
            statObj = new StatObj(key);
            this.buffer.unshift(statObj);
        }
        statObj.addTrade(trade);
        return this;
    },
    'timeStampToKey': function (timestamp) {
        // Note: a key is a "granulated" timestamp - the leading edge of a timeslice.
        return Math.floor(timestamp / this.granularity); // faster than parseInt()
    },
    'findByKey': function(key) {
        for(let i=0; i<this.buffer.length; i++) {
            if(this.buffer[i].key === key) {
                return this.buffer[i];
                break;
            }
            return null;
        }
    },
    'getMean': function(millisecondsAgo) {
        let key = this.timeStampToKey(Date.now() - millisecondsAgo);
        let s = { 'n':0, 'sigma':0 };
        let c = 0;
        for(let i=0; i<this.buffer.length; i++) {
            if(this.buffer[i].isFresherThan(key)) {
                s.n += this.buffer[i].n;
                s.sigma += this.buffer[i].sigma;
                c++;
            } else {
                break;
            }
        }
        console.log(c, 'of', this.buffer.length);
        return s.sigma / s.n; // arithmetic mean
    }
};

/*
 * StatObj() Constructor
 * Description:
 *    A stats constructor, instances of which receive a series of "trade" objects of the form { price:"10", quantity:"500" }.
 *    and to aggregate data from the received trades:
 *       'this.key': represents a time window (passes on construction).
 *       'this.n': is an aggregate of Σ(trade.quantity)
 *       'this.sigma' is an aggregate of trade values Σ(trade.price * trade.quantity)
 *    Together, 'n' and 'sigma' are the raw data required for (or contributing to) an arithmetic mean (average).
 *    NOTE: If variance or SD was required, then the store object would need to accumulate 'sigmaSquared' in addition to 'n' and 'sigma'.
 * Methods:
 *    .addTrade(trade): called externally
 *    .isFresherThan(key): called externally
 * Example: var myStaObj = new StatObj(101075933);
 * Usage: should only be called by Stats_store()
 */
const StatObj = function(key) {
    this.key = key;
    this.n = 0;
    this.sigma = 0;
}
StatObj.prototype = {
    'addTrade': function(trade) { // eg. { price:"10", quantity:"500" }
        this.n += +trade.quantity;
        this.sigma += +trade.quantity * +trade.price;
    },
    'isFresherThan': function(key) {
        return this.key >= key;
    }
};


用法

// Initialisation
let mySocket = new WebSocket("ws://www.example.com/socketserver", "protocolOne");
const stats_store = new Stats_store(2 * 60000); // 2 minutes granularity

// On receiving a new trade (example)
mySocket.onmessage = function(event) {
    let trade = ....; // extract `trade` from event
    let timestamp = ....; // extract `timestamp` from event
    let mean = stats_store.addTrade(timestamp, trade).getMean(10 * 60000); // 10 minutes averaging timeslice.
    console.log(mean); // ... whatever you need to do with the calculated mean.
    // ... whatever else you need to do with `trade` and `timestamp`.
};


通过选择传递给new Stats_store().getMean()的值,可以提供一定程度的灵活性。只要确保第一个值小于第二个值即可。

(2)here的轻度测试(在中等性能的计算机上,在Win7下为Chrome浏览器)上,表明:


性能至少应足以满足您所谈论的那种“交易”率(12小时内为100,000个或每分钟140个)。
内存使用率很高,但短期内不会泄漏。从长远来看,您可能需要“管家”程序来扫尾。


最后,想法(1)和(2)并没有完全不同。

随着传递给granularity的(2)的new Stats_store()常数变小,因此(2)的行为将趋向于(1)的行为。

10-04 21:26