我的Web应用程序上有图表,每当我收到新信息时,都需要更新图表。
现在我正在做模拟,所以我用大约100000个数据(在json中)进行了回测(但是如果浏览器和硬件可以处理的话,可能是数百万个)。
因此,我需要对我的代码进行尽可能的优化。
我有这样的对象:
var trades = {"1515867288390":{price:"10", quantity:"500"},
"1515867289541":{price:"9", quantity:"400"},
"1515867295400":{price:"11", quantity:"750"},
"1515867296500":{price:"7", quantity:"1100"},
...}
每次我在交易中扫描一个对象时,我都希望获得最近X秒钟的中间价格,因此我有$ .each(交易,getAverage ...)
getAverage = function (trade_time) {
var total_quantity = 0;
var total_trade_value = 0;
var startfrom = trade_time - duration;
Object.keys(trades).forEach(function (time) {
if (time < startfrom)
delete trades[time];
});
$.each(trades, function (key, value) {
total_quantity += parseFloat(value.quantity);
total_trade_value += (value.price * value.quantity);
});
var average = (total_trade_value / total_quantity);
return average;
}
80000笔交易的平均执行时间约为7.5s。
不错,我想是这样,但是问题是我需要
var startfrom = trade_time - duration
中的持续时间是可调的,这引起了问题,因为我的getAverage函数根据startfrom删除了所有元素,其本身取决于持续时间,所以如果在开始时持续时间= 10,然后持续时间更改为20,则无论如何都只能回溯过去10秒钟的平均值。一种解决方案是复制数组以保留“完整”副本,但是我的函数每次都会初始化所有元素,并且速度会变慢。
我尝试过的第二个选择是不删除项目并使用:
Object.keys(trades).filter(t => t>=startfrom).forEach(function (time) {
var value = trades[time];
total_quantity += parseFloat(value.quantity);
total_trade_value += (value.price * value.quantity);
});
它慢了大约300倍,所以这真的是个错误的选择,我想知道您会怎么想?
谢谢。
PS:我当时在考虑使用数组,因为我的键始终是数字(时间戳),但是如果我使用数组,我最终将得到数百万个空索引,这会不会再变慢?
最佳答案
这是几个(密切相关的)想法。
想法1
随着每笔交易的到来,将其推入一个数组中(如果交易可能会无序到达,则将其拼接到数组中)。通过钩子或弯曲,确保阵列按时间戳顺序保留。然后,当您的非套接字代码从套接字获取数据(作为交易)并计算出平均值时,所需的数据将始终位于数组的一端。一旦达到不合格交易,计算就可以停止(跳出循环)。
想法2
与“创意1”类似,但不维护一系列原始交易,而是存储一系列“统计对象”,每个对象代表一个时间片-可能价值不超过15秒的交易,却价值不超过5分钟。
在每个统计对象中,汇总trade.quantity
和trade.quantity * trade.price
。这将允许计算时间片的平均值,但是更重要的是,在计算平均值之前,可以通过简单的加法将两个或多个时间片进行组合。
这可以通过两个相互依赖的构造函数来实现:
/*
* Stats_store() Constructor
* Description:
* A constructor, instances of which maintain an array of Stats_store() instances (each representing a time-slice),
* and receive a series of timestamped "trade" objects of the form { price:"10", quantity:"500" }.
* On receipt of a trade object, an exiting Stats_store() instance is found (by key based on timestamp) or a new one is created,
* then the found/created Stats_store's .addTrade(trade)` method is called.
* Methods:
* .addTrade(timestamp, trade): called externally
* .getMean(millisecondsAgo): called externally
* .timeStampToKey(timestamp): called internally
* .findByKey(key): called internally
* Example: var myStats_store = new Stats_store(101075933);
* Usage:
*/
const Stats_store = function(granularity) {
this.buffer = [];
this.granularity = granularity || 60000; // milliseconds (default 1 minute)
};
Stats_store.prototype = {
'addTrade': function(timestamp, trade) {
let key = this.timeStampToKey(timestamp);
let statObj = this.findByKey(key);
if (!statObj) {
statObj = new StatObj(key);
this.buffer.unshift(statObj);
}
statObj.addTrade(trade);
return this;
},
'timeStampToKey': function (timestamp) {
// Note: a key is a "granulated" timestamp - the leading edge of a timeslice.
return Math.floor(timestamp / this.granularity); // faster than parseInt()
},
'findByKey': function(key) {
for(let i=0; i<this.buffer.length; i++) {
if(this.buffer[i].key === key) {
return this.buffer[i];
break;
}
return null;
}
},
'getMean': function(millisecondsAgo) {
let key = this.timeStampToKey(Date.now() - millisecondsAgo);
let s = { 'n':0, 'sigma':0 };
let c = 0;
for(let i=0; i<this.buffer.length; i++) {
if(this.buffer[i].isFresherThan(key)) {
s.n += this.buffer[i].n;
s.sigma += this.buffer[i].sigma;
c++;
} else {
break;
}
}
console.log(c, 'of', this.buffer.length);
return s.sigma / s.n; // arithmetic mean
}
};
/*
* StatObj() Constructor
* Description:
* A stats constructor, instances of which receive a series of "trade" objects of the form { price:"10", quantity:"500" }.
* and to aggregate data from the received trades:
* 'this.key': represents a time window (passes on construction).
* 'this.n': is an aggregate of Σ(trade.quantity)
* 'this.sigma' is an aggregate of trade values Σ(trade.price * trade.quantity)
* Together, 'n' and 'sigma' are the raw data required for (or contributing to) an arithmetic mean (average).
* NOTE: If variance or SD was required, then the store object would need to accumulate 'sigmaSquared' in addition to 'n' and 'sigma'.
* Methods:
* .addTrade(trade): called externally
* .isFresherThan(key): called externally
* Example: var myStaObj = new StatObj(101075933);
* Usage: should only be called by Stats_store()
*/
const StatObj = function(key) {
this.key = key;
this.n = 0;
this.sigma = 0;
}
StatObj.prototype = {
'addTrade': function(trade) { // eg. { price:"10", quantity:"500" }
this.n += +trade.quantity;
this.sigma += +trade.quantity * +trade.price;
},
'isFresherThan': function(key) {
return this.key >= key;
}
};
用法
// Initialisation
let mySocket = new WebSocket("ws://www.example.com/socketserver", "protocolOne");
const stats_store = new Stats_store(2 * 60000); // 2 minutes granularity
// On receiving a new trade (example)
mySocket.onmessage = function(event) {
let trade = ....; // extract `trade` from event
let timestamp = ....; // extract `timestamp` from event
let mean = stats_store.addTrade(timestamp, trade).getMean(10 * 60000); // 10 minutes averaging timeslice.
console.log(mean); // ... whatever you need to do with the calculated mean.
// ... whatever else you need to do with `trade` and `timestamp`.
};
通过选择传递给
new Stats_store()
和.getMean()
的值,可以提供一定程度的灵活性。只要确保第一个值小于第二个值即可。(2)here的轻度测试(在中等性能的计算机上,在Win7下为Chrome浏览器)上,表明:
性能至少应足以满足您所谈论的那种“交易”率(12小时内为100,000个或每分钟140个)。
内存使用率很高,但短期内不会泄漏。从长远来看,您可能需要“管家”程序来扫尾。
最后,想法(1)和(2)并没有完全不同。
随着传递给
granularity
的(2)的new Stats_store()
常数变小,因此(2)的行为将趋向于(1)的行为。