本文介绍了在高性能金融应用程序中缓存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个旨在优化交易策略的应用程序。为了简单起见,假设我们有一个交易策略,说进入这里,然后另一个说退出这里,如果在交易,然后让两个模型:一个说,应该采取多少风险(如何如果我们在市场的错误方面,我们会失去),另一个说明我们应该获得多少利润(即如果市场同意,我们将获得多少利润)。

I am writing an application whose purpose is to optimize a trading strategy. For the sake of simplicity, assume only that we have a trading strategy that says "enter here", then another that says "exit here if in a trade" and then lets have two models: one says how much risk we should take (how much we lose if we're on the wrong side of the market) and the other says how much profit we should take (i.e. how much profit we will take if the market agrees).

为了简单起见,我将把历史的实现交易称为蜱。这意味着,如果我进入蜱虫28,这意味着我将在这个贸易的价格进入我的数据集28交易时间的交易。

For simplicity sake, I will refer to historical realized trades as ticks. That means if I "enter on tick 28" this means I would have entered a trade in the time of 28th trade in my dataset at the price of this trade. Ticks are stored chronologically in my dataset.

现在,假设整个数据集的输入策略包含500个条目。对于每个条目,我可以预先计算确切的入口tick。我还可以计算由每个入口点的退出策略确定的退出点(再次作为滴答数)。对于每个条目,我还可以预先计算模拟的损失和利润,以及这些损失或利润将受到打击的蜱。最后一件还要做的事情是计算什么会首先发生,即退出战略,退出损失或退出利润。

Now, imagine the entry strategy on the whole dataset comes up with 500 entries. For each entry, I can precalculate the exact entry tick. I can also calculate the exit points determined by the exit strategy for each entry point (again as tick numbers). For each entry, I can also precalculate the modeled loss and profit and the ticks where these losses or profits would have been hit. The last thing that remains to be done is calculating what would have happenned first, i.e. exit on strategy, exit on a loss or exit on a profit.

因此,我迭代通过交易数组并计算exitTick [i] = min(exitTickByStrat [i],exitTickByLoss [i],exitTickByProfit [i])。整个过程是血腥的(让我们说我做这100M次)。我怀疑缓存未命中是主要罪魁祸首。问题是:这可以做得更快吗?我必须迭代通过4个数组的一些非平凡的长度。我提出的一个建议是将数据分成四个元组,即具有一个结构数组(entryTick,exitOnStrat,exitOnLoss,exitOnProfit)。这可能更快,因为更好的缓存可预测性,但我不能肯定地说。为什么我还没有测试到目前为止,检测分析器不知何故不适用于我的应用程序的发布二进制文件,而采样分析器似乎在我看来是不可靠的(我试过英特尔的分析器)。

Hence, I iterate through the array of trades and calculate exitTick[i] = min(exitTickByStrat[i], exitTickByLoss[i], exitTickByProfit[i]). And the whole process is bloody slow (let's say I do this 100M times). I suspect cache misses are the main culprit. And the question is: can this be made faster somehow? I have to iterate through 4 arrays of some non-trivial length. One suggestion I have come up with would be to group data in tuples of four, i.e. have one array of structures like (entryTick, exitOnStrat, exitOnLoss, exitOnProfit). This might be faster due to better cache predictability, but I cannot say for sure. Why I haven't tested it so far is that instrumenting profilers somehow don't work for release binaries of my app while sampling profilers seem to me to be unreliable (I have tried Intel's profiler).

所以最后的问题是:这个问题可以更快吗?什么是最好的分析器用于mem概要分析与发行二进制文件?我在Win7,VS2010上工作。

So the final questions are: can this problem be made faster? What is the best profiler to use for mem profiling with release binaries? I work on Win7, VS2010.

编辑:
非常感谢所有。我尽量简化我原来的问题,因此混淆。只是为了确保它的可读性 - 目标意味着预期/实现的利润,停止意味着设想/实现的损失。

Many thanks to all. I tried to simplify my original question as much as possible, hence the confusion. Just to make sure it's readable - target means an envisaged/realized profit, stop means an envisaged/realized loss.

优化器是一个强力的。所以,我有一些strat设置(例如指标周期,无论),然后min / max breakEvenAfter / breakEvenBy,然后公式,以给你停止/目标值在ticks。这些公式也是优化的对象。因此,我有一个优化结构,如

The optimizer is a brute-force one. So, i have some strat settings (e.g. indicator periods, whatever), then min/max breakEvenAfter/breakEvenBy and then formulas to give you stop/target values in ticks. These formulas are also objects of optimization. Hence, I have a structure of optimization like

for each in params
{
   calculateEntries()
   for each in beSettings
   {
      precalculateBeData()
      for each in targetFormulaSettings
      {
          precalculateTargetsAndRespectiveExitTicks
          for each in stopFormulaSettings
          {
              precalulcateStopsAndRespectiveExitsTicks
              evaluateExitsAndDetermineImprovement()
          }
       }
    }
}

所以我预先计算的东西,尽可能,只有当我需要它时计算的东西。并且在30秒内,计算在evaluateExitsAndDetermineImprovement()函数中花费25秒,这个函数只是我在原始问题中描述的,即选择min(exitOnPattern,exitOnStop,exitOnTarget)。我需要调用函数100M次的原因是因为我有100M组合的所有params组合。但是在最后一个循环中,只有exitOnStops数组发生变化。我可以发布一些代码,如果这有帮助。感谢所有的意见!

So I precalculate stuff as much as possible and only calculate something when I need it. And out of 30 seconds, the calculation spends 25 seconds in the evaluateExitsAndDetermineImprovement() function which does just what I described in the original question, i.e. picks min(exitOnPattern, exitOnStop, exitOnTarget). The reason why I need to call the function 100M times is because I have 100M combinations of all params combined. But within the last for cycle only the exitOnStops array changes. I can post some code if that helps. Im grateful for all the comments!

推荐答案

所以,一些工作后,我理解了亚历山大C的建议。 -miss profiling,我发现在15M调用evaluateExits()函数中,我只有30K缓存未命中,因此这个函数的性能不能被缓存阻碍。因此,我不得不开始相信VTune实际上产生有效的结果,虽然很奇怪。由于对VTune输出的分析与当前线程的名称不匹配,我决定开始一个新的。感谢大家的意见和建议。

So, after some work, I understood the advice by Alexandre C. When I ran cache-miss profiling, I found that out of 15M calls of the evaluateExits() function I have only 30K cache misses hence the performance of this function cannot be hindered by cache. Hence, I had to "start believing" that VTune is actually producing valid results, albeit weird. Since the analysis of VTune output does not match the current thread's name, I decided to start a new thread. Thank you all for opinions and recommendations.

这篇关于在高性能金融应用程序中缓存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-28 06:16
查看更多