algorithm - 通过冷却来优化 Action 顺序的算法

我可以从“动作”列表中选择每秒执行一次。列表中的每个操作都有一个表示值多少的数值，还有一个表示其“冷却时间”的值——在再次使用该操作之前，我必须等待的秒数。列表可能如下所示：
动作A的值为1，冷却时间为2秒
动作B的值为1.5，冷却时间为3秒
动作c的值为2，冷却时间为5秒
动作d的值为3，冷却时间为10秒
因此在这种情况下，顺序aba的总值为（1+1.5+1）=3.5，这是可以接受的，因为a的第一次使用发生在1秒，而a的最后一次使用发生在3秒，然后这两者之间的差异大于或等于a的冷却时间，2秒。aab命令不起作用，因为你只需要间隔一秒，比冷却时间短。
我的问题是尝试优化操作的顺序，在一定数量的动作上最大化总价值。显然，如果您只使用一个操作，那么最佳顺序是执行操作d，从而得到总值3。来自两个动作的最大值将来自CD或DC，导致总值为5。当你做10个或20个或100个动作的时候，事情会变得更复杂。我找不到一种方法来优化行动秩序，而不是蛮力迫使它，这给它的复杂性指数的行动总的数量，你想优化秩序。这在总共15次之后就不可能了。
那么，有没有办法找到最不复杂的最佳时间呢？这个问题研究过吗？我想可能会有某种加权图类型的算法来解决这个问题，但我不知道它会如何工作，更不用说如何实现它了。
抱歉，如果这让人困惑--这在概念上有点奇怪，我找不到更好的方法来构建它。

最佳答案

编辑：以下是使用高度修改的dijkstra算法的正确解决方案：
dijkstra的算法是用来寻找最短路径的，给定一个地图（一个抽象的图形），它是一系列节点（通常是位置，但是在这个例子中，假设它们是动作），这些节点通过弧相互连接（在这个例子中，不是距离，每个弧都有一个“值”）。
这是本质上的结构。

Graph{//in most implementations these are not Arrays, but Maps. Honestly, for your needs you don't a graph, just nodes and arcs... this is just used to keep track of them.
node[] nodes;
arc[] arcs;
}
Node{//this represents an action
arc[] options;//for this implementation, this will always be a list of all possible Actions to use.
float value;//Action value
}
Arc{
node start;//the last action used
node end;//the action after that
dist=1;//1 second
}

我们可以使用这个数据类型，在查看每条路径的最终总数的基础上，对所有可行的选项进行映射，以获得最佳解决方案。因此，在你寻找一个模式的时间越长，你就越有可能找到一条非常理想的路径。
地图上的每一段路都有一段距离，代表着它的价值，路上的每一站都是一个1秒的标记，因为那是决定下一步去哪里（执行什么动作）的时间。
为了简单起见，假设a和b是唯一可行的选择。
不代表没有行动，因为没有行动是可行的。
如果你旅行4秒（数量越高，效果越好），你的选择是…

A->na->A->na->A
B->na->na->B->na
A->B->A->na->B
B->A->na->B->A
...

还有更多，但我已经知道最佳路径是b->a->na->b->a，因为它的值是最高的。因此，处理这种组合动作的最佳模式是（至少在分析了4秒之后）b->a->na->b->a
这实际上是一个非常简单的递归算法。

    /*
     cur is the current action that you are at, it is a Node. In this example, every other action is seen as a viable option, so it's as if every 'place' on the map has a path going to every other path.
     numLeft is the amount of seconds left to run the simulation. The higher the initial value, the more desirable the results.

This won't work as written, but will give you a good idea of how the algorithm works.
*/
function getOptimal(cur,numLeft,path){
  if(numLeft==0){
    var emptyNode;//let's say, an empty node wiht a value of 0.
    return emptyNode;
  }
  var best=path;
  path.add(cur);
  for(var i=0;i<cur.options.length;i++){
    var opt=cur.options[i];//this is a COPY
    if(opt.timeCooled<opt.cooldown){
      continue;
    }
    for(var i2=0;i2<opt.length;i2++){
      opt[i2].timeCooled+=1;//everything below this in the loop is as if it is one second ahead
    }
    var potential=getOptimal(opt[i],numLeft-1,best);
    if(getTotal(potential)>getTotal(cur)){best.add(potential);}//if it makes it better, use it! getTotal will sum up the values of an array of nodes(actions)
  }
  return best;
}
function getOptimalExample(){
  log(getOptimal(someNode,4,someEmptyArrayOfNodes));//someNode will be A or B
}

结束编辑。
我对这个问题有点困惑，但是…
如果你的动作数量有限，就是这样，那么总是选择最有价值的动作，除非冷却时间还没有达到。
听起来你想要这样的东西（伪代码）：

function getOptimal(){
var a=[A,B,C,D];//A,B,C, and D are actions
a.sort()//(just pseudocode. Sort the array items by how much value they have.)
var theBest=null;
for(var i=0;i<a.length;++i){//find which action is the most valuable
     if(a[i].timeSinceLastUsed<a[i].cooldown){
        theBest=a[i];
        for(...){//now just loop through, and add time to each OTHER Action for their timeSinceLastUsed...
             //...
         }//That way, some previously used, but more valuable actions will be freed up again.
        break;
    }//because a is worth the most, and you can use it now, so why not?
}
}