本文介绍了将工作流DAG转换为并行资源分配的算法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有一个图,其中节点是各种工作负载,边是工作负载之间的依赖关系。 (这是一个DAG,因为周期性依赖关系不存在。)
我也有一组可以执行工作的代理。
某些工作负荷品种可能会提供给任何代理商,其他工作负荷品种必须交给特定代理商,其他代理商必须交给特定代理商组中的一位代理商。
如何分配工作负载,使其满足以下条件:
工作负载具有持续时间估计值,但为简单起见,假定每个工作负载需要相同的时间进行计算。 (只需将每个工作负载分解为多个串行相关的工作负载,直到每个工作负载实际上都是一个常量操作。)
我知道拓扑DAG排序,但这会产生一个单一的串行节点排序。我有多个代理并行运行,并且这些关系可以通过对任务进行非显而易见的重新排序来进行潜在的大时间优化。
这样做的结果是作为最小整体持续时间的甘特图表呈现最佳状态。事实上,如果你认为这个问题是在团队工程师的一个里程碑里分配bug故障单,为了尽快完成里程碑,你就明白了。 (不...请不要告诉我将我的图导入MS Project,然后导出它:) - 我对它背后的算法感兴趣!)
指向众所周知的算法,软件库或一般问题和原则,非常感谢! 解决方案
除非你拥有无限数量的代理,以便一旦任务的所有前任完成,兼容代理就可以使用,这是一个NP难题。
<无耻的插件>
在我的书
< /无耻插件
下面是该书中的问题和解决方案:
我们需要调度N在M教室讲课。其中一些讲座是其他人的先决条件。你如何选择何时何地举办讲座,尽快完成所有讲座?
解答:
给定一组N个单位持续时间讲座和M个教室。只要在同一个教室不需要同时发生两场讲座,并且所有的优先约束都得到满足,讲座就可以同时进行。
安排这些讲座以尽量减少完成所需时间的问题被称为NP-complete。
这个问题自然是使用图形建模的。我们将讲座建模为顶点,如果u是v的先决条件,则从顶点u到顶点v的边。显然,对于要满足的优先约束,图必须是非循环的。
如果只有一个演讲室,我们可以简单地按照拓扑顺序举行讲座,并以N次完成N次讲座(假设每个讲座都是单位持续时间)。
我们可以通过观察以下内容来制定启发式方法:在任何时候,都有一组优先约束得到满足的讲座。如果这个集合小于M,我们可以安排所有这些;否则,我们需要选择一个子集来安排。
子集选择可以基于以下几个度量:
- 排名顺序讲座基于最长依赖链的长度
- 根据讲座的数量排列讲座的顺序,他们是直接的先决条件。
- 排名讲座基于
我们也可以使用这些标准的组合来排列讲座,它们是直接或间接的先决条件。目前可以调度。
例如,对于每个顶点,我们将其临界性定义为从其到接收器的最长路径的长度。我们通过按照拓扑顺序处理顶点来安排讲座。在我们算法的任何时候,我们都有一套候选讲座 - 这些讲座的先决条件已经安排好了。
如果候选集小于M,我们安排所有的讲座;否则,我们选择M个最重要的讲座并安排这些讲座 - 这个想法是,他们应该更早地进行调度,因为他们正处于更长的依赖链的起点。
标准是启发式的,可能不会导致最优的时间表 - 这是可以预料的,因为问题是NP完全的。可以采用其他启发式方法,例如,我们可以使用依赖于讲座L的讲座的数量作为讲座L的临界性或标准的某种组合。
Say I have a graph where nodes are workloads of various kinds and edges are dependencies between the workloads. (This is a DAG since cyclical dependencies must not exist.)
I also have a set of multiple agents who can perform the work.
Some workload varieties may be given to any agent, others must be given to a specific agent, and others must be given to one agent among a particular group of agents.
How do I assign workloads such that:
No workload is given to an agent until all its blocking workloads are completed
The shortest possible time is required to complete the total workload graph. (Note that minimizing agent idle time is generally good, but not a fundamental requirement - there may be scenarios under which one particular agent idles for longer but the total time to complete all jobs across all agents is at a minimum.)
Workloads have duration estimates, but assume for simplicity's sake that every workload takes equal time to compute. (Just break each workload down into multiple, serially-dependent workloads until every workload is effectively a constant-time operation.)
I'm aware of topological DAG sorting, but that produces a single, serial ordering of nodes. I have multiple agents operating in parallel, and the relationships are such that potentially large timing optimizations can be made by non-obvious reordering of the tasks.
The result of this would be rendered best as a Gantt chart of minimum overall duration. In fact, if you think of the problem as the allocation of bug tickets in a milestone to engineers in a team, with the goal of getting the milestone done ASAP, then you get the idea. (No... please don't tell me to import my graph into MS Project and then export it :) - I'm interested in the algorithm behind it!)
Pointers to well known algorithms, software libraries, or general issues and principles are much appreciated!
解决方案
Unless you have infinite number of agents so that a compatible agent is available as soon as all the predecessors of a task is done, this is an NP-hard problem.
< shameless plug >
A very similar problem is there in my book "Algorithms For Interviews"
< /shameless plug >
Here is the problem and the solution from the book:
We need to schedule N lectures in M classrooms. Some of those lectures are prerequisites for others. How would you choose when and where to hold the lectures in order to finish all the lectures as soon as possible?
Solution:We are given a set of N unit duration lectures and M classrooms. The lectures can be held simultaneously as long as no two lectures need to happen in the same classroom at the same time and all the precedence constraints are met.The problem of scheduling these lectures so as to minimize the time taken to completion is known to be NP-complete.This problem is naturally modeled using graphs. We model lectures as vertices, with an edge from vertex u to vertex v if u is a prerequisite for v. Clearly, the graph must be acyclic for the precedence constraints to be satisfied.If there is just one lecture room, we can simply hold the lectures in topological order and complete the N lectures in N time (assuming each lecture is of unit duration).We can develop heuristics by observing the following: at any time, there is a set of lectures whose precedence constraints have been satisfied. If this set is smaller than M, we can schedule all of them; otherwise, we need to select a subset to schedule.The subset selection can be based on several metrics:
- Rank order lectures based on the length of the longest dependency chain that they are at the start of.
- Rank order lectures based on the number of lectures that they are immediate prerequisites for.
- Rank order lectures based on the total number of lectures that they are direct or indirect prerequisites for.
We can also use combinations of these criteria to order the lectures that are currently schedulable.For example, for each vertex, we define its criticality to be the length of a longest path from it to a sink. We schedule lectures by processing vertices in topological order. At any point in our algorithm, we have a set of candidate lectures-these are the lectures whose prerequisites have already been scheduled.If the candidate set is less than size M, we schedule all the lectures; otherwise, we choose the M most critical lectures and schedule those-the idea is that they should be scheduled sooner since they are at the start of longer dependency chains.The criterion is heuristic and may not lead to optimum schedules-this is to be expected since the problem is NP-complete. Other heuristics may be employed, e.g., we may use the number of lectures that depend on lecture L as the criticality of lecture L or some combination of the criterion.
这篇关于将工作流DAG转换为并行资源分配的算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!