I have a weighted graph with (in practice) up to 50,000 vertices. Given a vertex, I want to randomly choose an adjacent vertex based on the relative weights of all adjacent edges.
How should I store this graph in memory so that making the selection is efficient? What is the best algorithm? It could be as simple as a key value store for each vertex, but that might not lend itself to the most efficient algorithm. I'll also need to be able update the network.
Note that I'd like to take only one "step" at a time.
更正式:给定一个加权,定向,并可能完全图,让我们的 W(A,B)的是边A-> B,让体重是W 一 的全部边缘的距离的在。给定一个输入顶点的 v 的,我想选择一个顶点随机,其中选择顶点的 X 的是 W(V,X) / 是W v 的
More Formally: Given a weighted, directed, and potentially complete graph, let W(a,b) be the weight of edge a->b and let W be the sum of all edges from a. Given an input vertex v, I want to choose a vertex randomly where the likelihood of choosing vertex x is W(v,x) / W
说的 W(V,A) = 2, W(V,B)的= 1, W(V,C)的= 1。
Say W(v,a) = 2, W(v,b) = 1, W(v,c) = 1.
由于输入的 v 的,该函数返回的在的概率为0和 B 或 C 的概率为0.25
Given input v, the function should return a with probability 0.5 and b or c with probability 0.25.
If you are concerned about the performance of generating the random walk you may use the alias method to build a datastructure which fits your requirements of choosing a random outgoing edge quite well. The overhead is just that you have to assign each directed edge a probability weight and a so-called alias-edge.
因此,对于每个注意你有出边的矢量连同重量和别名边缘。然后,你可以选择在固定时间随机边缘(第的EDATA结构是相对于总的边缘或节点的边数数线性时间只有代)。在这个例子中的边缘记为 - > [NODE]
和节点 v
So for each note you have a vector of outgoing edges together with the weight and the alias edge. Then you may choose random edges in constant time (only the generation of th edata structure is linear time with respect to number of total edges or number of node edges). In the example the edge is denoted by ->[NODE]
and node v
corresponds to the example given above:
Node v
->a (p=1, alias= ...)
->b (p=3/4, alias= ->a)
->c (p=3/4, alias= ->a)
Node a
->c (p=1/2, alias= ->b)
->b (p=1, alias= ...)
如果你想选择一个出边(即下一个节点),你只需要产生一个随机数区间[0,1 研究
If you want to choose an outgoing edge (i.e. the next node) you just have to generate a single random number r
uniform from interval [0,1).
您再拿到无=地板(N [V] * R)
和 PV =压裂(N [V] * R)
,其中 N [V]
是出边的数量。即你选择的每个边具有完全相同的概率(在节点的例子,即1/3 v
You then get no=floor(N[v] * r)
and pv=frac(N[v] * r)
where N[v]
is the number of outgoing edges. I.e. you pick each edge with the exact same probability (namely 1/3 in the example of node v
那你这条边的分配概率 P
Then you compare the assigned probability p
of this edge with the generated value pv
. If pv
is less you keep the edge selected before, otherwise you choose its alias edge.
例如,如果我们有 R = 0.6
If for example we have r=0.6
from our random number generator we have
no = floor(0.6*3) = 1
pv = frac(0.6*3) = 0.8
Therefore we choose the second outgoing edge (note the index starts with zero) which is
->b (p=3/4, alias= ->a)
和切换到别名边缘 - >在
,因为点= 3/4℃;光伏
and switch to the alias edge ->a
since p=3/4 < pv
有关节点的例子 v
For the example of node v
we therefore
- 选择边缘
(即每当无= 1
和PV&LT; 3/4
) - 选择边缘
(即每当无= 2
和PV&LT; 3/4
) - 选择边缘
的概率1/3 + 1/3 * 1/4 + 1/3 *四分之一
(即每当无= 0
或PV&GT; =四分之三
- choose edge
with probability1/3*3/4
(i.e. wheneverno=1
) - choose edge
with probability1/3*3/4
(i.e. wheneverno=2
) - choose edge
with probability1/3 + 1/3*1/4 + 1/3*1/4
(i.e. wheneverno=0