javascript - 走树结构时如何避免`shareReplay`

我正在尝试使用RxJS（PM为pnpm，PR为here）重写程序包管理器。

在重写过程中，我使用了很多.shareReplay(Infinity)，这被告知是不好的（我是反应式编程的初学者）

有人可以提出一种替代方法，如何使用.shareReplay(Infinity)重写如下代码：

'use strict'
const Rx = require('@reactivex/rxjs')

const nodes = [
  {id: 'a', children$: Rx.Observable.empty()},
  {id: 'b', children$: Rx.Observable.empty()},
  {id: 'c', children$: Rx.Observable.from(['a', 'b', 'd'])},
  {id: 'd', children$: Rx.Observable.empty()},
  {id: 'e', children$: Rx.Observable.empty()},
]

// I want this stream to be executed once, that is why the .shareReplay
const node$ = Rx.Observable.from(nodes).shareReplay(Infinity)

const children$ = node$.mergeMap(node => node.children$.mergeMap(childId => node$.single(node => node.id === childId)))

children$.subscribe(v => console.log(v))

最佳答案

groupBy运算符应在此处工作。从PR来看，这可能是过分简化了，但这里有：

'use strict'
const Rx = require('@reactivex/rxjs')

const nodes = [
  {id: 'a', children$: Rx.Observable.empty()},
  {id: 'b', children$: Rx.Observable.empty()},
  {id: 'c', children$: Rx.Observable.from(['a', 'b', 'd'])},
  {id: 'd', children$: Rx.Observable.empty()},
  {id: 'e', children$: Rx.Observable.empty()},
]

Rx.Observable.from(nodes)
 // Group each of the nodes by its id
 .groupBy(node => node.id)
 // Flatten out each of the children by only forwarding children with the same id
 .flatMap(group$ => group$.single(childId => group$.key === childId))
 .subscribe(v => console.log(v));

编辑：比我想象的困难

好的，因此，在我通读第二篇文章时，我发现它需要的工作比我最初想象的要多，因此不能简单地简化它。基本上，由于没有神奇的子弹，因此您将不得不在内存复杂度和时间复杂度之间进行选择。

从优化的角度来看，如果初始源只是一个Array，则可以删除shareReplay，它会以完全相同的方式工作，因为订阅和ArrayObservable时，唯一的开销将是遍历Array，因此重新运行源代码实际上并没有任何额外成本。

基本上，我认为您可以想到两个维度，节点数m和子级平均数n。在速度优化版本中，您最终将不得不两次运行m，并且需要遍历“ n”个节点。由于您有m * n个孩子，最坏的情况是他们都是唯一的。这意味着如果我没有记错的话，您需要执行(m + m*n)简化为O(m*n)的操作。这种方法的缺点是您需要同时具有（nodeId-> Node）的Map和用于删除重复依赖项的Map。

'use strict'
const Rx = require('@reactivex/rxjs')

const nodes = [
  {id: 'a', children$: Rx.Observable.empty()},
  {id: 'b', children$: Rx.Observable.empty()},
  {id: 'c', children$: Rx.Observable.from(['a', 'b', 'd'])},
  {id: 'd', children$: Rx.Observable.empty()},
  {id: 'e', children$: Rx.Observable.empty()},
]

const node$ = Rx.Observable.from(nodes);

// Convert Nodes into a Map for faster lookup later
// Note this will increase your memory pressure.
const nodeMap$ = node$
  .reduce((map, node) => {
    map[node.id] = node;
    return map;
  });


node$
  // Flatten the children
  .flatMap(node => node.children$)
  // Emit only distinct children (you can remove this to relieve memory pressure
  // But you will still need to perform de-duping at some point.
  .distinct()
  // For each child find the associated node
  .withLatestFrom(nodeMap$, (childId, nodeMap) => nodeMap[childId])
  // Remove empty nodes, this could also be a throw if that is an error
  .filter(node => !!node)
  .subscribe(v => console.log(v));

替代方法是使用与您类似的方法，该方法侧重于以性能为代价降低内存压力。请注意，就像我说的那样，如果您的源是一个数组，则基本上可以删除shareReplay，因为重新评估时它所做的就是重新迭代该数组。这消除了附加Map的开销。虽然我认为您仍然需要去除重复项。如果O(m^2*n)很小，则最坏情况的运行时复杂度将为O(m^2)或简单地为n，因为您将需要遍历所有子项，并且对于每个子项，还需要再次遍历m找到匹配的节点。

const node$ = Rx.Observable.from(nodes);
node$
  // Flatten the children
  .flatMap(node => node.children$)
  // You may still need a distinct to do the de-duping
  .flatMap(childId => node$.single(n => n.id === childId)));

我想说第一种选择在几乎所有情况下都是可取的，但我将由您决定使用情况。在某些情况下，可能是您建立了一种启发式算法，以选择一种算法而不是另一种算法。

旁注：抱歉，这并不容易，但是喜欢pnpm，所以请继续努力！

关于javascript - 走树结构时如何避免`shareReplay`，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/45908003/