问题描述
许多数据结构使用称为 表示。这是什么意思?为什么要使用它?左边的孩子,右兄弟的表示(LCRS)是一种编码 (一个树结构,其中每个节点可以有任何数量的孩子)使用 (树结构每个节点最多可以有两个孩子)
动机
为了激发这种表现如何工作,我们先考虑一个简单的多路树,就像这样:
A
// | \ \
/ / | \ \
B C D E F
| / | \ / \
GHIJKL
(对低质量ASCII图稿抱歉!)
在这个树结构中,我们可以从树中的任何节点向下导航到其任何子节点。例如,我们可以从A迁移到B,A到C,A到D等。
如果我们想在这样的树中表示一个节点,我们通常会在这里使用某种节点结构/节点类(用C ++编写):
struct Node {
DataType值;
std :: vector< Node *>孩子
};
在LCRS表示中,我们以每个节点最多需要的方式表示多路树两个指针。为了做到这一点,我们将稍微重塑树。而不是使树中的每个节点都存储指向其所有子节点的指针,所以我们将以稍微不同的方式来构造树,方法是让每个节点存储一个指向其子中一个的指针(在LCRS中,最左侧的子节点)。然后,我们将让每个节点存储一个指向其右边兄弟节点的指针,该树的另一个节点是同一个父节点的子节点。在上述树的情况下,我们可以以下列方式表示树:
A
/
/
/
B - > C - > D - > E - > F
/ / /
G H-> I-> J K-> L
请注意,在这个新结构中,仍然可以从父节点导航到其第k个子节点(零索引)。这样做的过程如下:
- 下降到当前节点的左侧子节点。 (这是其孩子列表中的第一个节点)。
- 跟随该孩子的右兄弟指针k次。 (这将带您到节点的第k个孩子)。
- 返回该节点。
例如,要找到根节点A的第三个(零索引子节点),我们将下降到最左侧的子节点(B),然后按照三个右侧链接(访问B,C,D,最后是E)。我们然后到达E节点,它拥有节点A的第三个子节点。
以这种方式表示树的主要原因是即使任何节点可能任何数量的孩子,表示对于每个节点最多需要两个指针:一个节点存储最左边的孩子,一个指针存储右边的兄弟。因此,我们可以使用更简单的节点结构存储多路树:
struct Node {
DataType data ;
Node * leftChild;
Node * rightSibling;
};
这个节点结构与二叉树中的一个节点的形状完全相同(数据加上两个指针) 。因此,LCRS表示可以使用每个节点仅两个指针来表示任意多路树。
分析
$ b $我们现在来看看多路树的两个不同表示的时间和空间复杂性以及一些基本操作。
让我们开始看看最初的动态数组儿童表示法所需的总空间使用量。 n节点树的总内存使用量将多少?那么我们知道如下:
-
每个节点不管子数如何,都为存储的原始数据贡献了空间sizeof(data))和动态数组的空间开销。动态数组(通常)有一个指针存储(指向分配的数组),动态数组中元素总数的一个机器字,以及数组的分配容量的一个机器字(可选)。这意味着每个节点占用sizeof(Data)+ sizeof(Node *)+ 2 * sizeof(机器字)空间。
-
在树中的数组中,将存在n-1个子节点,因为树中的n个节点n - 1个都有父节点。这增加了一个额外的(n-1)* sizeof(Node *)因子。
因此,用法是
现在,让我们与LCRS树形成对照。 LCRS树中的每个节点都存储两个指针(2 * sizeof(Node *))和一个数据元素(sizeof(Data)),因此其总空间为
这里我们看到win:注意,我们不是存储2n * sizeof(机器字)额外的内存以跟踪分配的阵列大小。这意味着LCRS表示使用比常规多路树少得多的内存。
然而,LCRS树结构的基本操作往往比它们在正常的多路树。具体来说,在标准格式(每个节点存储子指针数组)的多路树中,从一个节点导航到其第k个子节点所需的时间由跟随单个指针所需的时间给出。另一方面,在LCRS表示中,所需的时间由遵循k + 1个指针(一个左子指针,然后是k个子指针)所需的时间给出。因此,如果树具有大的分支因子,则在LCRS树结构中进行查找比在相应的多路树结构中要慢得多。
我们因此,可以将LCRS表示法视为提供数据结构存储空间和访问时间。 LCRS表示与原始多路树相比具有更少的内存开销,而多路树给出了每个孩子的恒定时间查找。
使用案例
由于LCRS表示中涉及的时空折衷,通常不使用LCRS表示,除非两个条件之一成立:
- 内存非常稀少,或
- 不需要随机访问节点的子级。
如果您需要在主内存中存储一个惊人的巨大的多路树,则可能会出现这种情况(1)。例如,如果您需要存储包含许多不同种类的,因此需要频繁更新,那么LCRS表示可能是合适的。
案例(2)出现在专门的数据结构中,树结构以非常具体的方式被使用。例如,使用多路树的许多类型的可以进行空间优化通过使用LCRS表示。主要原因是在堆数据结构中,最常见的操作往往是
- 删除树的根并处理每个的孩子,或
- 通过让一棵树与另一棵树一个孩子一起加入两棵树。
可以在LCRS表示中非常有效地执行操作(1)。在LCRS表示中,树的根目录永远不会有一个正确的子节点(因为它没有兄弟节点),因此删除根本根本就意味着剥离根节点并下降到它的左子树。从那里处理每个孩子可以通过简单地走下剩下的树的正确的脊柱并依次处理每个节点来完成。
操作(2)可以非常效率也好。从上面回想一下,在LCRS表示中,树的根没有一个正确的孩子。因此,以LCRS表示方式将两棵树联合在一起是非常容易的。从这两个树开始:
R1 R2
/ /
(children 1)(children 2 )
我们可以这样融合树木:
R1
/
R2
/ \
(children 2)(children 1)
这可以在O(1)时间完成,而且很简单。 LCRS表示具有此属性的事实意味着许多类型的堆优先级队列,例如或通常被表示作为LCRS树。
希望这有帮助!
Many data structures store multi-way trees as binary trees using a representation called the "left-child, right-sibling" representation. What does this mean? Why would you use it?
The left-child, right-sibling representation (LCRS) is a way of encoding a multi-way tree (a tree structure in which each node can have any number of children) using a binary tree (a tree structure in which each node can have at most two children).
Motivation
To motivate how this representation works, let's begin by considering a simple multi-way tree, like this one here:
A
//|\ \
/ / | \ \
B C D E F
| /|\ / \
G H I J K L
(Apologies for the low-quality ASCII artwork!)
In this tree structure, we can navigate downward from any node in the tree to any of its children. For example, we can migrate from A to B, A to C, A to D, etc.
If we wanted to represent a node in a tree like this one, we would normally use some sort of node structure / node class like this one here (written in C++):
struct Node {
DataType value;
std::vector<Node*> children;
};
In the LCRS representation, we represent the multi-way tree in a way where each node requires at most two pointers. To do this, we will slightly reshape the tree. Rather than having each node in the tree store pointers to all of its children, we will structure the tree in a slightly different way by having each node store a pointer to just one of its children (in LCRS, the leftmost child). We will then have each node store a pointer to its right sibling, the next node in the tree that is the child of the same parent node. In the case of the above tree, we might represent the tree in the following way:
A
/
/
/
B -> C -> D -> E -> F
/ / /
G H->I->J K->L
Notice that in this new structure it is still possible to navigate from a parent node to its kth child (zero-indexed). The procedure for doing so is the following:
- Descend into the left child of the current node. (This is the first node in the list of its children).
- Follow that child's right sibling pointer k times. (This takes you to the kth child of the node).
- Return that node.
For example, to find the third (zero-indexed child) of the root node A, we would descend to the leftmost child (B), then follow three right links (visiting B, C, D, and finally E). We then arrive at the node for E, which holds the third child of node A.
The main reason for representing the tree this way is that even though any node may have any number of children, the representation requires at most two pointers for each node: one node to store the leftmost child, and one pointer to store the right sibling. As a result, we can store the multiway tree using a much simpler node structure:
struct Node {
DataType data;
Node* leftChild;
Node* rightSibling;
};
This node structure has exactly the same shape of a node in a binary tree (data plus two pointers). As a result, the LCRS representation makes it possible to represent an arbitrary multiway tree using just two pointers per node.
Analysis
Let us now look at the time and space complexity of the two different representations of the multiway tree and some basic operations on it.
Let's begin by looking at the total space usage required for the initial "dynamic array of children" representation. How much total memory usage will there be for an n-node tree? Well, we know the following:
Each node, regardless of its number of children, contributes space for the raw data stored (sizeof(data)) and the space overhead of the dynamic array. The dynamic array (typically) has one pointer stored (which points at the allocated array), one machine word for the total number of elements in the dynamic array, and (optionally) one machine word for the allocated capacity of the array. This means that each node takes up sizeof(Data) + sizeof(Node *) + 2 * sizeof(machine word) space.
Across all of the dynamic arrays in the tree, there will be n - 1 pointers to children, since of the n nodes in the tree n - 1 of them have parents. That adds in an extra (n - 1) * sizeof(Node *) factor.
Therefore, the total space usage is
Now, let's contrast that with an LCRS tree. Each node in an LCRS tree stores two pointers (2 * sizeof(Node*)) and one data element (sizeof(Data)), so its total space is
And here we see the win: notice that we are not storing 2n * sizeof(machine word) extra memory to keep track of allocated array sizes. This means that the LCRS representation uses considerably less memory than a regular multiway tree.
However, basic operations on the LCRS tree structure tend to take longer than their corresponding operations on the normal multi-way tree. Specifically, in a multi-way tree represented in the standard form (each node stores an array of child pointers), the time required to navigate from one node to its kth child is given by the time required to follow a single pointers. On the other hand, in the LCRS representation, the time required to do this is given by the time required to follow k + 1 pointers (one left child pointer, then k right child pointers). As a result, if the tree has a large branching factor, it can be much slower to do lookups in an LCRS tree structure than in the corresponding multiway tree structure.
We can therefore think of the LCRS representation as offering a time-space tradeoff between data structure storage space and access times. The LCRS representation has less memory overhead than the original multiway tree, while the multiway tree gives constant-time lookups of each of its children.
Use Cases
Because of the time-space tradeoff involved in the LCRS representation, the LCRS representation is typically not used unless one of two criteria hold:
- Memory is extremely scarce, or
- Random access of a node's children is not required.
Case (1) might arise if you needed to store a staggeringly huge multiway tree in main memory. For example, if you needed to store a phylogenetic tree containing many different species subject to frequent updates, then the LCRS representation might be appropriate.
Case (2) arises in specialized data structures in which the tree structure is being used in very specific ways. For example, many types of heap data structures that use multiway trees can be space optimized by using the LCRS representation. The main reason for this is that in heap data structures, the most common operations tend to be
- Remove the root of a tree and process each of its children, or
- Join two trees together by making one tree a child of the other.
Operation (1) can be done very efficiently in the LCRS representation. In an LCRS representation, the root of the tree never has a right child (since it has no siblings), and therefore removing the root simply means peeling off the root node and descending into its left subtree. From there, processing each child can be done by simply walking down the right spine of the remaining tree and processing each node in turn.
Operation (2) can be done very efficiently as well. Recall from above that in an LCRS representation, the root of a tree never has a right child. Therefore, it is very easy to join together two trees in LCRS representation as follows. Beginning with two trees like this:
R1 R2
/ /
(children 1) (children 2)
We can fuse the trees together in this way:
R1
/
R2
/ \
(children 2) (children 1)
This can be done in O(1) time, and is quite simple. The fact that the LCRS representation has this property means that many types of heap priority queues, such as the binomial heap or rank-pairing heap are typically represented as LCRS trees.
Hope this helps!
这篇关于一棵树的左边,右边同胞的表示是什么?你为什么要用它?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!