问题描述
我已在应用中启用了迭代器调试功能,方法是定义
_HAS_ITERATOR_DEBUGGING = 1
我期望这只是检查向量边界,但我有一种感觉,它做了很多。什么检查等,实际上正在执行?
Dinkumware STL,顺便说一句。
有许多操作使用迭代器导致未定义的行为,此触发器的目标是激活运行时检查
这是一个很明显的问题,它可以防止它发生(使用断言)。
- 单位化迭代器
- 迭代器到已擦除的元素
- 迭代器到物理位置已更改的元素(重新分配
向量
)
$ b
对于每个容器来说,标准精确到令人难以置信的细节。每个容器的标准精确度令人难以置信。不同的容器:
std :: vector< Animal>猫,狗;
for_each(cats.begin(),dogs.end(),/ ** /); //明显的bug
这是一个更常见的问题:传递给算法的范围的有效性。
-
[cats.begin(),dogs.end())
无效(除非另一个是别名) -
[cats.end(),cats.begin())
除非cats
是空的)
strong>
解决方案包括向迭代器添加信息,以便它们的有效性和它们定义的范围的有效性可以在执行期间声明,从而防止发生未定义的行为。
_HAS_ITERATOR_DEBUGGING
符号用作此功能的触发器,因为它不幸的是减慢了程序的速度。这在理论上很简单:每个迭代器由它发出的容器的 Observer
创建,因此通知修改。
在Dinkumware中,这是通过两个附加项实现的:
- 每个迭代器都携带一个指向其相关容器的指针
- 单位化迭代器没有父容器,大多数操作(除了赋值和销毁)将触发断言
- 对已删除或移动的元素的迭代器已被通知(感谢列表)并知道其无效
- 在递增和递减迭代器时,它可以检查它是否在边界
- 检查两个迭代器属于同一个容器,就像比较其父指针一样简单
- 检查范围的有效性就像检查我们在到达容器末端之前达到范围的末端(对于那些不是随机访问的容器,因此大多数是这些容器的线性操作)
- 额外的内存分配(维护的迭代器的额外列表):
O变更操作的通知过程:
O(NbIterators)
(注意push_back
或insert
不一定使迭代器无效,但erase
li>
- 范围有效性检查:
O(min(last-first,container.end() - first))
- 范围有效性检查:
$ b 这样可以很好地解决我们的问题:
成本
成本很高,但是正确性有价格吗?我们可以分解成本:
大多数库算法当然是为了最大效率而实现的,通常在算法开始时进行一次检查,然后检查未检查的版本运行。但是速度可能会严重减慢,特别是用手写的循环:
for(iterator_t it = vec.begin
it!= vec.end(); // Oups
++ it)
// body
我们知道 Oups 行是不好的,但是在这里它更糟:在每次循环,我们创建一个新的迭代器,然后销毁它,这意味着分配并释放 vec
的迭代器列表的一个节点...我必须强调在紧密循环中分配/释放内存的成本吗?
当然, for_each
不会遇到这样的问题,这是使用STL算法而不是hand-编码版本。
I've enabled iterator debugging in an application by defining
_HAS_ITERATOR_DEBUGGING = 1
I was expecting this to really just check vector bounds, but I have a feeling it's doing a lot more than that. What checks, etc are actually being performed?
Dinkumware STL, by the way.
There is a number of operations with iterators which lead to undefined behavior, the goal of this trigger is to activate runtime checks to prevent it from occuring (using asserts).
The issue
The obvious operation is to use an invalid iterator, but this invalidity may arise from various reasons:
- Unitialized iterator
- Iterator to an element that has been erased
- Iterator to an element which physical location has changed (reallocation for a
vector
) - Iterator outside of
[begin, end)
The standard precise in excruciating details for each container which operation invalidates which iterator.
There is a somehow less obvious reason that people tend to forget: mixing iterators to different containers:
std::vector<Animal> cats, dogs;
for_each(cats.begin(), dogs.end(), /**/); // obvious bug
This pertain to a more general issue: the validity of ranges passed to the algorithms.
[cats.begin(), dogs.end())
is invalid (unless one is an alias for the other)[cats.end(), cats.begin())
is invalid (unlesscats
is empty ??)
The solution
The solution consists in adding information to the iterators so that their validity and the validity of the ranges they defined can be asserted during execution thus preventing undefined behavior to occur.
The _HAS_ITERATOR_DEBUGGING
symbol serves as a trigger to this capability, because it unfortunately slows down the program. It's quite simple in theory: each iterator is made an Observer
of the container it's issued from and is thus notified of the modification.
In Dinkumware this is achieved by two additions:
- Each iterator carries a pointer to its related container
- Each container holds a linked list of the iterators it created
And this neatly solves our problems:
- An unitialized iterator does not have a parent container, most operations (apart from assignment and destruction) will trigger an assertion
- An iterator to an erased or moved element has been notified (thanks to the list) and know of its invalidity
- On incrementing and decrementing an iterator it can checks it stays within the bounds
- Checking that 2 iterators belong to the same container is as simple as comparing their parent pointers
- Checking the validity of a range is as simple as checking that we reach the end of the range before we reach the end of the container (linear operation for those containers which are not randomly accessible, thus most of them)
The cost
The cost is heavy, but does correctness has a price ? We can break down the cost though:
- extra memory allocation (the extra list of iterators maintained):
O(NbIterators)
- notification process on mutating operations:
O(NbIterators)
(Note thatpush_back
orinsert
do not necessarily invalidates iterators, buterase
does) - range validity check:
O( min(last-first, container.end()-first) )
Most of the library algorithms have of course been implemented for maximum efficiency, typically the check is done once and for all at the beginning of the algorithm, then an unchecked version is run. Yet the speed might severely slow down, especially with hand-written loops:
for (iterator_t it = vec.begin();
it != vec.end(); // Oups
++it)
// body
We know the Oups line is bad taste, but here it's even worse: at each run of the loop, we create a new iterator then destroy it which means allocating and deallocating a node for vec
's list of iterators... Do I have to underline the cost of allocating/deallocating memory in a tight loop ?
Of course, a for_each
would not encounter such an issue, which is yet another compelling case toward the use of STL algorithms instead of hand-coded versions.
这篇关于什么是启用STL迭代器调试真的吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!