问题描述
对于调试,通常有用的是判断某个函数在调用堆栈中是否较高。例如,我们通常只想在某个函数调用我们时运行调试代码。
For debugging, it is often useful to tell if a particular function is higher up on the call stack. For example, we often only want to run debugging code when a certain function called us.
一种解决方案是检查所有更高级别的堆栈条目,但这是在堆栈深处并重复调用的函数中,这会导致过多的开销。问题是找到一种方法,使我们能够以合理有效的方式确定特定函数是否在调用堆栈中更高。
One solution is to examine all of the stack entries higher up, but it this is in a function that is deep in the stack and repeatedly called, this leads to excessive overhead. The question is to find a method that allows us to determine if a particular function is higher up on the call stack in a way that is reasonably efficient.
相似
- -这个问题着重于获取功能对象,而不是确定我们是否处于特定功能中。尽管可以应用相同的技术,但它们最终可能效率极低。
- Obtaining references to function objects on the execution stack from the frame object? - This question focuses on obtaining the function objects, rather than determining if we are in a particular function. Although the same techniques could be applied, they may end up being extremely inefficient.
推荐答案
除非您要使用的功能做了非常特殊的标记,以标记我的一个实例在堆栈中处于活动状态(爱荷华州:如果该功能是原始且不可触摸的,并且可能无法意识到您的这种特殊需求),除非您敲击顶部(并且该功能不存在)或您感兴趣的功能的堆叠框架,否则没有其他办法可以一帧一栈地向上堆叠。正如对该问题的几条评论所指出的那样,是否值得努力对其进行优化是非常令人怀疑的。但是,出于论证的目的,假设它是值得...:
Unless the function you're aiming for does something very special to mark "one instance of me is active on the stack" (IOW: if the function is pristine and untouchable and can't possibly be made aware of this peculiar need of yours), there is no conceivable alternative to walking frame by frame up the stack until you hit either the top (and the function is not there) or a stack frame for your function of interest. As several comments to the question indicate, it's extremely doubtful whether it's worth striving to optimize this. But, assuming for the sake of argument that it was worthwhile...:
编辑:原始答案(由OP)存在许多缺陷,但是自那以后已经修复了一些缺陷,因此我正在编辑以反映当前的情况以及为什么某些方面很重要。
Edit: the original answer (by the OP) had many defects, but some have since been fixed, so I'm editing to reflect the current situation and why certain aspects are important.
首先所有这些,使用 try
/ 除外
或 with
,在装饰器中,这样就可以适当地考虑从监视功能退出的任何情况,而不仅仅是正常操作的退出情况(就像OP自己答案的原始版本一样)。
First of all, it's crucial to use try
/except
, or with
, in the decorator, so that ANY exit from a function being monitored is properly accounted for, not just normal ones (as the original version of the OP's own answer did).
其次,每个装饰器都应确保其装饰后的函数的 __ name __
和 __ doc __
完好无损-这就是 functools.wraps
用于(还有其他方法,但是 wraps
使其最简单)。
Second, every decorator should ensure it keeps the decorated function's __name__
and __doc__
intact -- that's what functools.wraps
is for (there are other ways, but wraps
makes it simplest).
第三点,就像第一点一样重要,集合
(它是OP最初选择的数据结构)是错误的选择:一个函数钙n在堆栈上多次(直接或间接递归)。我们显然需要一个多件套(也称为袋子),这是一种类似套子的结构,可以跟踪每个物品出现多少次。在Python中,多集的自然实现是将dict映射到计数的dict,而这又反过来最容易实现为 collections.defaultdict(int)
。
Third, just as crucial as the first point, a set
, which was the data structure originally chosen by the OP, is the wrong choice: a function can be on the stack several times (direct or indirect recursion). We clearly need a "multi-set" (also known as "bag"), a set-like structure which keeps track of "how many times" each item is present. In Python, the natural implementation of a multiset is as a dict mapping keys to counts, which in turn is most handily implemented as a collections.defaultdict(int)
.
第四,一般方法应该是线程安全的(至少可以轻松实现的;-)。幸运的是, threading.local
使它变得微不足道,在适用的情况下-在这里,它肯定是(每个堆栈都有其自己的单独调用线程)。
Fourth, a general approach should be threadsafe (when that can be accomplished easily, at least;-). Fortunately, threading.local
makes it trivial, when applicable -- and here, it should surely be (each stack having its own separate thread of calls).
第五,一个有趣的问题已经在一些评论中提到(注意到在某些答案中提供的装饰者与其他装饰者玩得多么糟糕:监视装饰者似乎必须是最后一位(最外面)一个,否则检查会中断。这是由于自然而不幸地选择了使用函数对象本身作为监视dict的键。
Fifth, an interesting issue that has been broached in some comments (noticing how badly the offered decorators in some answers play with other decorators: the monitoring decorator appears to have to be the LAST (outermost) one, otherwise the checking breaks. This comes from the natural but unfortunate choice of using the function object itself as the key into the monitoring dict.
我建议通过以下方法解决此问题:密钥的另一种选择:让装饰器采用(在每个给定线程中)唯一的 identifier
参数(并在每个给定线程中使用),并将标识符用作该密钥的键。
I propose to solve this by a different choice of key: make the decorator take a (string, say) identifier
argument that must be unique (in each given thread) and use the identifier as the key into the monitoring dict. The code checking the stack must of course be aware of the identifier and use it as well.
在装饰时,装饰器可以检查uniqueness属性(由usin g)。标识符可以保留为默认的函数名称(因此,仅在保持相同名称空间监视同名函数的灵活性时才明确要求使用该标识符);当出于监视目的将多个被监视的功能视为相同时,可以显式放弃唯一性属性(如果给定的 def
语句应为在稍微不同的上下文中多次执行,以使程序员想要出于监视目的考虑将多个功能对象视为同一功能)。最后,对于那些不可能进行进一步修饰的罕见情况(因为在这种情况下,这可能是确保唯一性的最简便方法),应该可以有选择地还原为作为标识符的功能对象。
At decorating time, the decorator can check for the uniqueness property (by using a separate set). The identifier may be left to default to the function name (so it's only explicitly required to keep the flexibility of monitoring homonymous functions in the same namespace); the uniqueness property may be explicitly renounced when several monitored functions are to be considered "the same" for monitoring purposes (this may be the case if a given def
statement is meant to be executed multiple times in slightly different contexts to make several function objects that the programmers wants to consider "the same function" for monitoring purposes). Finally, it should be possible to optionally revert to the "function object as identifier" for those rare cases in which further decoration is KNOWN to be impossible (since in those cases it may be the handiest way to guarantee uniqueness).
因此,综合考虑这些因素,我们可以拥有(包括 threadlocal_var
实用程序功能,该功能可能已经在工具箱模块中当然;-)类似于以下内容:
So, putting these many considerations together, we could have (including a threadlocal_var
utility function that will probably already be in a toolbox module of course;-) something like the following...:
import collections
import functools
import threading
threadlocal = threading.local()
def threadlocal_var(varname, factory, *a, **k):
v = getattr(threadlocal, varname, None)
if v is None:
v = factory(*a, **k)
setattr(threadlocal, varname, v)
return v
def monitoring(identifier=None, unique=True, use_function=False):
def inner(f):
assert (not use_function) or (identifier is None)
if identifier is None:
if use_function:
identifier = f
else:
identifier = f.__name__
if unique:
monitored = threadlocal_var('uniques', set)
if identifier in monitored:
raise ValueError('Duplicate monitoring identifier %r' % identifier)
monitored.add(identifier)
counts = threadlocal_var('counts', collections.defaultdict, int)
@functools.wraps(f)
def wrapper(*a, **k):
counts[identifier] += 1
try:
return f(*a, **k)
finally:
counts[identifier] -= 1
return wrapper
return inner
我尚未测试此代码,因此它可能包含一些错字或类似内容,但我提供它是因为我希望它涵盖了我上面解释的所有重要技术要点。
I have not tested this code, so it might contain some typo or the like, but I'm offering it because I hope it does cover all the important technical points I explained above.
这一切值得吗?如前所述,可能不是。但是,我认为,如果一切值得做,那么就值得做对;-)。
Is it all worth it? Probably not, as previously explained. However, I think along the lines of "if it's worth doing at all, then it's worth doing right";-).
这篇关于确定特定功能是否在Python堆栈中的有效方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!