本文介绍了用于代码的行探查器是否需要解析树,是否足够?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试确定为一种语言(如适用于Python和Matlab的语言)编写行事件探查器的必要条件.

I am trying to determine what is necessary to write a line profiler for a language, like those available for Python and Matlab.

一种简单的解释行探查器"的方法是假设可以在每行中插入时间日志,但是行的定义取决于解析器如何处理空白,这只是第一个问题.似乎需要使用解析树,并在各个节点周围插入时序.

A naive way to interpret "line profiler" is to assume that one can insert time logging around every line, but the definition of a line is dependent on how a parser handles whitespace, which is only the first problem. It seems that one needs to use the parse tree and insert timings around individual nodes.

这个结论正确吗?线路探查器是否需要解析树,并且是否需要(在时间记录之外)所有的内容?

Is this conclusion correct? Does a line profiler require the parse tree, and is that all that is needed (beyond time logging)?

更新1:为此提供悬赏,因为问题仍未解决.

Update 1: Offering a bounty on this because the question is still unresolved.

更新2:这是众所周知的 Python行分析器的链接,以防对于回答这个问题很有帮助.相对于解析,我还无法做出这样的决定.恐怕无法访问Matlab探查器的代码.

Update 2: Here is a link for a well known Python line profiler in case it is helpful for answering this question. I've not yet been able to make heads or tails of it's behavior relative to parsing. I'm afraid that the code for the Matlab profiler is not accessible.

还请注意,可以说手动修饰输入代码将消除对解析树的需求,但这不是自动探查器.

Also note that one could say that manually decorating the input code would eliminate a need for a parse tree, but that's not an automatic profiler.

更新3:尽管此问题与语言无关,但由于我正在考虑为R创建这样的工具(除非它存在并且我没有找到它)而出现了.

Update 3: Although this question is language agnostic, this arose because I am thinking of creating such a tool for R (unless it exists and I haven't found it).

更新4:关于使用线路探查器还是调用堆栈探查器-这篇关于使用调用堆栈分析器(在本例中为Rprof())的帖子说明了为什么使用调用堆栈而不是通过线路分析器直接分析事情会很痛苦.

Update 4: Regarding use of a line profiler versus a call stack profiler - this post relating to using a call stack profiler (Rprof() in this case) exemplifies why it can be painful to work with the call stack rather than directly analyze things via a line profiler.

推荐答案

我要说的是,您需要一个解析树(和源)-您还如何知道什么构成行"和有效语句?

I'd say that yes, you require a parse tree (and the source) - how else would you know what constitutes a "line" and a valid statement?

一个实际的简化可能是语句分析器"而不是线路分析器".在R中,分析树很容易获得:body(theFunction),因此在每个语句周围插入测量代码应该相当容易.通过其他一些工作,您可以将其插入到属于同一行的一组语句周围.

A practical simplification though might be an "statement profiler" instead of a "line profiler".In R, the parse tree is readily available: body(theFunction), so it should be fairly easy to insert measuring code around each statement. With some more work you can insert it around a group of statements that belong to the same line.

在R中,从文件加载的函数主体通常还具有属性srcref,该属性列出了每个行"(实际上是每个语句)的源:

In R, the body of a function loaded from a file typically also has an attribute srcref that lists the source for each "line" (actually each statement) :

这是一个示例函数(输入"example.R"):

Here's a sample function (put in "example.R"):

f <- function(x, y=3)
{
    a <- 0; a <- 1  # Two statements on one line
    a <- (x + 1) *  # One statement on two lines
        (y + 2)

    a <- "foo
        bar"        # One string on two lines
}

然后在R中

source("example.R")
dput(attr(body(theFunction), "srcref"))

打印以下行/列信息:

list(structure(c(2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L), srcfile = <environment>, class = "srcref"),
    structure(c(3L, 2L, 3L, 7L, 9L, 14L, 3L, 3L), srcfile = <environment>, class = "srcref"),
    structure(c(3L, 10L, 3L, 15L, 17L, 22L, 3L, 3L), srcfile = <environment>, class = "srcref"),
    structure(c(4L, 2L, 5L, 15L, 9L, 15L, 4L, 5L), srcfile = <environment>, class = "srcref"),
    structure(c(7L, 2L, 8L, 6L, 9L, 20L, 7L, 8L), srcfile = <environment>, class = "srcref"))

您可以看到"(每个结构的最后两个数字是开始/结束行),表达式a <- 0a <- 1映射到同一行...

As you can "see" (the last two numbers in each structure are begin/end line), the expressions a <- 0 and a <- 1 map to the same line...

祝你好运!

这篇关于用于代码的行探查器是否需要解析树,是否足够?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!