我想编写trycatch代码来处理从web下载时出现的错误。

url <- c(
    "http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html",
    "http://en.wikipedia.org/wiki/Xz")
y <- mapply(readLines, con=url)

这两个语句运行成功。下面,我创建一个不存在的Web地址:
url <- c("xxxxx", "http://en.wikipedia.org/wiki/Xz")

url[1]不存在。如何编写trycatch循环(函数),以便:
当URL错误时,输出为:“web URL错误,无法获取”。
当URL错误时,代码不会停止,而是继续下载,直到URL列表结束?

最佳答案

那么:欢迎来到R世界;-)
干得好
设置代码

urls <- c(
    "http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html",
    "http://en.wikipedia.org/wiki/Xz",
    "xxxxx"
)
readUrl <- function(url) {
    out <- tryCatch(
        {
            # Just to highlight: if you want to use more than one
            # R expression in the "try" part then you'll have to
            # use curly brackets.
            # 'tryCatch()' will return the last evaluated expression
            # in case the "try" part was completed successfully

            message("This is the 'try' part")

            readLines(con=url, warn=FALSE)
            # The return value of `readLines()` is the actual value
            # that will be returned in case there is no condition
            # (e.g. warning or error).
            # You don't need to state the return value via `return()` as code
            # in the "try" part is not wrapped insided a function (unlike that
            # for the condition handlers for warnings and error below)
        },
        error=function(cond) {
            message(paste("URL does not seem to exist:", url))
            message("Here's the original error message:")
            message(cond)
            # Choose a return value in case of error
            return(NA)
        },
        warning=function(cond) {
            message(paste("URL caused a warning:", url))
            message("Here's the original warning message:")
            message(cond)
            # Choose a return value in case of warning
            return(NULL)
        },
        finally={
        # NOTE:
        # Here goes everything that should be executed at the end,
        # regardless of success or error.
        # If you want more than one expression to be executed, then you
        # need to wrap them in curly brackets ({...}); otherwise you could
        # just have written 'finally=<expression>'
            message(paste("Processed URL:", url))
            message("Some other message at the end")
        }
    )
    return(out)
}

应用代码
> y <- lapply(urls, readUrl)
Processed URL: http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html
Some other message at the end
Processed URL: http://en.wikipedia.org/wiki/Xz
Some other message at the end
URL does not seem to exist: xxxxx
Here's the original error message:
cannot open the connection
Processed URL: xxxxx
Some other message at the end
Warning message:
In file(con, "r") : cannot open file 'xxxxx': No such file or directory

调查输出
> head(y[[1]])
[1] "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">"
[2] "<html><head><title>R: Functions to Manipulate Connections</title>"
[3] "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\">"
[4] "<link rel=\"stylesheet\" type=\"text/css\" href=\"R.css\">"
[5] "</head><body>"
[6] ""

> length(y)
[1] 3

> y[[3]]
[1] NA

附加说明
试拍
tryCatch返回与执行expr相关联的值,除非出现错误或警告。在这种情况下,可以通过提供相应的处理函数(参见return(NA)中的参数errorwarning来指定特定的返回值(参见上面的?tryCatch)。这些可以是已经存在的函数,但也可以在tryCatch()中定义它们(正如我上面所做的那样)。
选择处理函数的特定返回值的含义
因为我们已经指定了在出错时返回NA,所以y中的第三个元素是NA。如果我们选择NULL作为返回值,y的长度应该是2而不是3,因为lapply()只是“忽略”了NULL的返回值。还要注意,如果不通过return()指定显式返回值,则处理程序函数将返回NULL(即,在出现错误或警告条件的情况下)。
“意外”警告信息
因为warn=FALSE似乎没有任何效果,所以抑制警告的另一种方法是使用
suppressWarnings(readLines(con=url))

而不是
readLines(con=url, warn=FALSE)

多个表达式
请注意,如果您将多个表达式用花括号括起来(正如我在expr部分中所示),则还可以将它们放在“实际表达式部分”(参数tryCatch()finally)中。

关于r - 如何在R中编写trycatch,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/33733102/

10-10 19:36