问题描述
我正在尝试将 url 表放入 data.frame
.在其他示例中,我发现以下代码有效:
I'm trying to get the table of an url into a data.frame
. In other examples I found the following code worked:
library(XML)
library(RCurl)
theurl <- "https://es.finance.yahoo.com/q/cp?s=BEL20.BR"
tables <- readHTMLTable(theurl)
正如警告所说,该表似乎不是 XML
As the warning says the table doesn't seem to be XML
警告信息:XML 内容似乎不是 XML:'https://es.finance.yahoo.com/q/cp?s=BEL20.BR'
或者,getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R")
有效但不知道如何提取表格.任何帮助将不胜感激.
Alternatively, getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R")
works but don't know how to extract the table. Any help would be appreciated.
感谢@har07 使用 table <- readHTMLTable(getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R"))$ yfncsumtab
给出了输出,但仍然必须是过滤.
thanks to @har07 using table <- readHTMLTable(getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R"))$ yfncsumtab
gives the output but still have to be filtered.
推荐答案
如果使用 getURL
获取文档内容,则可以获取表格.有时 readHTMLTable
无法获取内容.在这些情况下,建议尝试 getURL
You can get the table if you use getURL
to get the document content. Sometimes readHTMLTable
has trouble getting content. In those cases, it is recommended to try getURL
> library(XML)
> library(RCurl)
> URL <- getURL("https://es.finance.yahoo.com/q/cp?s=BEL20.BR")
> rt <- readHTMLTable(URL, header = TRUE)
> rt
您可能需要调整 header
参数和其他可能的参数,但表格在那里.
You might need to adjust the header
argument and possibly others, but the tables are there.
这篇关于将 url 表放入 `data.frame` R-XML-RCurl的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!