问题描述
我正在尝试从此 URL 中抓取表格:"https://hutdb.net/17/players"我花了很多时间学习 rvest 和使用 selectorgadget,但是每当我尝试获得输出时,我总是得到相同的错误 (Character(0)).
I am trying to scrape the table from this URL:"https://hutdb.net/17/players"I have spent a lot of time learning rvest and using selectorgadget, however whenever I try to get an output I always get the same error (Character(0)).
library(rvest)
library(magrittr)
url <- read_html("https://hutdb.net/17/players")
table <- url %>%
html_nodes("td") %>%
html_text()
任何帮助将不胜感激.
推荐答案
数据是动态加载的,不能直接从 html 中检索.但是,以 Chrome DevTools 中的网络"为例,我们可以在 https://hutdb.net/ajax/stats.php?year=17&page=0&selected=OVR&sort=DESC
The data is dynamically loaded, and cannot be retrieved directly from the html. But, looking at "Network" in Chrome DevTools for instance, we can find a nicely formatted JSON at https://hutdb.net/ajax/stats.php?year=17&page=0&selected=OVR&sort=DESC
library(jsonlite)
dat <- fromJSON("https://hutdb.net/ajax/stats.php?year=17&page=0&selected=OVR&sort=DESC")
输出看起来像:
# results aOVR id League Year Card Team Player Position Type Shoots HGT
# 1 6308 6308 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 2 <NA> 2030 11782 NHL 17 MOV OTT Erik Karlsson RD OFD Right 6'0
# 3 <NA> 2060 11785 NHL 17 MOV TBL Victor Hedman LD TWD Left 6'6
# 4 <NA> 2008 11791 NHL 17 MOV CHI Patrick Kane RW SNP Left 5'11
# 5 <NA> 2058 13845 NHL 17 SCE ANA Ryan Getzlaf C PWF Right 6'4
# 6 <NA> 2074 11824 NHL 17 MOV BOS Brad Marchand LW TWF Left 5'9
# 7 <NA> 2008 11829 NHL 17 MOV EDM Connor McDavid C PLY Left 6'2
# 8 <NA> 2048 11840 NHL 17 MOV WSH Nicklas Backstrom C PLY Left 6'1
# 9 <NA> 2058 11841 NHL 17 MOV PIT Sidney Crosby C PLY Left 5'11
# 10 <NA> 2065 13644 NHL 17 TOTY WPG Patrik Laine RW TWF Right 6'3
# 11 <NA> 2008 13645 NHL 17 TOTY EDM Connor McDavid C PLY Left 6'2
# 12 <NA> 2039 13680 NHL 17 TOTY LAK Drew Doughty RD TWD Right 6'1
# 13 <NA> 2063 13689 NHL 17 TOTY BOS Patrice Bergeron C TWF Right 6'2
这篇关于如果可能,使用 R (Rvest) 或 VBA 从网站上抓取表格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!