本文介绍了如何将read_html的输出保存和读取为RDS文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
可以像这样保存和读取对象
Objects can be saved and read like so
# Save as file
saveRDS(iris, "mydata.RDS")
# Read back in
readRDS("mydata.RDS")
但这似乎不适用于使用xml2::read_html()
But this doesn't seem to work for objects made with xml2::read_html()
library(rvest)
someobject <- read_html("https://stackoverflow.com/")
saveRDS(someobject, "someobject.RDS")
哪个创建了文件,但没有达到预期即
Which creates a file, but not as expectedi.e.
readRDS("someobject.RDS")
Error in doc_is_html(x$doc) : external pointer is not valid
这是怎么回事,最简单的保存html对象的方法是什么,以便可以用最少的代码/繁琐的操作将其重新加载?
What's going on and what's the simplest way of saving an html object so that it can be loaded back in with minimal code/fuss?
推荐答案
我们可以使用xml2
包中的write_xml
和read_html
We can use write_xml
and read_html
from xml2
package
before <- read_html("https://stackoverflow.com/")
xml2::write_xml(before, "someobject1.xml")
after <- xml2::read_html("someobject1.xml")
但是,identical
返回FALSE
identical(before, after)
#[1] FALSE
但是对它们两个的查询似乎都返回相同的结果
but the query on both of them seem to return the same result
library(rvest)
before %>% html_nodes("div")
after %>% html_nodes("div")
这篇关于如何将read_html的输出保存和读取为RDS文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!