问题描述
我正在尝试在 Capital Bikeshare 网站上抓取我的旅行历史数据.我必须登录并转到行程菜单才能查看数据.但我收到此错误:
I am trying to scrape my trip history data on Capital Bikeshare Website. I have to log in and go to the trips menu to see the data. but i get this error:
> `No encoding supplied: defaulting to UTF-8.
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘readHTMLTable’ for signature ‘"xml_document"’
这是我的代码.
> library(httr)
> library(XML)
> handle <- handle("https://www.capitalbikeshare.com/")
> path <-"profile/trips"
> login <- list( profile_login="myemail", profile_pass ="mypassword", profile_redirect_url="https://secure.capitalbikeshare.com/profile/trips/QNURCMF2Q6")
> response <- POST(handle = handle, path = path, body = login)
> readHTMLTable(content(response))
我也尝试使用 rvest 但后来我一直收到Error: Unknown field names: _username, _password
"错误.我应该在这里使用哪个字段?我尝试了 ID、姓名等,但仍然无效.
I also tried using rvest but then I kept getting the "Error: Unknown field names: _username, _password
" error. Which field should I use here? I tried Id, name, etc and still didn't work.
推荐答案
首先,会员登录页面与上面列出的介绍页面不同:
For a start the member login page is different than the intro page which you have listed above:
这可能不正确,但尝试将其作为可能的 rvest 起点:
This may not be correct but try this as a possible rvest starting point:
login<-"https://secure.capitalbikeshare.com/profile/login"
library(rvest)
pgsession<-html_session(login)
pgform<-html_form(pgsession)[[1]]
#update user id and password in the next line
filled_form<-set_values(pgform, "_username"="[email protected]", "_password"="password")
submit_form(pgsession, filled_form)
登录后,您可以使用 jump_to 功能移动到所需页面:
Once you login in then one can use the jump_to function to move to the desired pages:
page<-jump_to(pgsession, newurl) #newurl will be the address where to go to next.
希望能帮到你,如果不行,请留言,我会删除帖子.
Hope this helps, if this does not work, leave a comment and I'll delete the post.
这篇关于R 中的 Web Scraping Capital Bikeshare 个人旅行历史数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!