问题描述
使用 rvest
包,我试图从我的
LinkedIn 有一个 API,但是由于某种原因,下面只返回前两个职位的经验,没有其他项目(如教育、项目).因此采用了抓取方法.
library("Rlinkedin")auth = inOAuth(application_name,consumer_key,consumer_secret)getProfile(auth, connections = FALSE, id = NULL) # 返回非常有限的数据
你让事情变得不必要地困难...你需要做的就是向 https://api.linkedin.com/v1/people/~?format=json 从 Linkedin 获取 OAuth 2.0 令牌后.在 R 中,您可以使用 jsonlite:
库(jsonlite)Linkedin <- fromJSON('https://api.linkedin.com/v1/people/~?format=json')位置 <-linkedin$headline
您的 oauth 令牌必须具有r_basicprofile"成员权限.
Using rvest
package, I am trying to scrape data from my LinkedIn profile.
These attempts:
library(rvest)
url = "https://www.linkedin.com/profile/view?id=AAIAAAFqgUsBB2262LNIUKpTcr0cF_ekoX9ZJh0&trk=nav_responsive_tab_profile"
li = read_html(url)
html_nodes(li, "#experience-316254584-view span.field-text")
html_nodes(li, xpath='//*[@id="experience-610617015-view"]/p/span/text()')
don't find any nodes:
#> {xml_nodeset (0)}
Q: How to return just the text?
#> "Quantitative hedge fund manager selection for $650m portfolio of alternative investments"
EDIT:
LinkedIn has an API, however for some reason, below returns only the first two positions of experience, no other items (like education, projects). Hence the scraping approach.
library("Rlinkedin")
auth = inOAuth(application_name, consumer_key, consumer_secret)
getProfile(auth, connections = FALSE, id = NULL) # returns very limited data
You are making things unnecessarily difficult... All you need to do is issue a GET request to https://api.linkedin.com/v1/people/~?format=json after obtaining an OAuth 2.0 token from Linkedin. In R, you can do this using jsonlite:
library(jsonlite)
linkedin <- fromJSON('https://api.linkedin.com/v1/people/~?format=json')
position <- linkedin$headline
You must have the 'r_basicprofile' member permission on your oauth token.
这篇关于R:使用 rvest 进行 LinkedIn 抓取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!