我试图从R Studio上的Postgres服务器下载一个JSON文件,但是,rjsonio包中的fromJSON()方法仅在文件1)位于本地目录或2)位于URL时才起作用。有人知道在R Studio或Python上执行此操作的方法吗?
最佳答案
我很想知道你到底需要什么。以下是我要回答的问题:
标题:如何访问postgres数据库中json编码的数据?
正文:我需要访问postgres数据库中存储的JSON数据中的数据。连接后,我可以看到数据就在那里,但我不知道如何在SQL或R中获取各个元素。
样本数据
精心设计的问题应该包含示例数据,因此我将在此处添加示例表。此数据改编自http://www.postgresqltutorial.com/postgresql-json/:
library(DBI)
con <- dbConnect(RPostgres::Postgres(), ...) # also works with `odbc::odbc()`, untested with `RODBC`
dbExecute(con, "
create table mytable (
id SERIAL8 PRIMARY KEY NOT NULL,
valtext TEXT,
jsontext JSON
)")
d <- data.frame(
valtext = c('{ "customer": "John Doe", "items": {"product": "Beer","qty": 6}}',
'{ "customer": "Lily Bush", "items": {"product": "Diaper","qty": 24}}',
'{ "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}',
'{ "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}')
)
d$jsontext <- d$valtext
dbWriteTable(con, "mytable", d, append=TRUE)
dbGetQuery(con, "select * from mytable")
# id valtext
# 1 1 { "customer": "John Doe", "items": {"product": "Beer","qty": 6}}
# 2 2 { "customer": "Lily Bush", "items": {"product": "Diaper","qty": 24}}
# 3 3 { "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}
# 4 4 { "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}
# jsontext
# 1 { "customer": "John Doe", "items": {"product": "Beer","qty": 6}}
# 2 { "customer": "Lily Bush", "items": {"product": "Diaper","qty": 24}}
# 3 { "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}
# 4 { "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}
在数据库中提取JSON
如果继续学习本教程,您将看到Postgres包含两个运算符,它们只对
JSON
类型的字段起作用:dbGetQuery(con, "select id from mytable
where cast(jsontext -> 'customer' as text) like '%William%'")
# id
# 1 3
但不
TEXT
或类似:dbGetQuery(con, "select id from mytable
where cast(valtext -> 'customer' as text) like '%William%'")
# Error in result_create(conn@ptr, statement) (from functions.R#284) :
# ERROR: operator does not exist: text -> unknown
# LINE 2: where cast(valtext -> 'customer' as text) l...
# ^
# HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
您可以以类似的方式检索各个组件:
dbGetQuery(con, "select jsontext -> 'items' -> 'qty' as quantity from mytable
where cast(jsontext -> 'items' -> 'product' as text) like '%Toy%'")
# quantity
# 1 1
# 2 2
在-R中提取JSON
ret <- dbGetQuery(con, "select jsontext from mytable
where cast(jsontext -> 'items' -> 'product' as text) like '%Toy%'")
ret
# jsontext
# 1 { "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}
# 2 { "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}
暴力(但仍然有效)方法是在每个字段上应用
fromJSON
函数。(我正在使用jsonlite
,但我认为RJSONIO
在这里也同样有效:lapply(ret$jsontext, jsonlite::fromJSON)
# [[1]]
# [[1]]$customer
# [1] "Josh William"
# [[1]]$items
# [[1]]$items$product
# [1] "Toy Car"
# [[1]]$items$qty
# [1] 1
# [[2]]
# [[2]]$customer
# [1] "Mary Clark"
# [[2]]$items
# [[2]]$items$product
# [1] "Toy Train"
# [[2]]$items$qty
# [1] 2
jsonlite
提供的另一种方法是viajsonlite::stream_in
;我尝试使用RJSONIO::readJSONStream
但无法使其工作。我没有努力,我希望这也一样简单。jsonlite::stream_in(textConnection(ret$jsontext))
# Imported 2 records. Simplifying...
# customer items.product items.qty
# 1 Josh William Toy Car 1
# 2 Mary Clark Toy Train 2
jsonlite::fromJSON
也可以与jsonlite::stream_in
一起使用:当数据以比上面更复杂的方式嵌套时,我经常需要simplifyDataFrame=FALSE
(注意带有两个元素的“items”字典是如何“扁平化”为"items.product"
和"items.qty"
,这是默认操作的副作用)。还有更多的教程,可能还有无数的其他资源。