我试图从R Studio上的Postgres服务器下载一个JSON文件,但是,rjsonio包中的fromJSON()方法仅在文件1)位于本地目录或2)位于URL时才起作用。有人知道在R Studio或Python上执行此操作的方法吗?

最佳答案

我很想知道你到底需要什么。以下是我要回答的问题:
标题:如何访问postgres数据库中json编码的数据?
正文:我需要访问postgres数据库中存储的JSON数据中的数据。连接后,我可以看到数据就在那里,但我不知道如何在SQL或R中获取各个元素。
样本数据
精心设计的问题应该包含示例数据,因此我将在此处添加示例表。此数据改编自http://www.postgresqltutorial.com/postgresql-json/

library(DBI)
con <- dbConnect(RPostgres::Postgres(), ...) # also works with `odbc::odbc()`, untested with `RODBC`
dbExecute(con, "
  create table mytable (
    id SERIAL8 PRIMARY KEY NOT NULL,
    valtext TEXT,
    jsontext JSON
  )")
d <- data.frame(
  valtext = c('{ "customer": "John Doe", "items": {"product": "Beer","qty": 6}}',
              '{ "customer": "Lily Bush", "items": {"product": "Diaper","qty": 24}}',
              '{ "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}',
              '{ "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}')
 )
d$jsontext <- d$valtext
dbWriteTable(con, "mytable", d, append=TRUE)
dbGetQuery(con, "select * from mytable")
#   id                                                                 valtext
# 1  1        { "customer": "John Doe", "items": {"product": "Beer","qty": 6}}
# 2  2    { "customer": "Lily Bush", "items": {"product": "Diaper","qty": 24}}
# 3  3 { "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}
# 4  4 { "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}
#                                                                  jsontext
# 1        { "customer": "John Doe", "items": {"product": "Beer","qty": 6}}
# 2    { "customer": "Lily Bush", "items": {"product": "Diaper","qty": 24}}
# 3 { "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}
# 4 { "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}

在数据库中提取JSON
如果继续学习本教程,您将看到Postgres包含两个运算符,它们只对JSON类型的字段起作用:
dbGetQuery(con, "select id from mytable
                 where cast(jsontext -> 'customer' as text) like '%William%'")
#   id
# 1  3

但不TEXT或类似:
dbGetQuery(con, "select id from mytable
                 where cast(valtext -> 'customer' as text) like '%William%'")
# Error in result_create(conn@ptr, statement) (from functions.R#284) :
#   ERROR:  operator does not exist: text -> unknown
# LINE 2:                  where cast(valtext -> 'customer' as text) l...
#                                             ^
# HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

您可以以类似的方式检索各个组件:
dbGetQuery(con, "select jsontext -> 'items' -> 'qty' as quantity from mytable
                 where cast(jsontext -> 'items' -> 'product' as text) like '%Toy%'")
#   quantity
# 1        1
# 2        2

在-R中提取JSON
ret <- dbGetQuery(con, "select jsontext from mytable
                        where cast(jsontext -> 'items' -> 'product' as text) like '%Toy%'")
ret
#                                                                  jsontext
# 1 { "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}
# 2 { "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}

暴力(但仍然有效)方法是在每个字段上应用fromJSON函数。(我正在使用jsonlite,但我认为RJSONIO在这里也同样有效:
lapply(ret$jsontext, jsonlite::fromJSON)
# [[1]]
# [[1]]$customer
# [1] "Josh William"
# [[1]]$items
# [[1]]$items$product
# [1] "Toy Car"
# [[1]]$items$qty
# [1] 1
# [[2]]
# [[2]]$customer
# [1] "Mary Clark"
# [[2]]$items
# [[2]]$items$product
# [1] "Toy Train"
# [[2]]$items$qty
# [1] 2

jsonlite提供的另一种方法是viajsonlite::stream_in;我尝试使用RJSONIO::readJSONStream但无法使其工作。我没有努力,我希望这也一样简单。
jsonlite::stream_in(textConnection(ret$jsontext))
#  Imported 2 records. Simplifying...
#       customer items.product items.qty
# 1 Josh William       Toy Car         1
# 2   Mary Clark     Toy Train         2

jsonlite::fromJSON也可以与jsonlite::stream_in一起使用:当数据以比上面更复杂的方式嵌套时,我经常需要simplifyDataFrame=FALSE(注意带有两个元素的“items”字典是如何“扁平化”为"items.product""items.qty",这是默认操作的副作用)。
还有更多的教程,可能还有无数的其他资源。

10-04 14:51