问题描述
我正在尝试从一组 jsons 中的每个 json 中提取两个键(使用 sql legacy)目前我正在使用 json 提取功能:
json_extract(json_column , '$[1].X') AS X,json_extract(json_column, '$[1].Y') AS Y,
如何让它在json arry 列"中的每个 json 上运行,而不仅仅是 [1](例如)?
示例 json:
[{"blabla":000,"X":1,"blabla":000,"blabla":000,"blabla":000,,"Y":"2"},{"blabla":000,"X":3,"blabla":000,"blabla":000,"blabla":000,,"Y":"4"},]提前致谢!
2020 年更新:JSON_EXTRACT_ARRAY()
现在 BigQuery 支持 JSON_EXTRACT_ARRAY()
:
上一个回答
让我们从一个类似的问题开始——这不是从 json 数组中提取所有电子邮件的一种非常方便的方法:
SELECT id, [ JSON_EXTRACT_SCALAR(JSON_EXTRACT(payload, '$.commits'), '$[0].author.email'), JSON_EXTRACT_SCALAR(JSON_EXTRACT(payload, '$.commits'), '$[1].author.email'), JSON_EXTRACT_SCALAR(JSON_EXTRACT(payload, '$.commits'), '$[2].author.email'), JSON_EXTRACT_SCALAR(JSON_EXTRACT(payload, '$.commits'), '$[3].author.email')] 电子邮件来自`githubarchive.day.20180830`WHERE type='PushEvent'和 id='8188163772'
我们现在处理这个问题的最好方法是在 UDF 中使用一些 JavaScript 将 json 数组拆分为 SQL 数组:
CREATE TEMP FUNCTION json2array(json STRING)返回数组语言 js 为 """返回 JSON.parse(json).map(x=>JSON.stringify(x));""";SELECT * EXCEPT(array_commits),ARRAY(SELECT JSON_EXTRACT_SCALAR(x, '$.author.email') FROM UNNEST(array_commits) x) 电子邮件从 (选择 ID, json2array(JSON_EXTRACT(payload, '$.commits')) array_commits来自`githubarchive.day.20180830`WHERE type='PushEvent'和 id='8188163772')
i'm trying to extract two key from every json in an arry of jsons(using sql legacy)currently i am using json extract function :
json_extract(json_column , '$[1].X') AS X, json_extract(json_column , '$[1].Y') AS Y,
how can i make it run on every json at the 'json arry column', and not just [1] (for example)?
An example json:
[ {"blabla":000,"X":1,"blabla":000,"blabla":000,"blabla":000,,"Y":"2"}, {"blabla":000,"X":3,"blabla":000,"blabla":000,"blabla":000,,"Y":"4"}, ]
thanks in advance!
解决方案Update 2020: JSON_EXTRACT_ARRAY()
Now BigQuery supports
JSON_EXTRACT_ARRAY()
:For example, to solve this particular question:
SELECT id , ARRAY( SELECT JSON_EXTRACT_SCALAR(x, '$.author.email') FROM UNNEST(JSON_EXTRACT_ARRAY(payload, "$.commits"))x ) emails FROM `githubarchive.day.20180830` WHERE type='PushEvent' AND id='8188163772'
Previous answer
Let's start with a similar problem - this is not a very convenient way to extract all emails from a json array:
SELECT id , [ JSON_EXTRACT_SCALAR(JSON_EXTRACT(payload, '$.commits'), '$[0].author.email') , JSON_EXTRACT_SCALAR(JSON_EXTRACT(payload, '$.commits'), '$[1].author.email') , JSON_EXTRACT_SCALAR(JSON_EXTRACT(payload, '$.commits'), '$[2].author.email') , JSON_EXTRACT_SCALAR(JSON_EXTRACT(payload, '$.commits'), '$[3].author.email') ] emails FROM `githubarchive.day.20180830` WHERE type='PushEvent' AND id='8188163772'
The best way we have right now to deal with this is to use some JavaScript in an UDF to split a json-array into a SQL array:
CREATE TEMP FUNCTION json2array(json STRING) RETURNS ARRAY<STRING> LANGUAGE js AS """ return JSON.parse(json).map(x=>JSON.stringify(x)); """; SELECT * EXCEPT(array_commits), ARRAY(SELECT JSON_EXTRACT_SCALAR(x, '$.author.email') FROM UNNEST(array_commits) x) emails FROM ( SELECT id , json2array(JSON_EXTRACT(payload, '$.commits')) array_commits FROM `githubarchive.day.20180830` WHERE type='PushEvent' AND id='8188163772' )
这篇关于Bigquery - json_extract 从数组中提取所有元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!