问题描述
我的数据是varchar格式.我想拆分此数组的两个元素,以便随后可以从json中提取键值.
My data is in varchar format. I want to split both the elements of this array so that I can then extract a key value from the json.
Data format:
[
{
"skuId": "5bc87ae20d298a283c297ca1",
"unitPrice": 0,
"id": "5bc87ae20d298a283c297ca1",
"quantity": "1"
},
{
"skuId": "182784738484wefhdchs4848",
"unitPrice": 50,
"id": "5bc87ae20d298a283c297ca1",
"quantity": "4"
},
]
例如我想从上面的列中提取臭味.因此,提取后的数据应如下所示:
For e.g. I want to extract skuid from the above column.So my data after extraction should look like:
1 5bc87ae20d298a283c297ca1
2 182784738484wefhdchs4848
投射到数组不起作用例如select cast(col as array)给出以下错误:未知类型:数组
Cast to array doesn't worke.g select cast(col as array) gives the following error:Unknown type: array
所以我无法取消嵌套该数组的操作.
So I am not able to unnest the array.
我该如何解决雅典娜的这个问题?
How do I do solve this problem in Athena?
推荐答案
您可以结合使用将值解析为JSON ,将其转换为结构化的SQL类型(数组/地图/行),然后不规则的从数组中提取元素作为单独的行.请注意,这仅在JSON有效负载中的数组元素没有结尾逗号的情况下才有效.您的示例有一个,但已从下面的示例中删除.
You can use a combination of parsing the value as JSON, casting it to a structured SQL type (array/map/row), and UNNEST WITH ORDINALITY to extract the elements from the array as separate rows. Note that this only works if the array elements in the JSON payload don't have a trailing commas. Your example has one but it is removed from the example below.
WITH data(value) AS (VALUES
'[
{
"skuId": "5bc87ae20d298a283c297ca1",
"unitPrice": 0,
"id": "5bc87ae20d298a283c297ca1",
"quantity": "1"
},
{
"skuId": "182784738484wefhdchs4848",
"unitPrice": 50,
"id": "5bc87ae20d298a283c297ca1",
"quantity": "4"
}
]'
),
parsed(entries) AS (
SELECT cast(json_parse(value) AS array(row(skuId varchar)))
FROM data
)
SELECT ordinal, skuId
FROM parsed, UNNEST(entries) WITH ORDINALITY t(skuId, ordinal)
产生:
ordinal | skuId
---------+--------------------------
1 | 5bc87ae20d298a283c297ca1
2 | 182784738484wefhdchs4848
(2 rows)
这篇关于无法将varchar转换为Presto Athena中的数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!