这是我的嵌套JSON:
{
"business_id":"pNQwnY_q4okdlnPiR-3RBA",
"full_address":"6105 S Fort Apache Rd\nSpring Valley\nLas Vegas, NV 89148",
"hours":{ },
"open":true,
"categories":[ ],
"city":"Las Vegas",
"review_count":68,
"name":"Empire Bagels",
"neighborhoods":[
"Spring Valley"
],
"longitude":-115.298175926911,
"state":"NV",
"stars":3.0,
"latitude":36.07728616051,
"attributes":{
"Take-out":true,
"Wi-Fi":"no",
"Good For":{
"dessert":false,
"latenight":false,
"lunch":false,
"dinner":false,
"breakfast":true,
"brunch":false
},
"Caters":true,
"Noise Level":"quiet",
"Takes Reservations":false,
"Delivery":false,
"Ambience":{
"romantic":false,
"intimate":false,
"classy":false,
"hipster":false,
"divey":false,
"touristy":false,
"trendy":false,
"upscale":false,
"casual":true
},
"Parking":{
"garage":false,
"street":false,
"validated":false,
"lot":true,
"valet":false
},
"Has TV":true,
"Outdoor Seating":true,
"Attire":"casual",
"Alcohol":"none",
"Waiter Service":false,
"Accepts Credit Cards":true,
"Good for Kids":true,
"Good For Groups":true,
"Price Range":1
},
"type":"business"
}
我正在查询使用apache演练。我想找出城市中所有餐厅的十大最常见``真实''属性,例如:
Accepts Credit Cards : 200,
Alcohol: 300,
Good For Kids : 500
我的查询看起来如何?这是我所做的:
select attributes, count(*) attributes from `yelp_dataset` group by attributes;
我收到此错误:
Error: SYSTEM ERROR: UnsupportedOperationException: Map, Array, Union or repeated scalar type should not be used in group by, order by or in a comparison operator. Drill does not support compare between MAP:REQUIRED and MAP:REQUIRED.
Fragment 0:0
[Error Id: 8fe8a616-92c7-4da0-ab65-b5542d391f47 on 192.168.10.104:31010] (state=,code=0)
我的查询应该是什么?
最佳答案
由于数据类型混合,我无法使用KVGEN()自动展平属性,但是您可以尝试使用UNION ALL蛮力进行CTE:
WITH ReviewAttributes AS (
SELECT
reviews.name,
'Accepts Credit Cards' as `AttributeName`,
CASE WHEN reviews.attributes.`Accepts Credit Cards` = true THEN 1 ELSE 0 END as `AttributeValue`
FROM
`yelp_dataset` reviews
UNION ALL
SELECT
reviews.name,
'Alcohol' as `AttributeName`,
CASE WHEN reviews.attributes.`Alcohol` <> 'none' THEN 1 ELSE 0 END as `AttributeValue`
FROM
`yelp_dataset` reviews
UNION ALL
SELECT
reviews.name,
'Good for Kids' as `AttributeName`,
CASE WHEN reviews.attributes.`Good for Kids` = true THEN 1 ELSE 0 END as `AttributeValue`
FROM
`yelp_dataset` reviews
)
SELECT
`AttributeName`,
SUM(`AttributeValue`) as `AttributeCount`
FROM
ReviewAttributes
GROUP BY
`AttributeName`;
CASE语句还可以帮助您解决布尔字段和枚举字段之间的某些差异,例如从样本中计算
Alcohol
。关于mysql - 查找嵌套JSON中具有“true”值的所有记录,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/40648431/