在Google Analytics BigQuery中,以下查询尝试汇总sessions
和hits
级别的信息
SELECT
visitId,
trafficSource.source as source,
trafficSource.medium as medium,
device.browser as browser,
min(hits.hour) as firstHitHour,
boolean(count(hits.eventInfo.eventCategory = "SomeValue")) as hasSomeValue
FROM
[my-table.ga_sessions_20150216]
GROUP BY
visitId, source, medium, browser;
如果在一行会话的所有匹配中,至少有一个匹配的
boolean(count(hits.eventInfo.eventCategory = "SomeValue")) as hasSomeValue
等于true
,则hits.eventInfo.eventCategory
行应为SomeValue
。预期结果如下:
sessionId source medium browser firstHitHour hasSomeValue
--------------------------------------------------------------------------------------
12318 google cpc firefox 12 true
13317 google organic safari 14 null
13551 bing organic firefox 14 true
13610 orange display chrome 14 true
14381 stackoverflow referral safari 15 false
14422 google organic chrome 15 true
但是,上述行似乎无效。即使我输入了一些哑巴的值,例如
hits.eventInfo.eventCategory = "Blablablabla"
,它仍然会为某些行输出true
(显然,没有命中值具有此哑巴值)。实际上,指令
count(hits.eventInfo.eventCategory = "Blablablabla")
-没有boolean()
会返回看似随机的结果(与实际计数没有共同点)。该行应该是什么以触发正确的输出?
最佳答案
使用standard SQL会更轻松。要检查hits
中是否存在该值,请在其上使用EXISTS
子句。例如,
SELECT
visitId,
trafficSource.source as source,
trafficSource.medium as medium,
device.browser as browser,
(SELECT MIN(hour) FROM UNNEST(hits)) as firstHitHour,
EXISTS (SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventCategory = "SomeValue") as hasSomeValue
FROM
`my-table.ga_sessions_20150216`
GROUP BY
visitId, source, medium, browser;
另请参见migrating from legacy to standard SQL上的指南。
如果确实要使用旧版SQL,则需要结合使用
WITHIN RECORD
和hits
计数。关于google-analytics - 汇总来自BigQuery中Analytics(分析)查询的点击和 session 数据,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/40782028/