问题描述
我正在尝试使用 BigQuery 重新创建 GA 漏斗(Google360 上的自定义报告).GA 上的漏斗使用每个页面上发生的唯一事件计数.我在网上发现这个查询在大多数情况下都有效:
I am trying to recreate the GA funnel (custom report on Google360) using BigQuery. The funnel on GA is using the unique count of events that happen on each page. I found this query online that is working for the most part:
SELECT
COUNT( s0.firstHit) AS Landing_Page,
COUNT( s1.firstHit) AS Model_Selection
from(
SELECT
s0.fullvisitorID,
s0.firstHit,
s1.firstHit,
FROM (
# Begin Subquery #1 aka s0
SELECT
fullvisitorID,
MIN(hits.hitNumber) AS firstHit
FROm [64269470.ga_sessions_20170720]
WHERE
hits.eventInfo.eventAction in ('landing_page')
AND totals.visits = 1
GROUP BY
fullvisitorID
) s0
# End Subquery #1 aka s0
left join (
# Begin Subquery #2 aka s1
SELECT
fullvisitorID,
MIN(hits.hitNumber) AS firstHit
FROM [64269470.ga_sessions_20170720]
WHERE
hits.eventInfo.eventAction in ('model_selection_page')
AND totals.visits = 1
GROUP BY
fullvisitorID,
) s1
ON
s0.fullvisitorID = s1.fullvisitorID
)
查询工作正常,登陆页面的值与我在 GA 上获得的值相同,但 Model_Selection 高出约 10%.这种差异也沿着漏斗增加(为了清楚起见,我只发布了 2 个步骤).知道我在这里缺少什么吗?
The query works fine and the value for landing page is the same as I can get on GA, but Model_Selection is about 10% higher. This difference also increases along the funnel (I only posted 2 steps for clarity).Any idea what am I missing here?
推荐答案
此查询满足您的需求,但在 标准 SQL 版本:
This query does what you need but in Standard SQL Version:
#standardSQL
SELECT
SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page,
SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection
FROM `64269470.ga_sessions_20170720`
就是这样.4 条线路,更快更便宜.
Just that. 4 lines, way faster and cheaper.
您还可以使用模拟数据,例如:
You can also play with simulated data, something like:
#standardSQL
WITH data AS(
SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo)] AS hits
)
SELECT
SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page,
SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection
FROM data
请注意,在 GA 中构建这种类型的报告可能会更困难一些,因为您需要选择至少触发过一次事件landing_page"然后触发事件model_selection_page"的访问者.确保您在 GA 中也正确构建了此报告(一种方法可能是首先构建仅包含触发landing_page"的客户的自定义报告,然后应用第二个过滤器查找model_selection_page").
Notice that building this type of report in GA might be a bit more difficult as you need to select visitors who had at least fired once the event 'landing_page' and then had the event 'model_selection_page' fired. Make sure you got this report built correctly as well in your GA (one way might be to first build a customized report with only customers who had 'landing_page' fired and then apply the second filter looking for 'model_selection_page').
:
您在评论中提出要在会话和用户级别进行计数.为了计算每个会话,您可以将每个子查询评估的结果限制为 1,如下所示:
You asked in your comment about bringing this counting on the session and user level. For counting each session, you can limit the results to 1 for each sub-query evaluation, like so:
SELECT
SUM((SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page' LIMIT 1)) Landing_Page,
SUM((SELECT 1 FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page') AND eventInfo.eventAction = 'model_selection_page' LIMIT 1)) Model_Selection
FROM data
为了计算不同的用户,想法是一样的,但你必须应用 COUNT(DISTINCT)
操作,就像这样:
For counting distinct users, the idea is the same but you'd have to apply a COUNT(DISTINCT)
operation, like so:
SELECT
COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page' LIMIT 1)) Landing_Page,
COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page') AND eventInfo.eventAction = 'model_selection_page' LIMIT 1)) Model_Selection
FROM data
这篇关于在 BigQuery 上重新创建 GA 漏斗的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!