本周,我偶然发现了这个标准的SQL BigQuery documentation,这使我开始使用Firebase Analytics Closed Funnel。但是我得到了错误的结果(下图)。在没有先启动“ Tutorial_LessonStarted >> Lesson = 1”之前,应该没有用户拥有“ Tutorial_LessonCompleted”。这可能是由于各种原因。
问题:
使用User Property =“ first_open_time”是明智的选择,还是使用Event =“ first_open”是更好的选择?后者的实现如何?
我怀疑我可能无法正确地追溯到:事件(String =“ Tutorial_LessonStarted”)>>参数(String =“ LessonNumber”)>>值(String =“ lesson1”)?
_TABLE_SUFFIX ='20170701'上的过滤器可能如何工作,我读到这会便宜一些。张开双臂并接受投票,将获得任何优化的代码建议!
#standardSQL
SELECT
step1, step2, step3, step4, step5, step6,
COUNT(*) AS funnel_count,
COUNT(DISTINCT user_id) AS users
FROM (
SELECT
user_dim.app_info.app_instance_id AS user_id,
event.timestamp_micros AS event_timestamp,
event.name AS step1,
LEAD(event.name, 1) OVER (
PARTITION BY user_dim.app_info.app_instance_id
ORDER BY event.timestamp_micros ASC) as step2,
LEAD(event.name, 2) OVER (
PARTITION BY user_dim.app_info.app_instance_id
ORDER BY event.timestamp_micros ASC) as step3,
LEAD(event.name, 3) OVER (
PARTITION BY user_dim.app_info.app_instance_id
ORDER BY event.timestamp_micros ASC) as step4,
LEAD(event.name, 4) OVER (
PARTITION BY user_dim.app_info.app_instance_id
ORDER BY event.timestamp_micros ASC) as step5,
LEAD(event.name, 5) OVER (
PARTITION BY user_dim.app_info.app_instance_id
ORDER BY event.timestamp_micros ASC) as step6
FROM
`......`,
UNNEST(event_dim) AS event,
UNNEST(user_dim.user_properties) AS user_prop
WHERE user_prop.key = "first_open_time"
ORDER BY 1, 2, 3, 4, 5 ASC
)
WHERE step6 = "Tutorial_LessonStarted" AND EXISTS (
SELECT *
FROM `......`,
UNNEST(event_dim) AS event,
UNNEST(event.params)
WHERE key = 'LessonNumber' AND value.string_value = "lesson1") GROUP BY step1, step2, step3, step4, step5, step6
ORDER BY funnel_count DESC
LIMIT 100;
注意:
输入您的查询表FROM,即:
project_id.com_game_example_IOS.app_events_20170212
,我省略了funnel_count和user_count。
输出:
-------------------------------------------------- --------
自上述原始问题以来进行了更新:
@Elliot:我不明白你为什么这么说:-确保第1课的事件在Tutorial_LessonStarted之前。
Tutorial_LessonStarted具有参数“ LessonNumber”,其值是lesson1,lesson2,lesson3,lesson4。
我想算出漏斗中最后一步发生的所有漏斗,等于LessonNumber = lesson1。
因此,将其应用于全新用户的首次会话的事件日志数据(又名:触发了first_open_time的用户),答案将是下表:
View.OnboardingWelcomePage
View.OnboardingFinalPage
View.JamLoading
View.JamLoading
Jam.UserViewsJam
Jam.ProjectOpened
View.JamMixer
Tutorial.LessonStarted(此参数“ LessonNumber”的值将等于“ lesson1”)
Jam.ProjectPlayStarted
View.JamLoopSelector
View.JamMixer
View.JamLoopSelector
View.JamMixer
View.JamLoopSelector
View.JamMixer
Tutorial.LessonCompleted
Tutorial.LessonStarted(此参数“ LessonNumber”的值将等于“ lesson2”)
因此,重要的是首先获取在特定日期具有first_open_time的所有用户,并将事件组织到漏斗中,以便漏斗中的最后一个事件是一个与事件和特定参数值匹配的事件,然后形成漏斗从那里“倒退”。
最佳答案
让我仔细解释一下,然后看看是否可以建议您入门。
看起来您想分析分析数据中的事件序列,但是序列已经为您准备好了-您有一系列事件。查看Firebase schema for BigQuery时,event_dim
是相关的列,除非我误会了这些事件,否则这些事件是按时间排序的。如果要检查第六个事件的名称,可以使用:
event_dim[SAFE_ORDINAL(6)].name
如果少于六个事件,这将评估为
NULL
,否则它将为您提供事件名称的字符串。另一个观察结果是,您正在尝试同时分析
event_dim
和user_dim
,但是您正在采用两者的叉积,这将激增行数,并使推理结果变得困难。要查找特定的用户属性,请使用以下形式的表达式:(SELECT value.value.string_value
FROM UNNEST(user_dim.user_properties)
WHERE key = 'first_open_time') = '<expected property value>'
结合这两个过滤器,您的
FROM
和WHERE
子句看起来像这样:FROM `project_id.com_game_example_IOS.app_events_*`
WHERE _TABLE_SUFFIX = '20170701' AND
event_dim[SAFE_ORDINAL(6)].name = 'Tutorial_LessonStarted' AND
(SELECT value.value.string_value
FROM UNNEST(user_dim.user_properties)
WHERE key = 'first_open_time') = '<expected property value>'
使用方括号运算符访问
event_dim
中的步骤,我们可以执行以下操作:WITH FilteredInput AS (
SELECT *
FROM `project_id.com_game_example_IOS.app_events_*`
WHERE _TABLE_SUFFIX = '20170701' AND
event_dim[SAFE_ORDINAL(6)].name = 'Tutorial_LessonStarted' AND
(SELECT value.value.string_value
FROM UNNEST(user_dim.user_properties)
WHERE key = 'first_open_time') = '<expected property value>' AND
-- ensure that an event with lesson1 precedes Tutorial_LessonStarted
EXISTS (
SELECT 1
FROM UNNEST(event_dim) WITH OFFSET event_offset
CROSS JOIN UNNEST(params)
WHERE key = 'LessonNumber' AND
value.string_value = 'lesson1' AND
event_offset < 5
)
)
SELECT
event_dim[ORDINAL(1)].name AS step1,
event_dim[ORDINAL(2)].name AS step2,
event_dim[ORDINAL(3)].name AS step3,
event_dim[ORDINAL(4)].name AS step4,
event_dim[ORDINAL(5)].name AS step5,
event_dim[ORDINAL(6)].name AS step6,
COUNT(*) AS funnel_count,
COUNT(DISTINCT user_dim.user_id) AS users
FROM FilteredInput
GROUP BY step1, step2, step3, step4, step5, step6;
这将返回所有唯一的“路径”,以及每个路径的数量和独立用户数。请注意,我只是在脑海中写下这些内容-我没有可以尝试使用的代表性数据-因此可能存在语法或其他错误。