本文介绍了选择用户 ID 首次/最近购买的日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在 BigQuery 中使用 Google Analytics 数据,希望将上次访问和首次访问的日期汇总到 UserID 级别,但是我的代码目前正在返回该用户的最大访问日期,只要他们在选定的日期范围,因为我使用的是 MAX().

I am working with Google Analytics data in BigQuery, looking to aggregate the date of last visit and first visit up to UserID level, however my code is currently returning the max visit date for that user, so long as they have purchased within the selected date range, because I am using MAX().

如果我删除 MAX() 我必须在 DATE 之前GROUP,我不想要这样,然后每个返回多行用户 ID.

If I remove MAX() I have to GROUP by DATE, which I don't want as this then returns multiple rows per UserID.

这是我的代码,它返回每个用户的一系列日期 - last_visit_date 当前正在工作,因为它是唯一可以简单地查看用户活动的最后日期的日期.关于如何让 last_ord_date 选择订单实际发生的日期的任何建议?

Here is my code which returns a series of dates per user - last_visit_date is currently working, as it's the only date that can simply look at the last date of user activity. Any advice on how I can get last_ord_date to select the date on which the order actually occurred?

SELECT
  customDimension.value AS UserID,
  # Last order date
  IF(COUNT(DISTINCT hits.transaction.transactionId) > 0,
    (MAX(DATE)),
    "unknown") AS last_ord_date,

  # first visit date
  IF(SUM(totals.newvisits) IS NOT NULL,
    (MAX(DATE)),
    "unknown") AS first_visit_date,

  # last visit date
  MAX(DATE) AS last_visit_date,

  # first order date
  IF(COUNT(DISTINCT hits.transaction.transactionId) > 0,
    (MIN(DATE)),
    "unknown") AS first_ord_date

FROM
  `XXX.XXX.ga_sessions_20*` AS t
CROSS JOIN
  UNNEST (hits) AS hits
CROSS JOIN
  UNNEST(t.customdimensions) AS customDimension
CROSS JOIN
  UNNEST(hits.product) AS hits_product
WHERE
  parse_DATE('%y%m%d',
    _table_suffix) BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 30 day)
  AND DATE_SUB(CURRENT_DATE(), INTERVAL 1 day)
  AND customDimension.index = 2
  AND customDimension.value NOT LIKE "true"
  AND customDimension.value NOT LIKE "false"
  AND customDimension.value NOT LIKE "undefined"
  AND customDimension.value IS NOT NULL
GROUP BY
  UserID

推荐答案

最有效、最清晰(也是最便携)的方法是拥有一个包含两列的简单表/视图:userid、last_purchase 和另一个有另外两个 cols 用户 ID,first_visit.

the most efficient and clear way to do this (and also most portable) is to have a simple table/view that has two columns: userid, last_purchase and another that has other two cols userid, first_visit.

然后您将它与原始原始表在 userid 上连接并点击时间戳以获取您感兴趣的会话 ID.3 个步骤但简单、可读且易于维护

then you inner join it with the original raw table on userid and hit timestamp to get, say, the session IDs you're interested in. 3 steps but simple, readable and easy to maintain

对于依赖于第一次或最后一次购买/操作(只需查看您在那里的 unnest 操作)的查询来说,很容易遇到太多的复杂性而变得不可用,并且您将花费太多时间试图弄清楚输出的含义.

It's very easy to hit too much complexity for a query that relies on first or last purchase/action (just look at the unnest operations you have there) that is becomes unusable and you'll spend way too much time trying to figure out the meaning of the output.

另请记住,在查询中使用通配符限制为 1000 个表,因此您的最后一次和第一次访问都在 1000 天的滚动窗口内.

Also keep in mind that using the wildcard in the query has a limit of 1000 tables, so your last and first visits are in a rolling window of 1000 days.

这篇关于选择用户 ID 首次/最近购买的日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-29 11:47