Basics
Database Management
Dates and Times
How to Measure Cohort Retention
Cohort retention analysis helps you understand how well you’re retaining users over time. With BigQuery’s powerful data processing capabilities, you can analyze retention at scale and gain actionable insights.
1. What is Cohort Retention?
Cohort retention measures how many users from a particular group (cohort) continue to engage with your product over time. For example, you may group users by signup month and track how many of them are still active after 1, 2, or 3 months.
2. Prepare Your Data in BigQuery
Ensure your dataset contains at least the following columns:
user_id
— unique identifier for each usersignup_date
— when the user first joinedactivity_date
— when the user was active
3. Write the SQL Query
Here’s an example SQL query to calculate cohort retention by signup month:
WITH cohorts AS (
SELECT
user_id,
DATE_TRUNC(signup_date, MONTH) AS cohort_month
FROM `project.dataset.users`
),
activity AS (
SELECT
user_id,
DATE_TRUNC(activity_date, MONTH) AS activity_month
FROM `project.dataset.user_activity`
),
cohort_activity AS (
SELECT
c.cohort_month,
a.activity_month,
COUNT(DISTINCT a.user_id) AS active_users
FROM cohorts c
JOIN activity a
ON c.user_id = a.user_id
WHERE a.activity_month >= c.cohort_month
GROUP BY c.cohort_month, a.activity_month
)
SELECT
cohort_month,
FORMAT_DATE('%Y-%m', activity_month) AS activity_month,
active_users
FROM cohort_activity
ORDER BY cohort_month, activity_month;
4. Calculate Retention Rates
To turn counts into retention rates, divide the number of active users by the number of users in the cohort:
WITH cohort_sizes AS (
SELECT
cohort_month,
COUNT(DISTINCT user_id) AS cohort_size
FROM cohorts
GROUP BY cohort_month
)
SELECT
ca.cohort_month,
ca.activity_month,
ca.active_users,
ROUND(ca.active_users / cs.cohort_size, 2) AS retention_rate
FROM cohort_activity ca
JOIN cohort_sizes cs
ON ca.cohort_month = cs.cohort_month
ORDER BY ca.cohort_month, ca.activity_month;
5. Visualize the Results
Export your results to a tool like Google Sheets, Looker Studio, or Tableau. Heatmaps are a popular choice, where rows represent cohorts, columns represent months since signup, and the cells are shaded by retention rate.
6. Best Practices
- Define clear cohort criteria (e.g., signup date, first purchase).
- Exclude test or inactive users for cleaner analysis.
- Segment by user type or geography to uncover deeper insights.
Conclusion
Measuring cohort retention in BigQuery provides valuable insights into user engagement and product health. By following these steps, you can track performance, identify trends, and improve retention strategies over time.