Basics
Database Management
Dates and Times
How to Generate Series to Avoid Gaps in Data
Gaps in time series or numeric data can cause inaccurate reports, misleading charts, and broken dashboards. Fortunately, BigQuery provides tools to help generate complete series so you can maintain clean, reliable datasets.
Why Do Gaps Matter?
When you analyze time series data—such as daily sales, website traffic, or user activity—you often expect to see one row per day. But if a day has no activity, BigQuery won’t return a row unless you explicitly account for it. This can break visualizations or lead to undercounting in reports.
Using GENERATE_DATE_ARRAY
BigQuery provides GENERATE_DATE_ARRAY
to create a series of dates between a start and end date. For example:
SELECT date
FROM UNNEST(GENERATE_DATE_ARRAY('2024-01-01', '2024-01-31', INTERVAL 1 DAY)) AS date
This produces one row per day for January 2024.
Joining with Your Data
To fill gaps, join the generated series with your actual data using a LEFT JOIN
:
WITH date_series AS (
SELECT date
FROM UNNEST(GENERATE_DATE_ARRAY('2024-01-01', '2024-01-31', INTERVAL 1 DAY)) AS date
)
SELECT
ds.date,
COALESCE(SUM(sales.amount), 0) AS total_sales
FROM date_series ds
LEFT JOIN sales
ON DATE(sales.created_at) = ds.date
GROUP BY ds.date
ORDER BY ds.date
This guarantees one row per date, even if no sales occurred on that day.
Generating Numeric Series
For numeric sequences, use GENERATE_ARRAY
:
SELECT num
FROM UNNEST(GENERATE_ARRAY(1, 10)) AS num
Combine it similarly with your dataset to fill gaps in sequences or ranges.
Pro Tips
- Use
COALESCE()
to replaceNULL
values from missing data. - For large ranges, consider filtering only relevant periods to optimize performance.
- Add indexes or clustering on join keys to improve query speed.
Conclusion
By generating series in BigQuery, you ensure complete, reliable datasets—making your reports, dashboards, and analytics trustworthy. Whether dealing with dates or numbers, mastering these techniques is key to becoming a more effective data analyst.