How to Generate Series to Avoid Gaps in Data

Gaps in time series or numeric data can cause inaccurate reports, misleading charts, and broken dashboards. Fortunately, BigQuery provides tools to help generate complete series so you can maintain clean, reliable datasets.

Why Do Gaps Matter?

When you analyze time series data—such as daily sales, website traffic, or user activity—you often expect to see one row per day. But if a day has no activity, BigQuery won’t return a row unless you explicitly account for it. This can break visualizations or lead to undercounting in reports.

Using GENERATE_DATE_ARRAY

BigQuery provides GENERATE_DATE_ARRAY to create a series of dates between a start and end date. For example:

SELECT date
FROM UNNEST(GENERATE_DATE_ARRAY('2024-01-01', '2024-01-31', INTERVAL 1 DAY)) AS date

This produces one row per day for January 2024.

Joining with Your Data

To fill gaps, join the generated series with your actual data using a LEFT JOIN:

WITH date_series AS (
  SELECT date
  FROM UNNEST(GENERATE_DATE_ARRAY('2024-01-01', '2024-01-31', INTERVAL 1 DAY)) AS date
)
SELECT
  ds.date,
  COALESCE(SUM(sales.amount), 0) AS total_sales
FROM date_series ds
LEFT JOIN sales
  ON DATE(sales.created_at) = ds.date
GROUP BY ds.date
ORDER BY ds.date

This guarantees one row per date, even if no sales occurred on that day.

Generating Numeric Series

For numeric sequences, use GENERATE_ARRAY:

SELECT num
FROM UNNEST(GENERATE_ARRAY(1, 10)) AS num

Combine it similarly with your dataset to fill gaps in sequences or ranges.

Pro Tips

  • Use COALESCE() to replace NULL values from missing data.
  • For large ranges, consider filtering only relevant periods to optimize performance.
  • Add indexes or clustering on join keys to improve query speed.

Conclusion

By generating series in BigQuery, you ensure complete, reliable datasets—making your reports, dashboards, and analytics trustworthy. Whether dealing with dates or numbers, mastering these techniques is key to becoming a more effective data analyst.