Basics
- How to Insert
- How to Update
- How to Delete
- How to Trim Strings
- How to Use substring()
- How to Use substring() with RegEx to Extract a String
- How to Replace Substrings
- How to Modify Arrays
- How to Compare Arrays
- How to Concatenate Strings
- How to Convert the Case of a String
- How to Create an Array
- How to Insert Data Into an Array
- How to Query Arrays
- How to Use string_agg()
Database Management
- How to Create a Table
- How to Drop a Table
- How to Rename a Table
- How to Truncate a Table
- How to Duplicate a Table
- How to Add a Column
- How to Change a Column Name
- How to Add a Default Value
- How to Remove a Default Value From a Column
- How to Add a Not Null Constraint
- How to Remove a Not Null Constraint
- How to Add an Index
- How to Drop an Index
- How to Create a View
- How to Drop a View
- How to Reset Sequence
- How to Drop a Column
Dates and Times
Analysis
- How to Use nullif()
- How to Use Lateral Joins
- How to Calculate Percentiles
- How to Get the First Row per Group
- How to Use generate_series to Avoid Gaps in Data
- How to Do Type Casting
- How to Write a Common Table Expression
- How to Import a CSV using Copy
- How to Compare Two Values When One Is Null
- How to Use Coalesce
- How to Write a Case Statement
- How to Use Filter to Have Multiple Counts
- How to Calculate Cumulative Sum-Running Total
- How to Query a JSON Column
How to Use generate_series to Avoid Gaps in Data
PostgreSQL’s generate_series
function is a powerful tool for creating continuous ranges of values, especially useful when working with time-series or sequential data. This tutorial will show you how to use it to avoid gaps in your datasets.
Why Gaps in Data Matter
Missing data points can break reports, dashboards, and analysis. For example, if you’re tracking daily sales but no sales occurred on some days, those dates might be missing from your reports — leading to misleading trends.
What is generate_series
?
The generate_series
function creates a set of numbers, timestamps, or dates based on a start, end, and step value. For example:
SELECT generate_series('2023-01-01'::date, '2023-01-10', '1 day');
This will return every date from January 1 to January 10, 2023.
Example Use Case: Filling Missing Dates in Sales Data
Suppose you have a sales
table:
CREATE TABLE sales (
sale_date DATE,
amount NUMERIC
);
And your data looks like this:
sale_date | amount
------------|--------
2023-01-01 | 100
2023-01-03 | 150
2023-01-05 | 200
To generate a full date range and join it to your sales data:
SELECT gs.date, COALESCE(s.amount, 0) AS amount
FROM generate_series('2023-01-01'::date, '2023-01-05', '1 day') AS gs(date)
LEFT JOIN sales s ON s.sale_date = gs.date;
This will give you:
date | amount
-----------|--------
2023-01-01 | 100
2023-01-02 | 0
2023-01-03 | 150
2023-01-04 | 0
2023-01-05 | 200
Handling Numeric Gaps
You can also fill gaps in numeric sequences:
SELECT gs.id, COALESCE(d.value, 0) AS value
FROM generate_series(1, 5) AS gs(id)
LEFT JOIN data d ON d.id = gs.id;
Best Practices
- Always use
LEFT JOIN
to preserve the full series. - Use
COALESCE
to replaceNULL
with default values like0
. - For large datasets, limit the series to a reasonable range to avoid performance issues.
Conclusion
By using generate_series
, you can ensure that your reports and visualizations always show complete ranges, making your data clearer and more reliable. It’s a simple but essential tool for anyone working with PostgreSQL.
Happy querying!