Basics
- How to Insert
- How to Update
- How to Delete
- How to Trim Strings
- How to Use substring()
- How to Use substring() with RegEx to Extract a String
- How to Replace Substrings
- How to Modify Arrays
- How to Compare Arrays
- How to Concatenate Strings
- How to Convert the Case of a String
- How to Create an Array
- How to Insert Data Into an Array
- How to Query Arrays
- How to Use string_agg()
Database Management
- How to Create a Table
- How to Drop a Table
- How to Rename a Table
- How to Truncate a Table
- How to Duplicate a Table
- How to Add a Column
- How to Change a Column Name
- How to Add a Default Value
- How to Remove a Default Value From a Column
- How to Add a Not Null Constraint
- How to Remove a Not Null Constraint
- How to Add an Index
- How to Drop an Index
- How to Create a View
- How to Drop a View
- How to Reset Sequence
- How to Drop a Column
Dates and Times
Analysis
- How to Use nullif()
- How to Use Lateral Joins
- How to Calculate Percentiles
- How to Get the First Row per Group
- How to Use generate_series to Avoid Gaps in Data
- How to Do Type Casting
- How to Write a Common Table Expression
- How to Import a CSV using Copy
- How to Compare Two Values When One Is Null
- How to Use Coalesce
- How to Write a Case Statement
- How to Use Filter to Have Multiple Counts
- How to Calculate Cumulative Sum-Running Total
- How to Query a JSON Column
How to Get the First Row per Group in PostgreSQL
In PostgreSQL, retrieving the first row per group (for example, the latest order per customer or the top score per player) is a common SQL challenge. Let’s explore several effective methods to achieve this.
1. Using DISTINCT ON
The simplest and most PostgreSQL-specific solution is the DISTINCT ON
clause.
SELECT DISTINCT ON (customer_id) customer_id, order_date, amount
FROM orders
ORDER BY customer_id, order_date DESC;
This selects the first row per customer_id
based on the latest order_date
. The ORDER BY
clause is essential to control which row is considered “first.”
2. Using ROW_NUMBER()
window function
The ROW_NUMBER()
function numbers rows within each group. You can then filter the first row per group.
SELECT customer_id, order_date, amount
FROM (
SELECT customer_id, order_date, amount,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn
FROM orders
) sub
WHERE rn = 1;
This works across most SQL databases, not just PostgreSQL.
3. Using LEFT JOIN
Another approach is to join each row to its potential “next” row and filter rows with no higher-ranked match.
SELECT o1.*
FROM orders o1
LEFT JOIN orders o2
ON o1.customer_id = o2.customer_id AND o1.order_date < o2.order_date
WHERE o2.order_date IS NULL;
This can be slower on large datasets but works when window functions are not available.
Which Method Should You Use?
- Use
DISTINCT ON
when you are using PostgreSQL — it’s concise and fast. - Use
ROW_NUMBER()
when writing cross-database SQL. - Use
LEFT JOIN
when window functions are unavailable.
Final Thoughts
PostgreSQL offers multiple tools to get the first row per group. DISTINCT ON
is a powerful feature unique to PostgreSQL, while ROW_NUMBER()
gives you more flexibility across databases. Choose the method that best fits your project’s needs.
Happy querying!