How to Get the First Row per Group in PostgreSQL

In PostgreSQL, retrieving the first row per group (for example, the latest order per customer or the top score per player) is a common SQL challenge. Let’s explore several effective methods to achieve this.

1. Using DISTINCT ON

The simplest and most PostgreSQL-specific solution is the DISTINCT ON clause.

SELECT DISTINCT ON (customer_id) customer_id, order_date, amount
FROM orders
ORDER BY customer_id, order_date DESC;

This selects the first row per customer_id based on the latest order_date. The ORDER BY clause is essential to control which row is considered “first.”

2. Using ROW_NUMBER() window function

The ROW_NUMBER() function numbers rows within each group. You can then filter the first row per group.

SELECT customer_id, order_date, amount
FROM (
    SELECT customer_id, order_date, amount,
           ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn
    FROM orders
) sub
WHERE rn = 1;

This works across most SQL databases, not just PostgreSQL.

3. Using LEFT JOIN

Another approach is to join each row to its potential “next” row and filter rows with no higher-ranked match.

SELECT o1.*
FROM orders o1
LEFT JOIN orders o2
  ON o1.customer_id = o2.customer_id AND o1.order_date < o2.order_date
WHERE o2.order_date IS NULL;

This can be slower on large datasets but works when window functions are not available.

Which Method Should You Use?

  • Use DISTINCT ON when you are using PostgreSQL — it’s concise and fast.
  • Use ROW_NUMBER() when writing cross-database SQL.
  • Use LEFT JOIN when window functions are unavailable.

Final Thoughts

PostgreSQL offers multiple tools to get the first row per group. DISTINCT ON is a powerful feature unique to PostgreSQL, while ROW_NUMBER() gives you more flexibility across databases. Choose the method that best fits your project’s needs.

Happy querying!