How to Get First Row Per Group in Snowflake

When working with SQL queries in Snowflake, you may need to retrieve the first row for each group in a dataset. This is a common requirement in reporting and analytics, where you want to summarize data or extract key values from each group. Snowflake provides several methods to achieve this, including the use of ROW_NUMBER(), RANK(), and window functions.

Using ROW_NUMBER() for First Row Per Group

One of the most efficient ways to retrieve the first row per group in Snowflake is by using the ROW_NUMBER() window function. This function assigns a unique number to each row within a partition, allowing you to easily identify the first row within each group.


SELECT 
    column1, 
    column2, 
    column3
FROM (
    SELECT 
        column1, 
        column2, 
        column3,
        ROW_NUMBER() OVER (PARTITION BY column1 ORDER BY column2) AS row_num
    FROM your_table
) AS ranked
WHERE row_num = 1;
                                

In this example, we partition the data by column1 and order the rows within each group by column2. The row with the smallest value in column2 for each column1 group will be assigned a row_num of 1, and we filter the results to retrieve only those rows.

Alternative: Using QUALIFY

Snowflake also supports the QUALIFY clause, which simplifies the filtering process after applying window functions. Here's how you can use it:


SELECT 
    column1, 
    column2, 
    column3
FROM your_table
QUALIFY ROW_NUMBER() OVER (PARTITION BY column1 ORDER BY column2) = 1;
                                

The QUALIFY clause is a shorthand for filtering results after applying window functions, eliminating the need for a subquery.

Conclusion

Retrieving the first row per group in Snowflake can be done efficiently using window functions like ROW_NUMBER() and the QUALIFY clause. These methods provide flexibility and performance when working with large datasets, and are essential tools for analysts and data engineers. By using these techniques, you can easily isolate the first record for each group based on any criteria.