type
status
date
slug
summary
tags
category
icon
password
What is a Window Function?
A window function in SQL performs a calculation across a set of table rows that are somehow related to the current row. This is different from regular aggregate functions, which collapse rows into a single output row per group.
Key Point: A window function retains the individual rows and adds extra computed columns based on a "window" of rows.
Syntax:
Components:
OVER
: Mandatory. Defines the window over which the function operates.
PARTITION BY
(optional): Splits the data into groups (likeGROUP BY
) but doesn’t collapse the rows.
ORDER BY
(optional but common): Specifies the logical order of rows within a partition.
ROWS BETWEEN
(optional): Defines a specific range of rows to consider in the window.
Key Differences: Aggregate vs. Window Functions
Feature | Aggregate Function | Window Function |
Output per group | One row per group | One row per original row |
Collapses rows? | Yes | No |
Keeps original data? | No | Yes |
Typical use | SUM, COUNT, AVG, etc. | Running totals, ranking, lag/lead |
Example Dataset
Let’s say we have this
orders
table:order_id | user_id | order_date | amount |
1 | 101 | 2023-01-01 | 100 |
2 | 101 | 2023-02-01 | 200 |
3 | 102 | 2023-01-15 | 150 |
4 | 101 | 2023-03-01 | 300 |
5 | 102 | 2023-02-01 | 250 |
6 | 103 | 2023-01-10 | 120 |
Example 1: Running Total per User
Explanation:
SUM(amount)
is calculated for each user (PARTITION BY user_id
).
- It adds up the amounts in order of
order_date
.
- Each row keeps its original data + a running total.
- ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW This clause defines the "frame" of rows used for each window function within the partition. for each row, SQL includes all previous rows plus the current one in the window frame.
Clause | Meaning |
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING | Only includes 1 row before and after the current row |
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING | From current row to the end of the partition |
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW | Rolling window of 3 rows: current + 2 previous |
user_id | order_id | order_date | amount | running_total |
101 | 1 | 2023-01-01 | 100 | 100 |
101 | 2 | 2023-02-01 | 200 | 300 |
101 | 4 | 2023-03-01 | 300 | 600 |
102 | 3 | 2023-01-15 | 150 | 150 |
102 | 5 | 2023-02-01 | 250 | 400 |
103 | 6 | 2023-01-10 | 120 | 120 |
Example 2: Cumulative Average per User
Calculates the average order amount for each user up to each row (ordered by date).
user_id | order_id | order_date | amount | running_avg |
101 | 1 | 2023-01-01 | 100 | 100.0 |
101 | 2 | 2023-02-01 | 200 | 150.0 |
101 | 4 | 2023-03-01 | 300 | 200.0 |
Example 3: Ranking Users by Spend
Ranks users by total amount spent. Ties get the same rank, and the next rank is skipped (like Olympic medals).
user_id | total_spent | spending_rank |
101 | 600 | 1 |
102 | 400 | 2 |
103 | 120 | 3 |
Example 4: Compare to Previous Order (LAG)
Adds a column showing the amount from the previous order for each user.
user_id | order_id | order_date | amount | previous_amount |
101 | 1 | 2023-01-01 | 100 | NULL |
101 | 2 | 2023-02-01 | 200 | 100 |
101 | 4 | 2023-03-01 | 300 | 200 |
Common Window Functions
Function | Description |
ROW_NUMBER() | Unique rank per row in a partition |
RANK() | Ranks rows with possible ties |
DENSE_RANK() | Like RANK() but no gaps in ranks |
SUM() | Running total within a window |
AVG() | Moving average |
LAG() / LEAD() | Get previous/next row’s value |
FIRST_VALUE() | First value in the window |
LAST_VALUE() | Last value in the window |
🔚 Summary
- Window functions are powerful tools for advanced analytics in SQL.
- They operate over a “window” of rows, without removing rows from the result set.
- Ideal for tasks like running totals, rankings, comparisons, etc.
- Author:Entropyobserver
- URL:https://tangly1024.com/article/1e5d698f-3512-8038-b851-c54f85fce9d7
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!