SQL Window Functions visualized

SQL Window Functions visualized

·

5 min read

In this article, I am going to talk/practice about window functions. Rishabh Mishra YT has beautifully explained about Window functions.

Diagram


Window function

Window function applies aggregate, ranking & analytic functions over a particular window(set of rows).

Over clause is used with window functions to define that window.

Diagram


Window function syntax,

select column_names(s),
        fun(expression) over ( [<partition by clause>]
                     [<order by clause>]
                     [<row or range clause>] )
from table_name

where fun() are aggregate fun or ranking fun or analytic fun

Define a window - partition by , - order by, rows

window function terms,

  1. window function -- applies aggregate, ranking & analytic function over a particular window(or set of rows) for example sum(),avg(), row_number()

  2. Expression -- is the name of the column that we want the window fun operate on. This may not be necessary depending on what function is used. like we pass a column name to a function

  3. Over -- is just to signify this is a window function and also used to define the window

  4. partition by -- divides the rows into partitions so we can specify which rows to use to compute the window function.

  5. order by -- is used so that we can order the row within each partition,

  6. rows -- can be used if we want to further limit the rows within a partition.


Data

Create table

create table test_data
    (
        new_id bigint,
        new_cat varchar(20)
    )

insert into

insert into test_data
values (100,'Agni'),
(200,'Agni'),
(200,'Vayu'),
(300,'Vayu'),
(500,'Vayu'),
(500,'Dharti'),
(700,'Dharti')

Window function example

I. Aggregate functions

i. using aggregate sum() functions,

SELECT new_id,new_cat,
       sum(new_id) over ( partition by new_cat
                          order by new_id ) as Total
FROM test_data

ii. using aggregate avg() functions,

SELECT new_id,new_cat,
       avg(new_id) over ( partition by new_cat
                          order by new_id ) as Avg
FROM test_data

iii. using aggregate count() functions,

SELECT new_id,new_cat,
       count(new_id) over ( partition by new_cat
                          order by new_id ) as Count
FROM test_data

iv. using aggregate min() functions,

SELECT new_id,new_cat,
       min(new_id) over ( partition by new_cat
                          order by new_id ) as Min
FROM test_data

v. using aggregate max() functions,

SELECT new_id,new_cat,
       Max(new_id) over ( partition by new_cat
                          order by new_id ) as Max
FROM test_data

vi. using rows/ range clause,

SELECT new_id,new_cat,
       sum(new_id) over ( order by new_id 
                          rows between unbounded preceding
                          and unbounded following  ) as Total,
       avg(new_id) over ( order by new_id 
                          rows between unbounded preceding
                          and unbounded following  ) as Avg,
       count(new_id) over ( order by new_id 
                          rows between unbounded preceding
                          and unbounded following  ) as Count,
       min(new_id) over ( order by new_id 
                          rows between unbounded preceding
                          and unbounded following  ) as Min,
       max(new_id) over ( order by new_id 
                          rows between unbounded preceding
                          and unbounded following  ) as Max

FROM test_data

Note:

  • Above used, 'rows between unbounded preceding and unbounded following' which will give a single output based on All input values/partition (if used)

Test cases

Test Case - 1: no partition, order by new_cat

select new_id, new_cat,
       sum(new_id) over (partition by new_cat order by new_cat) as Total_part,
       sum(new_id) over (order by new_cat) as Total,
       count(new_id) over (order by new_cat) as Count,
       min(new_id) over (order by new_cat) as Min,
       max(new_id) over (order by new_cat) as Max
from test_data

Test case -2: no partition, order by new_id

select new_id, new_cat,
       sum(new_id) over (partition by new_id order by new_id) as Total_part,
       sum(new_id) over (order by new_id) as Total,
       count(new_id) over (order by new_id) as Count,
       min(new_id) over (order by new_id) as Min,
       max(new_id) over (order by new_id) as Max
from test_data

II. Ranking functions

i. using row_number() ranking function

SELECT new_id,
       row_number() over(order by new_id) as Row_number
FROM test_data

ii. using rank() ranking function

rank() ranking function :

  • unique value - unique rank

  • duplicate values - same rank

  • Yes skips rank

SELECT new_id,
       rank() over(order by new_id) as Rank
FROM test_data

iii. using dense_rank() ranking function

dense_rank() ranking function:

  • unique value - unique dense_rank

  • duplicate values - same dense_rank

  • No skips rank

SELECT new_id,
       dense_rank() over(order by new_id) as Dense_rank
FROM test_data

iv. using percent_rank() ranking function

SELECT new_id,
       percent_rank() over(order by new_id) as Percent_rank
FROM test_data

III. Analytic/Value function

i. using first_value() Analytic/Value function

SELECT    new_id,new_cat,
          first_value(new_id) over ( order by new_id) as First_value
FROM      test_data

ii. using last_value() Analytic/Value function

SELECT    new_id,new_cat,
          last_value(new_id) over ( order by new_id) as Last_value
FROM      test_data

iii. using lead() Analytic/Value function

SELECT    new_id,new_cat,
          lead(new_id) over ( order by new_id) as Lead
FROM      test_data

iv. using lag() Analytic/Value function

SELECT    new_id,new_cat,
          lag(new_id) over ( order by new_id) as Lag
FROM      test_data

Window function Assignment

Offset the lead and lag values by 2 In the output columns.

Answer

SELECT    new_id,new_cat,
          lead(new_id,2) over ( order by new_id) as Lead,
          lag(new_id,2) over ( order by new_id) as Lag
FROM      test_data

Conclusion

Learning Objectives,

  1. Definition of window function in SQL,

  2. Syntax of window function,

  3. Examples of window functions using aggregate functions,

  4. Two Test Cases in aggregate window functions,

  5. Examples of window functions using ranking functions

  6. Examples of window functions using analytic/value functions

Source: Rishabh Mishra[Link]

Author: Dheeraj.y

Connect with me:

Did you find this article valuable?

Support dheerajy blog by becoming a sponsor. Any amount is appreciated!