Share via


LEAD (U-SQL)

Summary

The LEAD analytic function provides access to a row at a given physical offset that follows the current row. Use this analytic function in a SELECT expression to compare values in the current row with values in a following row.

LEAD can only be used in the context of a windowing expression.

Syntax

LEAD_Expression :=                                                                                
     'LEAD' '(' expression [ ',' offset ] [ ',' default ] ')'.

Remarks

  • expression
    The expression for which the first value gets calculated for the window.

  • offset
    The number of rows forward from the current row from which to obtain a value. If not specified, the default is 1. offset can be an expression that evaluates to a constant positive integral value. Column references and method calls are not allowed.

  • default
    The value to return when expression at offset is NULL. If a default value is not specified, NULL is returned. default must be a constant and type-compatible with expression. Column references and method calls are not allowed.

Return Type

The nullable type of the input.

Usage in Windowing Expression

This analytic function can be used in a windowing expression with the following restrictions:

Examples

  • The examples can be executed in Visual Studio with the Azure Data Lake Tools plug-in.

  • The scripts can be executed locally. An Azure subscription and Azure Data Lake Analytics account is not needed when executed locally.

  • The examples below are based on the dataset defined below. Ensure your execution includes the rowset variable.

      @storeSales =
          SELECT * FROM 
              ( VALUES
              (1, "NW", 2013, 100),
              (1, "NW", 2014, 150),
              (1, "NW", 2015, 300),
              (1, "NW", 2016, 640),
              (2, "SW", 2013, 200),
              (2, "SW", 2014, 350),
              (2, "SW", 2015, 500),
              (2, "SW", 2016, 650),
              (3, "NW", 2015, 75),
              (3, "NW", 2016, 100),
              (4, "NW", 2016, 375),
              (5, "SW", 2016, 700)
              ) AS T(StoreID, Region, Year, Sales);
    

A. Compare values between years
The following example uses the LEAD function to return the difference in sales for a specific store over previous years. Notice that because there is no lead value available for the first row, the default of zero (0) is returned.

@result =
    SELECT StoreID,
           Year AS SalesYear,
           Sales AS CurrentSales,
           LEAD(Sales, 1, 0) OVER(ORDER BY Year) AS SubsequentSales
    FROM @storeSales
    WHERE StoreID == 1 AND Year >= 2014;

OUTPUT @result
TO "/Output/ReferenceGuide/Analytic/lead/exampleA.csv"
ORDER BY SalesYear DESC
USING Outputters.Csv();

B. Dividing the result set using PARTITION BY
The following example uses the LEAD function to compare year-to-date sales between stores. Each record shows a store's sales and the sales of the store with the nearest higher sales. The PARTITION BY clause is specified to divide the rows in the result set by region. The LEAD function is applied to each partition separately and computation restarts for each partition. The ORDER BY clause in the OVER clause orders the rows in each partition. The ORDER BY clause in the OUTPUT statement sorts the rows in the whole result set. Notice that because there is no lead value available for the first row of each partition, the default of zero (0) is returned.

@result =
    SELECT Region,
           StoreID,
           Sales,
           LEAD(Sales, 1, 0) OVER(PARTITION BY Region ORDER BY Sales ASC) AS NearestHigherSales
    FROM @storeSales
    WHERE Year == 2016;

OUTPUT @result
TO "/Output/ReferenceGuide/Analytic/lead/exampleB.csv"
ORDER BY Region, Sales DESC
USING Outputters.Csv();

See Also