The LAG analytic function provides access to a row at a given physical offset that comes before the current row. Use this analytic function in a SELECT expression to compare values in the current row with values in a previous row.
LAG can only be used in the context of a windowing expression.
LAG_Expression := 'LAG' '(' expression [ ',' offset ] [ ',' default ] ')'.
The number of rows back from the current row from which to obtain a value. If not specified, the default is 1.
offsetcan be an expression that evaluates to a constant positive integral value. Column references and method calls are not allowed.
The value to return when
offsetis NULL. If a default value is not specified, NULL is returned.
defaultmust be a constant and type-compatible with
expression. Column references and method calls are not allowed.
The nullable type of the input.
Usage in Windowing Expression
This analytic function can be used in a windowing expression with the following restrictions:
The examples can be executed in Visual Studio with the Azure Data Lake Tools plug-in.
The scripts can be executed locally. An Azure subscription and Azure Data Lake Analytics account is not needed when executed locally.
The examples below are based on the dataset defined below. Ensure your execution includes the rowset variable.
@storeSales = SELECT * FROM ( VALUES (1, "NW", 2013, 100), (1, "NW", 2014, 150), (1, "NW", 2015, 300), (1, "NW", 2016, 640), (2, "SW", 2013, 200), (2, "SW", 2014, 350), (2, "SW", 2015, 500), (2, "SW", 2016, 650), (3, "NW", 2015, 75), (3, "NW", 2016, 100), (4, "NW", 2016, 375), (5, "SW", 2016, 700) ) AS T(StoreID, Region, Year, Sales);
A. Compare values between years
The following example uses the
LAG function to return the difference in sales for a specific store over previous years. Notice that because there is no lag value available for the last row, the default of zero (0) is returned.
@result = SELECT StoreID, Year AS SalesYear, Sales AS CurrentSales, LAG(Sales, 1, 0) OVER(ORDER BY Year) AS PreviousSales FROM @storeSales WHERE StoreID == 1 AND Year >= 2014; OUTPUT @result TO "/Output/ReferenceGuide/Analytic/lag/exampleA.csv" ORDER BY SalesYear DESC USING Outputters.Csv();
B. Dividing the result set using PARTITION BY
The following example uses the
LAG function to compare year-to-date sales between stores. Each record shows a store's sales and the sales of the store with the nearest lower sales. The PARTITION BY clause is specified to divide the rows in the result set by region. The
LAG function is applied to each partition separately and computation restarts for each partition. The ORDER BY clause in the OVER clause orders the rows in each partition. The ORDER BY clause in the
OUTPUT statement sorts the rows in the whole result set. Notice that because there is no lag value available for the last row of each partition, the default of zero (0) is returned.
@result = SELECT Region, StoreID, Sales, LAG(Sales, 1, 0) OVER(PARTITION BY Region ORDER BY Sales ASC ) AS NearestLowerSales FROM @storeSales WHERE Year == 2016; OUTPUT @result TO "/Output/ReferenceGuide/Analytic/lag/exampleB.csv" ORDER BY Region, Sales DESC USING Outputters.Csv();