Query Statements and Expressions (U-SQL)

Summary

The core processing capability of U-SQL is to transform between rowsets using query expressions. This is done by combining query statements in a U-SQL script.

Since the current version of the Azure Data Lake Analytics service is only executing batch scripts, a query expression cannot be returned directly. It always has to be either output into a table with an INSERT statement or a file using an OUTPUT statement, or assigned to a rowset variable.

Syntax U-SQL Query Statements

Query_Statement :=                                                                                       
     Rowset_Variable '=' Query_Expression_With_Fetch.
Rowset_Variable := '@' + Unquoted_Identifier.

Remarks

  • Rowset_Variable
    The rowset variable is the name given to the provided Query_Expression_With_Fetch. Note that this rowset variable is not containing the result of the query expression. It is a reference to the expression itself, similar to the SQL Common-Table Expression names or the names of functional lambda expressions in functional languages.

    The name can be used in subsequent query expressions to refer to the named expression that then will be inlined.

  • Query_Expression_With_Fetch
    Is the query expression followed by an optional ORDER BY/OFFSET FETCH clause.

Syntax U-SQL Query Expressions

  Query_Expression_With_Fetch :=
       Query_Expression [Order_By_Fetch_Clause].
Query_Expression := Primary_Rowset_Expression \| '(' Query_Expression_With_Fetch ')' \| Set_Rowset_Expression.
  • Query_Expression
    Is the actual transformation expression. A query expression can either be a primary rowset query, a query expression enclosed in parenthesis or a set expression over two rowsets.

  • Primary_Rowset_Expression
    U-SQL’s primary rowset expressions are the main rowset generating and transforming expressions of the language: The EXTRACT expression to generate a rowset from unstructured data, the SELECT expression, and the PROCESS, REDUCE, COMBINE expressions that apply the custom-defined user-defined operators (UDO) to the input rowset(s) and generate a new rowset. In addition it also includes the invocation of table-valued functions.

Syntax Primary Rowset Expression

    Primary_Rowset_Expression :=                                                                   
         Extract_Expression
    |    Select_Expression
    |    Process_Expression
    |    Reduce_Expression
    |    Combine_Expression
    |    Function_Call
  

Please review the individual expressions, above, for more detailed examples.

Examples

  • The examples can be executed in Visual Studio with the Azure Data Lake Tools plug-in.
  • The scripts can be executed locally. An Azure subscription and Azure Data Lake Analytics account is not needed when executed locally.
    @someBooks = 
        SELECT * FROM 
            ( VALUES
            ("The Book Thief", "Markus Zusak", "2005"),
            ("The Girl with the Dragon Tattoo", "Stieg Larsson", "2005"),
            ("The Silver Linings Playbook", "Matthew Quick", "2008"),
            ("Sarah's Key", "Tatiana de Rosnay", "2006")
            ) AS T(Book, Author, [Publication Year]);
    
    @rowsetVariable =
        SELECT Book, Author
        FROM @someBooks;
    
    OUTPUT @rowsetVariable
    TO "/ReferenceGuide/DML/QSE/exampleA.txt"
    USING Outputters.Tsv();
    

See Also