Clause PIVOT

S’applique à :case marquée oui Databricks SQL case marquée oui Databricks Runtime

Transforme les lignes de table_reference en faisant pivoter les valeurs uniques d’une liste de colonnes spécifiée dans des colonnes distinctes.

Syntaxe

table_reference PIVOT ( { aggregate_expression [ [ AS ] agg_column_alias ] } [, ...]
    FOR column_list IN ( expression_list ) )

column_list
 { column_name |
   ( column_name [, ...] ) }

expression_list
 { expression [ AS ] [ column_alias ] |
   { ( expression [, ...] ) [ AS ] [ column_alias] } [, ...] ) }

Paramètres

  • table_reference

    Identifie le sujet de l’opération PIVOT.

  • aggregate_expression

    Expression de tout type dans laquelle toutes les références de colonne à table_reference sont des arguments pour les fonctions d’agrégation.

  • agg_column_alias

    Alias facultatif pour le résultat de l’agrégation. Si aucun alias est spécifié, PIVOT génère un alias basé sur aggregate_expression.

  • column_list

    Ensemble de colonnes à faire pivoter.

  • expression_list

    Mappage des valeurs de column_list à des alias de colonne.

    • expression

      Expression littérale avec un type qui partage un type le moins commun avec column_name.

      Le nombre d’expressions dans chaque tuple doit correspondre au nombre de column_names dans column_list.

    • column_alias

      Alias facultatif spécifiant le nom de la colonne générée. Si aucun alias est spécifié, PIVOT génère un alias basé sur les expression.

Résultats

Une table temporaire au format suivant :

  • Toutes les colonnes du jeu de résultats intermédiaire de table_reference qui n’ont pas été spécifiées dans aggregate_expression ou column_list.

    Ces colonnes regroupent des colonnes.

  • Pour chaque tuple expression et combinaison aggregate_expression, PIVOT génère une colonne. Le type représente le type de aggregate_expression.

    S’il n’y a queaggregate_expression, la colonne nommée à l’aide de column_alias. Dans le cas contraire elle est nommée column_alias_agg_column_alias.

    La valeur de chaque cellule est le résultat de aggregation_expression utilisant FILTER ( WHERE column_list IN (expression, ...).

Exemples

-- A very basic PIVOT
-- Given a table with sales by quarter, return a table that returns sales across quarters per year.
> CREATE TEMP VIEW sales(year, quarter, region, sales) AS
   VALUES (2018, 1, 'east', 100),
          (2018, 2, 'east',  20),
          (2018, 3, 'east',  40),
          (2018, 4, 'east',  40),
          (2019, 1, 'east', 120),
          (2019, 2, 'east', 110),
          (2019, 3, 'east',  80),
          (2019, 4, 'east',  60),
          (2018, 1, 'west', 105),
          (2018, 2, 'west',  25),
          (2018, 3, 'west',  45),
          (2018, 4, 'west',  45),
          (2019, 1, 'west', 125),
          (2019, 2, 'west', 115),
          (2019, 3, 'west',  85),
          (2019, 4, 'west',  65);

> SELECT year, region, q1, q2, q3, q4
  FROM sales
  PIVOT (sum(sales) AS sales
    FOR quarter
    IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
 2018  east  100  20  40  40
 2019  east  120  110  80  60
 2018  west  105  25  45  45
 2019  west  125  115  85  65

-- The same query written without PIVOT
> SELECT year, region,
         sum(sales) FILTER(WHERE quarter = 1) AS q1,
         sum(sales) FILTER(WHERE quarter = 2) AS q2,
         sum(sales) FILTER(WHERE quarter = 3) AS q2,
         sum(sales) FILTER(WHERE quarter = 4) AS q4
  FROM sales
  GROUP BY year, region;
 2018  east  100  20  40  40
 2019  east  120  110  80  60
 2018  west  105  25  45  45
 2019  west  125  115  85  65

-- Also PIVOT on region
> SELECT year, q1_east, q1_west, q2_east, q2_west, q3_east, q3_west, q4_east, q4_west
    FROM sales
    PIVOT (sum(sales) AS sales
      FOR (quarter, region)
      IN ((1, 'east') AS q1_east, (1, 'west') AS q1_west, (2, 'east') AS q2_east, (2, 'west') AS q2_west,
          (3, 'east') AS q3_east, (3, 'west') AS q3_west, (4, 'east') AS q4_east, (4, 'west') AS q4_west));
 2018  100  105  20  25  40  45  40  45
 2019  120  125  110  115  80  85  60  65

-- The same query written without PIVOT
> SELECT year,
    sum(sales) FILTER(WHERE (quarter, region) = (1, 'east')) AS q1_east,
    sum(sales) FILTER(WHERE (quarter, region) = (1, 'west')) AS q1_west,
    sum(sales) FILTER(WHERE (quarter, region) = (2, 'east')) AS q2_east,
    sum(sales) FILTER(WHERE (quarter, region) = (2, 'west')) AS q2_west,
    sum(sales) FILTER(WHERE (quarter, region) = (3, 'east')) AS q3_east,
    sum(sales) FILTER(WHERE (quarter, region) = (3, 'west')) AS q3_west,
    sum(sales) FILTER(WHERE (quarter, region) = (4, 'east')) AS q4_east,
    sum(sales) FILTER(WHERE (quarter, region) = (4, 'west')) AS q4_west
    FROM sales
    GROUP BY year, region;
 2018  100  105  20  25  40  45  40  45
 2019  120  125  110  115  80  85  60  65

-- To aggregate across regions the column must be removed from the input.
> SELECT year, q1, q2, q3, q4
  FROM (SELECT year, quarter, sales FROM sales) AS s
  PIVOT (sum(sales) AS sales
    FOR quarter
    IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
  2018  205  45  85  85
  2019  245  225  165  125

-- The same query without PIVOT
> SELECT year,
    sum(sales) FILTER(WHERE quarter = 1) AS q1,
    sum(sales) FILTER(WHERE quarter = 2) AS q2,
    sum(sales) FILTER(WHERE quarter = 3) AS q3,
    sum(sales) FILTER(WHERE quarter = 4) AS q4
    FROM sales
    GROUP BY year;

-- A PIVOT with multiple aggregations
> SELECT year, q1_total, q1_avg, q2_total, q2_avg, q3_total, q3_avg, q4_total, q4_avg
    FROM (SELECT year, quarter, sales FROM sales) AS s
    PIVOT (sum(sales) AS total, avg(sales) AS avg
      FOR quarter
      IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
 2018  205  102.5  45  22.5  85  42.5  85  42.5
 2019  245  122.5  225  112.5  165  82.5  125  62.5

-- The same query without PIVOT
> SELECT year,
         sum(sales) FILTER(WHERE quarter = 1) AS q1_total,
         avg(sales) FILTER(WHERE quarter = 1) AS q1_avg,
         sum(sales) FILTER(WHERE quarter = 2) AS q2_total,
         avg(sales) FILTER(WHERE quarter = 2) AS q2_avg,
         sum(sales) FILTER(WHERE quarter = 3) AS q3_total,
         avg(sales) FILTER(WHERE quarter = 3) AS q3_avg,
         sum(sales) FILTER(WHERE quarter = 4) AS q4_total,
         avg(sales) FILTER(WHERE quarter = 4) AS q4_avg
    FROM sales
    GROUP BY year;

> CREATE TEMP VIEW person (id, name, age, class, address) AS
    VALUES (100, 'John', 30, 1, 'Street 1'),
           (200, 'Mary', NULL, 1, 'Street 2'),
           (300, 'Mike', 80, 3, 'Street 3'),
           (400, 'Dan', 50, 4, 'Street 4');
 2018  205  102.5  45  22.5  85  42.5  85  42.5
 2019  245  122.5  225  112.5  165  82.5  125  62.5