Поделиться через


Предложение PIVOT

Область применения:проверка помечено да Databricks SQL проверка помечено да Databricks Runtime

Преобразует строки table_reference путем поворота уникальных значений указанного списка столбцов в отдельные столбцы.

Синтаксис

table_reference PIVOT ( { aggregate_expression [ [ AS ] agg_column_alias ] } [, ...]
    FOR column_list IN ( expression_list ) )

column_list
 { column_name |
   ( column_name [, ...] ) }

expression_list
 { expression [ AS ] [ column_alias ] |
   { ( expression [, ...] ) [ AS ] [ column_alias] } [, ...] ) }

Параметры

  • table_reference

    Определяет субъект PIVOT операции.

  • aggregate_expression

    Выражение любого типа, где все ссылки на table_reference столбцы являются аргументами агрегатных функций.

  • agg_column_alias

    Необязательный псевдоним для результата агрегирования. Если псевдоним не указан, PIVOT создает псевдоним на aggregate_expressionоснове .

  • column_list

    Набор столбцов для поворота.

  • expression_list

    Сопоставляет значения из column_list с псевдонимами столбцов.

    • Выражение

      Литеральное выражение с типом , который использует наименее распространенный тип с соответствующим column_name.

      Число выражений в каждом кортеже должно совпадать с числом column_names в column_list.

    • column_alias

      Необязательный псевдоним, указывающий имя созданного столбца. Если псевдоним не указан PIVOT , создается псевдоним на expressionоснове .

Результат

Временная таблица следующего вида:

  • Все столбцы из промежуточного результирующих наборов table_reference , которые не были указаны в или aggregate_expressioncolumn_list.

    Эти столбцы являются столбцами группирования.

  • Для каждого expression кортежа и aggregate_expression комбинации PIVOT создается один столбец. Тип является типом aggregate_expression.

    Если имеется только один aggregate_expression , столбец называется с помощью column_alias. В противном случае он называется column_alias_agg_column_alias.

    Значение в каждой aggregation_expression ячейке является результатом использования FILTER ( WHERE column_list IN (expression, ...).

Примеры

-- A very basic PIVOT
-- Given a table with sales by quarter, return a table that returns sales across quarters per year.
> CREATE TEMP VIEW sales(year, quarter, region, sales) AS
   VALUES (2018, 1, 'east', 100),
          (2018, 2, 'east',  20),
          (2018, 3, 'east',  40),
          (2018, 4, 'east',  40),
          (2019, 1, 'east', 120),
          (2019, 2, 'east', 110),
          (2019, 3, 'east',  80),
          (2019, 4, 'east',  60),
          (2018, 1, 'west', 105),
          (2018, 2, 'west',  25),
          (2018, 3, 'west',  45),
          (2018, 4, 'west',  45),
          (2019, 1, 'west', 125),
          (2019, 2, 'west', 115),
          (2019, 3, 'west',  85),
          (2019, 4, 'west',  65);

> SELECT year, region, q1, q2, q3, q4
  FROM sales
  PIVOT (sum(sales) AS sales
    FOR quarter
    IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
 2018  east  100  20  40  40
 2019  east  120  110  80  60
 2018  west  105  25  45  45
 2019  west  125  115  85  65

-- The same query written without PIVOT
> SELECT year, region,
         sum(sales) FILTER(WHERE quarter = 1) AS q1,
         sum(sales) FILTER(WHERE quarter = 2) AS q2,
         sum(sales) FILTER(WHERE quarter = 3) AS q2,
         sum(sales) FILTER(WHERE quarter = 4) AS q4
  FROM sales
  GROUP BY year, region;
 2018  east  100  20  40  40
 2019  east  120  110  80  60
 2018  west  105  25  45  45
 2019  west  125  115  85  65

-- Also PIVOT on region
> SELECT year, q1_east, q1_west, q2_east, q2_west, q3_east, q3_west, q4_east, q4_west
    FROM sales
    PIVOT (sum(sales) AS sales
      FOR (quarter, region)
      IN ((1, 'east') AS q1_east, (1, 'west') AS q1_west, (2, 'east') AS q2_east, (2, 'west') AS q2_west,
          (3, 'east') AS q3_east, (3, 'west') AS q3_west, (4, 'east') AS q4_east, (4, 'west') AS q4_west));
 2018  100  105  20  25  40  45  40  45
 2019  120  125  110  115  80  85  60  65

-- The same query written without PIVOT
> SELECT year,
    sum(sales) FILTER(WHERE (quarter, region) = (1, 'east')) AS q1_east,
    sum(sales) FILTER(WHERE (quarter, region) = (1, 'west')) AS q1_west,
    sum(sales) FILTER(WHERE (quarter, region) = (2, 'east')) AS q2_east,
    sum(sales) FILTER(WHERE (quarter, region) = (2, 'west')) AS q2_west,
    sum(sales) FILTER(WHERE (quarter, region) = (3, 'east')) AS q3_east,
    sum(sales) FILTER(WHERE (quarter, region) = (3, 'west')) AS q3_west,
    sum(sales) FILTER(WHERE (quarter, region) = (4, 'east')) AS q4_east,
    sum(sales) FILTER(WHERE (quarter, region) = (4, 'west')) AS q4_west
    FROM sales
    GROUP BY year, region;
 2018  100  105  20  25  40  45  40  45
 2019  120  125  110  115  80  85  60  65

-- To aggregate across regions the column must be removed from the input.
> SELECT year, q1, q2, q3, q4
  FROM (SELECT year, quarter, sales FROM sales) AS s
  PIVOT (sum(sales) AS sales
    FOR quarter
    IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
  2018  205  45  85  85
  2019  245  225  165  125

-- The same query without PIVOT
> SELECT year,
    sum(sales) FILTER(WHERE quarter = 1) AS q1,
    sum(sales) FILTER(WHERE quarter = 2) AS q2,
    sum(sales) FILTER(WHERE quarter = 3) AS q3,
    sum(sales) FILTER(WHERE quarter = 4) AS q4
    FROM sales
    GROUP BY year;

-- A PIVOT with multiple aggregations
> SELECT year, q1_total, q1_avg, q2_total, q2_avg, q3_total, q3_avg, q4_total, q4_avg
    FROM (SELECT year, quarter, sales FROM sales) AS s
    PIVOT (sum(sales) AS total, avg(sales) AS avg
      FOR quarter
      IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
 2018  205  102.5  45  22.5  85  42.5  85  42.5
 2019  245  122.5  225  112.5  165  82.5  125  62.5

-- The same query without PIVOT
> SELECT year,
         sum(sales) FILTER(WHERE quarter = 1) AS q1_total,
         avg(sales) FILTER(WHERE quarter = 1) AS q1_avg,
         sum(sales) FILTER(WHERE quarter = 2) AS q2_total,
         avg(sales) FILTER(WHERE quarter = 2) AS q2_avg,
         sum(sales) FILTER(WHERE quarter = 3) AS q3_total,
         avg(sales) FILTER(WHERE quarter = 3) AS q3_avg,
         sum(sales) FILTER(WHERE quarter = 4) AS q4_total,
         avg(sales) FILTER(WHERE quarter = 4) AS q4_avg
    FROM sales
    GROUP BY year;

> CREATE TEMP VIEW person (id, name, age, class, address) AS
    VALUES (100, 'John', 30, 1, 'Street 1'),
           (200, 'Mary', NULL, 1, 'Street 2'),
           (300, 'Mike', 80, 3, 'Street 3'),
           (400, 'Dan', 50, 4, 'Street 4');
 2018  205  102.5  45  22.5  85  42.5  85  42.5
 2019  245  122.5  225  112.5  165  82.5  125  62.5