内置函数Built-in functions

聚合函数Aggregate functions

函数Function 说明Description
任何 (expr) any(expr) 如果至少有一个值 expr 为 true,则返回 true。Returns true if at least one value of expr is true.
approx_count_distinct (expr [,relativeSD] ) approx_count_distinct(expr[,relativeSD]) 通过 HyperLogLog + + 返回估计基数。Returns the estimated cardinality by HyperLogLog++. relativeSD 定义允许的最大估计错误数。relativeSD defines the maximum estimation error allowed.
approx_percentile (列,百分比 [,准确性] ) approx_percentile(col,percentage[,accuracy]) 返回给定百分比的数字列的近似百分位值 colReturns the approximate percentile value of numeric column col at the given percentage. 百分比值必须介于0.0 和1.0 之间。The value of percentage must be between 0.0 and 1.0. accuracy参数 (默认值: 10000) 是一个正数值文本,它以内存成本控制近似值精度。The accuracy parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. 更高的值 accuracy 可获得更好的准确性, 1.0/accuracy 是近似的相对错误。Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation. percentage 是数组时,百分比数组的每个值必须介于0.0 和1.0 之间。When percentage is an array, each value of the percentage array must be between 0.0 and 1.0. 在这种情况下, col 将返回给定百分比数组中列的大约百分位数组。In this case, returns the approximate percentile array of column col at the given percentage array.
avg (expr) avg(expr) 返回从组的值计算出的平均值。Returns the mean calculated from values of a group.
bit_or (expr) bit_or(expr) 返回所有非 null 输入值的按位 "或",如果没有,则返回 null。Returns the bitwise OR of all non-null input values, or null if none.
bit_xor (expr) bit_xor(expr) 返回所有非 null 输入值的按位 XOR,如果没有,则返回 null。Returns the bitwise XOR of all non-null input values, or null if none.
bool_and (expr) bool_and(expr) 如果的所有值均为 true,则返回 true exprReturns true if all values of expr are true.
bool_or (expr) bool_or(expr) 如果至少有一个值 expr 为 true,则返回 true。Returns true if at least one value of expr is true.
collect_list (expr) collect_list(expr) 收集并返回非唯一元素的列表。Collects and returns a list of non-unique elements.
collect_set (expr) collect_set(expr) 收集并返回一组唯一元素。Collects and returns a set of unique elements.
corr (,表达式 2) corr(expr1,expr2) 返回一组数字对之间的关联的皮尔逊系数。Returns Pearson coefficient of correlation between a set of number pairs.
计数 ( * ) count(*) 返回检索到的行的总数,包括包含 null 的行。Returns the total number of retrieved rows, including rows containing null.
计数 (expr [,expr ...]) count(expr[,expr…]) 返回所提供的表达式 (s) 的行数均为非 null。Returns the number of rows for which the supplied expression(s) are all non-null.
计数 (DISTINCT expr [,expr ...]) count(DISTINCT expr[,expr…]) 返回所提供的表达式 (s) 的行数唯一且为非 null。Returns the number of rows for which the supplied expression(s) are unique and non-null.
count_if (expr) count_if(expr) 返回表达式的值的数目 TRUEReturns the number of TRUE values for the expression.
count_min_sketch (列、eps、置信度、种子) count_min_sketch(col,eps,confidence,seed) 返回具有给定 esp、置信度和种子的列的最小计数草案。Returns a count-min sketch of a column with the given esp, confidence and seed. 结果是字节数组,可以在使用之前对其进行反序列化 CountMinSketchThe result is an array of bytes, which can be deserialized to a CountMinSketch before usage. 计数-最小值草绘是一个概率数据结构,用于使用子线性空间的基数估算。Count-min sketch is a probabilistic data structure used for cardinality estimation using sub-linear space.
covar_pop (,表达式 2) covar_pop(expr1,expr2) 返回一组数字对的总体协方差。Returns the population covariance of a set of number pairs.
covar_samp (,表达式 2) covar_samp(expr1,expr2) 返回一组数字对的样本协方差。Returns the sample covariance of a set of number pairs.
每个 (expr) every(expr) 如果的所有值均为 true,则返回 true exprReturns true if all values of expr are true.
first (expr [,isIgnoreNull] ) first(expr[,isIgnoreNull]) 返回一组行的第一个值 exprReturns the first value of expr for a group of rows. 如果 isIgnoreNull 为 true,则仅返回非 null 值。If isIgnoreNull is true, returns only non-null values.
first_value (expr [,isIgnoreNull] ) first_value(expr[,isIgnoreNull]) 返回一组行的第一个值 exprReturns the first value of expr for a group of rows. 如果 isIgnoreNull 为 true,则仅返回非 null 值。If isIgnoreNull is true, returns only non-null values.
峰值 (expr) kurtosis(expr) 返回从组的值计算得出的峰值值。Returns the kurtosis value calculated from values of a group.
last (expr [,isIgnoreNull] ) last(expr[,isIgnoreNull]) 返回一组行的最后一个值 exprReturns the last value of expr for a group of rows. 如果 isIgnoreNull 为 true,则仅返回非 null 值。If isIgnoreNull is true, returns only non-null values.
last_value (expr [,isIgnoreNull] ) last_value(expr[,isIgnoreNull]) 返回一组行的最后一个值 exprReturns the last value of expr for a group of rows. 如果 isIgnoreNull 为 true,则仅返回非 null 值。If isIgnoreNull is true, returns only non-null values.
max (expr) max(expr) 返回的最大值 exprReturns the maximum value of expr.
max_by (x、y) max_by(x,y) 返回 x 与最大值关联的的值 yReturns the value of x associated with the maximum value of y.
(expr 的平均值) mean(expr) 返回从组的值计算出的平均值。Returns the mean calculated from values of a group.
min (expr) min(expr) 返回的最小值 exprReturns the minimum value of expr.
min_by (x、y) min_by(x,y) 返回 x 与的最小值相关联的值 yReturns the value of x associated with the minimum value of y.
百分点 (列,百分比 [,frequency] ) percentile(col,percentage[,frequency]) 返回以给定百分比表示的数值列的精确百分位值 colReturns the exact percentile value of numeric column col at the given percentage. 百分比值必须介于0.0 和1.0 之间。The value of percentage must be between 0.0 and 1.0. Frequency 的值应为正整数。The value of frequency should be positive integral.
百分点 (col、array (percentage1 [,percentage2] ... ) [,frequency] ) percentile(col,array(percentage1[,percentage2]…)[,frequency]) 返回) (给定百分比的数字列的精确百分比值数组 colReturns the exact percentile value array of numeric column col at the given percentage(s). 百分比数组的每个值必须介于0.0 和1.0 之间。Each value of the percentage array must be between 0.0 and 1.0. Frequency 的值应为正整数。The value of frequency should be positive integral.
percentile_approx (列,百分比 [,准确性] ) percentile_approx(col,percentage[,accuracy]) 返回给定百分比的数字列的近似百分位值 colReturns the approximate percentile value of numeric column col at the given percentage. 百分比值必须介于0.0 和1.0 之间。The value of percentage must be between 0.0 and 1.0. accuracy参数 (默认值: 10000) 是一个正数值文本,它以内存成本控制近似值精度。The accuracy parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. 更高的值 accuracy 可获得更好的准确性, 1.0/accuracy 是近似的相对错误。Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation. percentage 是数组时,百分比数组的每个值必须介于0.0 和1.0 之间。When percentage is an array, each value of the percentage array must be between 0.0 and 1.0. 在这种情况下, col 将返回给定百分比数组中列的大约百分位数组。In this case, returns the approximate percentile array of column col at the given percentage array.
不对称 (expr) skewness(expr) 返回从组的值计算出的偏斜度值。Returns the skewness value calculated from values of a group.
某些 (expr) some(expr) 如果至少有一个值 expr 为 true,则返回 true。Returns true if at least one value of expr is true.
std (expr) std(expr) 返回从组的值计算出的样本标准偏差。Returns the sample standard deviation calculated from values of a group.
stddev (expr) stddev(expr) 返回从组的值计算出的样本标准偏差。Returns the sample standard deviation calculated from values of a group.
stddev_pop (expr) stddev_pop(expr) 返回通过组的值计算出的总体标准偏差。Returns the population standard deviation calculated from values of a group.
stddev_samp (expr) stddev_samp(expr) 返回从组的值计算出的样本标准偏差。Returns the sample standard deviation calculated from values of a group.
sum (expr) sum(expr) 返回从组的值计算得出的总和。Returns the sum calculated from values of a group.
var_pop (expr) var_pop(expr) 返回根据组的值计算出的总体方差。Returns the population variance calculated from values of a group.
var_samp (expr) var_samp(expr) 返回从组的值计算出的样本方差。Returns the sample variance calculated from values of a group.
(expr) 的差异variance(expr) 返回从组的值计算出的样本方差。Returns the sample variance calculated from values of a group.

示例Examples

-- any
SELECT any(col) FROM VALUES (true), (false), (false) AS tab(col);
+--------+
|any(col)|
+--------+
|    true|
+--------+

SELECT any(col) FROM VALUES (NULL), (true), (false) AS tab(col);
+--------+
|any(col)|
+--------+
|    true|
+--------+

SELECT any(col) FROM VALUES (false), (false), (NULL) AS tab(col);
+--------+
|any(col)|
+--------+
|   false|
+--------+

-- approx_count_distinct
SELECT approx_count_distinct(col1) FROM VALUES (1), (1), (2), (2), (3) tab(col1);
+---------------------------+
|approx_count_distinct(col1)|
+---------------------------+
|                          3|
+---------------------------+

-- approx_percentile
SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100);
+--------------------------------------------------+
|approx_percentile(10.0, array(0.5, 0.4, 0.1), 100)|
+--------------------------------------------------+
|                                [10.0, 10.0, 10.0]|
+--------------------------------------------------+

SELECT approx_percentile(10.0, 0.5, 100);
+-------------------------------------------------+
|approx_percentile(10.0, CAST(0.5 AS DOUBLE), 100)|
+-------------------------------------------------+
|                                             10.0|
+-------------------------------------------------+

-- avg
SELECT avg(col) FROM VALUES (1), (2), (3) AS tab(col);
+--------+
|avg(col)|
+--------+
|     2.0|
+--------+

SELECT avg(col) FROM VALUES (1), (2), (NULL) AS tab(col);
+--------+
|avg(col)|
+--------+
|     1.5|
+--------+

-- bit_or
SELECT bit_or(col) FROM VALUES (3), (5) AS tab(col);
+-----------+
|bit_or(col)|
+-----------+
|          7|
+-----------+

-- bit_xor
SELECT bit_xor(col) FROM VALUES (3), (5) AS tab(col);
+------------+
|bit_xor(col)|
+------------+
|           6|
+------------+

-- bool_and
SELECT bool_and(col) FROM VALUES (true), (true), (true) AS tab(col);
+-------------+
|bool_and(col)|
+-------------+
|         true|
+-------------+

SELECT bool_and(col) FROM VALUES (NULL), (true), (true) AS tab(col);
+-------------+
|bool_and(col)|
+-------------+
|         true|
+-------------+

SELECT bool_and(col) FROM VALUES (true), (false), (true) AS tab(col);
+-------------+
|bool_and(col)|
+-------------+
|        false|
+-------------+

-- bool_or
SELECT bool_or(col) FROM VALUES (true), (false), (false) AS tab(col);
+------------+
|bool_or(col)|
+------------+
|        true|
+------------+

SELECT bool_or(col) FROM VALUES (NULL), (true), (false) AS tab(col);
+------------+
|bool_or(col)|
+------------+
|        true|
+------------+

SELECT bool_or(col) FROM VALUES (false), (false), (NULL) AS tab(col);
+------------+
|bool_or(col)|
+------------+
|       false|
+------------+

-- collect_list
SELECT collect_list(col) FROM VALUES (1), (2), (1) AS tab(col);
+-----------------+
|collect_list(col)|
+-----------------+
|        [1, 2, 1]|
+-----------------+

-- collect_set
SELECT collect_set(col) FROM VALUES (1), (2), (1) AS tab(col);
+----------------+
|collect_set(col)|
+----------------+
|          [1, 2]|
+----------------+

-- corr
SELECT corr(c1, c2) FROM VALUES (3, 2), (3, 3), (6, 4) as tab(c1, c2);
+--------------------------------------------+
|corr(CAST(c1 AS DOUBLE), CAST(c2 AS DOUBLE))|
+--------------------------------------------+
|                          0.8660254037844387|
+--------------------------------------------+

-- count
SELECT count(*) FROM VALUES (NULL), (5), (5), (20) AS tab(col);
+--------+
|count(1)|
+--------+
|       4|
+--------+

SELECT count(col) FROM VALUES (NULL), (5), (5), (20) AS tab(col);
+----------+
|count(col)|
+----------+
|         3|
+----------+

SELECT count(DISTINCT col) FROM VALUES (NULL), (5), (5), (10) AS tab(col);
+-------------------+
|count(DISTINCT col)|
+-------------------+
|                  2|
+-------------------+

-- count_if
SELECT count_if(col % 2 = 0) FROM VALUES (NULL), (0), (1), (2), (3) AS tab(col);
+-------------------------+
|count_if(((col % 2) = 0))|
+-------------------------+
|                        2|
+-------------------------+

SELECT count_if(col IS NULL) FROM VALUES (NULL), (0), (1), (2), (3) AS tab(col);
+-----------------------+
|count_if((col IS NULL))|
+-----------------------+
|                      1|
+-----------------------+

-- covar_pop
SELECT covar_pop(c1, c2) FROM VALUES (1,1), (2,2), (3,3) AS tab(c1, c2);
+-------------------------------------------------+
|covar_pop(CAST(c1 AS DOUBLE), CAST(c2 AS DOUBLE))|
+-------------------------------------------------+
|                               0.6666666666666666|
+-------------------------------------------------+

-- covar_samp
SELECT covar_samp(c1, c2) FROM VALUES (1,1), (2,2), (3,3) AS tab(c1, c2);
+--------------------------------------------------+
|covar_samp(CAST(c1 AS DOUBLE), CAST(c2 AS DOUBLE))|
+--------------------------------------------------+
|                                               1.0|
+--------------------------------------------------+

-- every
SELECT every(col) FROM VALUES (true), (true), (true) AS tab(col);
+----------+
|every(col)|
+----------+
|      true|
+----------+

SELECT every(col) FROM VALUES (NULL), (true), (true) AS tab(col);
+----------+
|every(col)|
+----------+
|      true|
+----------+

SELECT every(col) FROM VALUES (true), (false), (true) AS tab(col);
+----------+
|every(col)|
+----------+
|     false|
+----------+

-- first
SELECT first(col) FROM VALUES (10), (5), (20) AS tab(col);
+-----------------+
|first(col, false)|
+-----------------+
|               10|
+-----------------+

SELECT first(col) FROM VALUES (NULL), (5), (20) AS tab(col);
+-----------------+
|first(col, false)|
+-----------------+
|             null|
+-----------------+

SELECT first(col, true) FROM VALUES (NULL), (5), (20) AS tab(col);
+----------------+
|first(col, true)|
+----------------+
|               5|
+----------------+

-- first_value
SELECT first_value(col) FROM VALUES (10), (5), (20) AS tab(col);
+-----------------------+
|first_value(col, false)|
+-----------------------+
|                     10|
+-----------------------+

SELECT first_value(col) FROM VALUES (NULL), (5), (20) AS tab(col);
+-----------------------+
|first_value(col, false)|
+-----------------------+
|                   null|
+-----------------------+

SELECT first_value(col, true) FROM VALUES (NULL), (5), (20) AS tab(col);
+----------------------+
|first_value(col, true)|
+----------------------+
|                     5|
+----------------------+

-- kurtosis
SELECT kurtosis(col) FROM VALUES (-10), (-20), (100), (1000) AS tab(col);
+-----------------------------+
|kurtosis(CAST(col AS DOUBLE))|
+-----------------------------+
|          -0.7014368047529618|
+-----------------------------+

SELECT kurtosis(col) FROM VALUES (1), (10), (100), (10), (1) as tab(col);
+-----------------------------+
|kurtosis(CAST(col AS DOUBLE))|
+-----------------------------+
|          0.19432323191698986|
+-----------------------------+

-- last
SELECT last(col) FROM VALUES (10), (5), (20) AS tab(col);
+----------------+
|last(col, false)|
+----------------+
|              20|
+----------------+

SELECT last(col) FROM VALUES (10), (5), (NULL) AS tab(col);
+----------------+
|last(col, false)|
+----------------+
|            null|
+----------------+

SELECT last(col, true) FROM VALUES (10), (5), (NULL) AS tab(col);
+---------------+
|last(col, true)|
+---------------+
|              5|
+---------------+

-- last_value
SELECT last_value(col) FROM VALUES (10), (5), (20) AS tab(col);
+----------------------+
|last_value(col, false)|
+----------------------+
|                    20|
+----------------------+

SELECT last_value(col) FROM VALUES (10), (5), (NULL) AS tab(col);
+----------------------+
|last_value(col, false)|
+----------------------+
|                  null|
+----------------------+

SELECT last_value(col, true) FROM VALUES (10), (5), (NULL) AS tab(col);
+---------------------+
|last_value(col, true)|
+---------------------+
|                    5|
+---------------------+

-- max
SELECT max(col) FROM VALUES (10), (50), (20) AS tab(col);
+--------+
|max(col)|
+--------+
|      50|
+--------+

-- max_by
SELECT max_by(x, y) FROM VALUES (('a', 10)), (('b', 50)), (('c', 20)) AS tab(x, y);
+------------+
|max_by(x, y)|
+------------+
|           b|
+------------+

-- mean
SELECT mean(col) FROM VALUES (1), (2), (3) AS tab(col);
+---------+
|mean(col)|
+---------+
|      2.0|
+---------+

SELECT mean(col) FROM VALUES (1), (2), (NULL) AS tab(col);
+---------+
|mean(col)|
+---------+
|      1.5|
+---------+

-- min
SELECT min(col) FROM VALUES (10), (-1), (20) AS tab(col);
+--------+
|min(col)|
+--------+
|      -1|
+--------+

-- min_by
SELECT min_by(x, y) FROM VALUES (('a', 10)), (('b', 50)), (('c', 20)) AS tab(x, y);
+------------+
|min_by(x, y)|
+------------+
|           a|
+------------+

-- percentile
SELECT percentile(col, 0.3) FROM VALUES (0), (10) AS tab(col);
+---------------------------------------+
|percentile(col, CAST(0.3 AS DOUBLE), 1)|
+---------------------------------------+
|                                    3.0|
+---------------------------------------+

SELECT percentile(col, array(0.25, 0.75)) FROM VALUES (0), (10) AS tab(col);
+-------------------------------------+
|percentile(col, array(0.25, 0.75), 1)|
+-------------------------------------+
|                           [2.5, 7.5]|
+-------------------------------------+

-- percentile_approx
SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100);
+--------------------------------------------------+
|percentile_approx(10.0, array(0.5, 0.4, 0.1), 100)|
+--------------------------------------------------+
|                                [10.0, 10.0, 10.0]|
+--------------------------------------------------+

SELECT percentile_approx(10.0, 0.5, 100);
+-------------------------------------------------+
|percentile_approx(10.0, CAST(0.5 AS DOUBLE), 100)|
+-------------------------------------------------+
|                                             10.0|
+-------------------------------------------------+

-- skewness
SELECT skewness(col) FROM VALUES (-10), (-20), (100), (1000) AS tab(col);
+-----------------------------+
|skewness(CAST(col AS DOUBLE))|
+-----------------------------+
|           1.1135657469022013|
+-----------------------------+

SELECT skewness(col) FROM VALUES (-1000), (-100), (10), (20) AS tab(col);
+-----------------------------+
|skewness(CAST(col AS DOUBLE))|
+-----------------------------+
|          -1.1135657469022011|
+-----------------------------+

-- some
SELECT some(col) FROM VALUES (true), (false), (false) AS tab(col);
+---------+
|some(col)|
+---------+
|     true|
+---------+

SELECT some(col) FROM VALUES (NULL), (true), (false) AS tab(col);
+---------+
|some(col)|
+---------+
|     true|
+---------+

SELECT some(col) FROM VALUES (false), (false), (NULL) AS tab(col);
+---------+
|some(col)|
+---------+
|    false|
+---------+

-- std
SELECT std(col) FROM VALUES (1), (2), (3) AS tab(col);
+------------------------+
|std(CAST(col AS DOUBLE))|
+------------------------+
|                     1.0|
+------------------------+

-- stddev
SELECT stddev(col) FROM VALUES (1), (2), (3) AS tab(col);
+---------------------------+
|stddev(CAST(col AS DOUBLE))|
+---------------------------+
|                        1.0|
+---------------------------+

-- stddev_pop
SELECT stddev_pop(col) FROM VALUES (1), (2), (3) AS tab(col);
+-------------------------------+
|stddev_pop(CAST(col AS DOUBLE))|
+-------------------------------+
|              0.816496580927726|
+-------------------------------+

-- stddev_samp
SELECT stddev_samp(col) FROM VALUES (1), (2), (3) AS tab(col);
+--------------------------------+
|stddev_samp(CAST(col AS DOUBLE))|
+--------------------------------+
|                             1.0|
+--------------------------------+

-- sum
SELECT sum(col) FROM VALUES (5), (10), (15) AS tab(col);
+--------+
|sum(col)|
+--------+
|      30|
+--------+

SELECT sum(col) FROM VALUES (NULL), (10), (15) AS tab(col);
+--------+
|sum(col)|
+--------+
|      25|
+--------+

SELECT sum(col) FROM VALUES (NULL), (NULL) AS tab(col);
+--------+
|sum(col)|
+--------+
|    null|
+--------+

-- var_pop
SELECT var_pop(col) FROM VALUES (1), (2), (3) AS tab(col);
+----------------------------+
|var_pop(CAST(col AS DOUBLE))|
+----------------------------+
|          0.6666666666666666|
+----------------------------+

-- var_samp
SELECT var_samp(col) FROM VALUES (1), (2), (3) AS tab(col);
+-----------------------------+
|var_samp(CAST(col AS DOUBLE))|
+-----------------------------+
|                          1.0|
+-----------------------------+

-- variance
SELECT variance(col) FROM VALUES (1), (2), (3) AS tab(col);
+-----------------------------+
|variance(CAST(col AS DOUBLE))|
+-----------------------------+
|                          1.0|
+-----------------------------+

开窗函数Window functions

函数Function 说明Description
cume_dist ( # A1cume_dist() 计算相对于分区中所有值的值的位置。Computes the position of a value relative to all values in the partition.
dense_rank ( # A1dense_rank() 计算某个值在值组中的排名。Computes the rank of a value in a group of values. 结果为一个加上之前分配的排名值。The result is one plus the previously assigned rank value. 与函数级别不同的是,dense_rank 不会在排名序列中产生空白。Unlike the function rank, dense_rank will not produce gaps in the ranking sequence.
lag (输入 [,偏移量 [,默认]] ) lag(input[,offset[,default]]) 返回 input offset 窗口中当前行之前第一行的值。Returns the value of input at the offsetth row before the current row in the window. 的默认值 offset 为1,默认值 default 为 null。The default value of offset is 1 and the default value of default is null. 如果 input offset 第一行的值为 null,则返回 null。If the value of input at the offsetth row is null, null is returned. 如果没有这样的偏移行 (例如,如果偏移量为1,则窗口的第一行不会有任何上一个行) defaultIf there is no such offset row (e.g., when the offset is 1, the first row of the window does not have any previous row), default is returned.
潜在顾客 (输入 [,偏移量 [,默认]] ) lead(input[,offset[,default]]) 返回 input offset 窗口中当前行之后的第一行的值。Returns the value of input at the offsetth row after the current row in the window. 的默认值 offset 为1,默认值 default 为 null。The default value of offset is 1 and the default value of default is null. 如果 input offset 第一行的值为 null,则返回 null。If the value of input at the offsetth row is null, null is returned. 如果没有这样的偏移行 (例如,如果偏移量为1,则窗口的最后一行不会返回任何后续行) defaultIf there is no such an offset row (e.g., when the offset is 1, the last row of the window does not have any subsequent row), default is returned.
ntile (n) ntile(n) 将每个窗口分区的行分割为 n 介于1到最多的存储桶中 nDivides the rows for each window partition into n buckets ranging from 1 to at most n.
percent_rank ( # A1percent_rank() 计算一组值中值的百分比排名。Computes the percentage ranking of a value in a group of values.
排名 ( # A1rank() 计算某个值在值组中的排名。Computes the rank of a value in a group of values. 结果是 1 加上前面的行数,或者等于当前行在分区中的顺序。The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. 值将在序列中生成空隙。The values will produce gaps in the sequence.
row_number()row_number() 根据窗口分区中的行顺序,为每一行指定一个唯一的顺序编号(从1开始)。Assigns a unique, sequential number to each row, starting with one, according to the ordering of rows within the window partition.

数组函数Array functions

函数Function 说明Description
array_contains (数组、值) array_contains(array,value) 如果数组包含该值,则返回 true。Returns true if the array contains the value.
array_distinct (数组) array_distinct(array) 从数组中移除重复值。Removes duplicate values from the array.
array_except (array1,array2) array_except(array1,array2) 返回 array1 中的元素的数组,而不是在 array2 中,不包含重复项。Returns an array of the elements in array1 but not in array2, without duplicates.
array_intersect (array1,array2) array_intersect(array1,array2) 返回 array1 和 array2 的交集中的元素的数组,不包含重复项。Returns an array of the elements in the intersection of array1 and array2, without duplicates.
array_join (数组,分隔符 [,nullReplacement] ) array_join(array,delimiter[,nullReplacement]) 使用分隔符和可选字符串连接给定数组的元素,以替换 null 值。Concatenates the elements of the given array using the delimiter and an optional string to replace nulls. 如果没有为 nullReplacement 设置任何值,则将筛选出任何 null 值。If no value is set for nullReplacement, any null value is filtered.
array_max (数组) array_max(array) 返回数组中的最大值。Returns the maximum value in the array. 将跳过空元素。NULL elements are skipped.
array_min (数组) array_min(array) 返回数组中的最小值。Returns the minimum value in the array. 将跳过空元素。NULL elements are skipped.
array_position (数组,元素) array_position(array,element) 返回数组中第一个元素的从 (1 开始的) 索引。Returns the (1-based) index of the first element of the array as long.
array_remove (数组,元素) array_remove(array,element) 从数组中移除等于元素的所有元素。Removes all elements that equal to element from array.
array_repeat (元素,计数) array_repeat(element,count) 返回包含元素计数时间的数组。Returns the array containing element count times.
array_union (array1,array2) array_union(array1,array2) 返回 array1 和 array2 的联合中的元素的数组,无重复项。Returns an array of the elements in the union of array1 and array2, without duplicates.
arrays_overlap (a1,a2) arrays_overlap(a1,a2) 如果 a1 至少包含 a2 中同时存在的非 null 元素,则返回 true。Returns true if a1 contains at least a non-null element present also in a2. 如果数组没有公共元素,并且它们都为非空,并且其中任何一个包含空元素 null,则返回 false; 否则为 false。If the arrays have no common element and they are both non-empty and either of them contains a null element null is returned, false otherwise.
(a1,a2,... arrays_zip ) arrays_zip(a1,a2,…) 返回结构的合并数组,其中第 N 个结构包含输入数组的所有第 N 个值。Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.
concat (col1,col2,..., colN) concat(col1,col2,…,colN) 返回 col1,col2,...,colN 的串联。Returns the concatenation of col1, col2, …, colN.
(arrayOfArrays) 平展flatten(arrayOfArrays) 将数组数组转换为单个数组。Transforms an array of arrays into a single array.
反转 (数组) reverse(array) 返回反向字符串或具有元素反向顺序的数组。Returns a reversed string or an array with reverse order of elements.
序列 (启动、停止、步骤) sequence(start,stop,step) 从 start 到 stop (包含) 生成元素数组,按步骤递增。Generates an array of elements from start to stop (inclusive), incrementing by step. 返回元素的类型与参数表达式的类型相同。The type of the returned elements is the same as the type of argument expressions. 支持的类型为: byte、short、integer、long、date、timestamp。Supported types are: byte, short, integer, long, date, timestamp. 开始和停止表达式必须解析为相同的类型。The start and stop expressions must resolve to the same type. 如果启动和停止表达式解析为 "date" 或 "timestamp" 类型,则步骤表达式必须解析为 "interval" 类型,否则为与 start 和 stop 表达式相同的类型。If start and stop expressions resolve to the ‘date’ or ‘timestamp’ type then the step expression must resolve to the ‘interval’ type, otherwise to the same type as the start and stop expressions.
无序 (阵列) shuffle(array) 返回给定数组的随机排列。Returns a random permutation of the given array.
切片 (x、start、length) slice(x,start,length) 从索引开始开始的子集数组 x (数组索引从1开始,或从 end 开始(如果 start 为负) 具有指定长度)。Subsets array x starting from index start (array indices start at 1, or starting from the end if start is negative) with the specified length.
sort_array (array [,ascendingOrder] ) sort_array(array[,ascendingOrder]) 根据数组元素的自然顺序,按升序或降序对输入数组进行排序。Sorts the input array in ascending or descending order according to the natural ordering of the array elements. 空元素将按升序放置在返回数组的开头,或以降序顺序放置在返回数组的末尾。Null elements will be placed at the beginning of the returned array in ascending order or at the end of the returned array in descending order.

示例Examples

-- array_contains
SELECT array_contains(array(1, 2, 3), 2);
+---------------------------------+
|array_contains(array(1, 2, 3), 2)|
+---------------------------------+
|                             true|
+---------------------------------+

-- array_distinct
SELECT array_distinct(array(1, 2, 3, null, 3));
+----------------------------------------------------+
|array_distinct(array(1, 2, 3, CAST(NULL AS INT), 3))|
+----------------------------------------------------+
|                                          [1, 2, 3,]|
+----------------------------------------------------+

-- array_except
SELECT array_except(array(1, 2, 3), array(1, 3, 5));
+--------------------------------------------+
|array_except(array(1, 2, 3), array(1, 3, 5))|
+--------------------------------------------+
|                                         [2]|
+--------------------------------------------+

-- array_intersect
SELECT array_intersect(array(1, 2, 3), array(1, 3, 5));
+-----------------------------------------------+
|array_intersect(array(1, 2, 3), array(1, 3, 5))|
+-----------------------------------------------+
|                                         [1, 3]|
+-----------------------------------------------+

-- array_join
SELECT array_join(array('hello', 'world'), ' ');
+----------------------------------+
|array_join(array(hello, world),  )|
+----------------------------------+
|                       hello world|
+----------------------------------+

SELECT array_join(array('hello', null ,'world'), ' ');
+--------------------------------------------------------+
|array_join(array(hello, CAST(NULL AS STRING), world),  )|
+--------------------------------------------------------+
|                                             hello world|
+--------------------------------------------------------+

SELECT array_join(array('hello', null ,'world'), ' ', ',');
+-----------------------------------------------------------+
|array_join(array(hello, CAST(NULL AS STRING), world),  , ,)|
+-----------------------------------------------------------+
|                                              hello , world|
+-----------------------------------------------------------+

-- array_max
SELECT array_max(array(1, 20, null, 3));
+---------------------------------------------+
|array_max(array(1, 20, CAST(NULL AS INT), 3))|
+---------------------------------------------+
|                                           20|
+---------------------------------------------+

-- array_min
SELECT array_min(array(1, 20, null, 3));
+---------------------------------------------+
|array_min(array(1, 20, CAST(NULL AS INT), 3))|
+---------------------------------------------+
|                                            1|
+---------------------------------------------+

-- array_position
SELECT array_position(array(3, 2, 1), 1);
+---------------------------------+
|array_position(array(3, 2, 1), 1)|
+---------------------------------+
|                                3|
+---------------------------------+

-- array_remove
SELECT array_remove(array(1, 2, 3, null, 3), 3);
+-----------------------------------------------------+
|array_remove(array(1, 2, 3, CAST(NULL AS INT), 3), 3)|
+-----------------------------------------------------+
|                                              [1, 2,]|
+-----------------------------------------------------+

-- array_repeat
SELECT array_repeat('123', 2);
+--------------------+
|array_repeat(123, 2)|
+--------------------+
|          [123, 123]|
+--------------------+

-- array_union
SELECT array_union(array(1, 2, 3), array(1, 3, 5));
+-------------------------------------------+
|array_union(array(1, 2, 3), array(1, 3, 5))|
+-------------------------------------------+
|                               [1, 2, 3, 5]|
+-------------------------------------------+

-- arrays_overlap
SELECT arrays_overlap(array(1, 2, 3), array(3, 4, 5));
+----------------------------------------------+
|arrays_overlap(array(1, 2, 3), array(3, 4, 5))|
+----------------------------------------------+
|                                          true|
+----------------------------------------------+

-- arrays_zip
SELECT arrays_zip(array(1, 2, 3), array(2, 3, 4));
+------------------------------------------+
|arrays_zip(array(1, 2, 3), array(2, 3, 4))|
+------------------------------------------+
|                      [[1, 2], [2, 3], ...|
+------------------------------------------+

SELECT arrays_zip(array(1, 2), array(2, 3), array(3, 4));
+-------------------------------------------------+
|arrays_zip(array(1, 2), array(2, 3), array(3, 4))|
+-------------------------------------------------+
|                             [[1, 2, 3], [2, 3...|
+-------------------------------------------------+

-- concat
SELECT concat('Spark', 'SQL');
+------------------+
|concat(Spark, SQL)|
+------------------+
|          SparkSQL|
+------------------+

SELECT concat(array(1, 2, 3), array(4, 5), array(6));
+---------------------------------------------+
|concat(array(1, 2, 3), array(4, 5), array(6))|
+---------------------------------------------+
|                           [1, 2, 3, 4, 5, 6]|
+---------------------------------------------+

-- flatten
SELECT flatten(array(array(1, 2), array(3, 4)));
+----------------------------------------+
|flatten(array(array(1, 2), array(3, 4)))|
+----------------------------------------+
|                            [1, 2, 3, 4]|
+----------------------------------------+

-- reverse
SELECT reverse('Spark SQL');
+------------------+
|reverse(Spark SQL)|
+------------------+
|         LQS krapS|
+------------------+

SELECT reverse(array(2, 1, 4, 3));
+--------------------------+
|reverse(array(2, 1, 4, 3))|
+--------------------------+
|              [3, 4, 1, 2]|
+--------------------------+

-- sequence
SELECT sequence(1, 5);
+---------------+
| sequence(1, 5)|
+---------------+
|[1, 2, 3, 4, 5]|
+---------------+

SELECT sequence(5, 1);
+---------------+
| sequence(5, 1)|
+---------------+
|[5, 4, 3, 2, 1]|
+---------------+

SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval 1 month);
+---------------------------------------------------------------------------+
|sequence(to_date('2018-01-01'), to_date('2018-03-01'), INTERVAL '1 months')|
+---------------------------------------------------------------------------+
|                                                       [2018-01-01, 2018...|
+---------------------------------------------------------------------------+

-- shuffle
SELECT shuffle(array(1, 20, 3, 5));
+---------------------------+
|shuffle(array(1, 20, 3, 5))|
+---------------------------+
|              [5, 20, 3, 1]|
+---------------------------+

SELECT shuffle(array(1, 20, null, 3));
+-------------------------------------------+
|shuffle(array(1, 20, CAST(NULL AS INT), 3))|
+-------------------------------------------+
|                                [20, 3,, 1]|
+-------------------------------------------+

-- slice
SELECT slice(array(1, 2, 3, 4), 2, 2);
+------------------------------+
|slice(array(1, 2, 3, 4), 2, 2)|
+------------------------------+
|                        [2, 3]|
+------------------------------+

SELECT slice(array(1, 2, 3, 4), -2, 2);
+-------------------------------+
|slice(array(1, 2, 3, 4), -2, 2)|
+-------------------------------+
|                         [3, 4]|
+-------------------------------+

-- sort_array
SELECT sort_array(array('b', 'd', null, 'c', 'a'), true);
+---------------------------------------------------------+
|sort_array(array(b, d, CAST(NULL AS STRING), c, a), true)|
+---------------------------------------------------------+
|                                           [, a, b, c, d]|
+---------------------------------------------------------+

Map 函数Map functions

函数Function 说明Description
(地图,... map_concat ) map_concat(map,…) 返回所有给定映射的联合。Returns the union of all the given maps.
map_entries (映射) map_entries(map) 返回给定映射中所有项的无序数组。Returns an unordered array of all entries in the given map.
map_from_entries (arrayOfEntries) map_from_entries(arrayOfEntries) 返回从给定的项数组创建的映射。Returns a map created from the given array of entries.
map_keys (映射) map_keys(map) 返回包含映射键的无序数组。Returns an unordered array containing the keys of the map.
map_values (映射) map_values(map) 返回包含映射值的无序数组。Returns an unordered array containing the values of the map.

示例Examples

-- map_concat
SELECT map_concat(map(1, 'a', 2, 'b'), map(3, 'c'));
+--------------------------------------+
|map_concat(map(1, a, 2, b), map(3, c))|
+--------------------------------------+
|                  [1 -> a, 2 -> b, ...|
+--------------------------------------+

-- map_entries
SELECT map_entries(map(1, 'a', 2, 'b'));
+----------------------------+
|map_entries(map(1, a, 2, b))|
+----------------------------+
|            [[1, a], [2, b]]|
+----------------------------+

-- map_from_entries
SELECT map_from_entries(array(struct(1, 'a'), struct(2, 'b')));
+---------------------------------------------------+
|map_from_entries(array(struct(1, a), struct(2, b)))|
+---------------------------------------------------+
|                                   [1 -> a, 2 -> b]|
+---------------------------------------------------+

-- map_keys
SELECT map_keys(map(1, 'a', 2, 'b'));
+-------------------------+
|map_keys(map(1, a, 2, b))|
+-------------------------+
|                   [1, 2]|
+-------------------------+

-- map_values
SELECT map_values(map(1, 'a', 2, 'b'));
+---------------------------+
|map_values(map(1, a, 2, b))|
+---------------------------+
|                     [a, b]|
+---------------------------+

日期和时间戳函数Date and timestamp functions

有关日期和时间戳格式的信息,请参阅日期 时间模式For information on date and timestamp formats, see Datetime patterns.

函数Function 说明Description
add_months (start_date,num_months) add_months(start_date,num_months) 返回晚的日期 num_months start_dateReturns the date that is num_months after start_date.
current_date ( # A1current_date() 返回查询开始计算的当前日期。Returns the current date at the start of query evaluation.
current_timestamp ( # A1current_timestamp() 返回查询计算开始时的当前时间戳。Returns the current timestamp at the start of query evaluation.
date_add (start_date,num_days) date_add(start_date,num_days) 返回晚的日期 num_days start_dateReturns the date that is num_days after start_date.
date_format (时间戳,bcp.fmt) date_format(timestamp,fmt) 转换 timestamp 为日期格式所指定格式的字符串值 fmtConverts timestamp to a value of string in the format specified by the date format fmt.
date_part (字段、源) date_part(field,source) 提取日期/时间戳或间隔源的部分。Extracts a part of the date/timestamp or interval source.
date_sub (start_date,num_days) date_sub(start_date,num_days) 返回之前的日期 num_days start_dateReturns the date that is num_days before start_date.
date_trunc (bcp.fmt,ts) date_trunc(fmt,ts) 返回 ts 与格式模型指定的单元截断的时间戳 fmtReturns timestamp ts truncated to the unit specified by the format model fmt.
datediff (终止日期,开始) datediff(endDate,startDate) 返回到的天数 startDate endDateReturns the number of days from startDate to endDate.
dayofweek (日期) dayofweek(date) 返回日期/时间戳 (1 = 星期日,2 = 星期一,...,7 = 星期六) 的周日期。Returns the day of the week for date/timestamp (1 = Sunday, 2 = Monday, …, 7 = Saturday).
dayofyear (日期) dayofyear(date) 返回日期/时间戳的年中日期。Returns the day of year of the date/timestamp.
from_unixtime (unix_time,格式) from_unixtime(unix_time,format) unix_time在指定的中返回 formatReturns unix_time in the specified format.
from_utc_timestamp (时间戳,时区) from_utc_timestamp(timestamp,timezone) 给定时间戳(如 "2017-07-14 02:40: 00.0"),将其解释为 UTC 时间,并将该时间呈现为给定时区中的时间戳。Given a timestamp like ‘2017-07-14 02:40:00.0’, interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. 例如,"GMT + 1" 将产生 "2017-07-14 03:40: 00.0"。For example, ‘GMT+1’ would yield ‘2017-07-14 03:40:00.0’.
小时 (时间戳) hour(timestamp) 返回字符串/时间戳的小时部分。Returns the hour component of the string/timestamp.
last_day (日期) last_day(date) 返回日期所属月份的最后一天。Returns the last day of the month which the date belongs to.
(年、月、日 make_date) make_date(year,month,day) 从年、月和日字段创建日期Create date from year, month and day fields
make_timestamp (年、月、日、小时、分钟、秒 [,时区] ) make_timestamp(year,month,day,hour,min,sec[,timezone]) 从年、月、日、小时、分钟、秒和时区字段创建时间戳。Create timestamp from year, month, day, hour, min, sec and timezone fields.
分钟 (时间戳) minute(timestamp) 返回字符串/时间戳的分钟部分。Returns the minute component of the string/timestamp.
月份 (日期) month(date) 返回日期/时间戳的月份部分。Returns the month component of the date/timestamp.
months_between (timestamp1,timestamp2 [,roundOff] ) months_between(timestamp1,timestamp2[,roundOff]) 如果 timestamp1 晚于 timestamp2 ,则结果为正。If timestamp1 is later than timestamp2, then the result is positive. 如果 timestamp1 和都 timestamp2 在同一月的同一天,或者两者都是月份的最后一天,则将忽略当天的时间。If timestamp1 and timestamp2 are on the same day of month, or both are the last day of month, time of day will be ignored. 否则,差异基于每月31天计算,舍入到8位数,除非 roundOff = false。Otherwise, the difference is calculated based on 31 days per month, and rounded to 8 digits unless roundOff=false.
next_day (start_date,day_of_week) next_day(start_date,day_of_week) 返回晚于并按指示进行命名的第一个日期 start_dateReturns the first date which is later than start_date and named as indicated.
now()now() 返回查询计算开始时的当前时间戳。Returns the current timestamp at the start of query evaluation.
季度 (日期) quarter(date) 返回日期的年份的季度,范围为1到4。Returns the quarter of the year for date, in the range 1 to 4.
第二个 (时间戳) second(timestamp) 返回字符串/时间戳的第二个组件。Returns the second component of the string/timestamp.
to_date (date_str [,bcp.fmt] ) to_date(date_str[,fmt]) date_str使用表达式将表达式分析 fmt 为日期。Parses the date_str expression with the fmt expression to a date. 返回无效输入的 null 值。Returns null with invalid input. 默认情况下,如果省略了,它将遵循转换规则到日期 fmtBy default, it follows casting rules to a date if the fmt is omitted
to_timestamp (timestamp_str [,bcp.fmt] ) to_timestamp(timestamp_str[,fmt]) timestamp_str使用表达式将表达式分析 fmt 为时间戳。Parses the timestamp_str expression with the fmt expression to a timestamp. 返回无效输入的 null 值。Returns null with invalid input. 如果省略,则默认情况下,它将强制转换规则转换为时间戳 fmtBy default, it follows casting rules to a timestamp if the fmt is omitted.
to_unix_timestamp (timeExp [,format] ) to_unix_timestamp(timeExp[,format]) 返回给定时间的 UNIX 时间戳。Returns the UNIX timestamp of the given time.
to_utc_timestamp (时间戳,时区) to_utc_timestamp(timestamp,timezone) 给定时间戳(如 "2017-07-14 02:40: 00.0"),将其解释为给定时区中的时间,并将该时间呈现为 UTC 时间戳。Given a timestamp like ‘2017-07-14 02:40:00.0’, interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. 例如,"GMT + 1" 将产生 "2017-07-14 01:40: 00.0"For example, ‘GMT+1’ would yield ‘2017-07-14 01:40:00.0’
trunc (datefmt) trunc(datefmt) 返回, date 其中一天的时间部分被截断为格式模型指定的单位 fmtReturns date with the time portion of the day truncated to the unit specified by the format model fmt.
unix_timestamp ( [timeExp [format]] ) unix_timestamp([timeExp[format]]) 返回当前或指定时间的 UNIX 时间戳。Returns the UNIX timestamp of current or specified time.
工作日 (日期) weekday(date) 返回日期/时间戳的星期日期 (0 = 星期一,1 = 星期二,...,6 = 星期日) 。Returns the day of the week for date/timestamp (0 = Monday, 1 = Tuesday, …, 6 = Sunday).
weekofyear (日期) weekofyear(date) 返回给定日期是一年中的第几周。Returns the week of the year of the given date. 一周的开始时间是星期一,第1周是第一周的 >3 天。A week is considered to start on a Monday and week 1 is the first week with >3 days.
年份 (日期) year(date) 返回日期/时间戳的年份部分。Returns the year component of the date/timestamp.

示例Examples

-- add_months
SELECT add_months('2016-08-31', 1);
+---------------------------------------+
|add_months(CAST(2016-08-31 AS DATE), 1)|
+---------------------------------------+
|                             2016-09-30|
+---------------------------------------+

-- current_date
SELECT current_date();
+--------------+
|current_date()|
+--------------+
|    2020-06-06|
+--------------+

SELECT current_date;
+--------------+
|current_date()|
+--------------+
|    2020-06-06|
+--------------+

-- current_timestamp
SELECT current_timestamp();
+--------------------+
| current_timestamp()|
+--------------------+
|2020-06-06 14:00:...|
+--------------------+

SELECT current_timestamp;
+--------------------+
| current_timestamp()|
+--------------------+
|2020-06-06 14:00:...|
+--------------------+

-- date_add
SELECT date_add('2016-07-30', 1);
+-------------------------------------+
|date_add(CAST(2016-07-30 AS DATE), 1)|
+-------------------------------------+
|                           2016-07-31|
+-------------------------------------+

-- date_format
SELECT date_format('2016-04-08', 'y');
+---------------------------------------------+
|date_format(CAST(2016-04-08 AS TIMESTAMP), y)|
+---------------------------------------------+
|                                         2016|
+---------------------------------------------+

-- date_part
SELECT date_part('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456');
+---------------------------------------------------------+
|date_part('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456')|
+---------------------------------------------------------+
|                                                     2019|
+---------------------------------------------------------+

SELECT date_part('week', timestamp'2019-08-12 01:00:00.123456');
+---------------------------------------------------------+
|date_part('week', TIMESTAMP '2019-08-12 01:00:00.123456')|
+---------------------------------------------------------+
|                                                       33|
+---------------------------------------------------------+

SELECT date_part('doy', DATE'2019-08-12');
+-----------------------------------+
|date_part('doy', DATE '2019-08-12')|
+-----------------------------------+
|                                224|
+-----------------------------------+

SELECT date_part('SECONDS', timestamp'2019-10-01 00:00:01.000001');
+------------------------------------------------------------+
|date_part('SECONDS', TIMESTAMP '2019-10-01 00:00:01.000001')|
+------------------------------------------------------------+
|                                                    1.000001|
+------------------------------------------------------------+

SELECT date_part('days', interval 1 year 10 months 5 days);
+------------------------------------------------------+
|date_part('days', INTERVAL '1 years 10 months 5 days')|
+------------------------------------------------------+
|                                                     5|
+------------------------------------------------------+

SELECT date_part('seconds', interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
+----------------------------------------------------------+
|date_part('seconds', INTERVAL '5 hours 30.001001 seconds')|
+----------------------------------------------------------+
|                                                 30.001001|
+----------------------------------------------------------+

-- date_sub
SELECT date_sub('2016-07-30', 1);
+-------------------------------------+
|date_sub(CAST(2016-07-30 AS DATE), 1)|
+-------------------------------------+
|                           2016-07-29|
+-------------------------------------+

-- date_trunc
SELECT date_trunc('YEAR', '2015-03-05T09:32:05.359');
+------------------------------------------------------------+
|date_trunc(YEAR, CAST(2015-03-05T09:32:05.359 AS TIMESTAMP))|
+------------------------------------------------------------+
|                                         2015-01-01 00:00:00|
+------------------------------------------------------------+

SELECT date_trunc('MM', '2015-03-05T09:32:05.359');
+----------------------------------------------------------+
|date_trunc(MM, CAST(2015-03-05T09:32:05.359 AS TIMESTAMP))|
+----------------------------------------------------------+
|                                       2015-03-01 00:00:00|
+----------------------------------------------------------+

SELECT date_trunc('DD', '2015-03-05T09:32:05.359');
+----------------------------------------------------------+
|date_trunc(DD, CAST(2015-03-05T09:32:05.359 AS TIMESTAMP))|
+----------------------------------------------------------+
|                                       2015-03-05 00:00:00|
+----------------------------------------------------------+

SELECT date_trunc('HOUR', '2015-03-05T09:32:05.359');
+------------------------------------------------------------+
|date_trunc(HOUR, CAST(2015-03-05T09:32:05.359 AS TIMESTAMP))|
+------------------------------------------------------------+
|                                         2015-03-05 09:00:00|
+------------------------------------------------------------+

SELECT date_trunc('MILLISECOND', '2015-03-05T09:32:05.123456');
+----------------------------------------------------------------------+
|date_trunc(MILLISECOND, CAST(2015-03-05T09:32:05.123456 AS TIMESTAMP))|
+----------------------------------------------------------------------+
|                                                  2015-03-05 09:32:...|
+----------------------------------------------------------------------+

-- datediff
SELECT datediff('2009-07-31', '2009-07-30');
+------------------------------------------------------------+
|datediff(CAST(2009-07-31 AS DATE), CAST(2009-07-30 AS DATE))|
+------------------------------------------------------------+
|                                                           1|
+------------------------------------------------------------+

SELECT datediff('2009-07-30', '2009-07-31');
+------------------------------------------------------------+
|datediff(CAST(2009-07-30 AS DATE), CAST(2009-07-31 AS DATE))|
+------------------------------------------------------------+
|                                                          -1|
+------------------------------------------------------------+

-- dayofweek
SELECT dayofweek('2009-07-30');
+-----------------------------------+
|dayofweek(CAST(2009-07-30 AS DATE))|
+-----------------------------------+
|                                  5|
+-----------------------------------+

-- dayofyear
SELECT dayofyear('2016-04-09');
+-----------------------------------+
|dayofyear(CAST(2016-04-09 AS DATE))|
+-----------------------------------+
|                                100|
+-----------------------------------+

-- from_unixtime
SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss');
+-----------------------------------------------------+
|from_unixtime(CAST(0 AS BIGINT), yyyy-MM-dd HH:mm:ss)|
+-----------------------------------------------------+
|                                  1970-01-01 00:00:00|
+-----------------------------------------------------+

-- from_utc_timestamp
SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul');
+-------------------------------------------------------------+
|from_utc_timestamp(CAST(2016-08-31 AS TIMESTAMP), Asia/Seoul)|
+-------------------------------------------------------------+
|                                          2016-08-31 09:00:00|
+-------------------------------------------------------------+

-- hour
SELECT hour('2009-07-30 12:58:59');
+--------------------------------------------+
|hour(CAST(2009-07-30 12:58:59 AS TIMESTAMP))|
+--------------------------------------------+
|                                          12|
+--------------------------------------------+

-- last_day
SELECT last_day('2009-01-12');
+----------------------------------+
|last_day(CAST(2009-01-12 AS DATE))|
+----------------------------------+
|                        2009-01-31|
+----------------------------------+

-- make_date
SELECT make_date(2013, 7, 15);
+----------------------+
|make_date(2013, 7, 15)|
+----------------------+
|            2013-07-15|
+----------------------+

SELECT make_date(2019, 13, 1);
+----------------------+
|make_date(2019, 13, 1)|
+----------------------+
|                  null|
+----------------------+

SELECT make_date(2019, 7, NULL);
+-------------------------------------+
|make_date(2019, 7, CAST(NULL AS INT))|
+-------------------------------------+
|                                 null|
+-------------------------------------+

SELECT make_date(2019, 2, 30);
+----------------------+
|make_date(2019, 2, 30)|
+----------------------+
|                  null|
+----------------------+

-- make_timestamp
SELECT make_timestamp(2014, 12, 28, 6, 30, 45.887);
+-----------------------------------------------------------------+
|make_timestamp(2014, 12, 28, 6, 30, CAST(45.887 AS DECIMAL(8,6)))|
+-----------------------------------------------------------------+
|                                             2014-12-28 06:30:...|
+-----------------------------------------------------------------+

SELECT make_timestamp(2014, 12, 28, 6, 30, 45.887, 'CET');
+----------------------------------------------------------------------+
|make_timestamp(2014, 12, 28, 6, 30, CAST(45.887 AS DECIMAL(8,6)), CET)|
+----------------------------------------------------------------------+
|                                                  2014-12-28 05:30:...|
+----------------------------------------------------------------------+

SELECT make_timestamp(2019, 6, 30, 23, 59, 60);
+-------------------------------------------------------------+
|make_timestamp(2019, 6, 30, 23, 59, CAST(60 AS DECIMAL(8,6)))|
+-------------------------------------------------------------+
|                                          2019-07-01 00:00:00|
+-------------------------------------------------------------+

SELECT make_timestamp(2019, 13, 1, 10, 11, 12, 'PST');
+------------------------------------------------------------------+
|make_timestamp(2019, 13, 1, 10, 11, CAST(12 AS DECIMAL(8,6)), PST)|
+------------------------------------------------------------------+
|                                                              null|
+------------------------------------------------------------------+

SELECT make_timestamp(null, 7, 22, 15, 30, 0);
+-------------------------------------------------------------------------+
|make_timestamp(CAST(NULL AS INT), 7, 22, 15, 30, CAST(0 AS DECIMAL(8,6)))|
+-------------------------------------------------------------------------+
|                                                                     null|
+-------------------------------------------------------------------------+

-- minute
SELECT minute('2009-07-30 12:58:59');
+----------------------------------------------+
|minute(CAST(2009-07-30 12:58:59 AS TIMESTAMP))|
+----------------------------------------------+
|                                            58|
+----------------------------------------------+

-- month
SELECT month('2016-07-30');
+-------------------------------+
|month(CAST(2016-07-30 AS DATE))|
+-------------------------------+
|                              7|
+-------------------------------+

-- months_between
SELECT months_between('1997-02-28 10:30:00', '1996-10-30');
+-------------------------------------------------------------------------------------------+
|months_between(CAST(1997-02-28 10:30:00 AS TIMESTAMP), CAST(1996-10-30 AS TIMESTAMP), true)|
+-------------------------------------------------------------------------------------------+
|                                                                                 3.94959677|
+-------------------------------------------------------------------------------------------+

SELECT months_between('1997-02-28 10:30:00', '1996-10-30', false);
+--------------------------------------------------------------------------------------------+
|months_between(CAST(1997-02-28 10:30:00 AS TIMESTAMP), CAST(1996-10-30 AS TIMESTAMP), false)|
+--------------------------------------------------------------------------------------------+
|                                                                          3.9495967741935485|
+--------------------------------------------------------------------------------------------+

-- next_day
SELECT next_day('2015-01-14', 'TU');
+--------------------------------------+
|next_day(CAST(2015-01-14 AS DATE), TU)|
+--------------------------------------+
|                            2015-01-20|
+--------------------------------------+

-- now
SELECT now();
+--------------------+
|               now()|
+--------------------+
|2020-06-06 14:00:...|
+--------------------+

-- quarter
SELECT quarter('2016-08-31');
+---------------------------------+
|quarter(CAST(2016-08-31 AS DATE))|
+---------------------------------+
|                                3|
+---------------------------------+

-- second
SELECT second('2009-07-30 12:58:59');
+----------------------------------------------+
|second(CAST(2009-07-30 12:58:59 AS TIMESTAMP))|
+----------------------------------------------+
|                                            59|
+----------------------------------------------+

-- to_date
SELECT to_date('2009-07-30 04:17:52');
+------------------------------+
|to_date('2009-07-30 04:17:52')|
+------------------------------+
|                    2009-07-30|
+------------------------------+

SELECT to_date('2016-12-31', 'yyyy-MM-dd');
+-----------------------------------+
|to_date('2016-12-31', 'yyyy-MM-dd')|
+-----------------------------------+
|                         2016-12-31|
+-----------------------------------+

-- to_timestamp
SELECT to_timestamp('2016-12-31 00:12:00');
+-----------------------------------+
|to_timestamp('2016-12-31 00:12:00')|
+-----------------------------------+
|                2016-12-31 00:12:00|
+-----------------------------------+

SELECT to_timestamp('2016-12-31', 'yyyy-MM-dd');
+----------------------------------------+
|to_timestamp('2016-12-31', 'yyyy-MM-dd')|
+----------------------------------------+
|                     2016-12-31 00:00:00|
+----------------------------------------+

-- to_unix_timestamp
SELECT to_unix_timestamp('2016-04-08', 'yyyy-MM-dd');
+-----------------------------------------+
|to_unix_timestamp(2016-04-08, yyyy-MM-dd)|
+-----------------------------------------+
|                               1460073600|
+-----------------------------------------+

-- to_utc_timestamp
SELECT to_utc_timestamp('2016-08-31', 'Asia/Seoul');
+-----------------------------------------------------------+
|to_utc_timestamp(CAST(2016-08-31 AS TIMESTAMP), Asia/Seoul)|
+-----------------------------------------------------------+
|                                        2016-08-30 15:00:00|
+-----------------------------------------------------------+

-- trunc
SELECT trunc('2019-08-04', 'week');
+-------------------------------------+
|trunc(CAST(2019-08-04 AS DATE), week)|
+-------------------------------------+
|                           2019-07-29|
+-------------------------------------+

SELECT trunc('2019-08-04', 'quarter');
+----------------------------------------+
|trunc(CAST(2019-08-04 AS DATE), quarter)|
+----------------------------------------+
|                              2019-07-01|
+----------------------------------------+

SELECT trunc('2009-02-12', 'MM');
+-----------------------------------+
|trunc(CAST(2009-02-12 AS DATE), MM)|
+-----------------------------------+
|                         2009-02-01|
+-----------------------------------+

SELECT trunc('2015-10-27', 'YEAR');
+-------------------------------------+
|trunc(CAST(2015-10-27 AS DATE), YEAR)|
+-------------------------------------+
|                           2015-01-01|
+-------------------------------------+

-- unix_timestamp
SELECT unix_timestamp();
+--------------------------------------------------------+
|unix_timestamp(current_timestamp(), yyyy-MM-dd HH:mm:ss)|
+--------------------------------------------------------+
|                                              1591452047|
+--------------------------------------------------------+

SELECT unix_timestamp('2016-04-08', 'yyyy-MM-dd');
+--------------------------------------+
|unix_timestamp(2016-04-08, yyyy-MM-dd)|
+--------------------------------------+
|                            1460073600|
+--------------------------------------+

-- weekday
SELECT weekday('2009-07-30');
+---------------------------------+
|weekday(CAST(2009-07-30 AS DATE))|
+---------------------------------+
|                                3|
+---------------------------------+

-- weekofyear
SELECT weekofyear('2008-02-20');
+------------------------------------+
|weekofyear(CAST(2008-02-20 AS DATE))|
+------------------------------------+
|                                   8|
+------------------------------------+

-- year
SELECT year('2016-07-30');
+------------------------------+
|year(CAST(2016-07-30 AS DATE))|
+------------------------------+
|                          2016|
+------------------------------+

JSON 函数JSON functions

函数Function 说明Description
from_json (jsonStr,schema [,options] ) from_json(jsonStr, schema[, options]) 返回具有给定和的结构值 jsonStr schemaReturns a struct value with the given jsonStr and schema.
get_json_object (json_txt,路径) get_json_object(json_txt, path) 从中提取 json 对象 pathExtracts a json object from path.
json_tuple (jsonStr、p1、p2、...、pn) json_tuple(jsonStr, p1, p2, …, pn) 返回类似于函数的元组 get_json_object ,但它采用多个名称。Returns a tuple like the function get_json_object, but it takes multiple names. 所有输入参数和输出列类型均为字符串。All the input parameters and output column types are string.
schema_of_json (json [,options] ) schema_of_json(json[, options]) 返回 JSON 字符串 DDL 格式的架构。Returns schema in the DDL format of JSON string.
to_json (expr [,options] ) to_json(expr[, options]) 返回具有给定结构值的 JSON 字符串。Returns a JSON string with a given struct value.

示例Examples

-- from_json
SELECT from_json('{"a":1, "b":0.8}', 'a INT, b DOUBLE');
+---------------------------+
|from_json({"a":1, "b":0.8})|
+---------------------------+
|                   [1, 0.8]|
+---------------------------+

SELECT from_json('{"time":"26/08/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
+--------------------------------+
|from_json({"time":"26/08/2015"})|
+--------------------------------+
|            [2015-08-26 00:00...|
+--------------------------------+

-- get_json_object
SELECT get_json_object('{"a":"b"}', '$.a');
+-------------------------------+
|get_json_object({"a":"b"}, $.a)|
+-------------------------------+
|                              b|
+-------------------------------+

-- json_tuple
SELECT json_tuple('{"a":1, "b":2}', 'a', 'b');
+---+---+
| c0| c1|
+---+---+
|  1|  2|
+---+---+

-- schema_of_json
SELECT schema_of_json('[{"col":0}]');
+---------------------------+
|schema_of_json([{"col":0}])|
+---------------------------+
|       array<struct<col:...|
+---------------------------+

SELECT schema_of_json('[{"col":01}]', map('allowNumericLeadingZeros', 'true'));
+----------------------------+
|schema_of_json([{"col":01}])|
+----------------------------+
|        array<struct<col:...|
+----------------------------+

-- to_json
SELECT to_json(named_struct('a', 1, 'b', 2));
+---------------------------------+
|to_json(named_struct(a, 1, b, 2))|
+---------------------------------+
|                    {"a":1,"b":2}|
+---------------------------------+

SELECT to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+---------------------------------------------------------------------+
|to_json(named_struct(time, to_timestamp('2015-08-26', 'yyyy-MM-dd')))|
+---------------------------------------------------------------------+
|                                                 {"time":"26/08/20...|
+---------------------------------------------------------------------+

SELECT to_json(array(named_struct('a', 1, 'b', 2)));
+----------------------------------------+
|to_json(array(named_struct(a, 1, b, 2)))|
+----------------------------------------+
|                         [{"a":1,"b":2}]|
+----------------------------------------+

SELECT to_json(map('a', named_struct('b', 1)));
+-----------------------------------+
|to_json(map(a, named_struct(b, 1)))|
+-----------------------------------+
|                      {"a":{"b":1}}|
+-----------------------------------+

SELECT to_json(map(named_struct('a', 1),named_struct('b', 2)));
+----------------------------------------------------+
|to_json(map(named_struct(a, 1), named_struct(b, 2)))|
+----------------------------------------------------+
|                                     {"[1]":{"b":2}}|
+----------------------------------------------------+

SELECT to_json(map('a', 1));
+------------------+
|to_json(map(a, 1))|
+------------------+
|           {"a":1}|
+------------------+

SELECT to_json(array((map('a', 1))));
+-------------------------+
|to_json(array(map(a, 1)))|
+-------------------------+
|                [{"a":1}]|
+-------------------------+