

Note
The * can be used to specify all columns in the column expression list, while <table name>.* can be used to specify all columns from the given table.
Columns can be doublequoted to use reserved words as column names, e.g., "PERCENT"; or to use special characters in column names, e.g., "key:value".
TOP <n> returns the first n records (up to 20000 records by default), but is configurable.
The grouping expression list may contain column names, aliases, expressions, or positions (e.g., GROUP BY 2 to aggregate on the 2nd column in the SELECT list).
The having expression list may contain grouping expressions or any grouping expression aliases defined in the SELECT list.
The ordering expression list may contain column names, expressions, or column positions (e.g., GROUP BY 2 to aggregate on the 2nd column in the SELECT list). The default ordering is ASC. The default null ordering is NULLS FIRST when using ascending order and NULLS LAST when using descending order. The general format for each commaseparated ordering expression in the list is:
ORDER BY Expression Syntax1
<column name/alias/expression/position> [ASC  DESC] [NULLS FIRST  NULLS LAST]
LIMIT applies paging to the result set, starting at the 0based offset (if specified) and returning num rows records.


Tableless Query
A query without a FROM clause can be used to return a single row of data containing a constant or expression.
For example, to select the current day of the week:


Note
A tableless query will create a result set backed by a replicated table, by default.
Join
The supported join types are:
 INNER  matching rows between two tables
 LEFT  matching rows between two tables, and rows in the lefthand table with no matching rows in the righthand table
 RIGHT  matching rows between two tables, and rows in the righthand table with no matching rows in the lefthand table
 FULL OUTER matching rows between two tables, and rows in both tables with no matching rows in the other
 CROSS  all rows in one table matched against all rows in the other
There are two execution schemes that are used to process joins, depending on the distribution of the joined tables:
 Local  highly performant, but native join criteria must be met
 Distributed  highly flexible, as native join restrictions are lifted, but less performant due to interprocessor communication overhead
Important
Though the data distribution restrictions on native joins do not exist for joins made via SQL, following the join guidelines on sharding will result in much more performant queries.
Kinetica supports both JOIN...ON and WHERE clause syntax for inner joins; all outer join types (LEFT, RIGHT, & FULL OUTER) require JOIN...ON syntax.
For example, to list the name of each employee and the name of the employee's manager, using the WHERE clause to specify the join condition:


To list the name of each employee and the associated manager, even for employees that don't have a manager, using the JOIN...ON syntax to specify the join condition:


ASOF
Kinetica supports the notion of an inexact match join via the ASOF join function. This feature allows each leftside table record to be matched to a single rightside table record whose join column value is the smallest or largest value within a range relative to the leftside join column value. In the case where multiple rightside table records have the same smallest or largest value for a given leftside table record, only one of the rightside table records will be chosen and returned as part of the join.


The five parameters are:
 left_column  name of the column to join on from the leftside table
 right_column  name of the column to join on from the rightside table
 rel_range_begin  constant value defining the position, relative to each leftside column value, of the beginning of the range in which to match rightside column values; use a negative constant to begin the range before the leftside column value, or a positive one to begin after it
 rel_range_end  constant value defining the position, relative to each leftside column value, of the end of the range in which to match rightside column values; use a negative constant to end the range before the leftside column value, or a positive one to end after it
 MINMAX  use MIN to return the rightside matched record with the smallest join column value; use MAX to return the rightside matched record with the greatest join column value
Effectively, each matched rightside column value must be:
 >= <leftside column value> + rel_range_begin
 <= <leftside column value> + rel_range_end
Within the set of rightside matches for each leftside record, the one with the MIN or MAX column value will be returned in the join. In the case of a tie for the MIN or MAX column value, only one rightside record will be selected for return in the join for that leftside record.
Examples
The following ASOF call might be used to list, for each flight arrival time, the soonest flight departure time that occurs between half an hour and an hour and a half after the arrival; effectively, the timematching portion of a connecting flight query:


This ASOF call returns rightside locations that are nearest eastward to each leftside location, for locations within 5 degrees of the leftside:


For example, to match a set of stock trades to the opening prices for those stocks (if an opening price record exists within 24 hours prior to the trade), and to include trades for which there is no opening stock price record:


While the ASOF join function can only be used as part of a join, it can effectively be made into a filter condition by subselecting the filter criteria in the FROM clause and joining on that criteria.
For instance, to look up the stock price for a given company as of a given date:


Important
The use of the KI_HINT_NO_LATE_MATERIALIZATION is key, here, as the join requires a materialized table, which this hint ensures, to succeed.
Aggregation
The GROUP BY clause can be used to segment data into groups and apply aggregate functions over the values within each group. Aggregation functions applied to data without a GROUP BY clause will be applied over the entire result set.
Note
GROUP BY can operate on columns, column expressions, column aliases, or the position of a member of the SELECT clause (where 1 is the first element).
For example, to find the average cab fare from the taxi data set:


To find the minimum, maximum, & average trip distances, as well as the average passenger count for each vendor per year from the taxi data set (weeding out data with errant trip distances):


Grouping
The GROUP BY clause can also be used to apply the following grouping functions over the values within each group:
With each of these, the GROUPING() aggregate function can be used to distinguish aggregated null values in the data from null values generated by the ROLLUP, CUBE, or GROUPING SETS grouping function.
For instance, the following CASE will turn the aggregated null values in the Sector column into an <UNKNOWN SECTOR> group and the null value generated by the grouping function into an <ALL SECTORS> group:


ROLLUP
The ROLLUP(expr list) function calculates n + 1 aggregates for n number of columns in expr list.
For example, the following query will aggregate the average opening stock price for these groups:
 Each market sector & stock symbol pair
 Each market sector
 All sectors and symbols


CUBE
The CUBE(expr list) function calculates 2^{n} aggregates for n number of columns in expr list.
For example, the following query will aggregate the average opening stock price for these groups:
 Each market sector & stock symbol pair
 Each market sector
 Each stock symbol
 All sectors and symbols


GROUPING SETS
The GROUPING SETS(expr list) function calculates aggregates for each group of columns in expr list.
For example, the following query will aggregate the average opening stock price for these groups:
 Each market sector
 Each stock symbol
 All sectors and symbols


Window
Window functions are available through the use of the OVER clause, which can partition rows into frames. Different types of functions can be used to aggregate data over a sliding window.


Note
 If the PARTITION BY clause is not specified, the window function will be computed over the entire data set.
 The ORDER BY clause is not required when the window function is either FIRST_VALUE() or LAST_VALUE(). These two functions are also the only ranking functions that can contain a RANGE or ROWS frame clause.
The FIRST_VALUE(), LAST_VALUE(), LEAD(), and LAG() window functions support ignoring or respecting null values in the window calculation using the IGNORE NULLS or RESPECT NULLS syntax respectively. The general format for applying this additional syntax is:


The default ordering of records within each partition is ASC. The default null ordering is NULLS FIRST when using ascending order and NULLS LAST when using descending order. The general format for each commaseparated ordering expression in the list is:


Note
Only one column can be specified in the ordering expression list when using RANGE. When using ROWS, the frame is applied after any ordering; so, while several columns may appear in the order expression list, there will be only one ROWS clause following the list.
When a RANGE frame is specified, CURRENT ROW includes all peer rows (rows with the same ordering values). Thus, when the first of a set of peer rows is encountered, all associated peer rows are included in the frame (not just the first one).
In contrast, when a ROWS frame is specified, CURRENT ROW will direct that only the peer rows up to and including the current row are contained within the framethe following peer rows will not be included.
The default frame type is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
If a frame clause is specified without a BETWEEN, the clause is applied to the frame start; the frame end will still be the default of CURRENT ROW.
For example, to calculate the rolling sum of total amounts collected by each taxi vendor over the course of a given day, as well as the number of other trips that occurred within 5 minutes of each trip:


To calculate a 5before and 10after moving average of 4passenger trip distances per vendor over the course of a given day:


To rank, by vendor, the total amounts collected from 3passenger trips on a given day:


To compare each trip's total amount to the lowest (ignoring nulls), highest (ignoring nulls), & average total amount for 5passenger trips for each vendor over the course of a given day:


To compare each vendor's average total amount to their average total amount within the interquartile range:


PIVOT
The PIVOT clause can be used to pivot columns, "rotating" column values into multiple columns (one for each value), creating wider and shorter denormalized tables from longer, more normalized tables.


For example, given a source table phone_number, which lists each phone number for a customer as a separate record in the table, a pivot operation can be performed, creating a single record per customer with the home, work, & cell phone numbers as separate columns.
With this data:
++++
 name  phone_type  phone_number 
++++
 Jane  Home  1234567890 
 Jane  Work  1112223333 
 John  Home  1234567890 
 John  Cell  3332221111 
++++
The following pivot operation can be applied:


The data will be pivoted into a table like this:
+++++
 name  Home_Phone  Work_Phone  Cell_Phone 
+++++
 Jane  1234567890  1112223333  
 John  1234567890   3332221111 
+++++
UNPIVOT
The UNPIVOT clause can be used to unpivot columns, "rotating" row values into column values, creating longer, more normalized tables from shorter, more denormalized tables.


For example, given a source table customer_contact, which lists the home, work, & cell phone numbers for each customer in the table, an unpivot operation can be performed, creating separate home, work, & cell phone records for each customer.
With this data:
+++++
 name  home_phone  work_phone  cell_phone 
+++++
 Jane  1234567890  1112223333  
 John  1234567890   3332221111 
+++++
The following unpivot operation can be applied:


The data will be unpivoted into a table like this:
++++
 name  phone_number  phone_type 
++++
 Jane  1234567890  Home 
 Jane  1112223333  Work 
 Jane   Cell 
 John  1234567890  Home 
 John   Work 
 John  3332221111  Cell 
++++
Note
If the original column names can be used as the values of the unpivot key, as is, the preselection and renaming of those columns using a subquery in the FROM clause can be eliminated.
For example, unpivoting without aliasing the quarterly grade columns will result in those exact column names being used as the quarter values:




Set Operations
There are three types of supported set operations, each having the option of returning duplicate records in the result set by using the keyword ALL:
 UNION [ALL]  return all records from both source data sets
 INTERSECT [ALL]  return only records that exist in both source data sets
 EXCEPT [ALL]  return all records that exist in the first data set, but not in the second
UNION
The UNION set operator creates a single list of records from the results of two SELECT statements. Use the ALL keyword to keep all records from both sets; omit it to remove duplicate records and form a single list of records unique between the two sets. See Limitations and Cautions for limitations.


For example, given a table of lunch menu items and another table of dinner menu items, a UNION can be used to return all unique lunch & dinner menu items together, including items that are the same on both menus, but of a different price:


Note
Since the example includes price and all columns selected must match between the two sets for an item to be considered a duplicate, a lunch item that is priced differently as a dinner item would also appear in the result set.
A UNION ALL can be used to return all lunch & dinner menu items together, including duplicates:


INTERSECT
The INTERSECT set operator creates a single list of records that exist in both of the result sets from two SELECT statements. Use the ALL keyword to keep duplicate records that exist in both sets; omit it to remove duplicate records and form a single list of records that exist in both sets. See Limitations for limitations.


For example, given a table of lunch menu items and another table of dinner menu items, an INTERSECT can be used to return all lunch menu items (excluding duplicates) that are also dinner items for the same price:


Note
Since the example includes price and all columns selected must match between the two sets for an item to be included, a lunch item that is priced differently as a dinner item would not appear in the result set.
EXCEPT
The EXCEPT set operator performs set subtraction, creating a single list of records that exist in the first SELECT statement's result set, but not in the second SELECT statement's result set. Use the ALL keyword to keep duplicate records that exist in the first set but not in the second; omit it to remove duplicate records and form a single list of records that exist in the first set but not the second. See Limitations for limitations:


For example, given a table of lunch menu items and another table of dinner menu items, an EXCEPT can be used to return all lunch menu items (excluding duplicates) that are not also dinner items for the same price:


Note
Since the example includes price and all columns selected must match between the two sets for an item to be eliminated, a lunch item that is priced differently as a dinner item would still appear in the result set.
WITH (Common Table Expressions)
The WITH operation, also known as a Common Table Expression (CTE) creates a set of data that can be assigned table & column aliases and used one or more times in subsequent operations. The aliased set can be used within the SELECT, FROM, or WHERE clauses of a subsequent query or a subsequent CTE within the same WITH operation.
Recursive WITH operations are not supportedthe aliased set cannot refer to itself. The column aliases must be unique within the WITH statementno other column or column alias can be similarly named, for example. Also, when used in a FROM clause and given a table alias, the table alias must be preceded with AS.
A CTE can be made available to a DML or DDL statement by having the WITH statement follow the CREATE TABLE...AS, INSERT, UPDATE, or DELETE statement (not precede it).
Each CTE definition within a WITH statement is structured as follows:


Here, <cte name> is the table alias given to the data set returned by the CTE, and <column alias list> is the list of aliases assigned to the columns that the CTE returns. Each column alias is matched to each column returned in the order that each were defined; e.g., for column aliases (A, B, C) used with a CTE which performs a SELECT x, y, z ..., alias A will reference column x, B will reference y, and C will reference z. If no aliases are used, the names of the source columns themselves can be used to reference the data set returned by the CTE.
Each WITH statement can contain one or more CTE definitions, followed by a SELECT statement, as shown here:




To apply the CTE to an INSERT statement, follow the INSERT clause with the WITH clause:


Iteration
Kinetica supports iteration over each record within a data set for the purpose of creating a result set with 0 to N result records per record in the original set.
This iteration can be variable, based on some value within each record, or fixed, based on a given constant value.
The iteration is performed by joining against the virtual ITER table, as follows:


The <column expression> can be replaced by a constant for fixed iteration.
For example, to extract all of the individual letters from a column of words, with one record per letter extracted (using variable iteration):


To duplicate the set of words five times (using fixed iteration):


For more detail, examples, and limitations, see Iteration.
Constants
Each data type has an associated literal constant syntax, which can be used, for instance, to insert constant data values into those columns.
Numeric Constants
Integer and floating point data types can be either singlequoted or not.


StringBased Constants
Stringbased data types should be singlequoted.


Binary Constants
Binary types can be represented in either of the following forms:
 singlequoted or unquoted base10
 singlequoted hexadecimal


Date/Time Constants
Kinetica accepts unqualified singlequoted date/time values, ANSI SQL, and ODBC escape sequences in the following formats:
Data Type  Native  ANSI  ODBC 

Date  'YYYY[M]M[D]D'  DATE 'YYYYMMDD'  {d 'YYYYMMDD'} 
Time  '[H]H24:MI:SS.mmm'  TIME 'HH24:MI:SS.mmm'  {t 'HH24:MI:SS.mmm'} 
DateTime  'YYYY[M]M[D]D[T ][H]H24:MI:SS.mmm[Z]'  TIMESTAMP 'YYYYMMDD HH24:MI:SS.mmm'  {ts 'YYYYMMDD HH24:MI:SS.mmm'} 
Timestamp  'YYYY[M]M[D]D [H]H24:MI:SS.mmm'  TIMESTAMP 'YYYYMMDD HH24:MI:SS.mmm'  {ts 'YYYYMMDD HH24:MI:SS.mmm'} 






Expressions
An expression can consist of a literal constant, a column name, or a function applied to a constant or column name. A compound expression is an operation or function applied to one or more expressions.
The following are the supported expression operators:
 + addition
  subtraction
 * multiplication
 / division
 () grouping
  string concatenation
Note
Use double quotes to specify column names in a casesensitive manner.
Conditional Functions
Conditional functions are subject to shortcircuiting to aid in errorchecking.
Function  Description  

DECODE(expr, match_a, value_a, ..., match_N, value_N[, unmatched_value])  Evaluates expr: returns the first value whose corresponding match is equal to expr; returns the optional unmatched_value (or null), if no match is found  
IF(expr, value_if_true, value_if_false)  Evaluates expr: if true, returns value_if_true; otherwise, if false or null, returns value_if_false

CASE
The case statement acts as a scalar function, but has two more complex forms. Note that for each of these CASE statements, the value expressions must all be of the same or convertible data type.
In the first form, each WHEN is followed by a conditional expression whose corresponding THEN expression will have its value returned, if true. Control will continue through each WHEN until a match is found and the corresponding value returned; if no match is found, the value of the ELSE expression will be returned, or null, if no ELSE clause exists.


In the second form, the CASE expression is evaluated. A match of that result will be attempted against each WHEN expression until a match is found and the value of the corresponding THEN expression returned; if no match is found, the value of the ELSE expression will be returned, or null, if no ELSE clause exists.


Note
This second version below has greater optimization than the first.




Conversion Functions
Function  Description  

CAST(expr AS [SQL_]<conv_type>) or CONVERT(expr, [SQL_]<conv_type>)  Converts expr into conv_type data type Conversion Types:
Note When using the SQL_ prefix, UNSIGNED BIGINT becomes SQL_UNSIGNED_BIGINT  
CHAR(expr)  Returns the character associated with the ASCII code given in expr  
CHAR1(expr)  Converts the given expr to VARCHAR(1) type  
CHAR2(expr)  Converts the given expr to VARCHAR(2) type  
CHAR4(expr)  Converts the given expr to VARCHAR(4) type  
CHAR8(expr)  Converts the given expr to VARCHAR(8) type  
CHAR16(expr)  Converts the given expr to VARCHAR(16) type  
CHAR32(expr)  Converts the given expr to VARCHAR(32) type  
CHAR64(expr)  Converts the given expr to VARCHAR(64) type  
CHAR128(expr)  Converts the given expr to VARCHAR(128) type  
CHAR256(expr)  Converts the given expr to VARCHAR(256) type  
DATE(expr)  Converts expr to date (YYYYMMDD) format  
DATETIME(expr)  Converts expr to datetime (YYYYMMDD HH24:MI:SS.mmm) format  
DECIMAL(expr)  Converts the given expr to DECIMAL type  
DOUBLE(expr)  Converts the given expr to DOUBLE type  
FLOAT(expr)  Converts the given expr to REAL type  
INT(expr)  Converts the given expr to INTEGER type  
LONG(expr)  Converts the given expr to BIGINT type  
TIME(expr)  Converts expr to time (HH24:MI:SS) format  
TIMESTAMP(expr)  Converts expr to the number of milliseconds since the epoch  
TO_CHAR(expr, format)  Converts the given date/time expr to a string matching the given format. The format can be a string literal or expression. Arbitrary text can be injected into the format string using doublequotes. The returned string will be truncated at 32 characters. Valid format codes include:
Example:
 
ULONG(expr)  Converts the given expr to UNSIGNED BIGINT type 
Date/Time Functions
This section comprises the following functions:
 Date/Time Base Functions, which can extract parts of date/time expressions, convert back and forth between data types, and return the current date/time
 Date/Time Complex Conversion Functions, which can perform more complex date/type conversions
Date/Time Base Functions
Function  Description  

CURRENT_DATE()  Returns the date as YYYYMMDD  
CURRENT_DATETIME()  Returns the date & time as YYYYMMDD HH24:MI:SS.mmm  
CURRENT_TIME()  Returns the time as HH24:MI:SS.mmm  
CURRENT_TIMESTAMP()  Returns the date & time as YYYYMMDD HH24:MI:SS.mmm; to return the date & time as the number of milliseconds since the epoch, pass the result of this function to LONG()  
DATEADD(unit, amount, expr)  Adds the positive or negative integral amount of unit date/time intervals to the date/time in expr The following date/time intervals are supported for unit:
Note Any of these unit types can have a SQL_TSI_ prefix prepended to them; e.g., both DAY and SQL_TSI_DAY are valid unit types for specifying a day interval Examples:
 
DATEDIFF(expr_begin, expr_end)  Determines the difference between two dates, irrespective of time component, as the number of days when expr_begin is subtracted from expr_end; returns a negative number of days if expr_begin occurs after expr_end  
DAY(expr)  Alias for DAYOFMONTH(expr)  
DAYNAME(expr)  Extracts the day of the week from expr and converts it to the corresponding day name [Sunday  Saturday ]  
DAYOFMONTH(expr)  Extracts the day of the month from expr [1  31]  
DAYOFWEEK(expr)  Extracts the day of the week from expr [1  7]
 
DAY_OF_WEEK(expr)  Alias for DAYOFWEEK(expr)  
DAYOFYEAR(expr)  Extracts the day of the year from expr [1  366]  
DAY_OF_YEAR(expr)  Alias for DAYOFYEAR(expr)  
HOUR(expr)  Extracts the hour of the day from expr [0  23]  
<expr> + INTERVAL '<amount>' <part> <expr>  INTERVAL '<amount>' <part>  Adds to or subtracts from the date/time expr the integral amount units of type part. This mirrors the behavior of the TIMESTAMPADD function, only with a different format and different date/time part constants. The following date/time constants are supported for part:
 
LAST_DAY(date)  Returns the date of the last day of the month in the given date  
MINUTE(expr)  Extracts the minute of the day from expr [0  59]  
MONTH(expr)  Extracts the month of the year from expr [1  12]  
MONTHNAME(expr)  Extracts the month of the year from expr and converts it to the corresponding month name [January  December]  
MSEC(expr)  Extracts the millisecond of the second from expr [0  999]  
NEXT_DAY(date, day_of_week)  Returns the date of the next day of the week, provided as a day name in day_of_week, that occurs after the given date Some examples, given that 20001010 is a Tuesday:
 
NOW()  Alias for CURRENT_DATETIME()  
QUARTER(expr)  Extracts the quarter of the year from expr [1  4]
 
SECOND(expr)  Extracts the seconds of the minute from expr [0  59]  
SEC(expr)  Alias for SECOND(expr)  
TIMESTAMPADD(unit, amount, expr)  Adds the positive or negative integral amount of unit date/time intervals to the date/time in expr The following date/time intervals are supported for unit:
Note Any of these unit types can have a SQL_TSI_ prefix prepended to them; e.g., both DAY and SQL_TSI_DAY are valid unit types for specifying a day interval Examples:
 
TIMESTAMPDIFF(unit, begin, end)  Calculates the difference between two date/time expressions, returning the result as an integral difference in the units specified; more precisely, how many whole date/time intervals of type unit need to be added to (or subtracted from) begin to equal end (or get as close as possible without going past it) using the unit types and and rules specified in TIMESTAMPADD. Note This is not symmetric with TIMESTAMPADD in all cases, as adding 1 MONTH to Mar 31st results in Apr 30th, but the TIMESTAMPDIFF in MONTH units between those two dates is 0. Examples:
 
WEEK(expr)  Extracts the week of the year from expr [1  54]; each full week starts on Sunday (a 1 is returned for the week containing Jan 1st)  
YEAR(expr)  Extracts the year from expr; 4digit year, A.D. 
Date/Time Complex Conversion Functions
Function  Description  

DATE_TO_EPOCH_MSECS(year, month, day, hour, min, sec, msec)  Converts the full date to milliseconds since the epoch; negative values are accepted Example:
 
DATE_TO_EPOCH_SECS(year, month, day, hour, min, sec)  Converts the full date to seconds since the epoch; negative values are accepted Example:
 
MSECS_SINCE_EPOCH(timestamp)  Converts the timestamp to milliseconds since the epoch Example:
 
TIMESTAMP_FROM_DATE_TIME(date, time)  Converts the given date and time to a composite date/time format Example:
 
WEEK_TO_EPOCH_MSECS(year, week_number)  Converts the year and week number to milliseconds since the epoch; negative values are accepted Example:
 
WEEK_TO_EPOCH_SECS(year, week_number)  Converts the year and week number to seconds since the epoch. Negative values are accepted. Each new week begins Sunday at midnight. Example:

ErrorChecking Functions
Errors will only be successfully caught if the expressions are floatingpoint types (i.e. double and float).
Function  Description  

IFERROR(expr, val)  Alias for IF_ERROR(expr, val)  
IFINF(expr, val)  Alias for IF_INF(expr, val)  
IFINFINITY(expr, val)  Alias for IF_INF(expr, val)  
IFNAN(expr, val)  Alias for IF_NAN(expr, val)  
IF_ERROR(expr, val)  Evaluates the given expr and if it resolves to infinity or NaN, return val Tip Conceptually, this function is the same as IF_INF(IF_NAN(expr, val), val) Example:
 
IF_INF(expr, val)  Evaluates the given expr and if it resolves to infinity, return val Example:
 
IF_INFINITY(expr, val)  Alias for IF_INF(expr, val)  
IF_NAN(expr, val)  Evaluates the given expr and if it resolves to NaN, return val Example:

Geospatial/Geometry Functions
Tip
 Use ST_ISVALID to determine if a geometry object is valid. The functions below work best with valid geometry objects.
 Use the REMOVE_NULLABLE function to remove any nullable column types that could result from calculating a derived column (e.g., as in Projections) using one of the functions below.
Enhanced Performance Scalar Functions
The functions below all compare x and y coordinates to geometry objects (or vice versa), thus increasing their performance in queries. Each of these functions have a geometrytogeometry version listed in the next section.
Function  Description 

STXY_CONTAINS(geom, x, y)  Returns 1 (true) if geom contains the x and y coordinate, e.g. lies in the interior of geom. The coordinate cannot be on the boundary and also be contained because geom does not contain its boundary 
STXY_CONTAINSPROPERLY(geom, x, y)  Returns 1 (true) if the x and y coordinate intersects the interior of geom but not the boundary (or exterior) because geom does not contain its boundary but does contain itself 
STXY_COVEREDBY(x, y, geom)  Returns 1 (true) if the x and y coordinate is covered by geom 
STXY_COVERS(geom, x, y)  Returns 1 (true) if geom covers the x and y coordinate 
STXY_DISJOINT(x, y, geom)  Returns 1 (true) if the given x and y coordinate and the geometry geom do not spatially intersect. 
STXY_DISTANCE(x, y, geom[, solution])  Calculates the minimum distance between the given x and y coordinate and geom using the specified solution type. Solution types available:
Note: If the x and y coordinate and geom intersect (verify using ST_INTERSECTS), the distance will always be 0. 
STXY_DWITHIN(x, y, geom, distance[, solution])  Returns 1 (true) if the x and y coordinate is within the specified distance from geom using the specified solution type. Solution types available:

STXY_ENVDWITHIN(x, y, geom, distance[, solution])  Returns 1 (true) if the x and y coordinate is within the specified distance from the bounding box of geom using the specified solution type. Solution types available:

STXY_ENVINTERSECTS(x, y, geom)  Returns 1 (true) if the bounding box of the given geometry geom intersects the x and y coordinate. 
STXY_H3(x, y, resolution)  Returns an H3 grid index, similar to a geohash, for the cell containing the x and y coordinate, with the given resolution. The higher the resolution, the more precise the index is. The resolution must be an integer between 0 and 15. 
STXY_INTERSECTION(x, y, geom)  Returns the shared portion between the x and y coordinate and the given geometry geom, i.e. the point itself. 
STXY_INTERSECTS(x, y, geom)  Returns 1 (true) if the x and y coordinate and geom intersect in 2D. 
STXY_TOUCHES(x, y, geom)  Returns 1 (true) if the x and y coordinate and geometry geom have at least one point in common but their interiors do not intersect. If geom is a GEOMETRYCOLLECTION, a 0 is returned regardless if the point and geometry touch 
STXY_WITHIN(x, y, geom)  Returns 1 (true) if the x and y coordinate is completely inside the geom geometry i.e., not on the boundary 
Scalar Functions
Function  Description 

DIST(x1, y1, x2, y2)  Computes the Euclidean distance (in degrees), i.e. SQRT( (x1x2)*(x1x2) + (y1y2)*(y1y2) ). 
GEODIST(lon1, lat1, lon2, lat2)  Computes the geographic greatcircle distance (in meters) between two lat/lon points. 
GEOHASH_DECODE_LATITUDE(geohash)  Decodes a given geohash and returns the latitude value for the given hash string. Supports a maximum geohash character length of 16. 
GEOHASH_DECODE_LONGITUDE(geohash)  Decodes a given geohash and returns the longitude value for the given hash string. Supports a maximum geohash character length of 16. 
GEOHASH_ENCODE(lat, lon, precision)  Encodes a given coordinate pair and returns a hash string with a given precision. 
ST_ADDPOINT(linestring, point, position)  Adds a given point geometry to the given linestring geometry at the specified position, which is a 0based index. 
ST_ALMOSTEQUALS(geom1, geom2, decimal)  Returns 1 (true) if given geometries, geom1 and geom2, are almost spatially equal within the given amount of decimal scale. Note that geometries will still be considered equal if the decimal scale for the geometries is within a half order of magnitude of each other, e.g, if decimal is set to 2, then POINT(63.4 123.45) and POINT(63.4 123.454) are equal, but POINT(63.4 123.45) and POINT(63.4 123.459) are not equal. The geometry types must match to be considered equal. 
ST_AREA(geom[, solution])  Returns the area of the given geometry geom if it is a POLYGON or MULTIPOLYGON using the specified solution type. Returns 0 if the input geometry type is (MULTI)POINT or (MULTI)LINESTRING. Solution types available:

ST_AZIMUTH(geom1, geom2)  Returns the azimuth in radians defined by the segment between two POINTs, geom1 and geom2. Returns a null if the input geometry type is MULTIPOINT, (MULTI)LINESTRING, or (MULTI)POLYGON. 
ST_BOUNDARY(geom)  Returns the closure of the combinatorial boundary of a given geometry geom. Returns an empty geometry if geom is an empty geometry. Returns a null if geom is a GEOMETRYCOLLECTION 
ST_BOUNDINGDIAGONAL(geom)  Returns the diagonal of the given geometry's (geom) bounding box. 
ST_BUFFER(geom, radius[, style[, solution]])  Returns a geometry that represents all points whose distance from the given geometry geom is less than or equal to the given distance radius. The radius units can be specified by the solution type (default is in degrees) and the radius is created in the provided style. The style options are specified as a list of blankseparated keyvalue pairs, e.g., 'quad_segs=8 endcap=round'. If an empty style list ('') is provided, the default settings will be used. The style parameter must be specified to provide a solution type. Available style options:
Available solution types:
Tip To create a 5meter buffer around geom using the default styles: ST_BUFFER(geom, 5, '', 1). To create a 5foot (converting feet to meters) buffer around geom using the following styles: ST_BUFFER(geom, 5*0.3048,'quad_segs=4 endcap=flat', 1) 
ST_CENTROID(geom)  Calculates the center of the given geometry geom as a POINT. For (MULTI)POINTs, the center is calculated as the average of the input coordinates. For (MULTI)LINESTRINGs, the center is calculated as the weighted length of each given LINESTRING. For (MULTI)POLYGONs, the center is calculated as the weighted area of each given POLYGON. If geom is an empty geometry, an empty GEOMETRYCOLLECTION is returned 
ST_CLIP(geom1, geom2)  Returns the geometry shared between given geometries geom1 and geom2 
ST_CLOSESTPOINT(geom1, geom2[, solution])  Calculates the 2D POINT in geom1 that is closest to geom2 using the specified solution type. If geom1 or geom2 is empty, a null is returned. Solution types available:

ST_COLLECT(geom1, geom2)  Returns a MULTI* or GEOMETRYCOLLECTION comprising geom1 and geom2. If geom1 and geom2 are the same, singular geometry type, a MULTI* is returned, e.g., if geom1 and geom2 are both POINTs (empty or no), a MULTIPOINT is returned. If geom1 and geom2 are neither the same type nor singular geometries, a GEOMETRYCOLLECTION is returned. 
ST_COLLECTIONEXTRACT(collection, type)  Returns only the specified type from the given geometry collection. Type is a number that maps to the following:

ST_COLLECTIONHOMOGENIZE(collection)  Returns the simplest form of the given collection, e.g., a collection with a single POINT will be returned as POINT(x y), and a collection with multiple individual points will be returned as a MULTIPOINT. 
ST_CONCAVEHULL(geom, target_percent[, allow_holes])  Returns a potentially concave geometry that encloses all geometries found in the given geom set. Use target_percent (values between 0 and 1) to determine the percent of area of a convex hull the concave hull will attempt to fill; 1 will return the same geometry as an ST_CONVEXHULL operation. Set allow_holes to 1 (true) to allow holes in the resulting geometry; default value is 0 (false). Note that allow_holes is independent of the area of target_percent. 
ST_CONTAINS(geom1, geom2)  Returns 1 (true) if no points of geom2 lie in the exterior of geom1 and at least one point of geom2 lies in the interior of geom1. Note that geom1 does not contain its boundary but does contain itself. 
ST_CONTAINSPROPERLY(geom1, geom2)  Returns 1 (true) if geom2 intersects the interior of geom1 but not the boundary (or exterior). Note that geom1 does not contain its boundary but does contain itself. 
ST_CONVEXHULL(geom)  Returns the minimum convex geometry that encloses all geometries in the given geom set. 
ST_COORDDIM(geom)  Returns the coordinate dimension of the given geom, e.g., a geometry with x, y, and z coordinates would return 3. 
ST_COVEREDBY(geom1, geom2)  Returns 1 (true) if no point in geom1 is outside geom2. 
ST_COVERS(geom1, geom2)  Returns 1 (true) if no point in geom2 is outside geom1. 
ST_CROSSES(geom1, geom2)  Returns 1 (true) if the given geometries, geom1 and geom2, spatially cross, meaning some but not all interior points in common. If geom1 and/or geom2 are a GEOMETRYCOLLECTION, a 0 is returned regardless if the two geometries cross 
ST_DIFFERENCE(geom1, geom2)  Returns a geometry that represents the part of geom1 that does not intersect with geom2. 
ST_DIMENSION(geom)  Returns the dimension of the given geometry geom, which is less than or equal to the coordinate dimension. If geom is a single geometry, a 0 is for POINT, a 1 is for LINESTRING, and a 2 is for POLYGON. If geom is a collection, it will return the largest dimension from the collection. If geom is empty, 0 is returned. 
ST_DISJOINT(geom1, geom2)  Returns 1 (true) if the given geometries, geom1 and geom2, do not spatially intersect. 
ST_DISTANCE(geom1, geom2[, solution])  Calculates the minimum distance between the given geometries, geom1 and geom2, using the specified solution type. Solution types available:
Note: If geom1 and geom2 intersect (verify using ST_INTERSECTS), the distance will always be 0. 
ST_DISTANCEPOINTS(x1, y1, x2, y2[, solution])  Calculates the minimum distance between the given points, x1, y1 and x2, y2, using the specified solution type. Solution types available:

ST_DFULLYWITHIN(geom1, geom2, distance[, solution])  Returns 1 (true) if the maximum distance between geometries geom1 and geom2 is less than or equal to the specified distance of each other using the specified solution type. If geom1 or geom2 is null, 0 (false) is returned. Solution types available:

ST_DWITHIN(geom1, geom2, distance[, solution])  Returns 1 (true) if the minimum distance between geometries geom1 and geom2 is within the specified distance of each other using the specified solution type. Solution types available:

ST_ELLIPSE(centerx, centery, height, width)  Returns an ellipse using the following values:

ST_ENDPOINT(geom)  Returns the last point of the given geom as a POINT if it's a LINESTRING. If geom is not a a LINESTRING, null is returned. 
ST_ENVDWITHIN(geom1, geom2, distance[, solution])  Returns 1 (true) if geom1 is within the specified distance of the bounding box of geom2 using the specified solution type. Solution types available:

ST_ENVELOPE(geom)  Returns the bounding box of a given geometry geom. 
ST_ENVINTERSECTS(geom1, geom2)  Returns 1 (true) if the bounding box of the given geometries, geom1 and geom2, intersect. 
ST_EQUALS(geom1, geom2)  Returns 1 (true) if the given geometries, geom1 and geom2, are spatially equal. Note that order does not matter. 
ST_EQUALSEXACT(geom1, geom2, tolerance)  Returns 1 (true) if the given geometries, geom1 and geom2, are almost spatially equal within some given tolerance. If the values within the given geometries are within the tolerance value of each other, they're considered equal, e.g., if tolerance is 2, POINT(1 1) and POINT(1 3) are considered equal, but POINT(1 1) and POINT(1 3.1) are not. Note that the geometry types have to match for them to be considered equal. 
ST_ERASE(geom1, geom2)  Returns the result of erasing a portion of geom1 equal to the size of geom2. 
ST_EXPAND(geom, units)  Returns the bounding box expanded in all directions by the given units of the given geom. The expansion can also be defined for separate directions by providing separate parameters for each direction, e.g., ST_EXPAND(geom, unitsx, unitsy, unitsz, unitsm). 
ST_EXPANDBYRATE(geom, rate)  Returns the bounding box expanded by a given rate (a ratio of width and height) for the given geometry geom. The rate must be between 0 and 1. 
ST_EXTERIORRING(geom)  Returns a LINESTRING representing the exterior ring of the given POLYGON geom 
ST_FORCE2D(geom)  Returns the 2dimensional version (e.g., X and Y coordinates) of geom, the provided geometry or set of geometries (e.g., via GEOMETRYCOLLECTION or WKT column name). 
ST_FORCE3D(geom[, z])  Returns the 3dimensional version (e.g., X, Y, and Z coordinates) of geom, a provided geometry or set of geometries (e.g., via GEOMETRYCOLLECTION or WKT column name), using z as the geometry's new zvalue. The provided zvalues can also be derived from a numeric column. If no z is provided, a 0 will be applied. Note If a WKT column is provided for geom and a numeric column is provided for z, the z values will be matched to the provided geometries by row in the source table. If a singular geometry is provided for geom and a column is provided for z, threedimensional versions of the provided geometry will be returned for each z value found in the provided z column. If columns are provided for both geom and z and nulls are present in either column, the row containing null values will be skipped in the results. 
ST_GENERATEPOINTS(geom, num)  Creates a MULTIPOINT containing a number num of randomly generated points within the boundary of geom. 
ST_GEOHASH(geom, precision)  Returns a hash string representation of the given geometry geom with specified precision (the length of the geohash string). The longer the precision, the more precise the hash is. By default, precision is set to 20; the max for precision is 32. Returns null if geom is an empty geometry. Note The value returned will not be a geohash of the exact geometry but a geohash of the centroid of the given geometry 
ST_GEOMETRYN(geom, index)  Returns the index geometry back from the given geom geometry. The index starts from 1 to the number of geometry in geom. 
ST_GEOMETRYTYPE(geom)  Returns the type of geometry from the given geom. 
ST_GEOMETRYTYPEID(geom)  Returns the type ID of from geom. Type and ID mappings:

ST_GEOMFROMGEOHASH(geohash, precision)  Returns a POLYGON boundary box using the given geohash with a precision set by the integer precision. If precision is specified, the function will use as many characters in the hash equal to precision to create the geometry. If no precision is specified, the full length of the geohash is used. 
ST_GEOMFROMH3(h3_index)  Returns a POLYGON boundary box of the hex grid identified by the given h3_index. 
ST_GEOMFROMTEXT(wkt)  Returns a geometry from the given WellKnown text representation wkt. Note that this function is only compatible with constants. 
ST_H3(wkt, resolution)  Returns an H3 grid index, similar to a geohash, for the cell containing the centroid of the wkt, with the given resolution. The higher the resolution, the more precise the index is. The resolution must be an integer between 0 and 15. 
ST_HEXGRID(xmin, ymin, xmax, ymax, cell_side[, limit])  Creates a MULTIPOLYGON containing a grid of hexagons between given minimum and maximum points of a bounding box. The minimum point cannot be greater than or equal to the maximum point. The size (in meters) of the individual hexagons' sides is determined by cell_side. The cell_side cannot be greater than the width or height of the bounding box. The maximum number of cells that can be produced is determined by limit, a positive integer. Supported values for limit:
If the custom limit request specifies more cells (based on the bounding box and the cell_side) than the system limit, a null is returned. 
ST_INTERIORRINGN(geom, n)  Returns the nth interior LINESTRING ring of the POLYGON geom. If geom is not a POLYGON or the given n is out of range, a null is returned. The index begins at 1 
ST_INTERSECTION(geom1, geom2)  Returns the shared portion between given geometries geom1 and geom2 
ST_INTERSECTS(geom1, geom2)  Returns 1 (true) if the given geometries, geom1 and geom2, intersect in 2D 
ST_ISCLOSED(geom)  Returns 1 (true) if the given geometry's (geom) start and end points coincide 
ST_ISCOLLECTION(geom)  Returns 1 (true) if geom is a collection, e.g., GEOMETRYCOLLECTION, MULTIPOINT, MULTILINESTRING, etc. 
ST_ISEMPTY(geom)  Returns 1 (true) if geom is empty 
ST_ISRING(geom)  Returns 1 (true) if LINESTRING geom is both closed (per ST_ISCLOSED) and "simple" (per ST_ISSIMPLE). Returns 0 if geom is not a LINESTRING 
ST_ISSIMPLE(geom)  Returns 1 (true) if geom has no anomalous geometric points, e.g., selfintersection or selftangency 
ST_ISVALID(geom)  Returns 1 (true) if geom (typically a [MULTI]POLYGON) is well formed. A POLYGON is valid if its rings do not cross and its boundary intersects only at POINTs (not along a line). The POLYGON must also not have dangling LINESTRINGs. A MULTIPOLYGON is valid if all of its elements are also valid and the interior rings of those elements do not intersect. Each element's boundaries may touch but only at POINTs (not along a line) 
ST_LENGTH(geom[, solution])  Returns the length of the geometry if it is a LINESTRING or MULTILINESTRING. Returns 0 if another type of geometry, e.g., POINT, MULTIPOINT, etc. GEOMETRYCOLLECTIONs are also supported but the aforementioned type limitation still applies; the collection will be recursively searched for LINESTRINGs and MULTILINESTRINGs and the summation of all supported geometry types is returned (unsupported types are ignored). Solution types available:

ST_LINEFROMMULTIPOINT(geom)  Creates a LINESTRING from geom if it is a MULTIPOINT. Returns null if geom is not a MULTIPOINT 
ST_LINEINTERPOLATEPOINT(geom, fraction)  Returns a POINT that represents the specified fraction of the LINESTRING geom. If geom is either empty or not a LINESTRING, null is returned 
ST_LINELOCATEPOINT(linestring, point)  Returns the location of the closest point in the given linestring to the given point as a value between 0 and 1. The return value is a fraction of the total linestring length. 
ST_LINEMERGE(geom)  Returns a LINESTRING or MULTILINESTRING from a given geom. If geom is a MULTILINESTRING comprising LINESTRINGs with shared endpoints, a contiguous LINESTRING is returned. If geom is a LINESTRING or a MULTILINESTRING comprising LINESTRINGS without shared endpoints, geom is returned If geom is an empty (MULTI)LINESTRING or a (MULTI)POINT or (MULTI)POLYGON, an empty GEOMETRYCOLLECTION is returned. 
ST_LINESUBSTRING(geom, start_fraction, end_fraction)  Returns the fraction of a given geom LINESTRING where start_fraction and end_fraction are between 0 and 1. For example, given LINESTRING(1 1, 2 2, 3 3) a start_fraction of 0 and an end_fraction of 0.25 would yield the first quarter of the given LINESTRING, or LINESTRING(1 1, 1.5 1.5). Returns null if start_fraction is greater than end_fraction. Returns null if input geometry is (MULTI)POINT, MULTILINESTRING, or (MULTI)POLYGON. Returns null if start_fraction and/or end_fraction are less than 0 or more than 1. 
ST_LONGESTLINE(geom1, geom2[, solution])  Returns the LINESTRING that represents the longest line of points between the two geometries. If multiple longest lines are found, only the first line found is returned. If geom1 or geom2 is empty, null is returned. Solution types available:

ST_MAKEENVELOPE(xmin, ymin, xmax, ymax)  Creates a rectangular POLYGON from the given min and max parameters 
ST_MAKELINE(geom[, geom2])  Creates a LINESTRING from geom if it is a MULTIPOINT. If geom is a POINT, there must be at least one other POINT to construct a LINESTRING. If geom is a LINESTRING, it must have at least two points. Returns null if geom is not a POINT, MULTIPOINT, or LINESTRING Note This function can be rather costly in terms of performance 
ST_MAKEPOINT(x, y)  Creates a POINT at the given coordinate Note This function can be rather costly in terms of performance 
ST_MAKEPOLYGON(geom)  Creates a POLYGON from geom. Inputs must be closed LINESTRINGs Note This function can be rather costly in terms of performance 
ST_MAKETRIANGLE2D(x1, y1, x2, y2, x3, y3)  Creates a closed 2D POLYGON with three vertices 
ST_MAKETRIANGLE3D(x1, y1, z1, x2, y2, z2, x3, y3, z3)  Creates a closed 3D POLYGON with three vertices 
ST_MAXDISTANCE(geom1, geom2[, solution])  Returns the maximum distance between the given geom1 and geom2 geometries using the specifed solution type. If geom1 or geom2 is empty, null is returned. Solution types available:

ST_MAXX(geom)  Returns the maximum x coordinate of a bounding box for the given geom geometry. This function works for 2D and 3D geometries. 
ST_MAXY(geom)  Returns the maximum y coordinate of a bounding box for the given geom geometry. This function works for 2D and 3D geometries. 
ST_MAXZ(geom)  Returns the maximum z coordinate of a bounding box for the given geom geometry. This function works for 2D and 3D geometries. 
ST_MINX(geom)  Returns the minimum x coordinate of a bounding box for the given geom geometry. This function works for 2D and 3D geometries. 
ST_MINY(geom)  Returns the minimum y coordinate of a bounding box for the given geom geometry. This function works for 2D and 3D geometries. 
ST_MINZ(geom)  Returns the minimum z coordinate of a bounding box for the given geom geometry. This function works for 2D and 3D geometries. 
ST_MULTI(geom)  Returns geom as a MULTI geometry, e.g., a POINT would return a MULTIPOINT. 
ST_MULTIPLERINGBUFFERS(geom, distance, outside)  Creates multiple buffers at specified distance around the given geom geometry. Multiple distances are specified as commaseparated values in an array, e.g., [10,20,30]. Valid values for outside are:

ST_NEAR(geom1, geom2)  Returns the portion of geom2 that is closest to geom1. If geom2 is a singular geometry object (e.g., POINT, LINESTRING, POLYGON), geom2 will be returned. If geom2 a multigeometry, e.g., MULTIPOINT, MULTILINESTRING, etc., the nearest singular geometry in geom2 will be returned. 
ST_NORMALIZE(geom)  Returns geom in its normalized (canonical) form, which may rearrange the points in lexicographical order. 
ST_NPOINTS(geom)  Returns the number of points (vertices) in geom. 
ST_NUMGEOMETRIES(geom)  If geom is a collection or MULTI geometry, returns the number of geometries. If geom is a single geometry, returns 1. 
ST_NUMINTERIORRINGS(geom)  Returns the number of interior rings if geom is a POLYGON. Returns null if geom is anything else. 
ST_NUMPOINTS(geom)  Returns the number of points in the geom LINESTRING. Returns null if geom is not a LINESTRING. 
ST_OVERLAPS(geom1, geom2)  Returns 1 (true) if given geometries geom1 and geom2 share space. If geom1 and/or geom2 are a GEOMETRYCOLLECTION, a 0 is returned regardless if the two geometries overlap 
ST_PARTITION(geom, threshold)  Returns a MULTIPOLYGON representing the given geom partitioned into a number of POLYGONs with a maximum number of vertices equal to the given threshold. Minimum value for threshold is 10; default value is 10000. If geom is not a POLYGON or MULTIPOLYGON, geom is returned. If the number of vertices in geom is less than the threshold, geom is returned. 
ST_PERIMETER(geom[, solution])  Returns the perimeter of the geometry if it is a POLYGON or MULTIPOLYGON. Returns 0 if another type of geometry, e.g., POINT, MULTIPOINT, LINESTRING, or MULTILINESTRING. GEOMETRYCOLLECTIONs are also supported but the aforementioned type limitation still applies; the collection will be recursively searched for POLYGONs and MULTIPOLYGONs and the summation of all supported geometry types is returned (unsupported types are ignored). Solution types available:

ST_POINT(x, y)  Returns a POINT with the given x and y coordinates. 
ST_POINTFROMGEOHASH(geohash, precision)  Returns a POINT using the given geohash with a precision set by the integer precision. If precision is specified, the function will use as many characters in the hash equal to precision to create the geometry. If no precision is specified, the full length of the geohash is used. Note The POINT returned represents the center of the bounding box of the geohash 
ST_POINTGRID(xmin, ymin, xmax, ymax, cell_side[, limit])  Creates a MULTIPOLYGON containing a squareshaped grid of points between given minimum and maximum points of a bounding box. The minimum point cannot be greater than or equal to the maximum point. The distance between the points (in meters) is determined by cell_side. The cell_side cannot be greater than the width or height of the bounding box. The maximum number of cells that can be produced is determined by limit, a positive integer. Supported values for limit:
If the custom limit request specifies more cells (based on the bounding box and the cell_side) than the system limit, a null is returned. 
ST_POINTN(geom, n)  Returns the nth point in LINESTRING geom. Negative values are valid, but note that they are counted backwards from the end of geom. A null is returned if geom is not a LINESTRING. 
ST_POINTS(geom)  Returns a MULTIPOINT containing all of the coordinates of geom. 
ST_PROJECT(geom, distance, azimuth)  Returns a POINT projected from a start point geom along a geodesic calculated using distance and azimuth. If geom is not a POINT, null is returned. 
ST_REMOVEPOINT(geom, offset)  Remove a point from LINESTRING geom using offset to skip over POINTs in the LINESTRING. The offset is 0based. 
ST_REMOVEREPEATEDPOINTS(geom, tolerance)  Removes points from geom if the point's vertices are greater than or equal to the tolerance of the previous point in the geometry's list. If geom is not a MULTIPOINT, MULTILINESTRING, or a MULTIPOLYGON, no points will be removed. 
ST_REVERSE(geom)  Return the geometry with its coordinate order reversed. 
ST_SCALE(geom, x, y)  Scales geom by multiplying its respective vertices by the given x and y values. This function also supports scaling geom using another geometry object, e.g., ST_SCALE('POINT(3 4)', 'POINT(5 6)') would return POINT(15 24). If specifying x and y for scale, note that the default value is 0, e.g., ST_SCALE('POINT(1 3)', 4) would return POINT(4 0). 
ST_SEGMENTIZE(geom, max_segment_length[, solution])  Returns the given geom but segmentized n number of times depending on how the max_segment_length distance (in units based on the solution type) divides up the original geometry. The new geom is guaranteed to have segments that are smaller than the given max_segment_length. Note that POINTs are not able to be segmentized. Collection geometries (GEOMETRYCOLLECTION, MULTILINESTRING, MULTIPOINT, etc.) can be segmentized, but only the individual parts will be segmentized, not the collection as a whole. Solution types available:

ST_SETPOINT(geom1, position, geom2)  Replace a point of LINESTRING geom1 with POINT geom2 at position (base 0). Negative values are valid, but note that they are counted backwards from the end of geom. 
ST_SHAREDPATH(geom1, geom2)  Returns a collection containing paths shared by geom1 and geom2. 
ST_SHORTESTLINE(geom1, geom2)  Returns the 2D LINESTRING that represents the shortest line of points between the two geometries. If multiple shortest lines are found, only the first line found is returned. If geom1 or geom2 is empty, null is returned 
ST_SIMPLIFY(geom, tolerance)  Returns a simplified version of the given geom using an algorithm to reduce the number of points comprising a given geometry while attempting to best retain the original shape. The given tolerance determines how much to simplify the geometry. The higher the tolerance, the more simplified the returned geometry. Some holes might be removed and some invalid polygons (e.g., selfintersecting, etc.) might be present in the returned geometry. Only (MULTI)LINESTRINGs and (MULTI)POLYGONs can be simplified, including those found within GEOMETRYCOLLECTIONs; any other geometry objects will be returned unsimplified. Note The tolerance should be provided in the same units as the data. As a rule of thumb, a tolerance of 0.00001 would correspond to about one meter. 
ST_SIMPLIFYPRESERVETOPOLOGY(geom, tolerance)  Returns a simplified version of the given geom using an algorithm to reduce the number of points comprising a given geometry while attempting to best retain the original shape. The given tolerance determines how much to simplify the geometry. The higher the tolerance, the more simplified the returned geometry. No holes will be removed and no invalid polygons (e.g., selfintersecting, etc.) will be present in the returned geometry. Only (MULTI)LINESTRINGs and (MULTI)POLYGONs can be simplified, including those found within GEOMETRYCOLLECTIONs; any other geometry objects will be returned unsimplified. Note The tolerance should be provided in the same units as the data. As a rule of thumb, a tolerance of 0.00001 would correspond to about one meter. 
ST_SNAP(geom1, geom2, tolerance)  Snaps geom1 to geom2 within the given tolerance. If the tolerance causes geom1 to not snap, the geometries will be returned unchanged. 
ST_SPLIT(geom1, geom2)  Returns a collection of geometries resulting from the split between geom1 and geom2 geometries. 
ST_SQUAREGRID(xmin, ymin, xmax, ymax, cell_side[, limit])  Creates a MULTIPOLYGON containing a grid of squares between given minimum and maximum points of a bounding box. The minimum point cannot be greater than or equal to the maximum point. The size (in meters) of the individual squares' sides is determined by cell_side. The cell_side cannot be greater than the width or height of the bounding box. The maximum number of cells that can be produced is determined by limit, a positive integer. Supported values for limit:
If the custom limit request specifies more cells (based on the bounding box and the cell_side) than the system limit, a null is returned. 
ST_STARTPOINT(geom)  Returns the first point of LINESTRING geom as a POINT. Returns null if geom is not a LINESTRING. 
ST_SYMDIFFERENCE(geom1, geom2)  Returns a geometry that represents the portions of geom1 and geom2 geometries that do not intersect. 
ST_TOUCHES(geom1, geom2)  Returns 1 (true) if the given geometries, geom1 and geom2, have at least one point in common but their interiors do not intersect. If geom1 and/or geom2 are a GEOMETRYCOLLECTION, a 0 is returned regardless if the two geometries touch 
ST_TRANSLATE(geom, deltax, deltay[, deltaz])  Translate geom by given offsets deltax and deltay. A zcoordinate offset can be applied using deltaz. 
ST_TRIANGLEGRID(xmin, ymin, xmax, ymax, cell_side[, limit])  Creates a MULTIPOLYGON containing a grid of triangles between given minimum and maximum points of a bounding box. The minimum point cannot be greater than or equal to the maximum point. The size (in meters) of the individual triangles' sides is determined by cell_side. The cell_side cannot be greater than the width or height of the bounding box. The maximum number of cells that can be produced is determined by limit, a positive integer. Supported values for limit:
If the custom limit request specifies more cells (based on the bounding box and the cell_side) than the system limit, a null is returned. 
ST_UNION(geom1, geom2)  Returns a geometry that represents the point set union of the two given geometries, geom1 and geom2. 
ST_UNIONCOLLECTION(geom)  Returns a geometry that represents the point set union of a single given geometry geom. 
ST_UPDATE(geom1, geom2)  Returns a geometry that is geom1 geometry updated by geom2 geometry 
ST_VORONOIPOLYGONS(geom, tolerance)  Returns a GEOMETRYCOLLECTION containing Voronoi polygons (regions consisting of points closer to a vertex in geom than any other vertices in geom) calculated from the vertices in geom and the given tolerance. The tolerance determines the distance at which points will be considered the same. An empty GEOMETRYCOLLECTION is returned if geom is an empty geometry, a single POINT, or a LINESTRING or POLYGON composed of equivalent vertices (e.g., POLYGON((0 0, 0 0, 0 0, 0 0)), LINESTRING(0 0, 0 0)). 
ST_WITHIN(geom1, geom2)  Returns 1 (true) if the geom1 geometry is inside the geom2 geometry. Note that as long as at least one point is inside of geom2, geom1 is considered within geom2 even if the rest of the geom1 lies along the boundary of geom2 
ST_WKTTOWKB(geom)  Returns the binary form (WKB) of a geom (WKT) Note This function can only be used in queries against a single table. 
ST_X(geom)  Returns the X coordinate of the POINT geom; if the coordinate is not available, null is returned. geom must be a POINT. 
ST_XMAX(geom)  Alias for ST_MAXX() 
ST_XMIN(geom)  Alias for ST_MINX() 
ST_Y(geom)  Returns the Y coordinate of the POINT geom; if the coordinate is not available, null is returned. geom must be a POINT. 
ST_YMAX(geom)  Alias for ST_MAXY() 
ST_YMIN(geom)  Alias for ST_MINY() 
ST_ZMAX(geom)  Alias for ST_MAXZ() 
ST_ZMIN(geom)  Alias for ST_MINZ() 
Aggregation Functions
Function  Description 

ST_AGGREGATE_COLLECT(geom)  Alias for ST_COLLECT_AGGREGATE() 
ST_AGGREGATE_INTERSECTION(geom)  Alias for ST_INTERSECTION_AGGREGATE() 
ST_COLLECT_AGGREGATE(geom)  Returns a GEOMETRYCOLLECTION comprising all geometries found in the geom set. Any MULTI* geometries will be divided into separate singular geometries, e.g., MULTIPOINT((0 0), (1 1)) would be divided into POINT(0 0) and POINT(1 1) in the results; the same is true for elements of a GEOMETRYCOLLECTION found in geom, where a GEOMETRYCOLLECTION within the provided geom set will also be parsed, effectively flattening it and adding the individual geometries to the resulting GEOMETRYCOLLECTION. Any empty geometries in geom are ignored even if they are part of a GEOMETRYCOLLECTION. Any duplicate WKTs will be retained. 
ST_DISSOLVE(geom)  Dissolves all geometries within a given set into a single geometry. Note that the resulting single geometry can still be a group of noncontiguous geometries but represented as a single group, e.g., a GEOMETRYCOLLECTION. Best performance when used in conjunction with adjacent geometries 
ST_DISSOLVEOVERLAPPING(geom)  Dissolves all geometries within a given set into a single geometry. Note that the resulting single geometry can still be a group of noncontiguous geometries but represented as a single group, e.g., a GEOMETRYCOLLECTION. Best performance when used in conjunction with overlapping geometries 
ST_INTERSECTION_AGGREGATE(geom)  Returns a POLYGON or MULTIPOLYGON comprising the shared portion between all geometries found in the geom set. Returns an empty GEOMETRYCOLLECTION if there is no shared portion between all geometries. Functionally equivalent to ST_INTERSECTION(ST_INTERSECTION(ST_INTERSECTION(geom1, geom2), geom3), ... geomN). 
ST_LINESTRINGFROMORDEREDPOINTS(x, y, t)  Returns a LINESTRING that represents a "track" of the given points (x, y) ordered by the given sort column t (e.g., a timestamp or sequence number). If any of the values in the specified columns are null, the null "point" will be left out of the resulting LINESTRING. If there's only one nonnull "point" in the source table, a POINT is returned. If there are no nonnull "points" in the source table, a null is returned 
ST_LINESTRINGFROMORDEREDPOINTS3D(x, y, z, t)  Returns a LINESTRING that represents a "track" of the given 3D points (x, y, z) ordered by the given sort column t (e.g., a timestamp or sequence number). If any of the values in the specified columns are null, the null "point" will be left out of the resulting LINESTRING. If there's only one nonnull "point" in the source table, a POINT is returned. If there are no nonnull "points" in the source table, a null is returned 
ST_POLYGONIZE(geom)  Returns a GEOMETRYCOLLECTION containing POLYGONs comprising the provided (MULTI)LINESTRING(s). (MULTI)POINT and (MULTI)POLYGON geometries are ignored when calculating the resulting GEOMETRYCOLLECTION. If a valid POLYGON cannot be constructed from the provided (MULTI)LINESTRING(s), an empty GEOMETRYCOLLECTION will be returned. 
Math Functions
Function  Description  

ABS(expr)  Calculates the absolute value of expr  
ACOS(expr)  Returns the inverse cosine (arccosine) of expr as a double  
ACOSF(expr)  Returns the inverse cosine (arccosine) of expr as a float  
ACOSH(expr)  Returns the inverse hyperbolic cosine of expr as a double  
ACOSHF(expr)  Returns the inverse hyperbolic cosine of expr as a float  
ASIN(expr)  Returns the inverse sine (arcsine) of expr as a double  
ASINF(expr)  Returns the inverse sine (arcsine) of expr as a float  
ASINH(expr)  Returns the inverse hyperbolic sine of expr as a double  
ASINHF(expr)  Returns the inverse hyperbolic sine of expr as a float  
ATAN(expr)  Returns the inverse tangent (arctangent) of expr as a double  
ATANF(expr)  Returns the inverse tangent (arctangent) of expr as a float  
ATANH(expr)  Returns the inverse hyperbolic tangent of expr as a double  
ATANHF(expr)  Returns the inverse hyperbolic tangent of expr as a float  
ATAN2(x, y)  Returns the inverse tangent (arctangent) using two arguments as a double  
ATAN2F(x, y)  Returns the inverse tangent (arctangent) using two arguments as a float  
ATN2(x, y)  Alias for ATAN2  
ATN2F(x, y)  Alias for ATAN2F  
CBRT(expr)  Returns the cube root of expr as a double  
CBRTF(expr)  Returns the cube root of expr as a float  
CEIL(expr)  Alias for CEILING  
CEILING(expr)  Rounds expr up to the next highest integer  
COS(expr)  Returns the cosine of expr as a double  
COSF(expr)  Returns the cosine of expr as a float  
COSH(expr)  Returns the hyperbolic cosine of expr as a double  
COSHF(expr)  Returns the hyperbolic cosine of expr as a float  
COT(expr)  Returns the cotangent of expr as a double  
COTF(expr)  Returns the cotangent of expr as a float  
DEGREES(expr)  Returns the conversion of expr (in radians) to degrees as a double  
DEGREESF(expr)  Returns the conversion of expr (in radians) to degrees as a float  
DIVZ(a, b, c)  Returns the quotient a / b unless b == 0, in which case it returns c  
EXP(expr)  Returns e to the power of expr as a double  
EXPF(expr)  Returns e to the power of expr as a float  
FLOOR(expr)  Rounds expr down to the next lowest integer  
GREATER(expr_a, expr_b)  Returns whichever of expr_a and expr_b has the larger value, based on typed comparison  
HYPOT(x, y)  Returns the hypotenuse of x and y as a double  
HYPOTF(x, y)  Returns the hypotenuse of x and y as a float  
ISNAN(expr)  Returns 1 (true) if expr is not a number by IEEE standard; otherwise, returns 0 (false)  
IS_NAN(expr)  Alias for ISNAN  
ISINFINITY(expr)  Returns 1 (true) if expr is infinity by IEEE standard; otherwise, returns 0 (false)  
IS_INFINITY(expr)  Alias for ISINFINITY  
LDEXP(x, exp)  Returns the value of x * 2^{exp} as a double  
LDEXPF(x, exp)  Returns the value of x * 2^{exp} as a float  
LESSER(expr_a, expr_b)  Returns whichever of expr_a and expr_b has the smaller value, based on typed comparison  
LN(expr)  Returns the natural logarithm of expr as a double  
LNF(expr)  Returns the natural logarithm of expr as a float  
LOG(expr)  Alias for LN  
LOG10(expr)  Returns the base10 logarithm of expr as a double  
LOG10F(expr)  Returns the base10 logarithm of expr as a float  
LOG1P(expr)  Returns the natural logarithm of one plus expr as a double  
LOG1PF(expr)  Returns the natural logarithm of one plus expr as a float  
LOG2(expr)  Returns the binary (base2) logarithm of expr as a double  
LOG2F(expr)  Returns the binary (base2) logarithm of expr as a float  
LOGF(expr)  Alias for LNF  
MAX_CONSECUTIVE_BITS(expr)  Calculates the length of the longest series of consecutive 1 bits in the integer expr  
MOD(dividend, divisor)  Calculates the remainder after integer division of dividend by divisor  
PI()  Returns the value of pi  
POW(base, exponent)  Alias for POWER  
POWF(base, exponent)  Alias for POWERF  
POWER(base, exponent)  Returns base raised to the power of exponent as a double  
POWERF(base, exponent)  Returns base raised to the power of exponent as a float  
RADIANS(expr)  Returns the conversion of expr (in degrees) to radians as a double  
RADIANSF(expr)  Returns the conversion of expr (in degrees) to radians as a float  
RAND()  Returns a random floatingpoint value.  
REGR_VALX(y, x)  Returns NULL if the first argument is NULL; otherwise, returns the second argument.  
REGR_VALY(y, x)  Returns NULL if the second argument is NULL; otherwise, returns the first argument.  
ROUND(expr, scale)  Rounds expr to the nearest decimal number with scale decimal places when scale is a positive number; rounds to the nearest number such that the result has (scale) zeros to the left of the decimal point when scale is negative; use scale of 0 to round to the nearest integer. Examples:
 
SIGN(expr)  Determines whether a number is positive, negative, or zero; returns one of the following three values:
 
SIN(expr)  Returns the sine of expr as a double  
SINF(expr)  Returns the sine of expr as a float  
SINH(expr)  Returns the hyperbolic sine of expr as a double  
SINHF(expr)  Returns the hyperbolic sine of expr as a float  
SQRT(expr)  Returns the square root of expr as a double  
SQRTF(expr)  Returns the square root of expr as a float  
TAN(expr)  Returns the tangent of expr as a double  
TANF(expr)  Returns the tangent of expr as a float  
TANH(expr)  Returns the hyperbolic tangent of expr as a double  
TANHF(expr)  Returns the hyperbolic tangent of expr as a float  
TRUNCATE(expr, scale)  Rounds expr down to the nearest decimal number with scale decimal places, following the same rules as ROUND. Examples:

Null Functions
Some of the following null functions require parameters to be of convertible data types. Note that limitedwidth (charN) & unlimitedwidth (noncharN) string types are not convertible.
Function  Description 

COALESCE(expr_a, ..., expr_N)  Returns the value of the first expression that is not null starting with expr_a and ending with expr_N. If all are null, then null is returned. All expressions should be of the same or convertible data type. 
IFNULL(expr_a, expr_b)  Returns expr_a if it is not null; otherwise, returns expr_b. Both should be of the same or convertible data type. See ShortCircuiting for errorchecking details. 
ISNULL(expr)  Returns 1 if expr is null; otherwise, returns 0 
IS_NULL(expr)  Synonymous with ISNULL(expr) 
NULLIF(expr_a, expr_b)  Returns null if expr_a equals expr_b; otherwise, returns the value of expr_a; both expressions should be of the same or convertible data type. 
NVL(expr_a, expr_b)  Alias for IFNULL 
NVL2(expr, value_if_not_null, value_if_null)  Evaluates expr: if not null, returns value_if_not_null; if null, returns value_if_null. Both value_if_not_null & value_if_null should be of the same data type as expr or implicitly convertible; see ShortCircuiting for errorchecking details. 
REMOVE_NULLABLE(expr)  Alias for ZEROIFNULL 
ZEROIFNULL(expr)  Replaces null values with appropriate values based on the column type (e.g., 0 if numeric column, an empty string if charN column, etc.). Also removes the nullable column property if used to calculate a derived column. 
String Functions
There are three different types of string functions:
CharN Functions
The following functions will only work with fixedwidth string columns (VARCHAR(1)  VARCHAR(256)).
Function  Description  

ASCII(expr)  Returns the ASCII code for the first character in expr  
CHAR(expr)  The character represented by the standard ASCII code expr in the range [ 0  127 ]  
CONCAT(expr_a, expr_b)  Performs a string concatenation of expr_a & expr_b; use nested CONCAT calls to concatenate more than two strings Note The resulting field size of any CONCAT will be a charN field big enough to hold the concatenated fields, e.g., concatenating a char32 column and a char64 column will result in a char128 column. Columns of type char256 cannot be used with CONCAT.  
CONCAT_TRUNCATE(expr_a, expr_b)  Returns the concatenation of expr_a and expr_b, truncated at the maximum size of expr_a. For columns, this size is explicit; for string constants, this will be the smallest charN type that can hold the expr_a string. Examples:
 
CONTAINS(match_expr, ref_expr)  Returns 1 if ref_expr contains match_expr by stringliteral comparison; otherwise, returns 0  
DIFFERENCE(expr_a, expr_b)  Returns a value between 0 and 4 that represents the difference between the sounds of expr_a and expr_b based on the SOUNDEX() value of the stringsa value of 4 is the best possible sound match  
EDIT_DISTANCE(expr_a, expr_b)  Returns the Levenshtein edit distance between expr_a and expr_b; the lower the the value, the more similar the two strings are  
ENDS_WITH(match_expr, ref_expr)  Returns 1 if ref_expr ends with match_expr by stringliteral comparison; otherwise, returns 0  
INITCAP(expr)  Returns expr with the first letter of each word in uppercase  
IPV4_PART(expr, part_num)  Returns the octet of the IP address given in expr at the position specified by part_num. Valid part_num values are constants from 1 to 4. Examples:
 
IS_IPV4(expr)  Returns 1 if expr is an IPV4 address; returns 0 otherwise  
LCASE(expr)  Converts expr to lowercase  
LEFT(expr, num_chars)  Returns the leftmost num_chars characters from expr  
LENGTH(expr)  Returns the number of characters in expr  
LOCATE(match_expr, ref_expr, [start_pos])  Returns the starting position of the first match of match_expr in ref_expr, starting from position 1 or start_pos (if specified)  
LOWER(expr)  Alias for LCASE  
LPAD(base_expr, length, pad_expr)  Left pads the given base_expr string with the pad_expr string to the given length of characters. If base_expr is longer than length, the return value is shortened to length characters. If length is larger than 256, it will be truncated to 256. Examples:
 
LTRIM(expr)  Removes whitespace from the left side of expr  
POSITION(match_expr, ref_expr, [start_pos])  Alias for LOCATE  
REPLACE(ref_expr, match_expr, repl_expr)  Replaces every occurrence of match_expr in ref_expr with repl_expr  
REVERSE(expr)  Returns expr with the order of characters reversed. Examples:
 
RIGHT(expr, num_chars)  Returns the rightmost num_chars characters from expr  
RPAD(base_expr, length, pad_expr)  Right pads the given base_expr string with the pad_expr string to the given length of characters. If base_expr is longer than length, the return value is shortened to length characters. If length is larger than 256, it will be truncated to 256. Examples:
 
RTRIM(expr)  Removes whitespace from the right side of expr  
SOUNDEX(expr)  Returns a soundex value from expr. Only the first word in the string will be considered in the calculation. Note This is the algorithm used by most programming languages.  
SPACE(n)  Returns a string consisting of n space characters. The value of n can only be within the range of 0256.  
SPLIT(expr, delim, group_num)  Splits expr into groups delimited by the delim character and returns the group_num split group. If group_num is positive, groups will be counted from the beginning of expr; if negative, groups will be counted from the end of expr going backwards. Two consecutive delimiters will result in an empty string being added to the list of selectable groups. If no instances of delim exist in expr, the entire string is available at group 1 (and 1). Group 0 returns nothing. Examples:
 
STARTS_WITH(match_expr, ref_expr)  Returns 1 if ref_expr starts with match_expr by stringliteral comparison; otherwise, returns 0  
STRCMP(expr_a, expr_b)  Compares expr_a to expr_b in a lexicographical sort
 
SUBSTR(expr, start_pos, num_chars)  Alias for SUBSTRING  
SUBSTRING(expr, start_pos, num_chars)  Returns num_chars characters from the expr, starting at the 1based start_pos  
TRIM(expr)  Removes whitespace from both sides of expr  
UCASE(expr)  Converts expr to uppercase  
UPPER(expr)  Alias for UCASE 
LIKE
Simple patternmatching capability is provided through the use of the LIKE clause, for fixedwidth string columns (VARCHAR(1)  VARCHAR(256)).


This clause matches records where reference expression ref_expr does (or does NOT) match the string value of match_expr. The match is a string literal one, with the following exceptions:
 % matches any string of 0 or more characters
 _ matches any single character
For example, to search for employees whose last name starts with C, ends with a, and has exactly five letters:


To search for employees whose first name does not start with Brook:


FILTER_BY_STRING
The FILTER_BY_STRING table function allows for several types of pattern matching for fixedwidth string columns (VARCHAR(1)  VARCHAR(256)), as well as full text search capability for unrestrictedwidth string columns (VARCHAR). Its features are based on the /filter/bystring endpoint.
The basic form of the FILTER_BY_STRING function when called in a SELECT statement follows.


The basic form of the FILTER_BY_STRING function when called in an EXECUTE FUNCTION statement follows. When called this way, the results are saved in the specified view.


Parameters  Description  

TABLE_NAME  The data set to filter, which should be passed via query to the INPUT_TABLE function. To perform a string filter on a column in the customer table, pass the name of the table to INPUT_TABLE: INPUT_TABLE(customer) To perform a string filter on a column in the result of a query, pass the query to INPUT_TABLE: INPUT_TABLE(SELECT * FROM customer_west UNION SELECT * FROM customer_east)  
VIEW_NAME  Name of the view in which the results are to be stored, in [schema_name.]table_name format, using standard name resolution rules and meeting table naming criteria Important This parameter is only applicable when using EXECUTE FUNCTION syntax.  
COLUMN_NAMES  Commaseparated list of columns to which the filter will be applied. Important Omit this parameter when using the search filter MODE.  
MODE  Type of string filter to apply:
 
EXPRESSION  String literal or pattern, depending on MODE selected, to use in filtering string columns  
OPTIONS  Optional list of string filtering options, specified as a set of key/value pairs passed as a commadelimited list of <key> = '<value>' assignments to the KV_PAIRS function; e.g.: KV_PAIRS(case_sensitive = 'false')

To see the message text & time for events in the event_log table containing the word ERROR:


User/Security Functions
Function  Description  

CURRENT_USER()  Alias for USER  
HASH(column[, seed])  Returns a nonnegative integer representing an obfuscated version of column, using the given seed; default seed is 0  
IS_MEMBER(role[, user]) or IS_ROLEMEMBER(role[, user])  Returns whether the current user (or the given user, if specified) has been assigned the given role, either directly or indirectly:
 
MASK(expr, start, length[, char])  Masks length characters of expr, beginning at the position identified by start, with * characters (or the character specified in char):
 
OBFUSCATE(column[, seed])  Alias for HASH  
SHA256(expr)  Returns the hex digits of the SHA256 hash of the given value expr as a char64 string.  
SYSTEM_USER()  Alias for USER  
USER()  Returns the username of the current user 
Aggregation Functions
Function  Description 

ATTR(expr)  If MIN(expr) = MAX(expr), returns expr; otherwise * 
ARG_MIN(agg_expr, ret_expr)  The value of ret_expr where agg_expr is the minimum value (e.g. ARG_MIN(cost, product_id) returns the product ID of the lowest cost product) 
ARG_MAX(agg_expr, ret_expr)  The value of ret_expr where agg_expr is the maximum value (e.g. ARG_MAX(cost, product_id) returns the product ID of the highest cost product) 
AVG(expr)  Calculates the average value of expr 
CORR(expr1, expr2)  Calculates the correlation coefficient of expr1 and expr2 
CORRELATION(expr1, expr2)  Alias for CORR 
CORRCOEF(expr1, expr2)  Alias for CORR 
COUNT(*)  Returns the number of records in a table 
COUNT(expr)  Returns the number of nonnull data values in expr 
COUNT(DISTINCT expr)  Returns the number of distinct nonnull data values in expr 
COV(expr1, expr2)  Alias for COVAR_POP 
COVAR(expr1, expr2)  Alias for COVAR_POP 
COVARIANCE(expr1, expr2)  Alias for COVAR_POP 
COVAR_POP(expr1, expr2)  Calculates the population covariance of expr1 and expr2 
COVAR_SAMP(expr1, expr2)  Calculates the sample covariance of expr1 and expr2 
GROUPING(expr)  Used primarily with ROLLUP, CUBE, and GROUPING SETS, to distinguish the source of null values in an aggregated result set, returns whether expr is part of the aggregation set used to calculate the values in a given result set row. Returns 0 if expr is part of the row's aggregation set, 1 if expr is not (meaning that aggregation took place across all expr values). For example, in a ROLLUP(A) operation, there will be two potential rows with null in the result set for column A. One row will contain null values of A aggregated together, and the other will contain null, but be an aggregation over the entire table, irrespective of A values. In this case, GROUPING(A) will return 0 for the null values of A aggregated together (as well as all other grouped A values) and 1 for the row resulting from aggregating across all A values. 
KURT(expr)  Alias for KURTOSIS_POP 
KURTOSIS(expr)  Alias for KURTOSIS_POP 
KURTOSIS_POP(expr)  Calculate the population kurtosis of expr 
KURTOSIS_SAMP(expr)  Calculate the sample kurtosis of expr 
KURT_POP(expr)  Alias for KURTOSIS_POP 
KURT_SAMP(expr)  Alias for KURTOSIS_SAMP 
MAX(expr)  Finds the maximum value of expr 
MEAN(expr)  Alias for AVG 
MIN(expr)  Finds the minimum value of expr 
PRODUCT(expr)  Multiplies all the values of expr 
REGR_AVGX(y, x)  Average of the independent variable (SUM(x)/N) 
REGR_AVGY(y, x)  Average of the dependent variable (SUM(y)/N) 
REGR_COUNT(y, x)  Number of input rows in which both expressions are nonnull 
REGR_INTERCEPT(y, x)  Yintercept of the leastsquaresfit linear equation determined by the (X, Y) pairs 
REGR_R2(y, x)  Square of the correlation coefficient 
REGR_SLOPE(y, x)  Slope of the leastsquaresfit linear equation determined by the (X, Y) pairs 
REGR_SXX(y, x)  "Sum of squares" of the independent variable (SUM(x^2)  SUM(x)^2/N) 
REGR_SXY(y, x)  "Sum of Products" of independent variable times dependent variable (SUM(x * y)  SUM(x) * SUM(y)/N) 
REGR_SYY(y, x)  "Sum of squares" of the dependent variable (SUM(y^2)  SUM(y)^2/N) 
SKEW(expr)  Alias for SKEWNESS_POP 
SKEWNESS(expr)  Alias for SKEWNESS_POP 
SKEWNESS_POP(expr)  Calculate the population skew of expr 
SKEWNESS_SAMP(expr)  Calculate the sample skew of expr 
SKEW_POP(expr)  Alias for SKEWNESS_POP 
SKEW_SAMP(expr)  Alias for SKEWNESS_SAMP 
STDDEV(expr)  Alias for STDDEV_POP 
STDDEV_POP(expr)  Calculates the population standard deviation of the values of expr 
STDDEV_SAMP(expr)  Calculates the sample standard deviation of the values of expr 
SUM(expr)  Sums all the values of expr 
VAR(expr)  Alias for VAR_POP 
VAR_POP(expr)  Calculates the population variance of the values of expr 
VAR_SAMP(expr)  Calculates the sample variance of the values of expr 
Grouping Functions
Function  Description 

ROLLUP(expr)  Calculates n + 1 aggregates for n number of columns in expr 
CUBE(expr)  Calculates 2^{n} aggregates for n number of columns in expr 
GROUPING SETS(expr)  Calculates aggregates for any given aggregates in expr, including ROLLUP() and CUBE() 
Distribution Functions
Distribution functions are column expressions that affect the sharded/replicated nature of the result set of a given query. It may be necessary to force a result set to be distributed in a certain way for a subsequent operation on that result set to be performant.
Important
Employing these functions will prevent any automatic resharding of data to allow the query to succeed. Use only when a better query plan (with respect to data distribution) is known than any the system can devise.
Function  Description 

KI_REPLICATE()  Force a scalar result set to be replicated (query with no GROUP BY) 
KI_REPLICATE_GROUP_BY(0)  Force an aggregated result set to be replicated (query with GROUP BY) 
KI_MATCH_COLUMN(0)  Aligns the column count of queries that are part of a UNION, INTERSECT or EXCEPT with a query whose column list has been amended with either KI_REPLICATE_GROUP_BY or KI_SHARD_KEY 
KI_SHARD_KEY(<column list>)  Force the result set to be sharded on the given columns. This will override any implicitlyderived or explicitlydefined replication status the table would have had. Note The column(s) listed in column list must also appear in the SELECT list; KI_SHARD_KEY merely identifies which of the selected columns should be used as the shard key. 
Sharding Example
For example, a query for all employees and their total employees managed, including employees who don't manage anyone, could employ a UNION like this:


In this example, the employee table is sharded on id. Since the first part of the UNION aggregates on manager_id, the result will be replicated. The second part of the UNION does no aggregation and includes the shard key in the SELECT list; the result of this will be sharded.
Prior to Kinetica v7.0, the limitation of UNION operations requiring that both parts of a UNION have to be distributed the same way, would make the query fail, with the following message:
GPUdb Error: either all input tables must be replicated or all input tables must be nonreplicated
In order to work around this limitation, a distribution function could be used. This is what the SQL engine of Kinetica, versions 7.0 or later, do automatically.
One option is to shard the first part of the UNION to match the second part:


Here, the distribution function KI_SHARD_KEY is used to make the selected manager_id column the new shard key for the first part of the UNION. Now, the shard key for the first part of the UNION (manager_id) aligns with the shard key for the second part (id), and the query succeeds. Note the use of KI_MATCH_COLUMN, which aligns the selected column lists on each side of the UNION. Without this matching distribution function, the UNION would appear to be merging three columns from the first part of the query into two columns in the second part and would fail.
Note
The manager_id column must exist in the SELECT list in order for the KI_SHARD_KEY function to designate it as the shard key.
Predicates
Predicate are generally used within a SQL WHERE clause to query records. They compare the values of two or more expressions; whenever a record meets the criteria defined in a predicate clause it will be marked as eligible to be part of the query result set. If it meets all predicate clauses defined within a query, it will be returned in the result set.
A single predicate clause may use a simple predicate operator to compare the values of two expressions or a more complex predicate clause form. A compound predicate clause uses a compound predicate operator to link together multiple predicate clauses to further refine a result set.
Unlimitedwidth (noncharN) strings can only be used within equalitybased predicates, e.g. =, IN, etc.
Predicate Operators
 = equality
 != or <> inequality
 < less than
 <= less than or equal to
 > greater than
 >= greater than or equal to
Predicate Clauses
In the following list of predicate clauses, ref_expr is the reference expression to apply the predicate to; note that EXISTS has no reference expression.
Predicate Clause  Description 

<expr_a> <pred_op> <expr_b>  Matches records where expr_a relates to expr_b according to predicate operator pred_op. 
<ref_expr> <pred_op> ALL (<select statement>)  Matches records where the reference expression ref_expr relates to all of the results of select statement according to the predicate operator pred_op 
<ref_expr> <pred_op> ANY (<select statement>)  Matches records where the reference expression ref_expr relates to any of the results of select statement according to the predicate operator pred_op 
<ref_expr> [NOT] BETWEEN <begin_expr> AND <end_expr>  Matches records where the reference expression ref_expr is (or is NOT) between the values of begin_expr and end_expr 
<ref_expr> [NOT] IN (<match_list>)  Matches records where the reference expression ref_expr is (or is NOT) in the match_list list of match values. The list can either be a commaseparated list of terms/expressions or the result of a SELECT statement. 
<ref_expr> IS [NOT] NULL  Matches records where the reference expression ref_expr is (or is NOT) null. 
[NOT] EXISTS (<select statement>)  Matches records where select statement returns 1 or more records. 
Compound Predicate Operators
Predicate Operator  Description 

<pred_a> AND <pred_b>  Matches records where both pred_a & pred_b are true 
<pred_a> OR <pred_b>  Matches records where either pred_a or pred_b is true 
NOT <pred_b>  Matches records where pred is false 
Subqueries
Hints
Hint strings (KI_HINT) can be added as comments within queries, and affect just the query in which they appear. They will override the corresponding client & server settings (when such settings exist). For example:


Hint  Description 

KI_HINT_BATCH_SIZE(n)  Use an ingest batch size of n records. Default: 10,000. Only applicable when issuing INSERT statements. 
KI_HINT_CHUNK_SIZE(n)  Use chunk sizes of n records per chunk within result sets. Suffixes of K & M can be used to represent thousands or millions of records; e.g., 20K, 50M. 
KI_HINT_COMPARABLE_EXPLAIN  Simplify EXPLAIN output removing unique identifiers from table names. Note Potentially unsafe for use in a multiuser environment, as the table names produced will not be unique and may collide with the table names of other users who execute EXPLAIN with this hint. 
KI_HINT_DICT_PROJECTION  Retain the dictionary encoding attributes of source columns in a projection result set. 
KI_HINT_DISTRIBUTED_OPERATIONS  Reshard data when doing so would be the only way to process one or more operations in this query. 
KI_HINT_DONT_COMBINE  Don’t combine joins and unions for this query. 
KI_HINT_GROUP_BY_FORCE_REPLICATED  Make all result tables within a single query replicated; useful when meeting the input table requirements of JOIN, UNION, etc. in a query containing aggregated subqueries which generate differentlysharded result tables. 
KI_HINT_GROUP_BY_PK  Create primary keys for all GROUP BY result sets on the groupedupon columns/expressions within a given query; often used to create a primary key on the result of a CREATE TABLE...AS that ends in a GROUP BY, and can also make materialized views containing grouping operations more performant. Note If any of the groupedon expressions are nullable, no primary key will be applied. 
KI_HINT_HAS_HEADER  Assume the first line of the source CSV file has a Kinetica header row. Only used with CSV ingestion. 
KI_HINT_INDEX(<column list>)  Create an index on each of the commaseparated columns in the given list; often used with CREATE TABLE...AS to create an index on a persisted result set. 
KI_HINT_JOBID_PREFIX(x)  Tag corresponding database job names(s) with x; e.g., KI_HINT_JOBID_PREFIX(tag) will result in job names like ODBC_tag_0123456789abcdef0123456789abcdef. 
KI_HINT_JOIN_TABLE_CHUNK_SIZE(n)  Use chunk sizes of n records per chunk within joins. Suffixes of K & M can be used to represent thousands or millions of records; e.g., 20K, 50M. 
KI_HINT_KEEP_TEMP_TABLES  Don’t erase temp tables created by this query. 
KI_HINT_NO_COST_BASED_OPTIMIZATION  Don't use the costbased optimizer when calculating the query plan. 
KI_HINT_NO_DICT_PROJECTION  Don't retain the dictionary encoding attributes of source columns in a projection result set. 
KI_HINT_NO_DISTRIBUTED_OPERATIONS  Don't reshard data when doing so would be the only way to process one or more operations in this query. 
KI_HINT_NO_HEADER  Assume the source CSV file has no Kinetica header row. Only used with CSV ingestion. 
KI_HINT_NO_JOIN_COUNT  Optimize joins by not calculating intermediate set counts. 
KI_HINT_NO_LATE_MATERIALIZATION  Force the materialization of intermediary result sets. 
KI_HINT_NO_PARALLEL_EXECUTION  Execute all components of this query in series. 
KI_HINT_NO_PLAN_CACHE  Don't cache the query plan calculated for this query. 
KI_HINT_NO_RULE_BASED_OPTIMIZATION  Don't use the rulebased optimizer when calculating the query plan. 
KI_HINT_NO_VALIDATE_CHANGE  Don't fail an ALTER TABLE command when changing column types/sizes and the column data is too long/large. Truncate the data instead and allow the modification to succeed. 
KI_HINT_PROJECT_MATERIALIZED_VIEW  Force the materialization of a materialized view. Some materialized views containing JOIN clauses will be backed by a native join view. This is done to improve the performance of materialized view refreshes and reduce memory usage at the cost of reduced query performance. This hint will induce the reverse of this tradeoff  increased query performance at the cost of reduced refresh performance and increased memory usage. 
KI_HINT_REPL_SYNC  Instruct the target database to treat this statement as one that should be run synchronously across all HA clusters in its ring. See High Availability Operation Handling for details. Note The target database must be configured for highavailability for this to have an effect. See High Availability Configuration & Management for details. 
KI_HINT_REPL_ASYNC  Instruct the target database to treat this statement as one that should be run asynchronously across all HA clusters in its ring. See High Availability Operation Handling for details. Note The target database must be configured for highavailability for this to have an effect. See High Availability Configuration & Management for details. 
KI_HINT_REQUEST_TIMEOUT(m)  Use a timeout of m minutes when processing this command. 
KI_HINT_TRUNCATE_STRINGS  Truncate all strings being inserted into restrictedwidth (charN) columns to their max width. Used with any INSERT INTO statement, including CSV ingestion. 
KI_HINT_UPDATE_ON_EXISTING_PK  Change the record collision policy for inserting into a table with a primary key to an upsert scheme; any existing table record with a primary key that matches a record being inserted will be replaced by that new record. Without this hint, the record being inserted will be discarded. If the specified table does not have a primary key, then this hint is ignored. 
EXPLAIN
Outputs the execution plan of a given SQL statement.
Tip
For the visual explain plan utility in GAdmin, see the Explain feature under SQL Tool.


If LOGICAL is specified, the algebraic execution tree of the statement is output. Otherwise, the physical execution plan will be output.
Each supporting API endpoint call that is made in servicing the request is listed as an execution plan step in the output, along with any input or output tables associated with the call and the prior plan execution steps on which a given execution step depends.
The following options can be specified:
 LOGICAL  outputs the algebraic execution tree of the statement
 PHYSICAL  (default) outputs the physical execution plan with the
following endpointlevel details per step:
 ID  execution step number
 ENDPOINT  name of native API endpoint called
 INPUT_TABLES  input tables used by the endpoint (if any)
 OUTPUT_TABLE  output table created by the endpoint (if any)
 DEPENDENCIES  list of prior execution steps upon which this step depends
 ANALYZE  same as PHYSICAL, including additional runtime details:
 RUN_TIME  execution time of each endpoint call
 RESULT_ROWS  number of records produced in the endpoint call
 VERBOSE  same as PHYSICAL, including endpoint parameter details:
 COLUMNS  columns passed to the endpoint call
 EXPRESSIONS  expressions passed to the endpoint call
 OPTIONS  option keys & values passed to the endpoint call
 LAST_USE_TABLES  list of tables that will not be used by any following execution step
 ADDITIONAL_INFO  other parameters passed to the endpoint call
 VERBOSE ANALYZE  same as VERBOSE & ANALYZE together, including the execution plan for any joins contained within the query
 FORMAT JSON  outputs the result in JSON format
 FORMAT TABLE  (default) outputs the result in tabular format
Important
Specifying ANALYZE will cause the statement to be executed in order to collect runtime statistics on the endpoint calls made.
For example, to output the execution plan for a query that aggregates the number of taxi rides between boroughs (using KI_HINT_COMPARABLE_EXPLAIN hint to simplify output):


The execution plan is listed in table format, as follows:


If there is an error processing a query, the error can be returned in the JSONformatted execution plan:


The execution plan is listed in JSON format with the query error, as follows:

