# Data transformation expressions in mapping data flow

APPLIES TO: Azure Data Factory Azure Synapse Analytics

Data flows are available both in Azure Data Factory and Azure Synapse Pipelines. This article applies to mapping data flows. If you are new to transformations, please refer to the introductory article Transform data using a mapping data flow.

This article provides details about expressions and functions supported by Azure Data Factory and Azure Synapse Analytics in mapping data flows.

## Expression functions

In Data Factory and Synapse pipelines, use the expression language of the mapping data flow feature to configure data transformations.

abs Absolute value of a number.
acos Calculates a cosine inverse value.
add Adds a pair of strings or numbers. Adds a date to a number of days. Adds a duration to a timestamp. Appends one array of similar type to another. Same as the + operator.
addDays Add days to a date or timestamp. Same as the + operator for date.
addMonths Add months to a date or timestamp. You can optionally pass a timezone.
and Logical AND operator. Same as &&.
asin Calculates an inverse sine value.
atan Calculates a inverse tangent value.
atan2 Returns the angle in radians between the positive x-axis of a plane and the point given by the coordinates.
between Checks if the first value is in between two other values inclusively. Numeric, string and datetime values can be compared
bitwiseAnd Bitwise And operator across integral types. Same as & operator
bitwiseOr Bitwise Or operator across integral types. Same as | operator
bitwiseXor Bitwise Or operator across integral types. Same as | operator
blake2b Calculates the Blake2 digest of set of column of varying primitive datatypes given a bit length which can only be multiples of 8 between 8 & 512. It can be used to calculate a fingerprint for a row
blake2bBinary Calculates the Blake2 digest of set of column of varying primitive datatypes given a bit length which can only be multiples of 8 between 8 & 512. It can be used to calculate a fingerprint for a row
case Based on alternating conditions applies one value or the other. If the number of inputs are even, the other is defaulted to NULL for last condition.
cbrt Calculates the cube root of a number.
ceil Returns the smallest integer not smaller than the number.
coalesce Returns the first not null value from a set of inputs. All inputs should be of the same type.
columnNames Gets the names of all output columns for a stream. You can pass an optional stream name as the second argument.
columns Gets the values of all output columns for a stream. You can pass an optional stream name as the second argument.
compare Compares two values of the same type. Returns negative integer if value1 < value2, 0 if value1 == value2, positive value if value1 > value2.
concat Concatenates a variable number of strings together. Same as the + operator with strings.
concatWS Concatenates a variable number of strings together with a separator. The first parameter is the separator.
cos Calculates a cosine value.
cosh Calculates a hyperbolic cosine of a value.
crc32 Calculates the CRC32 hash of set of column of varying primitive datatypes given a bit length which can only be of values 0(256), 224, 256, 384, 512. It can be used to calculate a fingerprint for a row.
currentDate Gets the current date when this job starts to run. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.
currentTimestamp Gets the current timestamp when the job starts to run with local time zone.
currentUTC Gets the current timestamp as UTC. If you want your current time to be interpreted in a different timezone than your cluster time zone, you can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. It is defaulted to the current timezone. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html. To convert the UTC time to a different timezone use `fromUTC()`.
dayOfMonth Gets the day of the month given a date.
dayOfWeek Gets the day of the week given a date. 1 - Sunday, 2 - Monday ..., 7 - Saturday.
dayOfYear Gets the day of the year given a date.
days Duration in milliseconds for number of days.
divide Divides pair of numbers. Same as the `/` operator.
dropLeft Removes as many characters from the left of the string. If the drop requested exceeds the length of the string, an empty string is returned.
dropRight Removes as many characters from the right of the string. If the drop requested exceeds the length of the string, an empty string is returned.
endsWith Checks if the string ends with the supplied string.
equals Comparison equals operator. Same as == operator.
equalsIgnoreCase Comparison equals operator ignoring case. Same as <=> operator.
escape Escapes a string according to a format. Literal values for acceptable format are 'json', 'xml', 'ecmascript', 'html', 'java'.
expr Results in a expression from a string. This is the same as writing this expression in a non-literal form. This can be used to pass parameters as string representations.
factorial Calculates the factorial of a number.
false Always returns a false value. Use the function `syntax(false())` if there is a column named 'false'.
floor Returns the largest integer not greater than the number.
fromBase64 Decodes the given base64-encoded string.
fromUTC Converts to the timestamp from UTC. You can optionally pass the timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. It is defaulted to the current timezone. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.
greater Comparison greater operator. Same as > operator.
greaterOrEqual Comparison greater than or equal operator. Same as >= operator.
greatest Returns the greatest value among the list of values as input skipping null values. Returns null if all inputs are null.
hasColumn Checks for a column value by name in the stream. You can pass a optional stream name as the second argument. Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions.
hour Gets the hour value of a timestamp. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.
hours Duration in milliseconds for number of hours.
iif Based on a condition applies one value or the other. If other is unspecified it is considered NULL. Both the values must be compatible(numeric, string...).
iifNull Checks if the first parameter is null. If not null, the first parameter is returned. If null, the second parameter is returned. If three parameters are specified, the behavior is the same as iif(isNull(value1), value2, value3) and the third parameter is returned if the first value is not null.
initCap Converts the first letter of every word to uppercase. Words are identified as separated by whitespace.
instr Finds the position(1 based) of the substring within a string. 0 is returned if not found.
isDelete Checks if the row is marked for delete. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.
isError Checks if the row is marked as error. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.
isIgnore Checks if the row is marked to be ignored. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.
isInsert Checks if the row is marked for insert. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.
isMatch Checks if the row is matched at lookup. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.
isNull Checks if the value is NULL.
isUpdate Checks if the row is marked for update. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.
isUpsert Checks if the row is marked for insert. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.
jaroWinkler Gets the JaroWinkler distance between two strings.
lastDayOfMonth Gets the last date of the month given a date.
least Comparison lesser than or equal operator. Same as <= operator.
left Extracts a substring start at index 1 with number of characters. Same as SUBSTRING(str, 1, n).
length Returns the length of the string.
lesser Comparison less operator. Same as < operator.
lesserOrEqual Comparison lesser than or equal operator. Same as <= operator.
levenshtein Gets the levenshtein distance between two strings.
like The pattern is a string that is matched literally. The exceptions are the following special symbols: _ matches any one character in the input (similar to . in `posix` regular expressions)
locate Finds the position(1 based) of the substring within a string starting a certain position. If the position is omitted it is considered from the beginning of the string. 0 is returned if not found.
log Calculates log value. An optional base can be supplied else a Euler number if used.
log10 Calculates log value based on 10 base.
lower Lowercases a string.
lpad Left pads the string by the supplied padding until it is of a certain length. If the string is equal to or greater than the length, then it is trimmed to the length.
ltrim Left trims a string of leading characters. If second parameter is unspecified, it trims whitespace. Else it trims any character specified in the second parameter.
md5 Calculates the MD5 digest of set of column of varying primitive datatypes and returns a 32 character hex string. It can be used to calculate a fingerprint for a row.
millisecond Gets the millisecond value of a date. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.
milliseconds Duration in milliseconds for number of milliseconds.
minus Subtracts numbers. Subtract number of days from a date. Subtract duration from a timestamp. Subtract two timestamps to get difference in milliseconds. Same as the - operator.
minute Gets the minute value of a timestamp. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.
minutes Duration in milliseconds for number of minutes.
mod Modulus of pair of numbers. Same as the % operator.
month Gets the month value of a date or timestamp.
monthsBetween Gets the number of months between two dates. You can round off the calculation.You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.
multiply Multiplies pair of numbers. Same as the * operator.
negate Negates a number. Turns positive numbers to negative and vice versa.
nextSequence Returns the next unique sequence. The number is consecutive only within a partition and is prefixed by the partitionId.
normalize Normalizes the string value to separate accented unicode characters.
not Logical negation operator.
notEquals Comparison not equals operator. Same as != operator.
notNull Checks if the value is not NULL.
null Returns a NULL value. Use the function `syntax(null())` if there is a column named 'null'. Any operation that uses will result in a NULL.
or Logical OR operator. Same as ||.
pMod Positive Modulus of pair of numbers.
partitionId Returns the current partition ID the input row is in.
power Raises one number to the power of another.
random Returns a random number given an optional seed within a partition. The seed should be a fixed value and is used in conjunction with the partitionId to produce random values
regexExtract Extract a matching substring for a given regex pattern. The last parameter identifies the match group and is defaulted to 1 if omitted. Use `<regex>`(back quote) to match a string without escaping.
regexMatch Checks if the string matches the given regex pattern. Use `<regex>`(back quote) to match a string without escaping.
regexReplace Replace all occurrences of a regex pattern with another substring in the given string Use `<regex>`(back quote) to match a string without escaping.
regexSplit Splits a string based on a delimiter based on regex and returns an array of strings.
replace Replace all occurrences of a substring with another substring in the given string. If the last parameter is omitted, it is default to empty string.
reverse Reverses a string.
right Extracts a substring with number of characters from the right. Same as SUBSTRING(str, LENGTH(str) - n, n).
rlike Checks if the string matches the given regex pattern.
round Rounds a number given an optional scale and an optional rounding mode. If the scale is omitted, it is defaulted to 0. If the mode is omitted, it is defaulted to ROUND_HALF_UP(5). The values for rounding include
rpad Right pads the string by the supplied padding until it is of a certain length. If the string is equal to or greater than the length, then it is trimmed to the length.
rtrim Right trims a string of trailing characters. If second parameter is unspecified, it trims whitespace. Else it trims any character specified in the second parameter.
second Gets the second value of a date. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.
seconds Duration in milliseconds for number of seconds.
sha1 Calculates the SHA-1 digest of set of column of varying primitive datatypes and returns a 40 character hex string. It can be used to calculate a fingerprint for a row.
sha2 Calculates the SHA-2 digest of set of column of varying primitive datatypes given a bit length which can only be of values 0(256), 224, 256, 384, 512. It can be used to calculate a fingerprint for a row.
sin Calculates a sine value.
sinh Calculates a hyperbolic sine value.
soundex Gets the `soundex` code for the string.
split Splits a string based on a delimiter and returns an array of strings.
sqrt Calculates the square root of a number.
startsWith Checks if the string starts with the supplied string.
subDays Subtract days from a date or timestamp. Same as the - operator for date.
subMonths Subtract months from a date or timestamp.
substring Extracts a substring of a certain length from a position. Position is 1 based. If the length is omitted, it is defaulted to end of the string.
tan Calculates a tangent value.
tanh Calculates a hyperbolic tangent value.
translate Replace one set of characters by another set of characters in the string. Characters have 1 to 1 replacement.
trim Trims a string of leading and trailing characters. If second parameter is unspecified, it trims whitespace. Else it trims any character specified in the second parameter.
true Always returns a true value. Use the function `syntax(true())` if there is a column named 'true'.
typeMatch Matches the type of the column. Can only be used in pattern expressions.number matches short, integer, long, double, float or decimal, integral matches short, integer, long, fractional matches double, float, decimal and datetime matches date or timestamp type.
unescape Unescapes a string according to a format. Literal values for acceptable format are 'json', 'xml', 'ecmascript', 'html', 'java'.
upper Uppercases a string.
uuid Returns the generated UUID.
weekOfYear Gets the week of the year given a date.
weeks Duration in milliseconds for number of weeks.
xor Logical XOR operator. Same as ^ operator.
year Gets the year value of a date.

## Aggregate functions

The following functions are only available in aggregate, pivot, unpivot, and window transformations.

approxDistinctCount Gets the approximate aggregate count of distinct values for a column. The optional second parameter is to control the estimation error.
avg Gets the average of values of a column.
avgIf Based on a criteria gets the average of values of a column.
collect Collects all values of the expression in the aggregated group into an array. Structures can be collected and transformed to alternate structures during this process. The number of items will be equal to the number of rows in that group and can contain null values. The number of collected items should be small.
count Gets the aggregate count of values. If the optional column(s) is specified, it ignores NULL values in the count.
countDistinct Gets the aggregate count of distinct values of a set of columns.
countIf Based on a criteria gets the aggregate count of values. If the optional column is specified, it ignores NULL values in the count.
covariancePopulation Gets the population covariance between two columns.
covariancePopulationIf Based on a criteria, gets the population covariance of two columns.
covarianceSample Gets the sample covariance of two columns.
covarianceSampleIf Based on a criteria, gets the sample covariance of two columns.
first Gets the first value of a column group. If the second parameter ignoreNulls is omitted, it is assumed false.
isDistinct Finds if a column or set of columns is distinct. It does not count null as a distinct value
kurtosis Gets the kurtosis of a column.
kurtosisIf Based on a criteria, gets the kurtosis of a column.
last Gets the last value of a column group. If the second parameter ignoreNulls is omitted, it is assumed false.
max Gets the maximum value of a column.
maxIf Based on a criteria, gets the maximum value of a column.
mean Gets the mean of values of a column. Same as AVG.
meanIf Based on a criteria gets the mean of values of a column. Same as avgIf.
min Gets the minimum value of a column.
minIf Based on a criteria, gets the minimum value of a column.
skewness Gets the skewness of a column.
skewnessIf Based on a criteria, gets the skewness of a column.
stddev Gets the standard deviation of a column.
stddevIf Based on a criteria, gets the standard deviation of a column.
stddevPopulation Gets the population standard deviation of a column.
stddevPopulationIf Based on a criteria, gets the population standard deviation of a column.
stddevSample Gets the sample standard deviation of a column.
stddevSampleIf Based on a criteria, gets the sample standard deviation of a column.
sum Gets the aggregate sum of a numeric column.
sumDistinct Gets the aggregate sum of distinct values of a numeric column.
sumDistinctIf Based on criteria gets the aggregate sum of a numeric column. The condition can be based on any column.
sumIf Based on criteria gets the aggregate sum of a numeric column. The condition can be based on any column.
variance Gets the variance of a column.
varianceIf Based on a criteria, gets the variance of a column.
variancePopulation Gets the population variance of a column.
variancePopulationIf Based on a criteria, gets the population variance of a column.
varianceSample Gets the unbiased variance of a column.
varianceSampleIf Based on a criteria, gets the unbiased variance of a column.

## Array functions

Array functions perform transformations on data structures that are arrays. These include special keywords to address array elements and indexes:

• `#acc` represents a value that you wish to include in your single output when reducing an array
• `#index` represents the current array index, along with array index numbers `#index2, #index3 ...`
• `#item` represents the current element value in the array
array Creates an array of items. All items should be of the same type. If no items are specified, an empty string array is the default. Same as a [] creation operator.
at Finds the element at an array index. The index is 1-based. Out of bounds index results in a null value. Finds a value in a map given a key. If the key is not found it returns null.
contains Returns true if any element in the provided array evaluates as true in the provided predicate. Contains expects a reference to one element in the predicate function as #item.
distinct Returns a distinct set of items from an array.
except Returns a difference set of one array from another dropping duplicates.
filter Filters elements out of the array that do not meet the provided predicate. Filter expects a reference to one element in the predicate function as #item.
find Find the first item from an array that match the condition. It takes a filter function where you can address the item in the array as #item. For deeply nested maps you can refer to the parent maps using the #item_n(#item_1, #item_2...) notation.
flatten Flattens array or arrays into a single array. Arrays of atomic items are returned unaltered. The last argument is optional and is defaulted to false to flatten recursively more than one level deep.
in Checks if an item is in the array.
intersect Returns an intersection set of distinct items from 2 arrays.
map Maps each element of the array to a new element using the provided expression. Map expects a reference to one element in the expression function as #item.
mapIf Conditionally maps an array to another array of same or smaller length. The values can be of any datatype including structTypes. It takes a mapping function where you can address the item in the array as #item and current index as #index. For deeply nested maps you can refer to the parent maps using the `#item_[n](#item_1, #index_1...)` notation.
mapIndex Maps each element of the array to a new element using the provided expression. Map expects a reference to one element in the expression function as #item and a reference to the element index as #index.
mapLoop Loops through from 1 to length to create an array of that length. It takes a mapping function where you can address the index in the array as #index. For deeply nested maps you can refer to the parent maps using the #index_n(#index_1, #index_2...) notation.
reduce Accumulates elements in an array. Reduce expects a reference to an accumulator and one element in the first expression function as #acc and #item and it expects the resulting value as #result to be used in the second expression function.
size Finds the size of an array or map type
slice Extracts a subset of an array from a position. Position is 1 based. If the length is omitted, it is defaulted to end of the string.
sort Sorts the array using the provided predicate function. Sort expects a reference to two consecutive elements in the expression function as #item1 and #item2.
unfold Unfolds an array into a set of rows and repeats the values for the remaining columns in every row.
union Returns a union set of distinct items from 2 arrays.

## Cached lookup functions

The following functions are only available when using a cached lookup when you've included a cached sink.

lookup Looks up the first row from the cached sink using the specified keys that match the keys from the cached sink.
mlookup Looks up the all matching rows from the cached sink using the specified keys that match the keys from the cached sink.
output Returns the first row of the results of the cache sink
outputs Returns the entire output row set of the results of the cache sink

## Conversion functions

Conversion functions are used to convert data and test for data types

isBitSet Checks if a bit position is set in this bitset
setBitSet Sets bit positions in this bitset
isBoolean Checks if the string value is a boolean value according to the rules of `toBoolean()`
isByte Checks if the string value is a byte value given an optional format according to the rules of `toByte()`
isDate Checks if the input date string is a date using an optional input date format. Refer Java's SimpleDateFormat for available formats. If the input date format is omitted, default format is `yyyy-[M]M-[d]d`. Accepted formats are `[ yyyy, yyyy-[M]M, yyyy-[M]M-[d]d, yyyy-[M]M-[d]dT* ]`
isShort Checks of the string value is a short value given an optional format according to the rules of `toShort()`
isInteger Checks of the string value is a integer value given an optional format according to the rules of `toInteger()`
isLong Checks of the string value is a long value given an optional format according to the rules of `toLong()`
isNan Check if this is not a number.
isFloat Checks of the string value is a float value given an optional format according to the rules of `toFloat()`
isDouble Checks of the string value is a double value given an optional format according to the rules of `toDouble()`
isDecimal Checks of the string value is a decimal value given an optional format according to the rules of `toDecimal()`
isTimestamp Checks if the input date string is a timestamp using an optional input timestamp format. Refer to Java's SimpleDateFormat for available formats. If the timestamp is omitted the default pattern `yyyy-[M]M-[d]d hh:mm:ss[.f...]` is used. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. Timestamp supports up to millisecond accuracy with value of 999 Refer to Java's SimpleDateFormat for available formats.
toBase64 Encodes the given string in base64.
toBinary Converts any numeric/date/timestamp/string to binary representation.
toBoolean Converts a value of ('t', 'true', 'y', 'yes', '1') to true and ('f', 'false', 'n', 'no', '0') to false and NULL for any other value.
toByte Converts any numeric or string to a byte value. An optional Java decimal format can be used for the conversion.
toDate Converts input date string to date using an optional input date format. Refer Java's `SimpleDateFormat` class for available formats. If the input date format is omitted, default format is yyyy-[M]M-[d]d. Accepted formats are :[ yyyy, yyyy-[M]M, yyyy-[M]M-[d]d, yyyy-[M]M-[d]dT* ].
toDecimal Converts any numeric or string to a decimal value. If precision and scale are not specified, it is defaulted to (10,2).An optional Java decimal format can be used for the conversion. An optional locale format in the form of BCP47 language like en-US, de, zh-CN.
toDouble Converts any numeric or string to a double value. An optional Java decimal format can be used for the conversion. An optional locale format in the form of BCP47 language like en-US, de, zh-CN.
toFloat Converts any numeric or string to a float value. An optional Java decimal format can be used for the conversion. Truncates any double.
toInteger Converts any numeric or string to an integer value. An optional Java decimal format can be used for the conversion. Truncates any long, float, double.
toLong Converts any numeric or string to a long value. An optional Java decimal format can be used for the conversion. Truncates any float, double.
toShort Converts any numeric or string to a short value. An optional Java decimal format can be used for the conversion. Truncates any integer, long, float, double.
toString Converts a primitive datatype to a string. For numbers and date a format can be specified. If unspecified the system default is picked.Java decimal format is used for numbers. Refer to Java SimpleDateFormat for all possible date formats; the default format is yyyy-MM-dd.
toTimestamp Converts a string to a timestamp given an optional timestamp format. If the timestamp is omitted the default pattern yyyy-[M]M-[d]d hh:mm:ss[.f...] is used. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. Timestamp supports up to millisecond accuracy with value of 999. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.
toUTC Converts the timestamp to UTC. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. It is defaulted to the current timezone. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

## Map functions

Map functions perform operations on map data types

associate Creates a map of key/values. All the keys & values should be of the same type. If no items are specified, it is defaulted to a map of string to string type.Same as a `[ -> ]` creation operator. Keys and values should alternate with each other.
keyValues Creates a map of key/values. The first parameter is an array of keys and second is the array of values. Both arrays should have equal length.
mapAssociation Transforms a map by associating the keys to new values. Returns an array. It takes a mapping function where you can address the item as #key and current value as #value.
reassociate Transforms a map by associating the keys to new values. It takes a mapping function where you can address the item as #key and current value as #value.

## Metafunctions

byItem Find a sub item within a structure or array of structure If there are multiple matches, the first match is returned. If no match it returns a NULL value. The returned value has to be type converted by one of the type conversion actions(? date, ? string ...). Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions
byOrigin Selects a column value by name in the origin stream. The second argument is the origin stream name. If there are multiple matches, the first match is returned. If no match it returns a NULL value. The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...). Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions.
byOrigins Selects an array of columns by name in the stream. The second argument is the stream where it originated from. If there are multiple matches, the first match is returned. If no match it returns a NULL value. The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...) Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions.
byName Selects a column value by name in the stream. You can pass a optional stream name as the second argument. If there are multiple matches, the first match is returned. If no match it returns a NULL value. The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...). Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions.
byNames Select an array of columns by name in the stream. You can pass a optional stream name as the second argument. If there are multiple matches, the first match is returned. If there are no matches for a column, the entire output is a NULL value. The returned value requires a type conversion functions (toDate, toString, ...). Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions.
byPath Finds a hierarchical path by name in the stream. You can pass an optional stream name as the second argument. If no such path is found it returns null. Column names/paths known at design time should be addressed just by their name or dot notation path. Computed inputs are not supported but you can use parameter substitutions.
byPosition Selects a column value by its relative position(1 based) in the stream. If the position is out of bounds it returns a NULL value. The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...) Computed inputs are not supported but you can use parameter substitutions.
hasPath Checks if a certain hierarchical path exists by name in the stream. You can pass an optional stream name as the second argument. Column names/paths known at design time should be addressed just by their name or dot notation path. Computed inputs are not supported but you can use parameter substitutions.
originColumns Gets all output columns for a origin stream where columns were created. Must be enclosed in another function.
hex Returns a hex string representation of a binary value
unhex Unhexes a binary value from its string representation. This can be used in conjunction with sha2, md5 to convert from string to binary representation

## Window functions

The following functions are only available in window transformations.

cumeDist The CumeDist function computes the position of a value relative to all values in the partition. The result is the number of rows preceding or equal to the current row in the ordering of the partition divided by the total number of rows in the window partition. Any tie values in the ordering will evaluate to the same position.
denseRank Computes the rank of a value in a group of values specified in a window's order by clause. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will not produce gaps in the sequence. Dense Rank works even when data is not sorted and looks for change in values.
lag Gets the value of the first parameter evaluated n rows before the current row. The second parameter is the number of rows to look back and the default value is 1. If there are not as many rows a value of null is returned unless a default value is specified.
lead Gets the value of the first parameter evaluated n rows after the current row. The second parameter is the number of rows to look forward and the default value is 1. If there are not as many rows a value of null is returned unless a default value is specified.
nTile The `NTile` function divides the rows for each window partition into `n` buckets ranging from 1 to at most `n`. Bucket values will differ by at most 1. If the number of rows in the partition does not divide evenly into the number of buckets, then the remainder values are distributed one per bucket, starting with the first bucket. The `NTile` function is useful for the calculation of `tertiles`, quartiles, deciles, and other common summary statistics. The function calculates two variables during initialization: The size of a regular bucket will have one extra row added to it. Both variables are based on the size of the current partition. During the calculation process the function keeps track of the current row number, the current bucket number, and the row number at which the bucket will change (bucketThreshold). When the current row number reaches bucket threshold, the bucket value is increased by one and the threshold is increased by the bucket size (plus one extra if the current bucket is padded).
rank Computes the rank of a value in a group of values specified in a window's order by clause. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will produce gaps in the sequence. Rank works even when data is not sorted and looks for change in values.
rowNumber Assigns a sequential row numbering for rows in a window starting with 1.

## Alphabetical listing of all functions

Following is an alphabetical listing of all functions available in mapping data flows.

### `abs`

`abs(<value1> : number) => number`

Absolute value of a number.

• `abs(-20) -> 20`
• `abs(10) -> 10`

### `acos`

`acos(<value1> : number) => double`

Calculates a cosine inverse value.

• `acos(1) -> 0.0`

### `add`

`add(<value1> : any, <value2> : any) => any`

Adds a pair of strings or numbers. Adds a date to a number of days. Adds a duration to a timestamp. Appends one array of similar type to another. Same as the + operator.

• `add(10, 20) -> 30`
• `10 + 20 -> 30`
• `add('ice', 'cream') -> 'icecream'`
• `'ice' + 'cream' + ' cone' -> 'icecream cone'`
• `add(toDate('2012-12-12'), 3) -> toDate('2012-12-15')`
• `toDate('2012-12-12') + 3 -> toDate('2012-12-15')`
• `[10, 20] + [30, 40] -> [10, 20, 30, 40]`
• `toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') + (days(1) + hours(2) - seconds(10)) -> toTimestamp('2019-02-04 07:19:18.871', 'yyyy-MM-dd HH:mm:ss.SSS')`

### `addDays`

`addDays(<date/timestamp> : datetime, <days to add> : integral) => datetime`

Add days to a date or timestamp. Same as the + operator for date.

• `addDays(toDate('2016-08-08'), 1) -> toDate('2016-08-09')`

### `addMonths`

`addMonths(<date/timestamp> : datetime, <months to add> : integral, [<value3> : string]) => datetime`

Add months to a date or timestamp. You can optionally pass a timezone.

• `addMonths(toDate('2016-08-31'), 1) -> toDate('2016-09-30')`
• `addMonths(toTimestamp('2016-09-30 10:10:10'), -1) -> toTimestamp('2016-08-31 10:10:10')`

### `and`

`and(<value1> : boolean, <value2> : boolean) => boolean`

Logical AND operator. Same as &&.

• `and(true, false) -> false`
• `true && false -> false`

### `approxDistinctCount`

`approxDistinctCount(<value1> : any, [ <value2> : double ]) => long`

Gets the approximate aggregate count of distinct values for a column. The optional second parameter is to control the estimation error.

• `approxDistinctCount(ProductID, .05) => long`

### `array`

`array([<value1> : any], ...) => array`

Creates an array of items. All items should be of the same type. If no items are specified, an empty string array is the default. Same as a [] creation operator.

• `array('Seattle', 'Washington')`
• `['Seattle', 'Washington']`
• `['Seattle', 'Washington']`
• `'Washington'`

### `asin`

`asin(<value1> : number) => double`

Calculates an inverse sine value.

• `asin(0) -> 0.0`

### `associate`

`reassociate(<value1> : map, <value2> : binaryFunction) => map`

Creates a map of key/values. All the keys & values should be of the same type. If no items are specified, it is defaulted to a map of string to string type.Same as a `[ -> ]` creation operator. Keys and values should alternate with each other.

• `associate('fruit', 'apple', 'vegetable', 'carrot' )=> ['fruit' -> 'apple', 'vegetable' -> 'carrot']`

### `at`

`at(<value1> : array/map, <value2> : integer/key type) => array`

Finds the element at an array index. The index is 1-based. Out of bounds index results in a null value. Finds a value in a map given a key. If the key is not found it returns null.

• `at(['apples', 'pears'], 1) => 'apples'`
• `at(['fruit' -> 'apples', 'vegetable' -> 'carrot'], 'fruit') => 'apples'`

### `atan`

`atan(<value1> : number) => double`

Calculates a inverse tangent value.

• `atan(0) -> 0.0`

### `atan2`

`atan2(<value1> : number, <value2> : number) => double`

Returns the angle in radians between the positive x-axis of a plane and the point given by the coordinates.

• `atan2(0, 0) -> 0.0`

### `avg`

`avg(<value1> : number) => number`

Gets the average of values of a column.

• `avg(sales)`

### `avgIf`

`avgIf(<value1> : boolean, <value2> : number) => number`

Based on a criteria gets the average of values of a column.

• `avgIf(region == 'West', sales)`

### `between`

`between(<value1> : any, <value2> : any, <value3> : any) => boolean`

Checks if the first value is in between two other values inclusively. Numeric, string and datetime values can be compared

• `between(10, 5, 24)`
• `true`
• `between(currentDate(), currentDate() + 10, currentDate() + 20)`
• `false`

### `bitwiseAnd`

`bitwiseAnd(<value1> : integral, <value2> : integral) => integral`

Bitwise And operator across integral types. Same as & operator

• `bitwiseAnd(0xf4, 0xef)`
• `0xe4`
• `(0xf4 & 0xef)`
• `0xe4`

### `bitwiseOr`

`bitwiseOr(<value1> : integral, <value2> : integral) => integral`

Bitwise Or operator across integral types. Same as | operator

• `bitwiseOr(0xf4, 0xef)`
• `0xff`
• `(0xf4 | 0xef)`
• `0xff`

### `bitwiseXor`

`bitwiseXor(<value1> : any, <value2> : any) => any`

Bitwise Or operator across integral types. Same as | operator

• `bitwiseXor(0xf4, 0xef)`
• `0x1b`
• `(0xf4 ^ 0xef)`
• `0x1b`
• `(true ^ false)`
• `true`
• `(true ^ true)`
• `false`

### `blake2b`

`blake2b(<value1> : integer, <value2> : any, ...) => string`

Calculates the Blake2 digest of set of column of varying primitive datatypes given a bit length which can only be multiples of 8 between 8 & 512. It can be used to calculate a fingerprint for a row

• `blake2b(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4'))`
• `'c9521a5080d8da30dffb430c50ce253c345cc4c4effc315dab2162dac974711d'`

### `blake2bBinary`

`blake2bBinary(<value1> : integer, <value2> : any, ...) => binary`

Calculates the Blake2 digest of set of column of varying primitive datatypes given a bit length which can only be multiples of 8 between 8 & 512. It can be used to calculate a fingerprint for a row

• `blake2bBinary(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4'))`
• `unHex('c9521a5080d8da30dffb430c50ce253c345cc4c4effc315dab2162dac974711d')`

### `byItem`

`byItem(<parent column> : any, <column name> : string) => any`

Find a sub item within a structure or array of structure If there are multiple matches, the first match is returned. If no match it returns a NULL value. The returned value has to be type converted by one of the type conversion actions(? date, ? string ...). Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions

• `byItem( byName('customer'), 'orderItems') ? (itemName as string, itemQty as integer)`
• `byItem( byItem( byName('customer'), 'orderItems'), 'itemName') ? string`

### `byName`

`byName(<column name> : string, [<stream name> : string]) => any`

Selects a column value by name in the stream. You can pass a optional stream name as the second argument. If there are multiple matches, the first match is returned. If no match it returns a NULL value. The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...). Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions.

• `toString(byName('parent'))`
• `toLong(byName('income'))`
• `toBoolean(byName('foster'))`
• `toLong(byName(\$debtCol))`
• `toString(byName('Bogus Column'))`
• `toString(byName('Bogus Column', 'DeriveStream'))`

### `byNames`

`byNames(<column names> : array, [<stream name> : string]) => any`

Select an array of columns by name in the stream. You can pass a optional stream name as the second argument. If there are multiple matches, the first match is returned. If there are no matches for a column, the entire output is a NULL value. The returned value requires a type conversion functions (toDate, toString, ...). Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions.

• `toString(byNames(['parent', 'child']))`
• `byNames(['parent']) ? string`
• `toLong(byNames(['income']))`
• `byNames(['income']) ? long`
• `toBoolean(byNames(['foster']))`
• `toLong(byNames(\$debtCols))`
• `toString(byNames(['a Column']))`
• `toString(byNames(['a Column'], 'DeriveStream'))`
• `byNames(['orderItem']) ? (itemName as string, itemQty as integer)`

### `byOrigin`

`byOrigin(<column name> : string, [<origin stream name> : string]) => any`

Selects a column value by name in the origin stream. The second argument is the origin stream name. If there are multiple matches, the first match is returned. If no match it returns a NULL value. The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...). Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions.

• `toString(byOrigin('ancestor', 'ancestorStream'))`

### `byOrigins`

`byOrigins(<column names> : array, [<origin stream name> : string]) => any`

Selects an array of columns by name in the stream. The second argument is the stream where it originated from. If there are multiple matches, the first match is returned. If no match it returns a NULL value. The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...) Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions.

• `toString(byOrigins(['ancestor1', 'ancestor2'], 'ancestorStream'))`

### `byPath`

`byPath(<value1> : string, [<streamName> : string]) => any`

Finds a hierarchical path by name in the stream. You can pass an optional stream name as the second argument. If no such path is found it returns null. Column names/paths known at design time should be addressed just by their name or dot notation path. Computed inputs are not supported but you can use parameter substitutions.

• `byPath('grandpa.parent.child') => column`

### `byPosition`

`byPosition(<position> : integer) => any`

Selects a column value by its relative position(1 based) in the stream. If the position is out of bounds it returns a NULL value. The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...) Computed inputs are not supported but you can use parameter substitutions.

• `toString(byPosition(1))`
• `toDecimal(byPosition(2), 10, 2)`
• `toBoolean(byName(4))`
• `toString(byName(\$colName))`
• `toString(byPosition(1234))`

### `case`

`case(<condition> : boolean, <true_expression> : any, <false_expression> : any, ...) => any`

Based on alternating conditions applies one value or the other. If the number of inputs are even, the other is defaulted to NULL for last condition.

• `case(10 + 20 == 30, 'dumbo', 'gumbo') -> 'dumbo'`
• `case(10 + 20 == 25, 'bojjus', 'do' < 'go', 'gunchus') -> 'gunchus'`
• `isNull(case(10 + 20 == 25, 'bojjus', 'do' > 'go', 'gunchus')) -> true`
• `case(10 + 20 == 25, 'bojjus', 'do' > 'go', 'gunchus', 'dumbo') -> 'dumbo'`

### `cbrt`

`cbrt(<value1> : number) => double`

Calculates the cube root of a number.

• `cbrt(8) -> 2.0`

### `ceil`

`ceil(<value1> : number) => number`

Returns the smallest integer not smaller than the number.

• `ceil(-0.1) -> 0`

### `coalesce`

`coalesce(<value1> : any, ...) => any`

Returns the first not null value from a set of inputs. All inputs should be of the same type.

• `coalesce(10, 20) -> 10`
• `coalesce(toString(null), toString(null), 'dumbo', 'bo', 'go') -> 'dumbo'`

### `collect`

`collect(<value1> : any) => array`

Collects all values of the expression in the aggregated group into an array. Structures can be collected and transformed to alternate structures during this process. The number of items will be equal to the number of rows in that group and can contain null values. The number of collected items should be small.

• `collect(salesPerson)`
• `collect(firstName + lastName))`
• `collect(@(name = salesPerson, sales = salesAmount) )`

### `columnNames`

`columnNames(<value1> : string) => array`

Gets the names of all output columns for a stream. You can pass an optional stream name as the second argument.

• `columnNames()`
• `columnNames('DeriveStream')`

### `columns`

`columns([<stream name> : string]) => any`

Gets the values of all output columns for a stream. You can pass an optional stream name as the second argument.

• `columns()`
• `columns('DeriveStream')`

### `compare`

`compare(<value1> : any, <value2> : any) => integer`

Compares two values of the same type. Returns negative integer if value1 < value2, 0 if value1 == value2, positive value if value1 > value2.

• `(compare(12, 24) < 1) -> true`
• `(compare('dumbo', 'dum') > 0) -> true`

### `concat`

`concat(<this> : string, <that> : string, ...) => string`

Concatenates a variable number of strings together. Same as the + operator with strings.

• `concat('dataflow', 'is', 'awesome') -> 'dataflowisawesome'`
• `'dataflow' + 'is' + 'awesome' -> 'dataflowisawesome'`
• `isNull('sql' + null) -> true`

### `concatWS`

`concatWS(<separator> : string, <this> : string, <that> : string, ...) => string`

Concatenates a variable number of strings together with a separator. The first parameter is the separator.

• `concatWS(' ', 'dataflow', 'is', 'awesome') -> 'dataflow is awesome'`
• `isNull(concatWS(null, 'dataflow', 'is', 'awesome')) -> true`
• `concatWS(' is ', 'dataflow', 'awesome') -> 'dataflow is awesome'`

### `contains`

`contains(<value1> : array, <value2> : unaryfunction) => boolean`

Returns true if any element in the provided array evaluates as true in the provided predicate. Contains expects a reference to one element in the predicate function as #item.

• `contains([1, 2, 3, 4], #item == 3) -> true`
• `contains([1, 2, 3, 4], #item > 5) -> false`

### `cos`

`cos(<value1> : number) => double`

Calculates a cosine value.

• `cos(10) -> -0.8390715290764524`

### `cosh`

`cosh(<value1> : number) => double`

Calculates a hyperbolic cosine of a value.

• `cosh(0) -> 1.0`

### `count`

`count([<value1> : any]) => long`

Gets the aggregate count of values. If the optional column(s) is specified, it ignores NULL values in the count.

• `count(custId)`
• `count(custId, custName)`
• `count()`
• `count(iif(isNull(custId), 1, NULL))`

### `countDistinct`

`countDistinct(<value1> : any, [<value2> : any], ...) => long`

Gets the aggregate count of distinct values of a set of columns.

• `countDistinct(custId, custName)`

### `countIf`

`countIf(<value1> : boolean, [<value2> : any]) => long`

Based on a criteria gets the aggregate count of values. If the optional column is specified, it ignores NULL values in the count.

• `countIf(state == 'CA' && commission < 10000, name)`

### `covariancePopulation`

`covariancePopulation(<value1> : number, <value2> : number) => double`

Gets the population covariance between two columns.

• `covariancePopulation(sales, profit)`

### `covariancePopulationIf`

`covariancePopulationIf(<value1> : boolean, <value2> : number, <value3> : number) => double`

Based on a criteria, gets the population covariance of two columns.

• `covariancePopulationIf(region == 'West', sales)`

### `covarianceSample`

`covarianceSample(<value1> : number, <value2> : number) => double`

Gets the sample covariance of two columns.

• `covarianceSample(sales, profit)`

### `covarianceSampleIf`

`covarianceSampleIf(<value1> : boolean, <value2> : number, <value3> : number) => double`

Based on a criteria, gets the sample covariance of two columns.

• `covarianceSampleIf(region == 'West', sales, profit)`

### `crc32`

`crc32(<value1> : any, ...) => long`

Calculates the CRC32 hash of set of column of varying primitive datatypes given a bit length which can only be of values 0(256), 224, 256, 384, 512. It can be used to calculate a fingerprint for a row.

• `crc32(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> 3630253689L`

### `cumeDist`

`cumeDist() => integer`

The CumeDist function computes the position of a value relative to all values in the partition. The result is the number of rows preceding or equal to the current row in the ordering of the partition divided by the total number of rows in the window partition. Any tie values in the ordering will evaluate to the same position.

• `cumeDist()`

### `currentDate`

`currentDate([<value1> : string]) => date`

Gets the current date when this job starts to run. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

• `currentDate() == toDate('2250-12-31') -> false`
• `currentDate('PST') == toDate('2250-12-31') -> false`
• `currentDate('America/New_York') == toDate('2250-12-31') -> false`

### `currentTimestamp`

`currentTimestamp() => timestamp`

Gets the current timestamp when the job starts to run with local time zone.

• `currentTimestamp() == toTimestamp('2250-12-31 12:12:12') -> false`

### `currentUTC`

`currentUTC([<value1> : string]) => timestamp`

Gets the current timestamp as UTC. If you want your current time to be interpreted in a different timezone than your cluster time zone, you can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. It is defaulted to the current timezone. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html. To convert the UTC time to a different timezone use `fromUTC()`.

• `currentUTC() == toTimestamp('2050-12-12 19:18:12') -> false`
• `currentUTC() != toTimestamp('2050-12-12 19:18:12') -> true`
• `fromUTC(currentUTC(), 'Asia/Seoul') != toTimestamp('2050-12-12 19:18:12') -> true`

### `dayOfMonth`

`dayOfMonth(<value1> : datetime) => integer`

Gets the day of the month given a date.

• `dayOfMonth(toDate('2018-06-08')) -> 8`

### `dayOfWeek`

`dayOfWeek(<value1> : datetime) => integer`

Gets the day of the week given a date. 1 - Sunday, 2 - Monday ..., 7 - Saturday.

• `dayOfWeek(toDate('2018-06-08')) -> 6`

### `dayOfYear`

`dayOfYear(<value1> : datetime) => integer`

Gets the day of the year given a date.

• `dayOfYear(toDate('2016-04-09')) -> 100`

### `days`

`days(<value1> : integer) => long`

Duration in milliseconds for number of days.

• `days(2) -> 172800000L`

### `degrees`

`degrees(<value1> : number) => double`

• `degrees(3.141592653589793) -> 180`

### `denseRank`

`denseRank() => integer`

Computes the rank of a value in a group of values specified in a window's order by clause. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will not produce gaps in the sequence. Dense Rank works even when data is not sorted and looks for change in values.

• `denseRank()`

### `distinct`

`distinct(<value1> : array) => array`

Returns a distinct set of items from an array.

• `distinct([10, 20, 30, 10]) => [10, 20, 30]`

### `divide`

`divide(<value1> : any, <value2> : any) => any`

Divides pair of numbers. Same as the `/` operator.

• `divide(20, 10) -> 2`
• `20 / 10 -> 2`

### `dropLeft`

`dropLeft(<value1> : string, <value2> : integer) => string`

Removes as many characters from the left of the string. If the drop requested exceeds the length of the string, an empty string is returned.

• dropLeft('bojjus', 2) => 'jjus'
• dropLeft('cake', 10) => ''

### `dropRight`

`dropRight(<value1> : string, <value2> : integer) => string`

Removes as many characters from the right of the string. If the drop requested exceeds the length of the string, an empty string is returned.

• dropRight('bojjus', 2) => 'bojj'
• dropRight('cake', 10) => ''

### `endsWith`

`endsWith(<string> : string, <substring to check> : string) => boolean`

Checks if the string ends with the supplied string.

• `endsWith('dumbo', 'mbo') -> true`

### `equals`

`equals(<value1> : any, <value2> : any) => boolean`

Comparison equals operator. Same as == operator.

• `equals(12, 24) -> false`
• `12 == 24 -> false`
• `'bad' == 'bad' -> true`
• `isNull('good' == toString(null)) -> true`
• `isNull(null == null) -> true`

### `equalsIgnoreCase`

`equalsIgnoreCase(<value1> : string, <value2> : string) => boolean`

Comparison equals operator ignoring case. Same as <=> operator.

• `'abc'<=>'Abc' -> true`
• `equalsIgnoreCase('abc', 'Abc') -> true`

### `escape`

`escape(<string_to_escape> : string, <format> : string) => string`

Escapes a string according to a format. Literal values for acceptable format are 'json', 'xml', 'ecmascript', 'html', 'java'.

### `except`

`except(<value1> : array, <value2> : array) => array`

Returns a difference set of one array from another dropping duplicates.

• `except([10, 20, 30], [20, 40]) => [10, 30]`

### `expr`

`expr(<expr> : string) => any`

Results in a expression from a string. This is the same as writing this expression in a non-literal form. This can be used to pass parameters as string representations.

• expr('price * discount') => any

### `factorial`

`factorial(<value1> : number) => long`

Calculates the factorial of a number.

• `factorial(5) -> 120`

### `false`

`false() => boolean`

Always returns a false value. Use the function `syntax(false())` if there is a column named 'false'.

• `(10 + 20 > 30) -> false`
• `(10 + 20 > 30) -> false()`

### `filter`

`filter(<value1> : array, <value2> : unaryfunction) => array`

Filters elements out of the array that do not meet the provided predicate. Filter expects a reference to one element in the predicate function as #item.

• `filter([1, 2, 3, 4], #item > 2) -> [3, 4]`
• `filter(['a', 'b', 'c', 'd'], #item == 'a' || #item == 'b') -> ['a', 'b']`

### `find`

`find(<value1> : array, <value2> : unaryfunction) => any`

Find the first item from an array that match the condition. It takes a filter function where you can address the item in the array as #item. For deeply nested maps you can refer to the parent maps using the #item_n(#item_1, #item_2...) notation.

• `find([10, 20, 30], #item > 10) -> 20`
• `find(['azure', 'data', 'factory'], length(#item) > 4) -> 'azure'`
• `find([ @( name = 'Daniel', types = [ @(mood = 'jovial', behavior = 'terrific'), @(mood = 'grumpy', behavior = 'bad') ] ), @( name = 'Mark', types = [ @(mood = 'happy', behavior = 'awesome'), @(mood = 'calm', behavior = 'reclusive') ] ) ], contains(#item.types, #item.mood=='happy') /*Filter out the happy kid*/ )`
• `@( name = 'Mark', types = [ @(mood = 'happy', behavior = 'awesome'), @(mood = 'calm', behavior = 'reclusive') ] )`

### `first`

`first(<value1> : any, [<value2> : boolean]) => any`

Gets the first value of a column group. If the second parameter ignoreNulls is omitted, it is assumed false.

• `first(sales)`
• `first(sales, false)`

### `flatten`

`flatten(<array> : array, <value2> : array ..., <value2> : boolean) => array`

Flattens array or arrays into a single array. Arrays of atomic items are returned unaltered. The last argument is optional and is defaulted to false to flatten recursively more than one level deep.

• `flatten([['bojjus', 'girl'], ['gunchus', 'boy']]) => ['bojjus', 'girl', 'gunchus', 'boy']`
• `flatten([[['bojjus', 'gunchus']]] , true) => ['bojjus', 'gunchus']`

### `floor`

`floor(<value1> : number) => number`

Returns the largest integer not greater than the number.

• `floor(-0.1) -> -1`

### `fromBase64`

`fromBase64(<value1> : string) => string`

Decodes the given base64-encoded string.

• `fromBase64('Z3VuY2h1cw==') -> 'gunchus'`

### `fromUTC`

`fromUTC(<value1> : timestamp, [<value2> : string]) => timestamp`

Converts to the timestamp from UTC. You can optionally pass the timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. It is defaulted to the current timezone. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

• `fromUTC(currentTimestamp()) == toTimestamp('2050-12-12 19:18:12') -> false`
• `fromUTC(currentTimestamp(), 'Asia/Seoul') != toTimestamp('2050-12-12 19:18:12') -> true`

### `greater`

`greater(<value1> : any, <value2> : any) => boolean`

Comparison greater operator. Same as > operator.

• `greater(12, 24) -> false`
• `('dumbo' > 'dum') -> true`
• `(toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS') > toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS')) -> true`

### `greaterOrEqual`

`greaterOrEqual(<value1> : any, <value2> : any) => boolean`

Comparison greater than or equal operator. Same as >= operator.

• `greaterOrEqual(12, 12) -> true`
• `('dumbo' >= 'dum') -> true`

### `greatest`

`greatest(<value1> : any, ...) => any`

Returns the greatest value among the list of values as input skipping null values. Returns null if all inputs are null.

• `greatest(10, 30, 15, 20) -> 30`
• `greatest(10, toInteger(null), 20) -> 20`
• `greatest(toDate('2010-12-12'), toDate('2011-12-12'), toDate('2000-12-12')) -> toDate('2011-12-12')`
• `greatest(toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS'), toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS')) -> toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS')`

### `hasColumn`

`hasColumn(<column name> : string, [<stream name> : string]) => boolean`

Checks for a column value by name in the stream. You can pass a optional stream name as the second argument. Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions.

• `hasColumn('parent')`

### `hasPath`

`hasPath(<value1> : string, [<streamName> : string]) => boolean`

Checks if a certain hierarchical path exists by name in the stream. You can pass an optional stream name as the second argument. Column names/paths known at design time should be addressed just by their name or dot notation path. Computed inputs are not supported but you can use parameter substitutions.

• `hasPath('grandpa.parent.child') => boolean`

### `hex`

`hex(<value1>: binary) => string`

Returns a hex string representation of a binary value

• `hex(toBinary([toByte(0x1f), toByte(0xad), toByte(0xbe)])) -> '1fadbe'`

### `hour`

`hour(<value1> : timestamp, [<value2> : string]) => integer`

Gets the hour value of a timestamp. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

• `hour(toTimestamp('2009-07-30 12:58:59')) -> 12`
• `hour(toTimestamp('2009-07-30 12:58:59'), 'PST') -> 12`

### `hours`

`hours(<value1> : integer) => long`

Duration in milliseconds for number of hours.

• `hours(2) -> 7200000L`

### `iif`

`iif(<condition> : boolean, <true_expression> : any, [<false_expression> : any]) => any`

Based on a condition applies one value or the other. If other is unspecified it is considered NULL. Both the values must be compatible(numeric, string...).

• `iif(10 + 20 == 30, 'dumbo', 'gumbo') -> 'dumbo'`
• `iif(10 > 30, 'dumbo', 'gumbo') -> 'gumbo'`
• `iif(month(toDate('2018-12-01')) == 12, 345.12, 102.67) -> 345.12`

### `iifNull`

`iifNull(<value1> : any, [<value2> : any], ...) => any`

Checks if the first parameter is null. If not null, the first parameter is returned. If null, the second parameter is returned. If three parameters are specified, the behavior is the same as iif(isNull(value1), value2, value3) and the third parameter is returned if the first value is not null.

• `iifNull(10, 20) -> 10`
• `iifNull(null, 20, 40) -> 20`
• `iifNull('azure', 'data', 'factory') -> 'factory'`
• `iifNull(null, 'data', 'factory') -> 'data'`

### `in`

`in(<array of items> : array, <item to find> : any) => boolean`

Checks if an item is in the array.

• `in([10, 20, 30], 10) -> true`
• `in(['good', 'kid'], 'bad') -> false`

### `initCap`

`initCap(<value1> : string) => string`

Converts the first letter of every word to uppercase. Words are identified as separated by whitespace.

• `initCap('cool iceCREAM') -> 'Cool Icecream'`

### `instr`

`instr(<string> : string, <substring to find> : string) => integer`

Finds the position(1 based) of the substring within a string. 0 is returned if not found.

• `instr('dumbo', 'mbo') -> 3`
• `instr('microsoft', 'o') -> 5`
• `instr('good', 'bad') -> 0`

### `intersect`

`intersect(<value1> : array, <value2> : array) => array`

Returns an intersection set of distinct items from 2 arrays.

• `intersect([10, 20, 30], [20, 40]) => `

### `isBitSet`

`isBitSet (<value1> : array, <value2>:integer ) => boolean`

Checks if a bit position is set in this bitset

• `isBitSet(toBitSet([10, 32, 98]), 10) => true`

### `isBoolean`

`isBoolean(<value1>: string) => boolean`

Checks if the string value is a boolean value according to the rules of `toBoolean()`

• `isBoolean('true') -> true`
• `isBoolean('no') -> true`
• `isBoolean('microsoft') -> false`

### `isByte`

`isByte(<value1> : string) => boolean`

Checks if the string value is a byte value given an optional format according to the rules of `toByte()`

• `isByte('123') -> true`
• `isByte('chocolate') -> false`

### `isDate`

`isDate (<value1> : string, [<format>: string]) => boolean`

Checks if the input date string is a date using an optional input date format. Refer Java's SimpleDateFormat for available formats. If the input date format is omitted, default format is `yyyy-[M]M-[d]d`. Accepted formats are `[ yyyy, yyyy-[M]M, yyyy-[M]M-[d]d, yyyy-[M]M-[d]dT* ]`

• `isDate('2012-8-18') -> true`
• `isDate('12/18--234234' -> 'MM/dd/yyyy') -> false`

### `isDecimal`

`isDecimal (<value1> : string) => boolean`

Checks of the string value is a decimal value given an optional format according to the rules of `toDecimal()`

• `isDecimal('123.45') -> true`
• `isDecimal('12/12/2000') -> false`

### `isDelete`

`isDelete([<value1> : integer]) => boolean`

Checks if the row is marked for delete. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.

• `isDelete()`
• `isDelete(1)`

### `isDistinct`

`isDistinct(<value1> : any , <value1> : any) => boolean`

Finds if a column or set of columns is distinct. It does not count null as a distinct value

• `isDistinct(custId, custName) => boolean`

### `isDouble`

`isDouble (<value1> : string, [<format>: string]) => boolean`

Checks of the string value is a double value given an optional format according to the rules of `toDouble()`

• `isDouble('123') -> true`
• `isDouble('\$123.45' -> '\$###.00') -> true`
• `isDouble('icecream') -> false`

### `isError`

`isError([<value1> : integer]) => boolean`

Checks if the row is marked as error. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.

• `isError()`
• `isError(1)`

### `isFloat`

`isFloat (<value1> : string, [<format>: string]) => boolean`

Checks of the string value is a float value given an optional format according to the rules of `toFloat()`

• `isFloat('123') -> true`
• `isFloat('\$123.45' -> '\$###.00') -> true`
• `isFloat('icecream') -> false`

### `isIgnore`

`isIgnore([<value1> : integer]) => boolean`

Checks if the row is marked to be ignored. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.

• `isIgnore()`
• `isIgnore(1)`

### `isInsert`

`isInsert([<value1> : integer]) => boolean`

Checks if the row is marked for insert. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.

• `isInsert()`
• `isInsert(1)`

### `isInteger`

`isInteger (<value1> : string, [<format>: string]) => boolean`

Checks of the string value is a integer value given an optional format according to the rules of `toInteger()`

• `isInteger('123') -> true`
• `isInteger('\$123' -> '\$###') -> true`
• `isInteger('microsoft') -> false`

### `isLong`

`isLong (<value1> : string, [<format>: string]) => boolean`

Checks of the string value is a long value given an optional format according to the rules of `toLong()`

• `isLong('123') -> true`
• `isLong('\$123' -> '\$###') -> true`
• `isLong('gunchus') -> false`

### `isMatch`

`isMatch([<value1> : integer]) => boolean`

Checks if the row is matched at lookup. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.

• `isMatch()`
• `isMatch(1)`

### `isNan`

`isNan (<value1> : integral) => boolean`

Check if this is not a number.

• `isNan(10.2) => false`

### `isNull`

`isNull(<value1> : any) => boolean`

Checks if the value is NULL.

• `isNull(NULL()) -> true`
• `isNull('') -> false`

### `isShort`

`isShort (<value1> : string, [<format>: string]) => boolean`

Checks of the string value is a short value given an optional format according to the rules of `toShort()`

• `isShort('123') -> true`
• `isShort('\$123' -> '\$###') -> true`
• `isShort('microsoft') -> false`

### `isTimestamp`

`isTimestamp (<value1> : string, [<format>: string]) => boolean`

Checks if the input date string is a timestamp using an optional input timestamp format. Refer to Java's SimpleDateFormat for available formats. If the timestamp is omitted the default pattern `yyyy-[M]M-[d]d hh:mm:ss[.f...]` is used. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. Timestamp supports up to millisecond accuracy with value of 999 Refer to Java's SimpleDateFormat for available formats.

• `isTimestamp('2016-12-31 00:12:00') -> true`
• `isTimestamp('2016-12-31T00:12:00' -> 'yyyy-MM-dd\\'T\\'HH:mm:ss' -> 'PST') -> true`
• `isTimestamp('2012-8222.18') -> false`

### `isUpdate`

`isUpdate([<value1> : integer]) => boolean`

Checks if the row is marked for update. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.

• `isUpdate()`
• `isUpdate(1)`

### `isUpsert`

`isUpsert([<value1> : integer]) => boolean`

Checks if the row is marked for insert. For transformations taking more than one input stream you can pass the (1-based) index of the stream. The stream index should be either 1 or 2 and the default value is 1.

• `isUpsert()`
• `isUpsert(1)`

### `jaroWinkler`

`jaroWinkler(<value1> : string, <value2> : string) => double`

Gets the JaroWinkler distance between two strings.

• `jaroWinkler('frog', 'frog') => 1.0`

### `keyValues`

`keyValues(<value1> : array, <value2> : array) => map`

Creates a map of key/values. The first parameter is an array of keys and second is the array of values. Both arrays should have equal length.

• `keyValues(['bojjus', 'appa'], ['gunchus', 'ammi']) => ['bojjus' -> 'gunchus', 'appa' -> 'ammi']`

### `kurtosis`

`kurtosis(<value1> : number) => double`

Gets the kurtosis of a column.

• `kurtosis(sales)`

### `kurtosisIf`

`kurtosisIf(<value1> : boolean, <value2> : number) => double`

Based on a criteria, gets the kurtosis of a column.

• `kurtosisIf(region == 'West', sales)`

### `lag`

`lag(<value> : any, [<number of rows to look before> : number], [<default value> : any]) => any`

Gets the value of the first parameter evaluated n rows before the current row. The second parameter is the number of rows to look back and the default value is 1. If there are not as many rows a value of null is returned unless a default value is specified.

• `lag(amount, 2)`
• `lag(amount, 2000, 100)`

### `last`

`last(<value1> : any, [<value2> : boolean]) => any`

Gets the last value of a column group. If the second parameter ignoreNulls is omitted, it is assumed false.

• `last(sales)`
• `last(sales, false)`

### `lastDayOfMonth`

`lastDayOfMonth(<value1> : datetime) => date`

Gets the last date of the month given a date.

• `lastDayOfMonth(toDate('2009-01-12')) -> toDate('2009-01-31')`

### `lead`

`lead(<value> : any, [<number of rows to look after> : number], [<default value> : any]) => any`

Gets the value of the first parameter evaluated n rows after the current row. The second parameter is the number of rows to look forward and the default value is 1. If there are not as many rows a value of null is returned unless a default value is specified.

• `lead(amount, 2)`
• `lead(amount, 2000, 100)`

### `least`

`least(<value1> : any, ...) => any`

Comparison lesser than or equal operator. Same as <= operator.

• `least(10, 30, 15, 20) -> 10`
• `least(toDate('2010-12-12'), toDate('2011-12-12'), toDate('2000-12-12')) -> toDate('2000-12-12')`

### `left`

`left(<string to subset> : string, <number of characters> : integral) => string`

Extracts a substring start at index 1 with number of characters. Same as SUBSTRING(str, 1, n).

• `left('bojjus', 2) -> 'bo'`
• `left('bojjus', 20) -> 'bojjus'`

### `length`

`length(<value1> : string) => integer`

Returns the length of the string.

• `length('dumbo') -> 5`

### `lesser`

`lesser(<value1> : any, <value2> : any) => boolean`

Comparison less operator. Same as < operator.

• `lesser(12, 24) -> true`
• `('abcd' < 'abc') -> false`
• `(toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') < toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS')) -> true`

### `lesserOrEqual`

`lesserOrEqual(<value1> : any, <value2> : any) => boolean`

Comparison lesser than or equal operator. Same as <= operator.

• `lesserOrEqual(12, 12) -> true`
• `('dumbo' <= 'dum') -> false`

### `levenshtein`

`levenshtein(<from string> : string, <to string> : string) => integer`

Gets the levenshtein distance between two strings.

• `levenshtein('boys', 'girls') -> 4`

### `like`

`like(<string> : string, <pattern match> : string) => boolean`

The pattern is a string that is matched literally. The exceptions are the following special symbols: _ matches any one character in the input (similar to . in `posix` regular expressions) % matches zero or more characters in the input (similar to .* in `posix` regular expressions). The escape character is ''. If an escape character precedes a special symbol or another escape character, the following character is matched literally. It is invalid to escape any other character.

• `like('icecream', 'ice%') -> true`

### `locate`

`locate(<substring to find> : string, <string> : string, [<from index - 1-based> : integral]) => integer`

Finds the position(1 based) of the substring within a string starting a certain position. If the position is omitted it is considered from the beginning of the string. 0 is returned if not found.

• `locate('mbo', 'dumbo') -> 3`
• `locate('o', 'microsoft', 6) -> 7`
• `locate('bad', 'good') -> 0`

### `log`

`log(<value1> : number, [<value2> : number]) => double`

Calculates log value. An optional base can be supplied else a Euler number if used.

• `log(100, 10) -> 2`

### `log10`

`log10(<value1> : number) => double`

Calculates log value based on 10 base.

• `log10(100) -> 2`

### `lookup`

`lookup(key, key2, ...) => complex[]`

Looks up the first row from the cached sink using the specified keys that match the keys from the cached sink.

• `cacheSink#lookup(movieId)`

### `lower`

`lower(<value1> : string) => string`

Lowercases a string.

• `lower('GunChus') -> 'gunchus'`

### `lpad`

`lpad(<string to pad> : string, <final padded length> : integral, <padding> : string) => string`

Left pads the string by the supplied padding until it is of a certain length. If the string is equal to or greater than the length, then it is trimmed to the length.

• `lpad('dumbo', 10, '-') -> '-----dumbo'`
• `lpad('dumbo', 4, '-') -> 'dumb'`
• `lpad('dumbo', 8, '<>') -> '<><dumbo'`

### `ltrim`

`ltrim(<string to trim> : string, [<trim characters> : string]) => string`

Left trims a string of leading characters. If second parameter is unspecified, it trims whitespace. Else it trims any character specified in the second parameter.

• `ltrim(' dumbo ') -> 'dumbo '`
• `ltrim('!--!du!mbo!', '-!') -> 'du!mbo!'`

### `map`

`map(<value1> : array, <value2> : unaryfunction) => any`

Maps each element of the array to a new element using the provided expression. Map expects a reference to one element in the expression function as #item.

• `map([1, 2, 3, 4], #item + 2) -> [3, 4, 5, 6]`
• `map(['a', 'b', 'c', 'd'], #item + '_processed') -> ['a_processed', 'b_processed', 'c_processed', 'd_processed']`

### `mapAssociation`

`mapAssociation(<value1> : map, <value2> : binaryFunction) => array`

Transforms a map by associating the keys to new values. Returns an array. It takes a mapping function where you can address the item as #key and current value as #value.

• `mapAssociation(['bojjus' -> 'gunchus', 'appa' -> 'ammi'], @(key = #key, value = #value)) => [@(key = 'bojjus', value = 'gunchus'), @(key = 'appa', value = 'ammi')]`

### `mapIf`

`mapIf (<value1> : array, <value2> : binaryfunction, <value3>: binaryFunction) => any`

Conditionally maps an array to another array of same or smaller length. The values can be of any datatype including structTypes. It takes a mapping function where you can address the item in the array as #item and current index as #index. For deeply nested maps you can refer to the parent maps using the `#item_[n](#item_1, #index_1...)` notation.

• `mapIf([10, 20, 30], #item > 10, #item + 5) -> [25, 35]`
• `mapIf(['icecream', 'cake', 'soda'], length(#item) > 4, upper(#item)) -> ['ICECREAM', 'CAKE']`

### `mapIndex`

`mapIndex(<value1> : array, <value2> : binaryfunction) => any`

Maps each element of the array to a new element using the provided expression. Map expects a reference to one element in the expression function as #item and a reference to the element index as #index.

• `mapIndex([1, 2, 3, 4], #item + 2 + #index) -> [4, 6, 8, 10]`

### `mapLoop`

`mapLoop(<value1> : integer, <value2> : unaryfunction) => any`

Loops through from 1 to length to create an array of that length. It takes a mapping function where you can address the index in the array as #index. For deeply nested maps you can refer to the parent maps using the #index_n(#index_1, #index_2...) notation.

• `mapLoop(3, #index * 10) -> [10, 20, 30]`

### `max`

`max(<value1> : any) => any`

Gets the maximum value of a column.

• `max(sales)`

### `maxIf`

`maxIf(<value1> : boolean, <value2> : any) => any`

Based on a criteria, gets the maximum value of a column.

• `maxIf(region == 'West', sales)`

### `md5`

`md5(<value1> : any, ...) => string`

Calculates the MD5 digest of set of column of varying primitive datatypes and returns a 32 character hex string. It can be used to calculate a fingerprint for a row.

• `md5(5, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> '4ce8a880bd621a1ffad0bca905e1bc5a'`

### `mean`

`mean(<value1> : number) => number`

Gets the mean of values of a column. Same as AVG.

• `mean(sales)`

### `meanIf`

`meanIf(<value1> : boolean, <value2> : number) => number`

Based on a criteria gets the mean of values of a column. Same as avgIf.

• `meanIf(region == 'West', sales)`

### `millisecond`

`millisecond(<value1> : timestamp, [<value2> : string]) => integer`

Gets the millisecond value of a date. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

• `millisecond(toTimestamp('2009-07-30 12:58:59.871', 'yyyy-MM-dd HH:mm:ss.SSS')) -> 871`

### `milliseconds`

`milliseconds(<value1> : integer) => long`

Duration in milliseconds for number of milliseconds.

• `milliseconds(2) -> 2L`

### `min`

`min(<value1> : any) => any`

Gets the minimum value of a column.

• `min(sales)`

### `minIf`

`minIf(<value1> : boolean, <value2> : any) => any`

Based on a criteria, gets the minimum value of a column.

• `minIf(region == 'West', sales)`

### `minus`

`minus(<value1> : any, <value2> : any) => any`

Subtracts numbers. Subtract number of days from a date. Subtract duration from a timestamp. Subtract two timestamps to get difference in milliseconds. Same as the - operator.

• `minus(20, 10) -> 10`
• `20 - 10 -> 10`
• `minus(toDate('2012-12-15'), 3) -> toDate('2012-12-12')`
• `toDate('2012-12-15') - 3 -> toDate('2012-12-12')`
• `toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') + (days(1) + hours(2) - seconds(10)) -> toTimestamp('2019-02-04 07:19:18.871', 'yyyy-MM-dd HH:mm:ss.SSS')`
• `toTimestamp('2019-02-03 05:21:34.851', 'yyyy-MM-dd HH:mm:ss.SSS') - toTimestamp('2019-02-03 05:21:36.923', 'yyyy-MM-dd HH:mm:ss.SSS') -> -2072`

### `minute`

`minute(<value1> : timestamp, [<value2> : string]) => integer`

Gets the minute value of a timestamp. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

• `minute(toTimestamp('2009-07-30 12:58:59')) -> 58`
• `minute(toTimestamp('2009-07-30 12:58:59'), 'PST') -> 58`

### `minutes`

`minutes(<value1> : integer) => long`

Duration in milliseconds for number of minutes.

• `minutes(2) -> 120000L`

### `mlookup`

`mlookup(key, key2, ...) => complex[]`

Looks up the all matching rows from the cached sink using the specified keys that match the keys from the cached sink.

• `cacheSink#mlookup(movieId)`

### `mod`

`mod(<value1> : any, <value2> : any) => any`

Modulus of pair of numbers. Same as the % operator.

• `mod(20, 8) -> 4`
• `20 % 8 -> 4`

### `month`

`month(<value1> : datetime) => integer`

Gets the month value of a date or timestamp.

• `month(toDate('2012-8-8')) -> 8`

### `monthsBetween`

`monthsBetween(<from date/timestamp> : datetime, <to date/timestamp> : datetime, [<roundoff> : boolean], [<time zone> : string]) => double`

Gets the number of months between two dates. You can round off the calculation.You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

• `monthsBetween(toTimestamp('1997-02-28 10:30:00'), toDate('1996-10-30')) -> 3.94959677`

### `multiply`

`multiply(<value1> : any, <value2> : any) => any`

Multiplies pair of numbers. Same as the * operator.

• `multiply(20, 10) -> 200`
• `20 * 10 -> 200`

### `negate`

`negate(<value1> : number) => number`

Negates a number. Turns positive numbers to negative and vice versa.

• `negate(13) -> -13`

### `nextSequence`

`nextSequence() => long`

Returns the next unique sequence. The number is consecutive only within a partition and is prefixed by the partitionId.

• `nextSequence() == 12313112 -> false`

### `normalize`

`normalize(<String to normalize> : string) => string`

Normalizes the string value to separate accented unicode characters.

• `regexReplace(normalize('bo²s'), `\p{M}`, '') -> 'boys'`

### `not`

`not(<value1> : boolean) => boolean`

Logical negation operator.

• `not(true) -> false`
• `not(10 == 20) -> true`

### `notEquals`

`notEquals(<value1> : any, <value2> : any) => boolean`

Comparison not equals operator. Same as != operator.

• `12 != 24 -> true`
• `'bojjus' != 'bo' + 'jjus' -> false`

### `notNull`

`notNull(<value1> : any) => boolean`

Checks if the value is not NULL.

• `notNull(NULL()) -> false`
• `notNull('') -> true`

### `nTile`

`nTile([<value1> : integer]) => integer`

The `NTile` function divides the rows for each window partition into `n` buckets ranging from 1 to at most `n`. Bucket values will differ by at most 1. If the number of rows in the partition does not divide evenly into the number of buckets, then the remainder values are distributed one per bucket, starting with the first bucket. The `NTile` function is useful for the calculation of `tertiles`, quartiles, deciles, and other common summary statistics. The function calculates two variables during initialization: The size of a regular bucket will have one extra row added to it. Both variables are based on the size of the current partition. During the calculation process the function keeps track of the current row number, the current bucket number, and the row number at which the bucket will change (bucketThreshold). When the current row number reaches bucket threshold, the bucket value is increased by one and the threshold is increased by the bucket size (plus one extra if the current bucket is padded).

• `nTile()`
• `nTile(numOfBuckets)`

### `null`

`null() => null`

Returns a NULL value. Use the function `syntax(null())` if there is a column named 'null'. Any operation that uses will result in a NULL.

• `isNull('dumbo' + null) -> true`
• `isNull(10 * null) -> true`
• `isNull('') -> false`
• `isNull(10 + 20) -> false`
• `isNull(10/0) -> true`

### `or`

`or(<value1> : boolean, <value2> : boolean) => boolean`

Logical OR operator. Same as ||.

• `or(true, false) -> true`
• `true || false -> true`

### `originColumns`

`originColumns(<streamName> : string) => any`

Gets all output columns for a origin stream where columns were created. Must be enclosed in another function.

• `array(toString(originColumns('source1')))`

### `output`

`output() => any`

Returns the first row of the results of the cache sink

• `cacheSink#output()`

### `outputs`

`output() => any`

Returns the entire output row set of the results of the cache sink

• `cacheSink#outputs()`

### `partitionId`

`partitionId() => integer`

Returns the current partition ID the input row is in.

• `partitionId()`

### `pMod`

`pMod(<value1> : any, <value2> : any) => any`

Positive Modulus of pair of numbers.

• `pmod(-20, 8) -> 4`

### `power`

`power(<value1> : number, <value2> : number) => double`

Raises one number to the power of another.

• `power(10, 2) -> 100`

### `radians`

`radians(<value1> : number) => double`

• `radians(180) => 3.141592653589793`

### `random`

`random(<value1> : integral) => long`

Returns a random number given an optional seed within a partition. The seed should be a fixed value and is used in conjunction with the partitionId to produce random values

• `random(1) == 1 -> false`

### `rank`

`rank() => integer`

Computes the rank of a value in a group of values specified in a window's order by clause. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will produce gaps in the sequence. Rank works even when data is not sorted and looks for change in values.

• `rank()`

### `reassociate`

`reassociate(<value1> : map, <value2> : binaryFunction) => map`

Transforms a map by associating the keys to new values. It takes a mapping function where you can address the item as #key and current value as #value.

• `reassociate(['fruit' -> 'apple', 'vegetable' -> 'tomato'], substring(#key, 1, 1) + substring(#value, 1, 1)) => ['fruit' -> 'fa', 'vegetable' -> 'vt']`

### `reduce`

`reduce(<value1> : array, <value2> : any, <value3> : binaryfunction, <value4> : unaryfunction) => any`

Accumulates elements in an array. Reduce expects a reference to an accumulator and one element in the first expression function as #acc and #item and it expects the resulting value as #result to be used in the second expression function.

• `toString(reduce(['1', '2', '3', '4'], '0', #acc + #item, #result)) -> '01234'`

### `regexExtract`

`regexExtract(<string> : string, <regex to find> : string, [<match group 1-based index> : integral]) => string`

Extract a matching substring for a given regex pattern. The last parameter identifies the match group and is defaulted to 1 if omitted. Use `<regex>`(back quote) to match a string without escaping.

• `regexExtract('Cost is between 600 and 800 dollars', '(\\d+) and (\\d+)', 2) -> '800'`
• `regexExtract('Cost is between 600 and 800 dollars', `(\d+) and (\d+)`, 2) -> '800'`

### `regexMatch`

`regexMatch(<string> : string, <regex to match> : string) => boolean`

Checks if the string matches the given regex pattern. Use `<regex>`(back quote) to match a string without escaping.

• `regexMatch('200.50', '(\\d+).(\\d+)') -> true`
• `regexMatch('200.50', `(\d+).(\d+)`) -> true`

### `regexReplace`

`regexReplace(<string> : string, <regex to find> : string, <substring to replace> : string) => string`

Replace all occurrences of a regex pattern with another substring in the given string Use `<regex>`(back quote) to match a string without escaping.

• `regexReplace('100 and 200', '(\\d+)', 'bojjus') -> 'bojjus and bojjus'`
• `regexReplace('100 and 200', `(\d+)`, 'gunchus') -> 'gunchus and gunchus'`

### `regexSplit`

`regexSplit(<string to split> : string, <regex expression> : string) => array`

Splits a string based on a delimiter based on regex and returns an array of strings.

• `regexSplit('bojjusAgunchusBdumbo', `[CAB]`) -> ['bojjus', 'gunchus', 'dumbo']`
• `regexSplit('bojjusAgunchusBdumboC', `[CAB]`) -> ['bojjus', 'gunchus', 'dumbo', '']`
• `(regexSplit('bojjusAgunchusBdumboC', `[CAB]`)) -> 'bojjus'`
• `isNull(regexSplit('bojjusAgunchusBdumboC', `[CAB]`)) -> true`

### `replace`

`replace(<string> : string, <substring to find> : string, [<substring to replace> : string]) => string`

Replace all occurrences of a substring with another substring in the given string. If the last parameter is omitted, it is default to empty string.

• `replace('doggie dog', 'dog', 'cat') -> 'catgie cat'`
• `replace('doggie dog', 'dog', '') -> 'gie '`
• `replace('doggie dog', 'dog') -> 'gie '`

### `reverse`

`reverse(<value1> : string) => string`

Reverses a string.

• `reverse('gunchus') -> 'suhcnug'`

`right(<string to subset> : string, <number of characters> : integral) => string`

Extracts a substring with number of characters from the right. Same as SUBSTRING(str, LENGTH(str) - n, n).

• `right('bojjus', 2) -> 'us'`
• `right('bojjus', 20) -> 'bojjus'`

### `rlike`

`rlike(<string> : string, <pattern match> : string) => boolean`

Checks if the string matches the given regex pattern.

• `rlike('200.50', `(\d+).(\d+)`) -> true`
• `rlike('bogus', `M[0-9]+.*`) -> false`

### `round`

`round(<number> : number, [<scale to round> : number], [<rounding option> : integral]) => double`

Rounds a number given an optional scale and an optional rounding mode. If the scale is omitted, it is defaulted to 0. If the mode is omitted, it is defaulted to ROUND_HALF_UP(5). The values for rounding include 1 - ROUND_UP 2 - ROUND_DOWN 3 - ROUND_CEILING 4 - ROUND_FLOOR 5 - ROUND_HALF_UP 6 - ROUND_HALF_DOWN 7 - ROUND_HALF_EVEN 8 - ROUND_UNNECESSARY.

• `round(100.123) -> 100.0`
• `round(2.5, 0) -> 3.0`
• `round(5.3999999999999995, 2, 7) -> 5.40`

### `rowNumber`

`rowNumber() => integer`

Assigns a sequential row numbering for rows in a window starting with 1.

• `rowNumber()`

### `rpad`

`rpad(<string to pad> : string, <final padded length> : integral, <padding> : string) => string`

Right pads the string by the supplied padding until it is of a certain length. If the string is equal to or greater than the length, then it is trimmed to the length.

• `rpad('dumbo', 10, '-') -> 'dumbo-----'`
• `rpad('dumbo', 4, '-') -> 'dumb'`
• `rpad('dumbo', 8, '<>') -> 'dumbo<><'`

### `rtrim`

`rtrim(<string to trim> : string, [<trim characters> : string]) => string`

Right trims a string of trailing characters. If second parameter is unspecified, it trims whitespace. Else it trims any character specified in the second parameter.

• `rtrim(' dumbo ') -> ' dumbo'`
• `rtrim('!--!du!mbo!', '-!') -> '!--!du!mbo'`

### `second`

`second(<value1> : timestamp, [<value2> : string]) => integer`

Gets the second value of a date. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

• `second(toTimestamp('2009-07-30 12:58:59')) -> 59`

### `seconds`

`seconds(<value1> : integer) => long`

Duration in milliseconds for number of seconds.

• `seconds(2) -> 2000L`

### `setBitSet`

`setBitSet (<value1>: array, <value2>:array) => array`

Sets bit positions in this bitset

• `setBitSet(toBitSet([10, 32]), ) => [4294968320L, 17179869184L]`

### `sha1`

`sha1(<value1> : any, ...) => string`

Calculates the SHA-1 digest of set of column of varying primitive datatypes and returns a 40 character hex string. It can be used to calculate a fingerprint for a row.

• `sha1(5, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> '46d3b478e8ec4e1f3b453ac3d8e59d5854e282bb'`

### `sha2`

`sha2(<value1> : integer, <value2> : any, ...) => string`

Calculates the SHA-2 digest of set of column of varying primitive datatypes given a bit length which can only be of values 0(256), 224, 256, 384, 512. It can be used to calculate a fingerprint for a row.

• `sha2(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> 'afe8a553b1761c67d76f8c31ceef7f71b66a1ee6f4e6d3b5478bf68b47d06bd3'`

### `sin`

`sin(<value1> : number) => double`

Calculates a sine value.

• `sin(2) -> 0.9092974268256817`

### `sinh`

`sinh(<value1> : number) => double`

Calculates a hyperbolic sine value.

• `sinh(0) -> 0.0`

### `size`

`size(<value1> : any) => integer`

Finds the size of an array or map type

• `size(['element1', 'element2']) -> 2`
• `size([1,2,3]) -> 3`

### `skewness`

`skewness(<value1> : number) => double`

Gets the skewness of a column.

• `skewness(sales)`

### `skewnessIf`

`skewnessIf(<value1> : boolean, <value2> : number) => double`

Based on a criteria, gets the skewness of a column.

• `skewnessIf(region == 'West', sales)`

### `slice`

`slice(<array to slice> : array, <from 1-based index> : integral, [<number of items> : integral]) => array`

Extracts a subset of an array from a position. Position is 1 based. If the length is omitted, it is defaulted to end of the string.

• `slice([10, 20, 30, 40], 1, 2) -> [10, 20]`
• `slice([10, 20, 30, 40], 2) -> [20, 30, 40]`
• `slice([10, 20, 30, 40], 2) -> 20`
• `isNull(slice([10, 20, 30, 40], 2)) -> true`
• `isNull(slice([10, 20, 30, 40], 2)) -> true`
• `slice(['a', 'b', 'c', 'd'], 8) -> []`

### `sort`

`sort(<value1> : array, <value2> : binaryfunction) => array`

Sorts the array using the provided predicate function. Sort expects a reference to two consecutive elements in the expression function as #item1 and #item2.

• `sort([4, 8, 2, 3], compare(#item1, #item2)) -> [2, 3, 4, 8]`
• `sort(['a3', 'b2', 'c1'], iif(right(#item1, 1) >= right(#item2, 1), 1, -1)) -> ['c1', 'b2', 'a3']`

### `soundex`

`soundex(<value1> : string) => string`

Gets the `soundex` code for the string.

• `soundex('genius') -> 'G520'`

### `split`

`split(<string to split> : string, <split characters> : string) => array`

Splits a string based on a delimiter and returns an array of strings.

• `split('bojjus,guchus,dumbo', ',') -> ['bojjus', 'guchus', 'dumbo']`
• `split('bojjus,guchus,dumbo', '|') -> ['bojjus,guchus,dumbo']`
• `split('bojjus, guchus, dumbo', ', ') -> ['bojjus', 'guchus', 'dumbo']`
• `split('bojjus, guchus, dumbo', ', ') -> 'bojjus'`
• `isNull(split('bojjus, guchus, dumbo', ', ')) -> true`
• `isNull(split('bojjus, guchus, dumbo', ', ')) -> true`
• `split('bojjusguchusdumbo', ',') -> ['bojjusguchusdumbo']`

### `sqrt`

`sqrt(<value1> : number) => double`

Calculates the square root of a number.

• `sqrt(9) -> 3`

### `startsWith`

`startsWith(<string> : string, <substring to check> : string) => boolean`

Checks if the string starts with the supplied string.

• `startsWith('dumbo', 'du') -> true`

### `stddev`

`stddev(<value1> : number) => double`

Gets the standard deviation of a column.

• `stdDev(sales)`

### `stddevIf`

`stddevIf(<value1> : boolean, <value2> : number) => double`

Based on a criteria, gets the standard deviation of a column.

• `stddevIf(region == 'West', sales)`

### `stddevPopulation`

`stddevPopulation(<value1> : number) => double`

Gets the population standard deviation of a column.

• `stddevPopulation(sales)`

### `stddevPopulationIf`

`stddevPopulationIf(<value1> : boolean, <value2> : number) => double`

Based on a criteria, gets the population standard deviation of a column.

• `stddevPopulationIf(region == 'West', sales)`

### `stddevSample`

`stddevSample(<value1> : number) => double`

Gets the sample standard deviation of a column.

• `stddevSample(sales)`

### `stddevSampleIf`

`stddevSampleIf(<value1> : boolean, <value2> : number) => double`

Based on a criteria, gets the sample standard deviation of a column.

• `stddevSampleIf(region == 'West', sales)`

### `subDays`

`subDays(<date/timestamp> : datetime, <days to subtract> : integral) => datetime`

Subtract days from a date or timestamp. Same as the - operator for date.

• `subDays(toDate('2016-08-08'), 1) -> toDate('2016-08-07')`

### `subMonths`

`subMonths(<date/timestamp> : datetime, <months to subtract> : integral) => datetime`

Subtract months from a date or timestamp.

• `subMonths(toDate('2016-09-30'), 1) -> toDate('2016-08-31')`

### `substring`

`substring(<string to subset> : string, <from 1-based index> : integral, [<number of characters> : integral]) => string`

Extracts a substring of a certain length from a position. Position is 1 based. If the length is omitted, it is defaulted to end of the string.

• `substring('Cat in the hat', 5, 2) -> 'in'`
• `substring('Cat in the hat', 5, 100) -> 'in the hat'`
• `substring('Cat in the hat', 5) -> 'in the hat'`
• `substring('Cat in the hat', 100, 100) -> ''`

### `sum`

`sum(<value1> : number) => number`

Gets the aggregate sum of a numeric column.

• `sum(col)`

### `sumDistinct`

`sumDistinct(<value1> : number) => number`

Gets the aggregate sum of distinct values of a numeric column.

• `sumDistinct(col)`

### `sumDistinctIf`

`sumDistinctIf(<value1> : boolean, <value2> : number) => number`

Based on criteria gets the aggregate sum of a numeric column. The condition can be based on any column.

• `sumDistinctIf(state == 'CA' && commission < 10000, sales)`
• `sumDistinctIf(true, sales)`

### `sumIf`

`sumIf(<value1> : boolean, <value2> : number) => number`

Based on criteria gets the aggregate sum of a numeric column. The condition can be based on any column.

• `sumIf(state == 'CA' && commission < 10000, sales)`
• `sumIf(true, sales)`

### `tan`

`tan(<value1> : number) => double`

Calculates a tangent value.

• `tan(0) -> 0.0`

### `tanh`

`tanh(<value1> : number) => double`

Calculates a hyperbolic tangent value.

• `tanh(0) -> 0.0`

### `toBase64`

`toBase64(<value1> : string) => string`

Encodes the given string in base64.

• `toBase64('bojjus') -> 'Ym9qanVz'`

### `toBinary`

`toBinary(<value1> : any) => binary`

Converts any numeric/date/timestamp/string to binary representation.

• `toBinary(3) -> [0x11]`

### `toBoolean`

`toBoolean(<value1> : string) => boolean`

Converts a value of ('t', 'true', 'y', 'yes', '1') to true and ('f', 'false', 'n', 'no', '0') to false and NULL for any other value.

• `toBoolean('true') -> true`
• `toBoolean('n') -> false`
• `isNull(toBoolean('truthy')) -> true`

### `toByte`

`toByte(<value> : any, [<format> : string], [<locale> : string]) => byte`

Converts any numeric or string to a byte value. An optional Java decimal format can be used for the conversion.

• `toByte(123)`
• `123`
• `toByte(0xFF)`
• `-1`
• `toByte('123')`
• `123`

### `toDate`

`toDate(<string> : any, [<date format> : string]) => date`

Converts input date string to date using an optional input date format. Refer Java's `SimpleDateFormat` class for available formats. If the input date format is omitted, default format is yyyy-[M]M-[d]d. Accepted formats are :[ yyyy, yyyy-[M]M, yyyy-[M]M-[d]d, yyyy-[M]M-[d]dT* ].

• `toDate('2012-8-18') -> toDate('2012-08-18')`
• `toDate('12/18/2012', 'MM/dd/yyyy') -> toDate('2012-12-18')`

### `toDecimal`

`toDecimal(<value> : any, [<precision> : integral], [<scale> : integral], [<format> : string], [<locale> : string]) => decimal(10,0)`

Converts any numeric or string to a decimal value. If precision and scale are not specified, it is defaulted to (10,2).An optional Java decimal format can be used for the conversion. An optional locale format in the form of BCP47 language like en-US, de, zh-CN.

• `toDecimal(123.45) -> 123.45`
• `toDecimal('123.45', 8, 4) -> 123.4500`
• `toDecimal('\$123.45', 8, 4,'\$###.00') -> 123.4500`
• `toDecimal('Ç123,45', 10, 2, 'Ç###,##', 'de') -> 123.45`

### `toDouble`

`toDouble(<value> : any, [<format> : string], [<locale> : string]) => double`

Converts any numeric or string to a double value. An optional Java decimal format can be used for the conversion. An optional locale format in the form of BCP47 language like en-US, de, zh-CN.

• `toDouble(123.45) -> 123.45`
• `toDouble('123.45') -> 123.45`
• `toDouble('\$123.45', '\$###.00') -> 123.45`
• `toDouble('Ç123,45', 'Ç###,##', 'de') -> 123.45`

### `toFloat`

`toFloat(<value> : any, [<format> : string], [<locale> : string]) => float`

Converts any numeric or string to a float value. An optional Java decimal format can be used for the conversion. Truncates any double.

• `toFloat(123.45) -> 123.45f`
• `toFloat('123.45') -> 123.45f`
• `toFloat('\$123.45', '\$###.00') -> 123.45f`

### `toInteger`

`toInteger(<value> : any, [<format> : string], [<locale> : string]) => integer`

Converts any numeric or string to an integer value. An optional Java decimal format can be used for the conversion. Truncates any long, float, double.

• `toInteger(123) -> 123`
• `toInteger('123') -> 123`
• `toInteger('\$123', '\$###') -> 123`

### `toLong`

`toLong(<value> : any, [<format> : string], [<locale> : string]) => long`

Converts any numeric or string to a long value. An optional Java decimal format can be used for the conversion. Truncates any float, double.

• `toLong(123) -> 123`
• `toLong('123') -> 123`
• `toLong('\$123', '\$###') -> 123`

### `toShort`

`toShort(<value> : any, [<format> : string], [<locale> : string]) => short`

Converts any numeric or string to a short value. An optional Java decimal format can be used for the conversion. Truncates any integer, long, float, double.

• `toShort(123) -> 123`
• `toShort('123') -> 123`
• `toShort('\$123', '\$###') -> 123`

### `toString`

`toString(<value> : any, [<number format/date format> : string]) => string`

Converts a primitive datatype to a string. For numbers and date a format can be specified. If unspecified the system default is picked.Java decimal format is used for numbers. Refer to Java SimpleDateFormat for all possible date formats; the default format is yyyy-MM-dd.

• `toString(10) -> '10'`
• `toString('engineer') -> 'engineer'`
• `toString(123456.789, '##,###.##') -> '123,456.79'`
• `toString(123.78, '000000.000') -> '000123.780'`
• `toString(12345, '##0.#####E0') -> '12.345E3'`
• `toString(toDate('2018-12-31')) -> '2018-12-31'`
• `isNull(toString(toDate('2018-12-31', 'MM/dd/yy'))) -> true`
• `toString(4 == 20) -> 'false'`

### `toTimestamp`

`toTimestamp(<string> : any, [<timestamp format> : string], [<time zone> : string]) => timestamp`

Converts a string to a timestamp given an optional timestamp format. If the timestamp is omitted the default pattern yyyy-[M]M-[d]d hh:mm:ss[.f...] is used. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. Timestamp supports up to millisecond accuracy with value of 999. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

• `toTimestamp('2016-12-31 00:12:00') -> toTimestamp('2016-12-31 00:12:00')`
• `toTimestamp('2016-12-31T00:12:00', 'yyyy-MM-dd\'T\'HH:mm:ss', 'PST') -> toTimestamp('2016-12-31 00:12:00')`
• `toTimestamp('12/31/2016T00:12:00', 'MM/dd/yyyy\'T\'HH:mm:ss') -> toTimestamp('2016-12-31 00:12:00')`
• `millisecond(toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS')) -> 871`

### `toUTC`

`toUTC(<value1> : timestamp, [<value2> : string]) => timestamp`

Converts the timestamp to UTC. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. It is defaulted to the current timezone. Refer Java's `SimpleDateFormat` class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

• `toUTC(currentTimestamp()) == toTimestamp('2050-12-12 19:18:12') -> false`
• `toUTC(currentTimestamp(), 'Asia/Seoul') != toTimestamp('2050-12-12 19:18:12') -> true`

### `translate`

`translate(<string to translate> : string, <lookup characters> : string, <replace characters> : string) => string`

Replace one set of characters by another set of characters in the string. Characters have 1 to 1 replacement.

• `translate('(bojjus)', '()', '[]') -> '[bojjus]'`
• `translate('(gunchus)', '()', '[') -> '[gunchus'`

### `trim`

`trim(<string to trim> : string, [<trim characters> : string]) => string`

Trims a string of leading and trailing characters. If second parameter is unspecified, it trims whitespace. Else it trims any character specified in the second parameter.

• `trim(' dumbo ') -> 'dumbo'`
• `trim('!--!du!mbo!', '-!') -> 'du!mbo'`

### `true`

`true() => boolean`

Always returns a true value. Use the function `syntax(true())` if there is a column named 'true'.

• `(10 + 20 == 30) -> true`
• `(10 + 20 == 30) -> true()`

### `typeMatch`

`typeMatch(<type> : string, <base type> : string) => boolean`

Matches the type of the column. Can only be used in pattern expressions.number matches short, integer, long, double, float or decimal, integral matches short, integer, long, fractional matches double, float, decimal and datetime matches date or timestamp type.

• `typeMatch(type, 'number')`
• `typeMatch('date', 'datetime')`

### `unescape`

`unescape(<string_to_escape> : string, <format> : string) => string`

Unescapes a string according to a format. Literal values for acceptable format are 'json', 'xml', 'ecmascript', 'html', 'java'.

• `unescape('{\\\\\"value\\\\\": 10}', 'json')`
• `'{\\\"value\\\": 10}'`

### `unfold`

`unfold (<value1>: array) => any`

Unfolds an array into a set of rows and repeats the values for the remaining columns in every row.

• `unfold(addresses) => any`
• `unfold( @(name = salesPerson, sales = salesAmount) ) => any`

### `unhex`

`unhex(<value1>: string) => binary`

Unhexes a binary value from its string representation. This can be used in conjunction with sha2, md5 to convert from string to binary representation

• `unhex('1fadbe') -> toBinary([toByte(0x1f), toByte(0xad), toByte(0xbe)])`
• `unhex(md5(5, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4'))) -> toBinary([toByte(0x4c),toByte(0xe8),toByte(0xa8),toByte(0x80),toByte(0xbd),toByte(0x62),toByte(0x1a),toByte(0x1f),toByte(0xfa),toByte(0xd0),toByte(0xbc),toByte(0xa9),toByte(0x05),toByte(0xe1),toByte(0xbc),toByte(0x5a)])`

### `union`

`union(<value1>: array, <value2> : array) => array`

Returns a union set of distinct items from 2 arrays.

• `union([10, 20, 30], [20, 40]) => [10, 20, 30, 40]`

### `upper`

`upper(<value1> : string) => string`

Uppercases a string.

• `upper('bojjus') -> 'BOJJUS'`

### `uuid`

`uuid() => string`

Returns the generated UUID.

• `uuid()`

### `variance`

`variance(<value1> : number) => double`

Gets the variance of a column.

• `variance(sales)`

### `varianceIf`

`varianceIf(<value1> : boolean, <value2> : number) => double`

Based on a criteria, gets the variance of a column.

• `varianceIf(region == 'West', sales)`

### `variancePopulation`

`variancePopulation(<value1> : number) => double`

Gets the population variance of a column.

• `variancePopulation(sales)`

### `variancePopulationIf`

`variancePopulationIf(<value1> : boolean, <value2> : number) => double`

Based on a criteria, gets the population variance of a column.

• `variancePopulationIf(region == 'West', sales)`

### `varianceSample`

`varianceSample(<value1> : number) => double`

Gets the unbiased variance of a column.

• `varianceSample(sales)`

### `varianceSampleIf`

`varianceSampleIf(<value1> : boolean, <value2> : number) => double`

Based on a criteria, gets the unbiased variance of a column.

• `varianceSampleIf(region == 'West', sales)`

### `weekOfYear`

`weekOfYear(<value1> : datetime) => integer`

Gets the week of the year given a date.

• `weekOfYear(toDate('2008-02-20')) -> 8`

### `weeks`

`weeks(<value1> : integer) => long`

Duration in milliseconds for number of weeks.

• `weeks(2) -> 1209600000L`

### `xor`

`xor(<value1> : boolean, <value2> : boolean) => boolean`

Logical XOR operator. Same as ^ operator.

• `xor(true, false) -> true`
• `xor(true, true) -> false`
• `true ^ false -> true`

### `year`

`year(<value1> : datetime) => integer`

Gets the year value of a date.

• `year(toDate('2012-8-8')) -> 2012`