join operator

Merge the rows of two tables to form a new table by matching values of the specified column(s) from each table.

Table1 | join (Table2) on CommonColumn, $left.Col1 == $right.Col2

Syntax

LeftTable | join [JoinParameters] ( RightTable ) on Attributes

Arguments

  • LeftTable: The left table or tabular expression (sometimes called outer table) whose rows are to be merged. Denoted as $left.

  • RightTable: The right table or tabular expression (sometimes called *inner table) whose rows are to be merged. Denoted as $right.

  • Attributes: One or more (comma-separated) rules that describe how rows from LeftTable are matched to rows from RightTable. Multiple rules are evaluated using the and logical operator. A rule can be one of:

    Rule kind Syntax Predicate
    Equality by name ColumnName where LeftTable.ColumnName == RightTable.ColumnName
    Equality by value $left.LeftColumn == $right.RightColumn where $left.LeftColumn == $right.RightColumn

Note

In case of 'equality by value', the column names must be qualified with the applicable owner table denoted by $left and $right notations.

  • JoinParameters: Zero or more (space-separated) parameters in the form of Name = Value that control the behavior of the row-match operation and execution plan. The following parameters are supported:

    Name Values Description
    kind Join flavors See Join Flavors
    hint.remote auto, left, local, right See Cross-Cluster Join
    hint.strategy Execution hints See Join hints

Warning

The default join flavor, if kind is not specified, is innerunique. This is different than some other analytics products, which have inner as the default flavor. Please read carefully below to understand the differences between the different kinds and to make sure the query yields the intended results.

Returns

A table with:

  • A column for every column in each of the two tables, including the matching keys. The columns of the right side will be automatically renamed if there are name clashes.

  • A row for every match between the input tables. A match is a row selected from one table that has the same value for all the on fields as a row in the other table.

  • kind unspecified, kind=innerunique

    Only one row from the left side is matched for each value of the on key. The output contains a row for each match of this row with rows from the right.

  • Kind=inner

    There's a row in the output for every combination of matching rows from left and right.

  • kind=leftouter (or kind=rightouter or kind=fullouter)

    In addition to the inner matches, there's a row for every row on the left (and/or right), even if it has no match. In that case, the unmatched output cells contain nulls.

  • kind=leftanti (or kind=rightanti)

    Returns all the records from the left side that do not have matches from the right. The result table just has the columns from the left side. Equivalent to kind=leftantisemi.

  • kind=leftsemi (or kind=rightsemi)

    Returns all the records from the left side that have matches from the right. The result table contains columns from the left side only.

If there are several rows with the same values for those fields, you'll get rows for all the combinations.

Tips

For best performance:

  • Use where and project to reduce the numbers of rows and columns in the input tables, before the join.
  • If one table is always smaller than the other, use it as the left (piped) side of the join.

Example

Get extended activities from a log in which some entries mark the start and end of an activity.

let Events = MyLogTable | where type=="Event" ;
Events
| where Name == "Start"
| project Name, City, ActivityId, StartTime=timestamp
| join (Events
    | where Name == "Stop"
        | project StopTime=timestamp, ActivityId)
    on ActivityId
| project City, ActivityId, StartTime, StopTime, Duration = StopTime - StartTime
let Events = MyLogTable | where type=="Event" ;
Events
| where Name == "Start"
| project Name, City, ActivityIdLeft = ActivityId, StartTime=timestamp
| join (Events
        | where Name == "Stop"
        | project StopTime=timestamp, ActivityIdRight = ActivityId)
    on $left.ActivityIdLeft == $right.ActivityIdRight
| project City, ActivityId, StartTime, StopTime, Duration = StopTime - StartTime

More about this example.

Join flavors

The exact flavor of the join operator is specified with the kind keyword. As of today, Kusto supports the following flavors of the join operator:

Join kind Description
innerunique (or empty as default) Inner join with left side deduplication
inner Standard inner join
leftouter Left outer join
rightouter Right outer join
fullouter Full outer join
leftanti, anti or leftantisemi Left anti join
rightanti or rightantisemi Right anti join
leftsemi Left semi join
rightsemi Right semi join

inner and innerunique join operator flavors

  • When using inner join flavor, there will be a row in the output for every combination of matching rows from left and right without left keys deduplications. The output will be a cartesian product of left and right keys. Example of inner join:
let t1 = datatable(key:long, value:string)  
[
1, "val1.1",  
1, "val1.2"  
];
let t2 = datatable(key:long, value:string)  
[  
1, "val1.3", 
1, "val1.4"  
];
t1
| join kind = inner
    t2
on key
key value key1 value1
1 val1.2 1 val1.3
1 val1.1 1 val1.3
1 val1.2 1 val1.4
1 val1.1 1 val1.4
  • Using innerunique join flavor will deduplicate keys from left side and there will be a row in the output from every combination of deduplicated left keys and right keys. Example of innerunique join for the same datasets used above, Please note that innerunique flavor for this case may yield two possible outputs and both are correct. In the first output, the join operator randomly picked the first key which appears in t1 with the value "val1.1" and matched it with t2 keys while in the second one, the join operator randomly picked the second key appears in t1 which has the value "val1.2" and matched it with t2 keys:
let t1 = datatable(key:long, value:string)  
[
1, "val1.1",  
1, "val1.2"  
];
let t2 = datatable(key:long, value:string)  
[  
1, "val1.3", 
1, "val1.4"  
];
t1
| join kind = innerunique
    t2
on key
key value key1 value1
1 val1.1 1 val1.3
1 val1.1 1 val1.4
let t1 = datatable(key:long, value:string)  
[
1, "val1.1",  
1, "val1.2"  
];
let t2 = datatable(key:long, value:string)  
[  
1, "val1.3", 
1, "val1.4"  
];
t1
| join kind = innerunique
    t2
on key
key value key1 value1
1 val1.2 1 val1.3
1 val1.2 1 val1.4
  • Kusto is optimized to push filters that come after the join towards the appropriate join side left or right when possible. Sometimes, when the flavor used is innerunique and the filter can be propagated to the left side of the join, then it will be propagated automatically and the keys which applies to that filter will always appear in the output. for example, using the example above and adding filter where value == "val1.2" will always give the second result and will never give the first result for the used datasets :
let t1 = datatable(key:long, value:string)  
[
1, "val1.1",  
1, "val1.2"  
];
let t2 = datatable(key:long, value:string)  
[  
1, "val1.3", 
1, "val1.4"  
];
t1
| join kind = innerunique
    t2
on key
| where value == "val1.2"
key value key1 value1
1 val1.2 1 val1.3
1 val1.2 1 val1.4

Default join flavor

X | join Y on Key
X | join kind=innerunique Y on Key
 

Let's use two sample tables to explain the operation of the join:

Table X

Key Value1
a 1
b 2
b 3
c 4

Table Y

Key Value2
b 10
c 20
c 30
d 40

The default join performs an inner join after de-duplicating the left side on the join key (deduplication retains the first record). Given this statement:

X | join Y on Key 

the effective left side of the join (table X after de-duplication) would be:

Key Value1
a 1
b 2
c 4

and the result of the join would be:

let X = datatable(Key:string, Value1:long)
[
    'a',1,
    'b',2,
    'b',3,
    'c',4
];
let Y = datatable(Key:string, Value2:long)
[
    'b',10,
    'c',20,
    'c',30,
    'd',40
];
X | join Y on Key
Key Value1 Key1 Value2
b 2 b 10
c 4 c 20
c 4 c 30

(Note that the keys 'a' and 'd' do not appear in the output, since there were no matching keys on both left and right sides).

(Historically, this was the first implementation of the join supported by the initial version of Kusto; it is useful in the typical log/trace analysis scenarios where we want to correlate two events (each matching some filtering criterion) under the same correlation ID, and get back all appearances of the phenomenon we're looking for, ignoring multiple appearances of the contributing trace records.)

Inner join

This is the standard inner join as known from the SQL world. Output record is produced whenever a record on the left side has the same join key as the record on the right side.

let X = datatable(Key:string, Value1:long)
[
    'a',1,
    'b',2,
    'b',3,
    'c',4
];
let Y = datatable(Key:string, Value2:long)
[
    'b',10,
    'c',20,
    'c',30,
    'd',40
];
X | join kind=inner Y on Key
Key Value1 Key1 Value2
b 3 b 10
b 2 b 10
c 4 c 20
c 4 c 30

Note that (b,10) coming from the right side was joined twice: with both (b,2) and (b,3) on the left; also (c,4) on the left was joined twice: with both (c,20) and (c,30) on the right.

Left outer join

The result of a left outer join for tables X and Y always contains all records of the left table (X), even if the join condition does not find any matching record in the right table (Y).

let X = datatable(Key:string, Value1:long)
[
    'a',1,
    'b',2,
    'b',3,
    'c',4
];
let Y = datatable(Key:string, Value2:long)
[
    'b',10,
    'c',20,
    'c',30,
    'd',40
];
X | join kind=leftouter Y on Key
Key Value1 Key1 Value2
b 3 b 10
b 2 b 10
c 4 c 20
c 4 c 30
a 1

Right outer join

Resembles the left outer join, but the treatment of the tables is reversed.

let X = datatable(Key:string, Value1:long)
[
    'a',1,
    'b',2,
    'b',3,
    'c',4
];
let Y = datatable(Key:string, Value2:long)
[
    'b',10,
    'c',20,
    'c',30,
    'd',40
];
X | join kind=rightouter Y on Key
Key Value1 Key1 Value2
b 3 b 10
b 2 b 10
c 4 c 20
c 4 c 30
d 40

Full outer join

Conceptually, a full outer join combines the effect of applying both left and right outer joins. Where records in the joined tables do not match, the result set will have NULL values for every column of the table that lacks a matching row. For those records that do match, a single row will be produced in the result set (containing fields populated from both tables).

let X = datatable(Key:string, Value1:long)
[
    'a',1,
    'b',2,
    'b',3,
    'c',4
];
let Y = datatable(Key:string, Value2:long)
[
    'b',10,
    'c',20,
    'c',30,
    'd',40
];
X | join kind=fullouter Y on Key
Key Value1 Key1 Value2
b 3 b 10
b 2 b 10
c 4 c 20
c 4 c 30
d 40
a 1

Left anti join

Left anti join returns all records from the left side that do not match any record from the right side.

let X = datatable(Key:string, Value1:long)
[
    'a',1,
    'b',2,
    'b',3,
    'c',4
];
let Y = datatable(Key:string, Value2:long)
[
    'b',10,
    'c',20,
    'c',30,
    'd',40
];
X | join kind=leftanti Y on Key
Key Value1
a 1

Anti-join models the "NOT IN" query.

Right anti join

Right anti join returns all records from the right side that do not match any record from the left side.

let X = datatable(Key:string, Value1:long)
[
    'a',1,
    'b',2,
    'b',3,
    'c',4
];
let Y = datatable(Key:string, Value2:long)
[
    'b',10,
    'c',20,
    'c',30,
    'd',40
];
X | join kind=rightanti Y on Key
Key Value2
d 40

Anti-join models the "NOT IN" query.

Left semi join

Left semi join returns all records from the left side that match a record from the right side. Only columns from the left side are returned.

let X = datatable(Key:string, Value1:long)
[
    'a',1,
    'b',2,
    'b',3,
    'c',4
];
let Y = datatable(Key:string, Value2:long)
[
    'b',10,
    'c',20,
    'c',30,
    'd',40
];
X | join kind=leftsemi Y on Key
Key Value1
b 3
b 2
c 4

Right semi join

Right semi join returns all records from the right side that match a record from the left side. Only columns from the right side are returned.

let X = datatable(Key:string, Value1:long)
[
    'a',1,
    'b',2,
    'b',3,
    'c',4
];
let Y = datatable(Key:string, Value2:long)
[
    'b',10,
    'c',20,
    'c',30,
    'd',40
];
X | join kind=rightsemi Y on Key
Key Value2
b 10
c 20
c 30

Cross join

Kusto doesn't natively provide a cross-join flavor (i.e., you can't mark the operator with kind=cross). It isn't difficult to simulate this, however, by coming up with a dummy key:

X | extend dummy=1 | join kind=inner (Y | extend dummy=1) on dummy

Join hints

The join operator supports a number of hints that control the way a query executes. These do not change the semantic of join, but may affect its performance.