DataStreamWriter.ForeachBatch(Action<DataFrame,Int64>) Method

Definition

Sets the output of the streaming query to be processed using the provided function. This is supported only in the micro-batch execution modes (that is, when the trigger is not continuous). In every micro-batch, the provided function will be called in every micro-batch with (i) the output rows as a DataFrame and (ii) the batch identifier. The batchId can be used to deduplicate and transactionally write the output (that is, the provided Dataset) to external systems. The output DataFrame is guaranteed to exactly same for the same batchId (assuming all operations are deterministic in the query).

[Microsoft.Spark.Since("2.4.0")]
public Microsoft.Spark.Sql.Streaming.DataStreamWriter ForeachBatch (Action<Microsoft.Spark.Sql.DataFrame,long> func);
[<Microsoft.Spark.Since("2.4.0")>]
member this.ForeachBatch : Action<Microsoft.Spark.Sql.DataFrame, int64> -> Microsoft.Spark.Sql.Streaming.DataStreamWriter
Public Function ForeachBatch (func As Action(Of DataFrame, Long)) As DataStreamWriter

Parameters

func
Action<DataFrame,Int64>

The function to apply to the DataFrame

Returns

This DataStreamWriter object

Attributes

Applies to