IDataView.GetRowCursorSet(IEnumerable<DataViewSchema.Column>, Int32, Random) Method


This constructs a set of parallel batch cursors. The value n is a recommended limit on cardinality. If n is non-positive, this indicates that the caller has no recommendation, and the implementation should have some default behavior to cover this case. Note that this is strictly a recommendation: it is entirely possible that an implementation can return a different number of cursors.

The cursors should return the same data as returned through GetRowCursor(IEnumerable<DataViewSchema.Column>, Random), except partitioned: no two cursors should return the "same" row as would have been returned through the regular serial cursor, but all rows should be returned by exactly one of the cursors returned from this cursor. The cursors can have their values reconciled downstream through the use of the Batch property.

The typical usage pattern is that a set of cursors is requested, each of them is then given to a set of working threads that consume from them independently while, ultimately, the results are finally collated in the end by exploiting the ordering of the Batch property described above. More typical scenarios will be content with pulling from the single serial cursor of GetRowCursor(IEnumerable<DataViewSchema.Column>, Random).

public Microsoft.ML.DataViewRowCursor[] GetRowCursorSet (System.Collections.Generic.IEnumerable<Microsoft.ML.DataViewSchema.Column> columnsNeeded, int n, Random rand = default);
abstract member GetRowCursorSet : seq<Microsoft.ML.DataViewSchema.Column> * int * Random -> Microsoft.ML.DataViewRowCursor[]
Public Function GetRowCursorSet (columnsNeeded As IEnumerable(Of DataViewSchema.Column), n As Integer, Optional rand As Random = Nothing) As DataViewRowCursor()



The active columns needed. If passed an empty IEnumerable no column is requested.


The suggested degree of parallelism.


An instance of Random to seed randomizing the access.



Applies to