View Row Cursor Class
Class used to cursor through rows of an IDataView.
public abstract class DataViewRowCursor : Microsoft.ML.DataViewRow
type DataViewRowCursor = class inherit DataViewRow
Public MustInherit Class DataViewRowCursor Inherits DataViewRow
Note that this is also an DataViewRow. The Position is
incremented by MoveNext(). Prior to the first call to MoveNext(), or after
false, Position is
Otherwise, when MoveNext() returns
true, Position >= 0.
This provides a means for reconciling multiple rows that have been produced generally from GetRowCursorSet(IEnumerable<DataViewSchema>, Int32, Random). When getting a set, there is a need to, while allowing parallel processing to proceed, always have an aim that the original order should be reconverable. Note, whether or not a user cares about that original order in ones specific application is another story altogether (most callers of this as a practical matter do not, otherwise they would not call it), but at least in principle it should be possible to reconstruct the original order one would get from an identically configured GetRowCursor(IEnumerable<DataViewSchema>, Random). So: for any cursor implementation, batch numbers should be non-decreasing. Furthermore, any given batch number should only appear in one of the cursors as returned by GetRowCursorSet(IEnumerable<DataViewSchema>, Int32, Random). In this way, order is determined by batch number. An operation that reconciles these cursors to produce a consistent single cursoring, could do so by drawing from the single cursor, among all cursors in the set, that has the smallest batch number available.
Note that there is no suggestion that the batches for a particular entry will be consistent from cursoring to cursoring, except for the consistency in resulting in the same overall ordering. The same entry could have different batch numbers from one cursoring to another. There is also no requirement that any given batch number must appear, at all. It is merely a mechanism for recovering ordering from a possibly arbitrary partitioning of the data. It also follows from this, of course, that considering the batch to be a property of the data is completely invalid.(Inherited from DataViewRow)
This is incremented when the underlying contents changes, giving clients a way to detect change. It should be
-1 when the object is in a state where values cannot be fetched. In particular, for an DataViewRowCursor,
this will be before MoveNext() if ever called for the first time, or after the first time
MoveNext() is called and returns
Note that this position is not position within the underlying data, but position of this cursor only. If one, for example, opened a set of parallel streaming cursors, or a shuffled cursor, each such cursor's first valid entry would always have position 0.(Inherited from DataViewRow)
Gets a Schema, which provides name and type information for variables (i.e., columns in ML.NET's type system) stored in this row.(Inherited from DataViewRow)
Implementation of dispose. Calls Dispose(Boolean) with
The disposable method for the disposable pattern. This default implementation does nothing.(Inherited from DataViewRow)
Returns a value getter delegate to fetch the value of the given
A getter for a 128-bit ID value. It is common for objects to serve multiple DataViewRow instances to iterate over what is supposed to be the same data, for example, in a IDataView a cursor set will produce the same data as a serial cursor, just partitioned, and a shuffled cursor will produce the same data as a serial cursor or any other shuffled cursor, only shuffled. The ID exists for applications that need to reconcile which entry is actually which. Ideally this ID should be unique, but for practical reasons, it suffices if collisions are simply extremely improbable.
Note that this ID, while it must be consistent for multiple streams according to the semantics above, is not considered part of the data per se. So, to take the example of a data view specifically, a single data view must render consistent IDs across all cursorings, but there is no suggestion at all that if the "same" data were presented in a different data view (as by, say, being transformed, cached, saved, or whatever), that the IDs between the two different data views would have any discernable relationship.(Inherited from DataViewRow)
Returns whether the given column is active in this row.(Inherited from DataViewRow)
Advance to the next row. When the cursor is first created, this method should be called to
move to the first row. Returns