DataOperationsCatalog.LoadFromEnumerable Method

Definition

Overloads

LoadFromEnumerable<TRow>(IEnumerable<TRow>, SchemaDefinition)

Create a new IDataView over an enumerable of the items of user-defined type. The user maintains ownership of the data and the resulting data view will never alter the contents of the data. Since IDataView is assumed to be immutable, the user is expected to support multiple enumerations of the data that would return the same results, unless the user knows that the data will only be cursored once.

One typical usage for streaming data view could be: create the data view that lazily loads data as needed, then apply pre-trained transformations to it and cursor through it for transformation results.

LoadFromEnumerable<TRow>(IEnumerable<TRow>, DataViewSchema)

Create a new IDataView over an enumerable of the items of user-defined type using the provided DataViewSchema, which might contain more information about the schema than the type can capture.

LoadFromEnumerable<TRow>(IEnumerable<TRow>, SchemaDefinition)

Create a new IDataView over an enumerable of the items of user-defined type. The user maintains ownership of the data and the resulting data view will never alter the contents of the data. Since IDataView is assumed to be immutable, the user is expected to support multiple enumerations of the data that would return the same results, unless the user knows that the data will only be cursored once.

One typical usage for streaming data view could be: create the data view that lazily loads data as needed, then apply pre-trained transformations to it and cursor through it for transformation results.

public Microsoft.ML.IDataView LoadFromEnumerable<TRow> (System.Collections.Generic.IEnumerable<TRow> data, Microsoft.ML.Data.SchemaDefinition schemaDefinition = default) where TRow : class;
member this.LoadFromEnumerable : seq<'Row (requires 'Row : null)> * Microsoft.ML.Data.SchemaDefinition -> Microsoft.ML.IDataView (requires 'Row : null)
Public Function LoadFromEnumerable(Of TRow As Class) (data As IEnumerable(Of TRow), Optional schemaDefinition As SchemaDefinition = null) As IDataView

Type Parameters

TRow

The user-defined item type.

Parameters

data
IEnumerable<TRow>

The enumerable data containing type TRow to convert to anIDataView.

schemaDefinition
SchemaDefinition

The optional schema definition of the data view to create. If null, the schema definition is inferred from TRow.

Returns

The constructed IDataView.

Examples

using System;
using System.Collections.Generic;
using Microsoft.ML;

namespace Samples.Dynamic
{
    public static class DataViewEnumerable
    {
        // A simple case of creating IDataView from
        //IEnumerable.
        public static void Example()
        {
            // Create a new context for ML.NET operations. It can be used for
            // exception tracking and logging,
            // as a catalog of available operations and as the source of randomness.
            var mlContext = new MLContext();

            // Get a small dataset as an IEnumerable.
            IEnumerable<SampleTemperatureData> enumerableOfData =
                GetSampleTemperatureData(5);

            // Load dataset into an IDataView. 
            IDataView data = mlContext.Data.LoadFromEnumerable(enumerableOfData);

            // We can now examine the records in the IDataView. We first create an
        // enumerable of rows in the IDataView.
            var rowEnumerable = mlContext.Data
                .CreateEnumerable<SampleTemperatureData>(data,
                reuseRowObject: true);

            // SampleTemperatureDataWithLatitude has the definition of a Latitude
        // column of type float. We can use the parameter ignoreMissingColumns
        // to true to ignore any missing columns in the IDataView. The produced
        // enumerable will have the Latitude field set to the default for the
        // data type, in this case 0. 
            var rowEnumerableIgnoreMissing = mlContext.Data
                .CreateEnumerable<SampleTemperatureDataWithLatitude>(data, 
                reuseRowObject: true, ignoreMissingColumns: true);

            Console.WriteLine($"Date\tTemperature");
            foreach (var row in rowEnumerable)
                Console.WriteLine(
                    $"{row.Date.ToString("d")}\t{row.Temperature}");

            // Expected output:
            //  Date    Temperature
            //  1/2/2012        36
            //  1/3/2012        36
            //  1/4/2012        34
            //  1/5/2012        35
            //  1/6/2012        35

            Console.WriteLine($"Date\tTemperature\tLatitude");
            foreach (var row in rowEnumerableIgnoreMissing)
                Console.WriteLine($"{row.Date.ToString("d")}\t{row.Temperature}" 
                    + $"\t{row.Latitude}");

            // Expected output:
            //  Date    Temperature     Latitude
            //  1/2/2012        36      0
            //  1/3/2012        36      0
            //  1/4/2012        34      0
            //  1/5/2012        35      0
            //  1/6/2012        35      0
        }

        private class SampleTemperatureData
        {
            public DateTime Date { get; set; }
            public float Temperature { get; set; }
        }
        
        private class SampleTemperatureDataWithLatitude
        {
            public float Latitude { get; set; }
            public DateTime Date { get; set; }
            public float Temperature { get; set; }
        }
        
        /// <summary>
        /// Get a fake temperature dataset.
        /// </summary>
        /// <param name="exampleCount">The number of examples to return.</param>
        /// <returns>An enumerable of <see cref="SampleTemperatureData"/>.</returns>
        private static IEnumerable<SampleTemperatureData> GetSampleTemperatureData(
            int exampleCount)

        {
            var rng = new Random(1234321);
            var date = new DateTime(2012, 1, 1);
            float temperature = 39.0f;

            for (int i = 0; i < exampleCount; i++)
            {
                date = date.AddDays(1);
                temperature += rng.Next(-5, 5);
                yield return new SampleTemperatureData { Date = date, Temperature = 
                    temperature };

            }
        }
    }
}

LoadFromEnumerable<TRow>(IEnumerable<TRow>, DataViewSchema)

Create a new IDataView over an enumerable of the items of user-defined type using the provided DataViewSchema, which might contain more information about the schema than the type can capture.

public Microsoft.ML.IDataView LoadFromEnumerable<TRow> (System.Collections.Generic.IEnumerable<TRow> data, Microsoft.ML.DataViewSchema schema) where TRow : class;
member this.LoadFromEnumerable : seq<'Row (requires 'Row : null)> * Microsoft.ML.DataViewSchema -> Microsoft.ML.IDataView (requires 'Row : null)
Public Function LoadFromEnumerable(Of TRow As Class) (data As IEnumerable(Of TRow), schema As DataViewSchema) As IDataView

Type Parameters

TRow

The user-defined item type.

Parameters

data
IEnumerable<TRow>

The enumerable data containing type TRow to convert to an IDataView.

schema
DataViewSchema

The schema of the returned IDataView.

Returns

An IDataView with the given schema.

Remarks

The user maintains ownership of the data and the resulting data view will never alter the contents of the data. Since IDataView is assumed to be immutable, the user is expected to support multiple enumerations of the data that would return the same results, unless the user knows that the data will only be cursored once. One typical usage for streaming data view could be: create the data view that lazily loads data as needed, then apply pre-trained transformations to it and cursor through it for transformation results. One practical usage of this would be to supply the feature column names through the DataViewSchema.Annotations.

Applies to