NimbusML

Overview

NimbusML is the experimental Python bindings for ML.NET, an open source machine learning framework. ML.NET was originally developed in Microsoft Research and is used across many product groups in Microsoft like Windows, Bing, Office, Azure, SQL and more. However, we also know that often people work in multiple programming languages or work in teams where people use multiple programming languages.

In machine learning, Python has become very popular. We want to enable as many people to benefit from the ML.NET machine learning framework as possible and enable teams to work together, so we've created this project as experimental Python bindings for ML.NET. NimbusML not only enables data scientists to train and use machine learning models in Python but also enables saving models which can be easily used in .NET applications (see Loading, Saving and Serving Models)

This is an open source project located at https://github.com/Microsoft/NimbusML. We'd love for you to try it out and/or contribute!

Quick links

  • Installation guide
  • Quick Start introduces two simple end-to-end examples for training a machine learning model using NimbusML
  • Important Concepts section introduces the innovative concepts developed by NimbusML to have the best model performance, such as the usage of FileDataStream.
  • More Examples section illustrates helpful features like pipeline visualization and exporting models for use in ML.NET.

What is "NimbusML"?

nimbusml is interoperable with scikit-learn estimators and transforms, while adding a suite of highly optimized algorithms written in C++ and C# for speed and performance. NimbusML trainers and transforms support the following data structures for the fit() and transform() methods:

  • numpy.ndarray

  • scipy.sparse_cst

  • pandas.DataFrame

In addition, NimbusML also supports streaming from files without loading the dataset into memory, which allows training on data significantly exceeding memory using FileDataStream.

With FileDataStream, NimbusML is able to handle up to one billion features and billions of training examples for select algorithms.

NimbusML can be easily used for the following problems:

pngpngpngpng