Vowpal Wabbit for Fast Learning

This blog post is authored by John Langford , Principal Researcher at Microsoft Research, New York City. Vowpal Wabbit is an open source machine learning (ML) system sponsored by Microsoft. VW is the essence of speed in machine learning, able to learn from terafeature datasets with ease. Via parallel learning, it can exceed the throughput of any single machine network interface when doing linear learning, a first amongst learning algorithms. 

The name has three references---the vorpal blade of Jabberwocky, the rabbit of Monty Python, and Elmer Fudd who hunted the wascally wabbit throughout my childhood.

VW sees use inside of Microsoft for ad relevance and other natural-language related tasks. Its external footprint is quite large with known applications across a broad spectrum of companies including Amazon, American Express, AOL, Baidu, eHarmony, Facebook, FTI Consulting, GraphLab, IBM, Twitter, Yahoo! and Yandex.

Why? Several tricks started with or are unique to VW:

  • VW supports online learning and optimization by default. Online learning is an old approach which is becoming much more common. Various alterations to standard stochastic gradient descent make the default rule more robust across many datasets, and progressive validation allows debugging learning applications in sub-linear time.

  • VW does Feature Hashing which allows learning from dramatically rawer representations, reducing the need to preprocess data, speeding execution, and sometimes even improving accuracy.

  • The conjunction of online learning and feature hashing imply the ability to learn from any amount of information via network streaming.  This makes the system a reliable baseline tool.

  • VW has also been parallelized to be the most scalable public ML algorithm, as measured by the quantity of data effectively learned from, more information here.

  • VW has a reduction stack which allows the basic core learning algorithm to address many advanced problem types such as cost-sensitive multiclass classification. Some of these advanced problem types, such as for interactive learning exist only in VW.

There is more, of course, but the above gives you the general idea – VW has several advanced designs and technologies which make it particularly compelling for some applications. 

In terms of deployment, VW runs as a library or a standalone daemon, but Microsoft Azure ML creates the possibility of cloud deployment. Imagine operationalizing a learned model for traffic from all over the world in seconds.  Azure ML presently exposes the feature hashing capability inside of VW via a module of the same name.

What of the future? We hunt the Wascally Wabbit. Technology-wise, I intend to use VW to experiment with other advanced ML ideas: 

Good answers to these questions can change the scope of future ML applications wadically.

Follow my personal blog here.