The F# 3.0 Freebase Type Provider Sample – Integrating Internet-Scale Data Sources into a Strongly Typed Language
[ Update: the Freebase type provider is now available as part of the FSharp.Data NuGet package and library. The namespace has changed from "Samples.DataStore.Freebase" to "FSharp.Data". Find out more about F# at fsharp.org. ]
Part 1 - The Freebase Type Provider Sample - Integrating Internet-Scale Data Sources into a Strongly Typed Language
Part 2 - The Freebase Type Provider Sample - Static Parameters
Part 3 - The Freebase Type Provider Sample - Sample Queries for Presidents, Books and Stars
Part 4 - The Freebase Type Provider Sample - Some Info On Queries
Today we have added a new sample to the F# 3.0 Sample Pack – the Freebase Type Provider Sample! You can also read the hot-off-the-presses technical report from our partner team in Microsoft Research which gives some of the background to this sample, watch F# 3.0: data, services, Web, cloud, at your fingertips where we first demonstrated this type provider, or watch the F# 3.0 Information Rich Programming launch video.
Freebase is self-described as “an entity graph of people, places and things, built by a community that loves open data”. It contains a great deal of interesting and useful structured data suited for integration into programming applications and data scripting, and outright having fun exploring a world of information. Much of the information is drawn from open sources such as Wikipedia. It has a REST API.
The F# 3.0 Freebase type Provider Sample uses the unrivalled data-integration capabilities of F# 3.0 to integrate the entire "Commons" schema into the F# 3.0 programming language in a natural, fluent, fun and powerful way.
From the traditional programming languages perspective, Freebase is just yet another example of an information source with a “type-like” system. This begs the question: why can’t we use this web database directly, as if it were “part of our program”? Well, that’s what the F# 3.0 Type Provider Sample gives you.
To get going with the sample,
(2) Download and unzip the latest HEAD copy of the F# 3.0 Sample Pack
(3) Go to SampleProviders\Samples.DataStore.Freebase. Either run “build.cmd” or open and build Samples.DataStore.Freebase.sln
(4) Create a new script referencing Samples.DataStore.Freebase.dll
let data = Samples.DataStore.Freebase.FreebaseData.GetDataContext()
let elements = data.``Science and Technology``.Chemistry.``Chemical Elements`` |> Seq.toList
let hydrogen = data.``Science and Technology``.Chemistry.``Chemical Elements``.Individuals.Hydrogen
All going well, you've downloaded the Chemical Elements, including Hydrogen in three lines of code!
You can now explore the Freebase data schema by using “.” after "data" or "hydrogen". To go further,
(a) Take a look at the samples in Tests\Net45ScriptUsingTypeProvider.fsx. In particular, you will need to learn to write F# 3.0 queries to query Freebase efficiently. We'll be giving more details and samples about writing queries for Freebase in F# in later blog posts (or just see the tests in the sample pack).
(b) The Freebase API is rate limited, and initially you are using some quota available for debugging purposes. Quickly you will need an API key with the Freebase service enabled. This gives you 100,000 requests/day. See the sample Tests\Net45ScriptUsingTypeProviderWithApiKey.fsx for how to add the API key to your data scripting. If you get the (403) Forbidden error, then this shows you are hitting rate limitations.
(c) The Freebase type provider can be used with .NET 3.5, .NET 4.0, .NET 4.5, Silverlight and Portable programming. A proxy may be needed in some cases. The projects in Tests\ProjectsUsingTypeProvider.sln show some sample libraries for these different cases.
We plan to blog more about using the Freebase type provider. But even better, get in and do it for us first! We look forward to what you do with this fun sample.
Some nice features are:
- Many queries are translated efficiently into the MQL language. Those that can't execute on the client side by default.
- A selection of sample individuals is given under the "Individuals" entry for each collection of objects. This allows you to program against stroingly named individual such as Hydrogen or ``Bob Dylan``
- Freebase features such as pproximate counts are supported
- The implementation uses the latest Freebase API
- Image URLs are provided with GetImages(), and the first image is provided using the MainImage property
- Snapshot dates for Freebase are supported
- Optional client-side caching of schema information makes type checking quick and efficient
- API keys are supported
- Units of measure are supported (more details in future blog posts)
Some limitations of the Freebase type provider at the moment are
- Writes are not supported as yet
- Queries selecting compound objects are not particularly efficient
- You need to be connected to compile & run.
Here's as screenshot of getting a data context and then using auto-completion to explore the information space: