Enterprise Search and Bing Services – Part 2: Bing Geo-coding Structured/Unstructured Text for Search
Web-based mapping services from Microsoft and others have been around for a number of years, but all are naturally dependent on data that contains well defined geo-coordinate information – usually latitude and longitude pairs. But what if you have content without clearly defined geo-tagged? For example, news stories may contain references to places around the world, but they don’t always come pre-tagged with geo-coordinate information for these places. Is it possible to infer this information and still be able to use a map for search results?
The short answer is yes. By combining mapping services like the Bing Maps Web Services SDK 1.0 with information extraction techniques available in more advanced enterprise search platforms, we can create interesting geo-aware search applications even against content that contains no explicit geo-coordinate data.
Let’s take a look at a simple example. Specifically, we’ll look at an example that takes advantage of the FAST Enterprise Search Platform (ESP) and ability to automatically recognize words and phrases by type (entities) within full text. For this application, we care about the “location” extractor. Once location information is extracted, we can use the Bing Maps SDK Geocoding abilities to tag each document with the appropriate geo-coordinate information. For the front-end, we’ll use a Silverlight 3 prototype UI based on the Bing MAPS Silverlight Control.
The steps for pulling this example together are:
1. Crawl a news source (unstructured text) and retrieve sample news articles (using the FAST ESP Enterprise Crawler).
2. Tag each document with extracted locations that are identified as most important in each article (using FAST ESP’s built-in location entity extractor).
3. Create a geocoder client (sample code) using Bing Maps SDK to submit extracted locations, and retrieve lat/long.
4. Call the geocode client from your content processor to submit extracted location to client, retrieve geodata and tag document with lat/long (using the FAST ESP content processing pipeline)
5. Finally, tie this all up in a nice UI using Bing Maps Silverlight Control.
A more detailed “how to” with some sample code is posted here.
Below is a screen shot of the resulting example app showing a query for “Swine Flu”. The larger the red pin the hotter that location is for that query in the target time range we searched:
This UI can also be used to track newly arriving news articles (aka live feed) and providing a visualization of where the emerging hot spots in the news. Below is a video of a simulated and accelerated day (5 seconds = 1 hour) showing incoming news articles and how news spreads around the world through the course of a day.
This is just one simple example of extending search results to a map where geo coded data is not readily available. I hope it helps to inspire other ideas for combining search with the functionality in the Bing Map SDK.
Microsoft Enterprise Search Group