RSS Aggregation - Part 2: Our Implementation
Problems facing RSS right now
RSS presents the user with an entirely new way of retrieving, monitoring, and following up with web-based information sources. The technology is still in its infancy, as two competing standards (RSS and ATOM) compete to be the default syndication protocol while the industry is struggling to find the right place for RSS in the user’s daily information management functions.
To the typical end user, RSS is currently just a way to check the headlines on their favorite websites and blogs. The information itself is very dynamic, with the feed updating as often as the publisher creates new content, but the end user is left with little in the way of actions they can perform on the feed items once they are downloaded. The ease with which a user can sign-up to these feeds can cause a common case of information overload, leading to a situation where the user is checking too many feeds in their aggregator and they blend into information noise. The user has no effective way to categorize, manage, or control the view of information they see coming in from RSS feeds.
Aside from client-side problems with RSS, the Internet itself has begun to feel the effects of a rapidly increasing RSS user base. Major news companies that have begun putting out RSS feeds have seen a big spike in reader response, and this response has increased bandwidth congestion across the Internet. Users always want to stay as up-to-date as possible, so they will set the sync time in their aggregators (how often the feed is downloaded) to a very short interval. This causes havoc with web servers, especially if it is a small web server hosting a personal blog, by flooding them with requests and reducing bandwidth availability.
Most sites that syndicate content through RSS have begun to fight back, using a combination of two methods. First, they put a recommended sync interval into the XML of the feed so that the aggregator can parse that time and use it for its sync interval. This helps because sites will often only update the feed every certain amount of minutes, so checking sooner than that does not receive information that is any more ‘up to date’. Second, sites have begun banning users that sync more often than the recommended interval, usually by responding with a “You have been banned” message and then disabling the user’s access for a specified period of time.
The Outlook RSS Aggregation Experience
The challenge of making RSS aggregation in Outlook an effective tool for all of our users can be broken up into two main areas: Workflow Integration and Feed Management.
Outlook is in a position to offer a unique value proposition to the user by bringing RSS directly into their information workflow. Instead of bringing RSS down into a separate aggregation program or module, users will be able to interact with RSS items just like mail items.
For the user, this means easy and immediate acceptance of RSS items as first-class citizens in their Outlook experience. Flagging, assigning categories, sorting in the Navigation pane; all of these functions will work seamlessly with the new RSS items that can exist in a user’s Outlook folders. RSS items can be forwarded, shared, and flagged for follow-up in our new ToDo Bar.
You can create rules that act on individual RSS feeds or all of your feeds, allowing you to monitor multiple feeds for a certain keyword or phrase and then automatically move and/or flag it, for example. Roll-up views are also easily made using our Search Folder functionality, allowing you to create customized lists of RSS feeds that update automatically. I have over 100 feeds subscribed in Outlook right now, and use a few Search Folders to easily find feed items that are relevant to me from across all of the subscribed feeds. Here is an example of an RSS Search Folder I made that only shows unread and new items from all my feeds that have been downloaded today or yesterday:
Enclosures support will be extensive. Enclosures are files of any type that are ‘attached’ to an RSS item and can be surfaced in a number of ways in an aggregator. We will automatically parse and download individual enclosures items from a feed item and surface them as attachments to the RSS item itself. Using Outlook’s robust attachment previewing the user can interact with most standard media and document types directly inside of the rendering pane.
Our main goal with workflow integration is to make the information that comes down through RSS actionable, allowing the user to employ Outlook’s rich feature set, familiar interface, and comprehensive information management techniques to interact with the content.
The ability to add, remove, and configure RSS feed subscriptions is standard fare for any RSS aggregator. Outlook will excel in its ability to merge the feed management directly into our accounts architecture, reducing the amount of places a user must go to manage incoming information sources (email, sharepoint, ical, etc.).
This will significantly reduce the learning curve for new users to RSS, making management of their feed subscriptions just like managing their mail accounts. It will also reduce redundancy in design, decreasing the necessary resources that would come as a result of building a totally new management infrastructure for RSS items. New functionality built into Outlook 12 allows us to change the delivery location of incoming accounts; this allows the user to deliver an RSS subscription to a specific folder inside of their mailbox without using a rule – a distinct advantage of our implementation that lets users have a large amount of feeds being delivered to numerous locations without cluttering their rules menu.
At the individual item level RSS is a derivation of the Post message class, specifically IPM.Post.Rss. The decision to build on the Post item as opposed to creating a new message class was made so we would inherit a majority of the mail functionality from the Post, and because the Post item structure is so similar to what we want to display for RSS items. RSS items will have custom fields, rendering frames, and object model support. This deep integration into the mail module allows users to deliver RSS feeds to folders already existing mail folders. A feed from a news source about a specific technology could be delivered directly to a relevant project folder in your folder hierarchy, or a feed from a friend’s blog can be delivered right to your inbox.
Being a good ‘Net citizen
Integration with the Outlook Sync Manager (OSM) will allow us to intelligently manage feed polling and present status to the user. More importantly, we will attempt to prevent users from burning feed servers, and potentially getting banned, by reading a feed’s recommended sync interval and presenting that to the user as a default choice.
This is a key advantage of Outlook, as we will respect the server’s recommended update limit by default. A common complaint by many content providers is that aggregators do not respect this limit, nor do they make it discoverable to the user what it is and how to turn it on. Outlook’s design puts control back in the hands of the publisher, similar to what Cached Exchange does with Exchange server; requests from the client can be throttled by setting a known value inside of your content with the knowledge that the client will use it to control sync. This is still a user-controlled decision, so if you’d like you can turn it off. I will delve more into our sync architecture and how it works in a later post.
In the next post I'll be focusing on some advanced ways you can interact with RSS items, our sync architecture, and will answer more of the questions posed in your comments. Thanks again!