Enhancing Offline Favorites
Microsoft Active Channel technology is obsolete as of Windows Internet Explorer 7 and should not be used. To provide users with the best browsing experience, Microsoft Internet Explorer 4.0 introduced offline browsing to the Microsoft Win32 platform. Microsoft Internet Explorer 5 extends offline browsing, supporting "smarter" offline Favorites.
Active Channel technology is obsolete as of Internet Explorer 7 and should not be used. This article describes the creation and implementation of smart offline Favorites. Using a combination of Channel Definition Format (CDF) and HTML elements, Web authors can enhance the offline browsing experience for users with Internet Explorer 5 and later.
- Discover Offline Favorites
- LINK REL="Offline"
- Channel Definition Format (CDF)
- Sample Uses
- How It's Done
- Content Development Considerations for Offline Favorites
- Disabling Site Crawling
- Related topics
Discover Offline Favorites
Offline browsing allows users to view Web pages from the cache, a local repository of files gathered from the Web, through normal browsing and through the delivery of content subscriptions.
Users can choose to work offline by selecting Work Offline from the File menu in Internet Explorer 4.0 and later. When working offline, the system functions independently, regardless of any network connection, and content is read exclusively from the cache. If the content is not available locally, Windows Internet Explorer asks users if they want to go online to view the content or continue working offline.
Offline browsing is an easy feature for users to discover and use in Internet Explorer 5 and later. Users adding Web pages to Favorites see a Make available offline check box in the Add Favorite dialog box. When users select the box, the Customize command button is enabled. The Offline Favorite Wizard is activated by clicking the Customize button. The wizard enables users to make linked pages available offline, create a synchronization (update) schedule, and enter site-specific user names and passwords.
<LINK REL="Offline" HREF="/filename.cdf">
link is an HTML element that specifies a relationship with another object. REL is an attribute of link that sets or retrieves the specified relationship. For offline favorites, the value of the REL attribute is "Offline."
Channel Definition Format (CDF)
CDF is an XML vocabulary, or XML-based data format, that can be used to organize a set of related Web documents into a logical hierarchy. CDF enables developers to describe the structure and logically present various structured views of their HTML-based sites. Individual Web pages can be described by a CDF file to specify a hierarchy of associated Web pages.
In general, authoring a CDF file for offline Favorites is similar to developing Active Channel content. The main difference is that you use a subset of CDF elements when developing an offline Favorite; the CHANNEL and ITEM elements are currently supported.
ITEMs specified by the author of the CDF file supersede the Offline Favorite Wizard, in which the user can choose whether to include one level of linked pages for offline use. In this way, the Web author can assist the offline user by selecting a subset of linked pages, selecting content more than one level deep, or even selecting content that isn't linked.
Note Although the offline Favorite feature of Internet Explorer 5 and later uses a subset of the CDF vocabulary, Internet Explorer ignores the CDF elements it does not use. It may not be necessary, therefore, to author separate CDF files for Active Channel sites and offline Favorites.
You can create a CDF file with any text editor.
Offline Favorites save connection charges, navigation and download time, and enable users to take Web content with them wherever they can go with their computers. All of the desired text, graphics, and multimedia files can be placed on users' hard drives for offline browsing. By synchronizing their offline Favorites, users can quickly and easily update their offline content.
Previously, users had two choices when adding offline Favorites. They could choose to have all the article's links available offline, or none of them. Authors can now offer users a more meaningful alternative.
By choosing which pages go offline with a given Web page, authors can give users access to an intelligent subset of additional pages, enhancing the experience for users viewing content offline. This is a much more friendly alternative than "all or nothing."
The offline browsing experience can be enhanced for almost any Web page a user might add to offline Favorites. Web pages contain links to other Web pages, and some of the linked pages are more useful than others in offline mode. Going a step further, pages linked to these linked pages might also be useful to the user offline.
Perhaps you are authoring a research article for an educational institution. The article will contain links to the department that sponsored your work and links to online references that supported your research. Perhaps the article is hosted in a frame containing links to the whole institution. Offline readers of your work might might like to have your references available offline. They might find offline access to your institution's home page less useful.
But if a user precaches all the linked pages one level down, they get all the links, useful or not. Prior to Internet Explorer 5, the only other choice was precaching none of the linked pages, which can also detract from the offline browsing experience.
Consider the reader of this article, for example. A user adding this article to Favorites for offline browsing is likely to be a Web author who has an interest in implementing the technology. This user would find the CDF references useful offline also. The HTML link element and the REL attribute are summarized adequately within the article.
A CDF file for this article might look like this:
<?XML VERSION="1.0" ENCODING="windows-1252"?> <CHANNEL HREF="/Workshop/delivery/offline/linkrel.asp" LASTMOD="1998-09-19T22:12" PRECACHE="YES" LEVEL="0"> <ITEM HREF="/Workshop/delivery/cdf/reference/CHANNEL.asp" LASTMOD="1998-09-19T22:12" PRECACHE="YES" LEVEL="0"> </ITEM> <ITEM HREF="/Workshop/delivery/cdf/reference/ITEM.asp" LASTMOD="1998-09-19T22:12" PRECACHE="YES" LEVEL="0"> </ITEM> </CHANNEL>
In the sample above, the CHANNEL element defines the top-level of the hierarchy. Its attributes specify the offline Favorite's URL, the date the page was last modified, and the number of levels deep the browser should site crawl and precache the CHANNEL's content.
The ITEMs nested within the CHANNEL element define the next level in the hierarchy of associated Web pages. Each of the above ITEMs specifies a URL that represents the location of the page, the date the page was last modified, and the number of levels deep the browser should site crawl and precache the ITEM's content.
How It's Done
In this section, you'll see how easy it is to author a basic CDF file to support smarter offline Favorites. You'll see the actual HTML you include in your Web page to instruct Internet Explorer 5 and later to crawl the CDF file when users choose to make your page available offline.
Add the following HTML within the head section of your Web page.
<LINK REL="Offline" HREF="/FolderName/FileName.cdf">
This line specifies a relationship between the Web page and the CDF file you create in the next step.
Author a CDF file with the path and file name you specified in the HREF property above. The following example shows a simple CDF file that instructs Internet Explorer to cache the Web page and all the pages linked to it.
<?XML VERSION="1.0" ENCODING="windows-1252"?> <CHANNEL HREF="/FolderName/WebPage.htm" PRECACHE="YES" LEVEL="1"> </CHANNEL>
What if you don't want all of the linked pages precached? Perhaps there are too many, or they aren't all relevant. It is not always appropriate to cache all the pages linked to a Web page. And sometimes pages that are not linked directly to a Web page should be included. When only specific pages would benefit a user choosing a given page as an offline Favorite, set the value of the CHANNEL element's CHANNEL attribute to
0and specify the relevant pages individually using the ITEM element.
<?XML VERSION="1.0" ENCODING="windows-1252"?> <CHANNEL HREF="/FolderName/WebPage.htm" PRECACHE="YES" LEVEL="0"> <ITEM HREF="/FolderName/RelatedPage1.htm"></ITEM> <ITEM HREF="/FolderName/RelatedPage2.htm"></ITEM> </CHANNEL>
Content Development Considerations for Offline Favorites
Developers should be aware that the URLs of cached pages must exactly match the HREF values of the anchor tags. Whether or not you are authoring content specifically intended for offline use, it is a good practice to be careful authoring anchor tags. If a user takes a page and its links offline (adds an offline Favorite), and an HREF value in an anchor tag isn't identical to the file name of the linked page, the user won't have offline access to the linked page unless he or she knows to enter the correct URL in the address bar.
To ensure that hyperlinks for your smart offline Favorites work when the user is browsing offline, author the HREF values for ITEMs in your CDF file exactly the same way you authored the hyperlinks on your Web pages. If the hyperlink on your Web page says /FolderName/RelatedPage1.htm, the HREF value for the corresponding ITEM in your CDF file should also say /FolderName/RelatedPage1.htm. Likewise, if the hyperlink on your Web page only says /RelatedPage1.htm. the HREF value of the corresponding ITEM in your CDF file should also only say /RelatedPage1.htm.
The reason for this is that Internet Explorer respects the URL of the page the user is viewing when adding it to offline Favorites.
Disabling Site Crawling
Web authors can prevent Internet Explorer from crawling the links on any Web page by inserting the following HTML into the head section of the page. This prevents users from making linked pages available offline when they use the Customize option in the Offline Favorite Wizard:
<META NAME="ROBOTS" CONTENT="NOFOLLOW">
To prevent Internet Explorer from crawling all pages on a Web site, a file named Robots.txt should be placed in the root directory of the Web server. This file should contain the following lines.
In this example, path is the name of the path that Internet Explorer should not site crawl. A path of "/" can be used to exclude all pages on the server. Note that "MSIECrawler" can be replaced with an asterisk (*) to indicate that all automated crawling robots (including Internet Explorer) should be excluded.