Power was out for 9 days, from December 14th to December 23rd. Being stuck in freezing weather really helped take the edge off of Christmas :D Also, I'm pretty much useless without some sort of electronic gadget.

The MSHTML COM object I was having problems with before isn't so much a *problem*. What I was running into was something that was done by design, and with good reasons. However, in this scenario, it didn't make much sense.

I was writing a program to scrape relevant values off of a series of HTML pages. When you're using Passport (Windows Live ID) as an authentication mechanism, the login page always resides on login.live.com. The actual login page you visit is a page that hosts an iframeset. The contents you see outside of the login page are framed in from other sources/other domains.

This scenario has an iframeset originating from login.live.com with content in iframes originating from www.microsoft.com. I was interested in programmatically fetching the data from the content in those iframes. No problem! I wrote an application that used the .NET 2.0 WebBrowser object to navigate to the login page. I then went through the Window.Frames collection to get the underlying HTML documents. This is when I hit a System.UnauthorizedAccessException. What? Why should that happen?

My understanding is that JavaScript works by manipulating the MSHTML COM object, and that limiting access to the content in the iframes is a by-design XSS countermeasure to prevent one domain from accessing sensitive information in another domain. Which makes total sense in terms of client-side script. But in this scenario, there simply wasn't a "clean" way to pull all of the information that didn't originate from login.live.com.

My workaround? A somewhat convoluted system in which after the initial request, three other hidden web browser objects (one for each iframe) are utilized to directly make requests to the URL value of each iframe. On the DocumentCompleted event, I'm able to use the existing object model and scrape all of the relevant information. It's not ideal, but it works and I'm not recreating a ton of existing functionality.