Web Test correlation

I've been thinking a lot about web test correlation over the last 6 months. What exactly is correlation? It's when you have to scrape a value out of the response to pass to a subsequent request. The hidden field rules we automatically apply after a recording are an example of this.

Web tests work at the http layer and do not run through IE. The web test playback viewer is a bit misleading, because we'll preview the page that was returned in a hosted IE window,m so it kind of looks like the playback may have gone through IE. But in fact, the all requests during playback go through the web test engine. We then post the responses on disk, fixup the resource urls to point to downloaded resources (e.g. if there's an image on the page, we'll download it to the temp directory, and then re-write the url to point to the temp dir). The IE browser in playback works by pointing IE to the downloaded file in the temp directory. This instance of IE has java script turned off, one because the playback engine doesn't run javascript, and two because it would be a security hole since files loaded from disk are mostly likely in a different security zone than files loaded from a web site.

But I'm rambling a bit. The main point is that all requests go through our http engine. We chose this (rather than running through IE) because our primary scenario for web tests is running them in a load test. There's a bunch of implications for this, as IE obviously does a lot of stuff. In order to playback correctly, the things IE does that effect what goes over http need to also be done in the web test.

The good news is http is very simple. There's only three ways to send data over http: the url (including query string parameters), the http headers, and form post body.

Most of what what IE does actually doesn't affect what goes over the wire. E.g. visual effects in IE using dhtml often times don't send any data. Also, for web apps perf problems are almost always in the mid-tier, and are not caused by static content served up by web servers (IIS is extremely fast at serving up content).

So what does matter? The biggest problem is dynamically generated data. That is, data that changes every time you run the app. One example of this is a session ID that uniquely identifies a user session. Another is a cart ID to identify a shopping cart. With these parameter values, you can't just record and playback the recorded values, because playback will fail. So tying this back into the three ways data is sent, how are dynamic variables sent?

The three most common ways are:

  1. cookie values
  2. hidden fields
  3. query string parameters

For #1, the VS 2005 web tests work well. Cookies are automatically propagated in a virtual user session (a session is one iteration of a web test). We did recently find some problems in this area as the system.net classes (which we use in our http engine) implement cookie handling per the RFC, but IE does not in a couple of cases. We're working on a solution where we bypass system.net's cookie handling and use our own to properly handle these cases.

VS 2005 web tests have a good solution here as well, although we have a key fix in this area in SP1. So if you are having trouble with record/playback, get SP1! The specific problem there was that we’d correlate a hidden field even if it was set in java script, which would result in sending the wrong value. The SP1 fix is the following: if the hidden field parameter value posted is different than the value that’s received in the response, it means it was set in java script. In that case, the recorder will record the value posted rather than correlate the hidden field.

VS 2005 doesn’t have a solution for #3, and when this happens it’s hard to figure out exactly what to do. So we’re adding a cool feature in Orcas to figure out when this happens, and suggest to the user correlations to add to their test. This feature should save you a ton of time on your tests that require correlation, and us a ton of support calls and forum posts. J During recording, we record extraction rules for every single form post and query string parameter using the extract text rule, along with the value that was recorded. Then in the correlate command, we’ll fire the extraction rule and compare to the recorded value. If they differ, we’ve found a dynamic parameter and will suggest you apply the rule to the test. So far it has worked great! Unfortunately this feature did not make it in for the beta 1.

If you have web sites for which record/playback does not work, and that we can test against, I’d love it if you would send me the link and instructions for navigating the site.