Chapter 7 - Data Access and Transactions

Article
12/09/2009

To be useful, data must be accurate and accessible, and it must conform to the needs of users. Users should be able to respond to, and act upon, the information presented to them. The tool that makes this flexibility possible is the database, an essential element in today's Web applications.

The previous chapter introduced the n-tier application model, and described how to use a combination of client-side scripts and components with Active Server Pages (ASP) to create dynamic Web-based applications. This chapter builds on those concepts, beginning with a conceptual overview of Web database technologies, followed by technical discussions of how to harness the power of a data-driven approach for content publishing and information management on the Web.

Web Database Technologies

The Internet is changing our expectations about the availability of information. What we expect from the Internet and from the Web is changing as well. Once you've visited a site that lets you browse a product catalogue and initiate a sales transaction online, nothing is more frustrating than visiting another site that talks about products, but doesn't let you purchase them immediately.

As online commerce and electronic publishing become increasingly common, sites that provide a high level of interactivity will replace those that simply present information. Interactivity and complexity call for information to be stored in a way that makes it easy to manipulate and modify. This is the central role of the database in today's Web applications.

Why a Web Database?

What makes the Web such a good mechanism for accessing a database? When you use a data management solution as part of your site, you reap the following benefits:

Ease of Deployment It's no secret that the World Wide Web is a cheap and practical alternative to traditional client/server application deployment, and that it provides immediate cross-platform support on the client side. Implementing a dynamic Web database solution can be done relatively quickly and doesn't require a large team of developers.
Database Standards The components that enable Web database access are built on proven standards. Web pages can access data from a variety of sources, such as Microsoft® Access, Microsoft® SQL Server", or any database compliant with Microsoft® OLE DB or Open Database Connectivity (ODBC) Operator.
Data Security By using IIS 5.0 and SQL Server, you leverage the security model of Microsoft® Windows® 2000 Server. By using Microsoft® Component Services, you automatically gain the data protection and operational integrity provided by a distributed transaction coordinator. You have a lot invested in your content-protect it! (For more information about transactions, see "Transaction Processing on the Web" later in this chapter.)
Dynamic Content Dynamically generating HTML from a database is easier than making numerous manual changes to individual pages. By automating the creation of HTML from content stored in a database, you save time and make site management easier. In the end, you can focus on updating your content, not your HTML.

Data Publishing Considerations

Of course, not everything great about Web database access is free. Before you start combining databases and HTML, there are some important issues you should consider:

Static vs. Dynamic How much of your site really needs data access? How often does your content change? Dynamic solutions, especially if they involve accessing a database, are slower than plain, static HTML pages. If you display data that doesn't change frequently, you can improve performance (for your server and your client) by converting dynamic pages to HTML.
Server Load Be sure you have sufficient server resources to handle the increased demands of database access. Consider memory, CPU speed, Internet connection speed, disk subsystems and other critical hardware factors. If you are expecting heavy database traffic, you may need to separate your Web server and database management system (DBMS) onto two computers (or more). Also, use existing database and performance management tools to help you measure and balance your server load.
Tool Support The tools used to develop Web-based applications sometimes aren't being updated as fast as the technology is changing. Research and choose your tools carefully before you implement a large-scale database project.
Database Scalability and Reliability Determine how much the database is likely to grow. On average, how often will users access it? What kinds of tasks will they perform? What is your Web site's overall growth estimate-in both content and readership?
Client Presentation How will users access the data on your site? Will they be able to add to it, or modify it? Will the users have their own copies of the data, or will they only have access to the information while online? Using Microsoft® Data Access Components (MDAC), information can either be manipulated on the server as part of a server-side query, or be bundled as a package and transmitted to the client. Choosing how the information will be presented to the user is perhaps the hardest decision you'll have to make-often a hybrid approach is best.

Industrial Strength Information

Database-centric publishing is not just a convenience-it's a form of commerce applied to the commodity of information. You should consider the content of your company's Web site as you would your company's product. In addition to providing the latest information, a good Web administrator makes sure the information suits the needs of the customer.

Intranet and Extranets

Using the same networking technologies as the Internet, the corporate intranet publishes and disseminates information. The intranet is quickly becoming an effective way to capture and present information throughout the enterprise. With the same technology used on your Web site and intranet, combined with a tight security model, you can also extend the intranet to your trading partners, contractors, and suppliers via an extranet. The extranet has become a primary means of interaction between companies that contribute to each other's everyday business.

Extending the reach of information is a key feature of intranets and extranets. Users need to have access to information that is stored in more than one location, across multiple databases, or different file formats. Many technologies exist today to give you full access to information stored in mail systems, relational databases, file systems, and mainframes.

If information truly is your commodity, you should constantly be searching for ways to improve its content and distribution. Not only can you replace paper forms with online data entry, teams and working groups can improve their communication and productivity by sharing common resources, data, knowledge, and new ideas.

Database-Centric Publishing

Instead of authoring directly in HTML, many successful Web sites initially store their content in a database and combine it with HTML layout tags only when deploying the content for publication. Combining raw information with a layout template, or HTML boilerplate, imbues the content with the same look and feel as other pages on the site. The end result is a site that has a uniform appearance from page to page, even though the information has been created by dozens of people.

For instance, Microsoft® Sidewalk®, the company's family of city-centric entertainment guide Web sites, uses a custom-built application based on HTML forms and SQL Server to collect content from its contributors. Once the information is entered, it is managed separately from the visual representation of the site. Sidewalk.com employs a large number of freelance correspondents to keep the content fresh, and a smaller team of Web administrators to manage the technological aspects of the site.

Content Search and Personalization

Data in a database can be sorted, filtered, and queried for relevance. Why should it not be the same on your Web site or intranet? It is surprising how much raw data a site can contain. Not every piece of information is of interest to everyone. Users don't want to spend time searching through data, and will be less likely to visit sites with poor search tools.

To attract visitors, many sites provide a list of current headlines, or compilations of topics that appeal to people with common interests. Others allow the user to select which items will be displayed on the site's home page. Without a data-driven solution, it would be impossible to develop a site that could be tailored to display the latest items of interest each time a visitor returned.

Real-Time Information Systems

There's nothing wrong with collecting information about customers, as long as that information is not misused, or sold without permission. Building a database of customer habits is useful not only as a mechanism for feedback to management, but also as a means of providing individually tailored service to repeat customers. Likewise, field sales information, design and production schedules, product inventory, online purchases, sales transactions, marketing research, and competitive information all have an impact on your ability to provide service to your customers. Making that information available in real time (or as a continuous up-to-date flow of information) dramatically affects your ability to make rapid and incremental improvements to your business processes.

Data Warehousing and Online Analytical Processing

As organizations collect increasing volumes of data, the need to analyze and derive conclusions from raw data becomes more acute. A data warehouse is often used as the basis for a decision-support system (also referred to as a business intelligence system). It is designed to overcome some of the problems encountered when an organization attempts to perform strategic analysis using the same database that is used to perform online transaction processing (OLTP).

Typically, OLTP systems are designed specifically to manage transaction processing and minimize disk storage requirements using a series of related and normalized tables. However, when users need to analyze their data, a myriad of problems often prohibits the data from being used:

Application databases can be segmented across multiple servers, making it difficult for users to find the data in the first place.
Users may not understand the complex relationships among the tables, and therefore cannot generate ad hoc queries.
Security restrictions may prevent users from accessing the detailed data they need. Database administrators prohibit ad hoc querying of OLTP systems. This prevents analytical users from running queries that could slow down the performance of mission-critical production databases.

Data warehousing offers a viable solution to these problems. Basically, data warehousing is an approach to storing data in which heterogeneous data sources from across the enterprise (typically from multiple OLTP databases) are migrated to a common homogenous data store. Sometimes organizations maintain smaller, more topic-oriented data stores called data marts. Whereas data warehouses or data marts are the stores for data, online analytical processing (OLAP) is the technology that enables client applications to efficiently process the data. Data warehouses (combined with OLAP) provide the following benefits to analytical users:

Differences among data structures across multiple heterogeneous databases can be resolved. Data transformation rules can be applied to validate and consolidate data, when data is moved from the OLTP database into the data warehouse.
Data is organized to facilitate analytical queries rather than transaction processing. Frequently-queried data is pre-aggregated and the results are stored as "cubes," which are table-like structures that enable very fast response time to ad hoc queries.
Security and performance issues can be resolved without requiring changes in the production systems. OLAP is a powerful tool for creating new views of data, based upon a rich array of ad hoc calculation functions.

Microsoft Data Access Components

The previous section discussed the benefits of providing database access solutions on the Web. This section discusses how you can develop these solutions using Microsoft data access technologies.

Universal Data Access is Microsoft's initiative for providing high-performance access to all types of information (including relational and nonrelational data) across an organization, from the desktop to the enterprise, using practically any tool or language.

In a parallel effort with the Microsoft® Windows® Distributed interNet Applications (Windows DNA) architecture, the Universal Data Access initiative is a platform, application, and tools strategy that defines and delivers standards and technologies essential for application development.

MDAC is comprised of the data access technologies that enable Universal Data Access: ODBC, OLE DB, and Microsoft® ActiveX® Data Objects (ADO). Figure 7.1 shows how these data access components interact. For the latest information and current releases of MDAC see https://www.microsoft.com/data.

Bb742412.iis0701(en-us,TechNet.10).gif

Figure 7.1 Microsoft Data Access Components

The next few sections describe each of the MDAC technologies. Most of your Web applications will use ADO, which is described in detail later.

ODBC and OLE DB

The ODBC standard is a widely recognized method of accessing data in a variety of relational databases. It is fast, lightweight, and provides a common method that is not optimized for any specific data source.

Like ODBC, OLE DB is an open specification. It was designed to build on the success of ODBC by providing another standard for accessing data. Whereas ODBC was created to access relational databases, OLE DB interfaces are designed to communicate with any data source including relational and nonrelational data, such as Microsoft® Excel spreadsheets, as well as e-mail, and text files. There is no restriction on the type of data you can access with OLE DB-relational databases, indexed sequential access method (ISAM), text, or hierarchical data sources.

OLE DB is set of programming interfaces designed for driver vendors who want to expose a data source, and for C++ developers wanting to develop custom data components. Microsoft® Visual Basic®, which does not support Automation objects, cannot use OLE DB directly.

Applications that use OLE DB fall into two categories: consumers and providers. A consumer application uses (or consumes) data through the OLE DB interfaces or components. A provider is any component or data source that allows consumers to access data in a uniform way through the OLE DB interfaces. In a sense, an OLE DB provider is similar to an ODBC driver that provides a uniform mechanism for accessing relational data.

Several OLE DB providers are available, including the Microsoft® ODBC provider, which exposes any ODBC-compliant database through OLE DB. Developers can implement an OLE DB provider for whatever data access they require, if one does not already exist.

ADO and RDS

ADO and Remote Data Service (RDS) use OLE DB providers to communicate with local and remote data sources respectively. Any application that uses ADO objects gets its data indirectly from OLE DB. If there is an OLE DB provider for it, the data is accessible through ADO.

You can use ADO to write both server-side and client-side applications that can access and manipulate data. ADO, designed to provide a universal high-level data access method, is a collection of Automation objects that can retrieve, update, and create records from any OLE DB provider.

ADO exposes a set of functions that all data sources are expected to implement. Using these core functions, ADO can access the unique features of specific data sources through OLE DB. Additionally, unlike earlier data access methods, you no longer need to navigate through a hierarchy to create objects. You can create most ADO objects independently and reuse them in different contexts. If used correctly, the result is fewer ADO object calls and a smaller working set.

There's a downside to all this flexibility, however. Because ADO is an OLE DB consumer, the peculiarities of the OLE DB provider that you are using directly influence the behavior of ADO. Just because you can write an application in ADO doesn't mean that the provider will support it. Often, ADO errors are a direct result of performing operations not supported by the OLE DB provider, or the underlying data source. As you develop database access components and applications, keep in mind that there are sometimes multiple ways to perform any given action.

RDS is a feature of ADO that facilitates client-side programming by optimizing the transfer of data between the client and the ADO components in the middle tier of a Web application. RDS uses ADO as a programming interface between the code and the data exposed by the underlying OLE DB provider. The client-side components of RDS are Microsoft® ActiveX® controls that use either Microsoft® Component Object Model (COM) components or Hypertext Transfer Protocol (HTTP) to communicate with the server components. Microsoft® Internet Explorer includes the RDS client-side components.

Note: For Microsoft® Internet Explorer 3.x users, MDAC 2.0 provides service components that are compatible with RDS server-side and client-side components. Client-side components are included with Microsoft® Internet Explorer 5. Since later versions of MDAC are not 100 percent compatible with earlier versions of the browser, you might need to upgrade your clients before using the more advanced features of RDS and ADO.

A detailed look at RDS and ADO is included in the section "Client-Side Data Access" later in this chapter.

Other Data Access Methods

In today's fast-paced technological arena, it is not surprising to find that last year's good idea is this year's outdated software. For the sake of those that cannot throw away older technology just to adopt the latest craze, the following section enumerates some older data access technologies that are still supported by IIS 5.0. However, unless you have a good business reason to use them, you should rely on ADO instead. ADO is designed to balance flexibility with programmatic simplicity. In most cases, it is the only data access method you will need.

ADC

The Advanced Data Connector (ADC) can be considered the parent of RDS. In fact, the RDS technology used to access remote data is inherited directly from ADC. The early design of ADC was less flexible than the ADO programming model, so it was integrated with ADO to provide a uniform means of accessing remote data. ADC itself is now considered obsolete; RDS (and ADO, which is used by RDS in the middle tier) has replaced ADC programming.

Instead of ADC, use RDS when you need to provide a common programming model for accessing either local or remote data. RDS objects are installed with Microsoft® Internet Explorer 4.0 and Internet Explorer 5 on your client, or you can download them at run time from the .cab files shipped with MDAC components.

Jet Database Engine and DAO

The Jet database engine is a workstation-based storage system. Jet databases can be accessed using Data Access Objects (DAO). You can also access Jet databases with the ODBC drivers provided with Access, but only limited functionality is exposed using these drivers. The Jet database engine has its own query and result-set processors and is capable of executing queries against homogeneous or heterogeneous data sources. Developers who are familiar with DAO can use ODBCDirect to bypass the Jet database engine when connecting to back-end data sources.

DAO requires that you change programming models depending on whether your data is stored in a Jet database or some other data store. ADO provides a common programming model for Jet databases or any other OLE DB data source.

RDO

The Remote Data Objects (RDO) were specifically designed to access remote ODBC relational data sources, and to add a thin object layer to the ODBC application programming interface (API). RDO performance is, in most cases, close to that of the ODBC API.

RDO was specifically designed to deal with remote, intelligent data sources (such as SQL Server or Oracle, as opposed to ISAM databases), so it does not support some of the DAO table-based interfaces or Dynamic Data Exchange (DDE). RDO can execute ordinary table-based queries, but it is especially adept at building and executing queries against stored procedures. It also handles all types of result sets including those generated by multiple result set procedures, those returning output arguments, and those requiring complex input parameters. RDO 2.0 provides a high level of control over remote data sources, so it is not necessary to expose the underlying ODBC handles in order to manipulate the data sources, except in the most unusual cases. It also can create client cursors to manage "disconnected" result sets.

Once again, newer technologies have surpassed older ones. For example, ADO provides equivalent functionality and performance to RDO, with an easier-to-use object model; ADO can also access a much larger variety of data stores.

The Cost of Data Access

Technology, like MDAC, comes with a price.

Suppose you're creating a site that publishes bus schedules for more than a hundred routes. The "static" solution might be an index page-perhaps with an HTML form-that allows the user to select and view route-specific pages. The "dynamic" solution might use a query page to look up each bus schedule as it is requested, and return it on a customized, dynamic page.

Both approaches offer the same solution, but the "static" one offers better performance, for two reasons:

Delivering a static page demands far less processing than creating the same page on the fly.
The static page solution creates each page once; the dynamic page solution might create a new page for each request (depending on how the server's cache is utilized) for information that generally doesn't change much.

Consider how your data will typically be used. Often a static approach is best for static data (such as bus schedules), and a dynamic one is generally best for dynamic data (such as stock quotes). However, the best solution is to provide a controlled mix of static and dynamic pages, as your users require them, that your site can support. For example, if people need infrequent access to large amounts of data, the best solution may be the dynamic approach: a query page. But if they can read a relatively short list of articles while online, it's better to generate the page once and display a static listing.

Once you've determined that the dynamic approach is necessary, choosing a data access method is based on what your application requires in the following areas:

Performance
Ease of development
Ease of maintenance
Ease of deployment

The emphasis you place on each factor should be motivated by both the current and future needs of your application. Given the robust growth patterns of the Web in recent years, application performance and scalability should be your foremost goals. It is important to note, however, that throughput numbers should in no way dictate your choice of technology. For example, although the cost of developing an ADO application can be significantly less than with ODBC, the number of transactions ADO can process per second (TPS) is significantly lower than what ODBC can achieve. If both ADO and ODBC exceed your target TPS, your decision on which technology to use should be based on factors other than performance.

Table 7.1 compares the TPS results of similar tests performed with each data access technology. For these tests, Component Services (implemented in Microsoft® Visual C++®) and connection pooling were used. The SQL Server database was scaled to 800 TPS, with 384 megabytes (MB) RAM, and 4 percent procedure cache.

Table 7.1 TPS Per Number of Threads by MDAC Technology

	1	2	5	10	20	50
ODBC	66.37	146.28	350.46	626.76	900.24	859.91
OLE DB	67.30	141.92	326.19	590.57	794.91	715.78
OLE DB 2.0	61.73	126.93	297.29	506.75	575.35	526.61
ADO 2.0	51.24	108.12	240.91	377.30	361.26	310.34

The results of using relational data are striking (the comparison is not valid if the data is not relational). The ODBC component produced the highest throughput (as much as 277 percent higher than ADO using 50 threads). All data access technologies increased in throughput up to approximately 20 threads, after which performance dropped off.

These test results make the cost of data access very clear. If you use data access frivolously, you will be plagued with excessive delays and bottlenecks as pages are generated over and over again, and as transactions are processed for thousands of concurrent requests. By using the right data access methodology, and judiciously choosing when to generate dynamic pages, you can enhance and complement the static elements of your site.

Client-Side Data Access

The components that make up MDAC are designed for distributed applications that take advantage of the processing power on client, middle-tier, and database server computers. These components are part of a simple yet rich programming model for manipulating data and building applications that are easy to configure and maintain.

ADO and RDS are suitable for applications that need a high degree of database accessibility. When you create an RDS application for an intranet, you normally would use IIS 5.0 and Internet Explorer to communicate between the client and business services tiers. However, if the client- and middle-tier components are on computers within a local area network (LAN), you can also use COM to marshal the interfaces and method arguments across the network. RDS can also be used by applications written in Visual Basic to manage remote information.

ADO, with RDS, enables the Web-application developer to:

Bind ADO Recordset objects to intrinsic Dynamic HTML (DHTML) controls (and other data-aware controls) hosted in the browser, by using the DHTML "databinding" model.
Create and manage remote and disconnected ADO Recordset objects. ADO normally maintains a persistent connection to the database, but RDS works with locally cached, or disconnected, data.
Asynchronously request information from the server, and respond to events that are triggered when operations are complete. Asynchronous fetching is a feature specific to Microsoft® Cursor Service for OLE DB. It returns the first rows from a query and then continues fetching in the background, while the user manipulates the rows that have already been sent.
Invoke Automation objects on the Web server over either HTTP or COM. You can use RDS for applications thatbrowse records, or that connect to and modify data on the middle tier.
Work with hierarchical and multidimensional recordsets. With the newly available Microsoft® Data Shaping Service for OLE DB (Microsoft® Datashape), RDS can finally overcome its previous limitation of only one recordset per query.

The RDS client-side and server-side components are described in detail in the following sections.

Client-Tier Elements

As described in the previous chapter, the client services tier provides the visual interface for presenting and gathering data. In a Web-based RDS application, the Web page represents the RDS front end. The RDS client tier contains the following components:

A Web page or Visual Basic application containing an RDS.DataControl object and one or more data-aware controls.
An RDS.DataSpace object and client-side proxy for middle-tier business objects. The default business object is RDSServer.DataFactory. You can develop custom business objects as well.
A data cache and the client-side cursor engine, Cursor Service for OLE DB.

Data-Aware Controls

You can bind data-aware controls (for example, a DHTML text box, FlexGrid control, and so on) to data from remote servers, allowing you to view, edit, and update data from a Web page. RDS is only one of several data binding controls that can execute query results and bind them to other DHTML elements in Internet Explorer.

In RDS, the data-binding mechanism is the RDS.DataControl object. This is a nonvisual control that can bind a single ADO Recordset object, which represents the results of a query, to one or more data-aware controls on a Web page. It is possible to have more than one RDS.DataControl object on the page. The following HTML establishes a connection to the MySite server, and selects all the authors from the Pubs database:

<!-- RDS.DataControl -->
<OBJECT ID=RDS classid="clsid:BD96C556-65A3-11D0-983A-00C04FC29E33"> 
<PARAM NAME="SQL" VALUE="Select * from Authors"> 
<PARAM NAME="SERVER" VALUE="https://MySite">
<PARAM NAME="CONNECT" VALUE="dsn=Pubs;UID=sa;PWD=;"> 
</OBJECT>

When you use the RDS.DataControl to obtain a disconnected Recordset, RDS calls the RDSServer.DataFactory object on the middle tier for you. You don't need to call RDSServer.DataFactory or RDS.DataSpace explicitly.

When the RDS proxy invokes the RDSServer.DataFactory object on the server, the server-side object creates an ADO Recordset and executes the query against the specified data source. The results are then transmitted to the client where the RDS.DataControl reconstructs the Recordset for the client application. Once the data has been retrieved, the client disconnects from the data source. This helps to eliminate the contention for database connections that sometimes occurs as multiple clients simultaneously access the same data source.

Figure 7.2 shows a data-bound grid control in Internet Explorer. When the page loads, the grid control is automatically filled with records from the Adventure Works database. The following code listing shows the Microsoft® Visual Basic® Scripting Edition (VBScript) used to create this page. Note that very little of the script actually executes on the server; most of it operates using objects created on the client.

Example: The Sheridan Grid Control

The example uses the Sheridan data-bound grid control, which is available from the Sheridan Web site, https://www.shersoft.com , or as part of the MDAC samples. If you have installed the MDAC samples, the .cab file for the grid control can be found at https://<yourserver>msadc/Samples/ssdatb32.cab.

Bb742412.iis0702(en-us,TechNet.10).gif

Figure 7.2 A Data-Bound Grid Control in Internet Explorer

<%@ LANGUAGE=VBScript EnableSessionState=False %>
<!-- #include file="adcvbs.inc" -->
<HTML>
<HEAD>
<TITLE>A databound grid control</TITLE>
</HEAD>
<BODY BGCOLOR=#FFFFFF TEXT="#000000">
<!-- RDS.DataControl -->
<OBJECT ID="DATA" WIDTH=1 HEIGHT=1
CLASSID="clsid:BD96C556-65A3-11D0-983A-00C04FC29E33">
</OBJECT>
<!-- Sheridan data-bound grid control (note DATASRC) -->
<OBJECT ID="GRID" WIDTH=600 HEIGHT=200 DATASRC="#DATA"
CLASSID="clsid:00028C00-0000-0000-0000-000000000046">
</OBJECT>
--><SCRIPT Language=VBScript>
Sub Window_OnLoad
If DATA.ReadyState <> <%=adcReadyStateComplete%> Then
MsgBox "Query results still arriving, Please wait"
Else
DATA.Server = "https://<%Request.ServerVariables("SERVER_NAME")%>"
DATA.Connect = "DSN=AdvWorks"
DATA.SQL = "Select * from Products"
DATA.Refresh
GRID.Rebind
End if
END SUB
</SCRIPT>
</BODY>
</HTML>

Each RDS.DataControl object can be connected to a different data source and contain the results of a separate query. When you use the RDS.DataControl object to send a query to a database, the RDS server-side components return an ADODB.Recordset object to the client. On the client, the RDS proxy creates a client-side cursor and an ADOR.Recordset object to manipulate the returned records. You don't necessarily have to write any ADOspecific code to make this happen-RDS handles this for you when you use the RDS.DataControl object.

Figure 7.3 shows how the client-side and server-side components of a Web-based ADO application work together to process a user's query and display information from a database.

The user enters the query text, chooses a preformatted request, or navigates to a page containing an embedded query.
When an event fires on the Web page, such as the Window_OnLoad routine or OnClick event of a Search button, ADO creates a RDSServer.DataFactory proxy (or business object proxy) on the client.
The proxy translates the user request into an HTTP request, by formatting the parameters of the business object method as URL parameters. It then sends the request to the Web server specified in the RDS.DataControlServer property. IIS 5.0 forwards the HTTP request to an Internet Server Application Programming Interface (ISAPI) extension.
Advanced Data ISAPI (ADISAPI) interprets the URL parameters, creates an instance of the requested business object, and makes the method call. (By default, it calls the Query method of the server-side RDS.DataFactory object.)
The RDS.DataFactory object executes the user's query via OLE DB. It sets the CursorLocation property of the Recordset so that Cursor Service is used as its buffering component.
OLE DB passes the complete results of the query to the Cursor Service, which populates its buffers with the data, including all of the metadata for tables that are part of the result set.
The Client Cursor Engine passes a reference to the result set back to the RDS.DataFactory object.
The RDS.DataFactory object creates an instance of an ADO Recordset and initializes it with the result set. The ADO Recordset is then passed back as the return value of the original Query call from step 4.
The RDS.DataFactory object passes the Recordset back to ADISAPI, which packages the return value of the call into Multipurpose Internet Mail Extensions (MIME) format.
ADISAPI sends the Recordset over HTTP as multipart MIME packets to the business object proxy on the client side.
The client-side proxy unpacks the results of the method call, and recreates the Recordset in the client-side Data Cache.
Finally, the embedded RDS.DataControl object binds the data in the client-side Data Cache to the visual controls.

Figure 7.3 Process Diagram of an ADO Application Using RDS for Data Partitioning

Data Cache

One of the most important features of the RDS is its in-memory data caching on both the client and middle tiers. The Microsoft® Cursor Service caches the set of records returned by the query on the client computer. The only client actions that require another trip to the server are updates to the data or requests for new data. The client-side data cache does the following:

Reduces the number of requests for data between client-side application components and the database server. The performance improvement is especially noticeable for data access across the Internet.
Makes data immediately available to client-side application logic, without the application having to wait for data to travel across the network.

Because the data is cached on the client workstation, a user can quickly sort, filter, and scroll through the data without another round-trip to the server.

Client Cursor Engine

Remote Data Service calls Cursor Service (invisible to the user) to perform tasks for you automatically. It caches, in memory (or temporarily on disk, for large sets of data), the set of query results retrieved from a data source, as well as client updates to those results. It also contains layout information about the data such as table layouts, row counts, primary and secondary keys, column names, and timestamps, as well as the actual table data itself. To manage the cache, Cursor Service can:

Create and delete temporary tables.
Populate tables.
Manage updates to the data values.
Provide schema information (such as base tables, key columns, read-write columns, and computed columns).
Provide the mechanism to send updates as a batch to the server, with minimum network traffic, by sending only the modified records.

For more information about the differences between the client-side and server-side cursor engines, see "Recordsets and Cursors" later in this chapter.

Business Object Proxies and the RDS.DataSpace Object

RDS uses proxies for business objects that enable client-side components to communicate with business objects located on the middle tier. Proxies facilitate the packaging, unpackaging, and transport (marshaling) of the application's data across process or computer boundaries.

Like the RDS.DataControl object, the RDS.DataSpace object is a nonvisual ActiveX control that creates client-side proxies for custom business objects located on the middle tier. Client-side proxies facilitate packing, transporting, and unpackaging of disconnected recordsets (and other standard data types) across machine boundaries. For example, the following client-side script instantiates a Customer component using an RDS.DataSpace object:

<!-- RDS.DataSpace object -->
<OBJECT ID=RDSDataSpace WIDTH=1 HEIGHT=1
CLSID="clsid:BD96C556-65A3-11D0-983A-00C04FC29E36">
</OBJECT>
<SCRIPT LANGUAGE=VBScript>
Set objProxyCust = RDSDataSpace.CreateObject("MyCls.Customer", & _
"https://<%=Request.ServerVariables("SERVER_NAME")%>")
Set rs = objProxyCust.Orders(Request.Form("UserID"))
</SCRIPT>

Note: In order to create server-side object proxies with the RDS.DataSpace control, the ProgID of the business component must be registered in the key in the Microsoft® Windows® 2000 Server registry.

HKEY_LOCAL_MACHINE
\SYSTEM
\CurrentControlSet
\Services
\W3SVC
\Parameters
\ADCLaunch

Caution Do not use a registry editor to edit the registry directly unless you have no alternative. The registry editors bypass the standard safeguards provided by administrative tools. These safeguards prevent you from entering conflicting settings or settings that are likely to degrade performance or damage your system. Editing the registry directly can have serious, unexpected consequences that can prevent the system from starting and require that you reinstall Windows 2000. To configure or customize Windows 2000, use the programs in Control Panel or Microsoft® Management Console (MMC) whenever possible.

Middle-Tier Elements of Client-Side Data Access

The middle tier is the "bridge" between client computers and database servers. The components on the middle tier respond to requests from the user (or business services) in order to execute a business task.

In an RDS application, business objects on the middle tier handle the data request from the client sent through a transport protocol such as HTTP. Data, in the form of Recordset objects, is made available as an update to client-side controls through the OLE DB provider. The middle tier consists of the following components:

IIS 5.0 and ADISAPI
RDSServer.DataFactory or business objects

IIS 5.0 and ADISAPI

The underlying RDS code uses an ISAPI extension that helps to create server-side stubs, in order to communicate with client-side business object proxies. The ADISAPI breaks up the MIME packets that pass through IIS 5.0 and invokes methods on the server-side Data Factory objects.

The ADISAPI component communicates with business objects for you, and provides parsing, automation control, and Recordset object marshaling; it also provides tabular data packaging, streaming, and unpackaging. This ISAPI extension performs the necessary work to instantiate business objects, invoke their services through automation interfaces, and process the return parameters for transport back to the calling client through the tabular data streaming protocol.

Note: Requests to the ADISAPI component are made through the msadc virtual directory of IIS 5.0. This directory is created in the default Web site when IIS 5.0 is installed. If the default Web site has been renamed or replaced (for example, to improve administrative control of installed Web applications), the developer must recreate the msadc virtual directory on the new default Web site. The other option would be to specify the port address of the site that includes it in all RDS client-side requests.

The RDS Data Factory and Custom Business Objects

There are two ways to pass a Recordset object back from your server to the client with RDS. You can:

Use the RDSServer.DataFactory object.
Create a custom business object that exposes data access methods.

RDS includes a default business object, the RDSServer.DataFactory object, that provides read and write access to data sources, but contains no business rules or application logic. The RDSServer.DataFactory is a server-side business object-specifically an ActiveX dynamic-link library (DLL)-that uses ADO to dispatch SQL statements to a database management system (DBMS) through the OLE DB provider, and packages the results for transfer across the Internet or an intranet.

Designing Custom Business Objects

The main application components on the middle tier are business objects that contain information such as application logic, business rules, and specialized data-access code, in order to retrieve information from underlying databases. Business objects can be created with Visual Basic, C++, or any other language that supports COM.

The life span of a business object is as long as the execution of the method call invoked by the client-side RDS.DataSpace object. Business objects are created and destroyed with each method call, so no interim state is maintained. Therefore, custom business objects designed for use with RDS need to be stateless (don't use properties or static data). If state needs to be saved between invocations to the object, consider using the Shared Property Manager. A feature of Component Services, the Shared Property Manager simplifies programming of components that need to save state information or share state with other components. For more about working with business objects, see "Transactional Components" later in this chapter.

When designing business objects, keep the following in mind:

RDS can only marshal back one Recordset at a time. Beware of procedures that return two or more result sets, since they will require special handling by your business object. You can use the new MSDataShape provider to return hierarchical Recordset objects, but the data shape query language doesn't support stored procedures at this time.
RDS cannot marshal user-defined data types. Return only basic data types (Integer, String, and so on), Recordset objects, or Variants. Variant arrays are also supported.
Pass and return ADOR recordsets, not ADODB.
Use BYVAL parameters whenever possible. BYREF (by reference) parameters require two network trips, so using the BYVAL keyword to pass parameters by value saves time. BYREF is the default behavior of Visual Basic, so you must type BYVAL wherever you need it.
Business objects created with the RDS.DataSpace object cannot access ASP objects, including Session or Application state.

You can use either RDS or ADO, or a combination of both technologies, in the same application. For example, your application can create a custom business object that uses ADO to manage server-side data, and that uses RDS to transfer remote data to the client tier, where the user can interact with it.

The next section discusses the issues involved in creating a server-side ADO application, using ASP and COM components.

Accessing Data with ASP and COM Components

When you create a data-enhanced application, you need to decide whether data manipulation functionality or far-reaching browser support is more important to you. Unlike RDS, a server-side ADO solution is browser-independent. Because ADO can be accessed from server-side ASP and COM components, it doesn't require any ActiveX controls on the client side-a serious consideration for mixed-browser environments like the Internet.

Preparing the Database

Before you can connect to a database using ADO, you must take a few preliminary steps to properly configure your data source.

Connection Strings

ADO uses OLE DB connection strings to define the specifics of the data source. The connection string consists of a series of keyword/value pairs separated by semicolons, which specify information used for establishing a connection to the data source. Such a string might look like this:

"Provider=sqloledb; Driver={SQL Server}; Server=MySite;" & _
"Initial Catalog=pubs; User Id=sa; Password=;"

When using the OLE DB provider so that ODBC can connect to an ODBC driver through ADO, elements of the OLE DB connection string are passed down to the driver for the data source.

Creating an ODBC-Compliant Data Source

When designing a Web database application, it is often easier to initially prototype with a desktop database like Access, and then scale up to an enterprise-level database like SQL Server once the design work has been finished. This is an easy alteration to make if you are using script, but it would require you to recompile your business object DLLs. With a connection string, you can make such a change without altering code in your application.

When you use an ODBC Data Source Name (DSN) entry to store the connection string values externally, you simplify the information needed in the connection string. This makes changes to the data source completely transparent to the code itself.

To create a new ODBC DSN

Click the Start menu button, point to Programs, then Administrative Tools.
Click the Data Sources (ODBC) icon, and select the System DSN tab.
Click Add to create a new DSN entry, and then select the appropriate OLE DB provider for your database from the list of installed providers. Click Finish to configure your provider.

Note: Make sure you create either a File or System DSN. ADO does not recognize user (or local) DSNs. Because they store settings in the system registry, System DSNs offer slightly faster performance than File DSNs, which store connection parameters in a file on your hard disk.

"DSN-less" Connections

You can open a connection to a database without creating or specifying a named data source. Connections made in this way are called "DSN-less," because they don't require the system administrator to create an ODBC DSN. Rather than relying on information stored in a file or in the system registry, DSN-less connections specify the driver name, and all driver-specific information in the connection string.

Whether or not to use a DSN depends on how much flexibility you want. Connection parameters are readily visible to system administrators. As a result, no matter which connection style you use, there are no extra security benefits. Probably the most common reason for using a DSN-less connection is to connect to a database on a system that is not under your direct control. This makes DSN-less connections good for testing and for applications under development.

The following DSN-less connection strings contain the minimum parameters required by the ODBC drivers of Access and SQL 6.5:

strConAccess = "Driver={Microsoft Access Driver (*.mdb)};DBQ=C:\db.mdb"
strConSQL = "Driver={SQL Server};Server=(local);UID=sa;PWD="

Essentially, a DSN-less connection is "hard-coded" to use a certain driver, user identity, and network location, which makes it bothersome and difficult to update when the database parameters change. Because the variety of connection parameters can differ greatly from one ODBC data source drive to another, it is recommended that you use a DSN whenever possible.

Selecting an OLE DB Provider

A provider is a component that manages access to a data source. The Access and SQL Server providers are only two of the OLE DB providers available with ADO. In most cases, you don't have to specify the OLE DB provider; you can just use the default ODBC driver. However, not all OLE DB providers are alike; some connect to nonstandard data sources and some have specialized functionality. Microsoft continues to update and add to the following list in Table 7.2 of OLE DB providers installed with ADO 2.0:

Table 7.2 Common OLE DB Providers

Driver	Description
MSDASQL	OLE DB provider for ODBC. Connects to existing ODBC data sources, either with a System DSN or from connection string parameters. This is the default, if no provider is specified.
Microsoft.Jet.OLEDB.3.51	Access data provider. Connects to Jet databases.
SQLOLEDB	SQL Server OLE DB data provider.
MSDAORA	OLE DB provider for Oracle.
MSIDXS	Provider for Microsoft® Indexing Service.
ADSDSOObject	Microsoft® Active Directory" Services provider.
SNAOLEDB	OLE DB provider for Microsoft® SNA Server (requires SNA Server Client).
MSDataShape	Hierarchical recordset service provider.
MSPersist	Persisted recordset (local storage) service provider.
MS Remote	RDS disconnected recordset service provider.
MSDAOSP	Simple OLE DB provider. This provider can be used as the basis of custom providers that you create.

Choosing a data provider is as simple as setting the Provider property of the Connection object. The following ASP example uses a sample provider ("sampprov") to create a Recordset object from a Comma Separated Values (CSV) file:

Set cn = Server.CreateObject("ADODB.Connection")
cn.Provider = "sampprov"
cn.Open "Data Source=C:\oledbsdk\samples\sampclnt"
Set rs = cn.Execute("customer.csv")

Note: The "sampprov" provider is shipped with OLE DB SDK 1.5 and later releases.

Data Source Permissions

Setting up a DSN is the first step toward connecting to a database. You also need the correct permissions before you can access data from the Web. If you don't have these and you try to open an Access database stored on an NTFS file system, you might get the following error:

Microsoft OLE DB provider for ODBC Drivers error '80004005'
[Microsoft][ODBC Access 97 Driver] The Microsoft Jet database engine cannot 
open the file '(unknown)'. It is already opened exclusively by another user, 
or you need permission to view its data.

This problem occurs when the anonymous (IUSR_computername) account doesn't have the necessary permissions for the directory containing the database (.mdb) file. In order to make modifications to the data, the Jet database engine (on which Access is built) needs to create a lock file, a temporary working file in the same directory as the database. If the anonymous user can't write to this lock file, an error will occur. Make sure that you grant sufficient permission on the database and its directory for the anonymous user account. Also verify that exclusive access and record locking are turned off.

Security and SQL Server

For SQL Server, permissions issues multiply, especially if SQL Server and IIS 5.0 are running on different computers. By default, connections to a SQL Server use a service of the Windows 2000 operating system known as a "named pipe." In order for a SQL server client to gain access to a Windows 2000 named pipe, the client needs to be validated by the server. This is normally accomplished either by means of a Workgroups-style validation (identical user names and passwords are created on the client and the server), or by using the domain method (both the client and server are domain members).

The SQL Server connection uses the identity of the user associated with the Web connection. If the connection is anonymous, you need to create a guest account that corresponds to the IUSR_computername account. If this guest account already exists, then make sure it has rights to log onto the SQL Server computer.

Support for the anonymous user account can be configured by any of the following means:

Add the IUSR_computername account to the local user account database on the server that hosts SQL Server.
Make the account a member of the domain that SQL Server resides in.
Enable the Windows 2000 Guest user account on the remote SQL Server computer.

If you have configured IIS 5.0 to use integrated Windows authentication in the Windows 2000 operating system (by either disabling Anonymous access, or forcing a logon by returning a "401 Access Denied" response), IIS 5.0 tries to connect to the SQL Server by using the user's security context. See Figure 7.4. If SQL Server resides on a separate computer from IIS 5.0, the Windows 2000 operating system detects the attempt to use a network named pipe handle that had been opened in a different user context. It then forces the pipe closed, according to its security rules.

The following OLE DB error is an indication of this problem:

Microsoft OLE DB provider for ODBC Drivers error '80004005'
[Microsoft][ODBC SQL Server Driver][DBNMPNTW]ConnectionWrite (GetOverLappedResult()).

If SQL Server is running on the same server as IIS 5.0, use a local named pipe connection instead of a network named pipe connection. In the SQL Server connection string or in the DSN configuration, change "SERVER=computername" to "SERVER=(local)." The name "(local)" (with parentheses) is a special keyword to the SQL Server ODBC driver, and indicates that a local connection should be used.

If SQL Server is running on a different server than IIS 5.0, you can use a nonauthenticated protocol between IIS 5.0 and SQL Server, such as Transmission Control Protocol/Internet Protocol (TCP/IP) sockets. To use these, you must configure both the SQL Server and the SQL Server client on the IIS 5.0 server.

To set up a TCP/IP connection on the server hosting SQL Server

Run SQL Server Setup.
In the Options dialog box, click Change Network Support, and click Continue.
Select the entry for TCP/IP Sockets (leave Named Pipes selected also) and click OK. Accept the default Named Pipe name and TCP/IP Socket number.
Exit SQL Server Setup. Stop and restart the SQL Server.

To set up a TCP/IP connection on the server hosting IIS 5.0

From Control Panel, point to Administrative Tools. Select Data Sources (ODBC). Select a SQL Server data source, and click Configure to start the SQL Server DSN Configuration Wizard. Click Next, then click Client Configuration. Figure 7.4 shows the resulting dialog box.
Select the Network Library tab, and select TCP/IP as the default network protocol. Click OK. IIS 5.0 will now use TCP/IP sockets when connecting to the SQL Server specified in this DSN.

Figure 7.4 Configuring the SQL Server Client Connection

The Database Connection

There are several ways to establish a database connection. One way is to create an ADO Connection object explicitly, and use its Open method to connect to the database using a DSN, user name, and password. The following example shows you how this would look in an ASP page:

<%
'Open the database.
Set cn = Server.CreateObject("ADODB.Connection")
cn.Open strDSN, strUserName, strPassword
%>

You can also dynamically create a connection when you create a Recordset or Command object by using a connection string in place of a Connection object. When called, the object creates a new database connection. The following example uses a Recordset object in conjunction with a connection string in order to create a new database connection:

'Open the database and retrieve the records.
Set rs = Server.CreateObject("ADODB.Recordset")
rs.Open strSQL, strDSN

Note: When you bypass the Connection object in this way, you run the risk of inadvertently creating multiple database connections, which uses extra server resources and slows down the processing of your queries. This is particularly a problem when ODBC Connection Pooling is disabled. If you have already created a Connection object, it is always more efficient to reuse it when creating new ADO objects.

ODBC Connection Pooling

Database applications that frequently open and close connections can reduce database server performance. Fortunately, ODBC 3.5 implements connection pooling, which enables applications to share connections across user requests, and thereby improves performance by reducing the number of idle connections.

Before making a new connection to an ODBC data source, the ODBC driver searches the connection pool for any idle connections that may satisfy the connection request. For the connection to be reused, the connection string and user context of the pooled connection must exactly match that of the request. If a matching idle connection is found, it is returned to the calling application.

When an ODBC connection is released, the connection returns to the pool, rather than being closed. The Connection Pool Timeout (CPTimeout) setting of your ODBC driver determines the length of time that a connection remains in the connection pool. If the connection is not reused during this time, it is closed and removed from the pool. The default CPTimeout value is 60 seconds.

You can modify the CPTimeout value of each ODBC database driver by changing the following Windows 2000 Server registry key:

HKEY_LOCAL_MACHINE \SOFTWARE
\ODBC
\ODBCINST.INI
\driver-name
\CPTimeout = timeout (REG_SZ, units are in seconds)

For example, the following setting configures the ODBC driver for the SQL Server connection pool time-out to 180 seconds:

HKEY_LOCAL_MACHINE \SOFTWARE
\ODBC
\ODBCINST.INI
\SQL Server
\CPTimeout = "180"

A severed connection to the database could prevent applications that use a connection pool from connecting successfully. Client applications would continue to make new connection requests even though the connection is broken. It takes time for each request to determine that the server is unavailable, and new connection requests must wait for others to time out. Eventually, the server will be unable to accept any more requests.

The ODBC Connection Manager can be configured to retry dead connections on a preset interval. If the connection attempt fails, the connection is marked as "bad" and placed back in the pool. Once a bad server has been identified, subsequent connection requests for that server immediately return an error. Periodically, the Connection Manager attempts to reestablish the connection. If the attempt succeeds, the connection returns to a valid state and normal processing resumes.

You can configure the Retry Wait property for a specific ODBC driver by creating a registry key with the following settings:

HKEY_LOCAL_MACHINE \SOFTWARE
\ODBC
\ODBCINST.INI
\driver-name
\Retry Wait = timeout (REG_SZ, units are in seconds)

The following setting instructs the ODBC driver for SQL Server Connection Manager to retry a dead connection after a 60-second wait:

HKEY_LOCAL_MACHINE \SOFTWARE
\ODBC
\ODBCINST.INI
\SQL Server
\Retry Wait = "60"

Every process uses its own connection pool. If you are using an out-of-process component (or LocalServer application), you cannot take full advantage of ODBC connection pooling. Because each application process uses a separate pool, your application can only share connections with itself. In order to share connections with other components, you must write your business logic as a DLL. OLE DB providers automatically handle connection pooling.

Tips for Optimizing Database Connections

One of the main challenges of designing a sophisticated Web database application is maintaining a properly managed database connection. Opening and maintaining connections, even when no information is transmitted, can deplete a database server's resources and result in connectivity problems. Well-designed Web database applications manage connections wisely and compensate for delays due to network traffic.

Here are six tips to help you optimize your use of database connections.

Tip 1: Enhance Performance on SQL Server Systems

If you don't intend to connect to Access databases with ADO, you can enhance your application's performance by changing the threading model for the main ADO components from "Apartment" to "Both" in the registry.

By default, the ADO objects are assigned the Apartment threading model in the registry. This model guarantees that each object is allocated a dedicated thread for the life of the object, and all calls to the object execute on the same thread. Although the Apartment model provides significant improvements over the single-threading model (in which many objects share one thread), and works with providers that are not free-threaded (like Access), it also has its performance drawbacks. For instance, if you store ADO components, such as the Connection object, in the ASP Session object, IIS 5.0 will enforce a limit of one thread per user session.

To switch ADO to a Both threading model, open Windows Explorer and doubleclick Makefre15.bat in the ADO installation folder (C:\Program Files\Common Files\System\Ado, by default). To reverse the process (that is, to return the threading model to the Apartment model), double-click Makeapt15.bat in the ADO installation folder.

Tip 2: Set the Connection Time-Out

Limit the amount of time your application waits before abandoning a connection attempt and issuing an error message.

A database server experiencing a sudden increase in activity can become backlogged, which greatly increases the time required to establish a database connection. Excessive connection delays increase the time that users wait to find out that requests cannot be processed.

By changing the Connection object's ConnectionTimeout property, you can reduce the time it takes for a connection to time out. Not all providers support this property, but it can dramatically increase the perceived responsiveness of those that do. The default for the ConnectionTimeout property is 30 seconds.

The following script sets the ConnectionTimeout property to wait 20 seconds before canceling the connection attempt:

Set cn = Server.CreateObject("ADODB.Connection")
cn.ConnectionTimeout = 20
cn.Open "FILEDSN=MyDatabase.dsn"

Tip 3: Close Your Connections

Close your connections as soon as you are finished with them.

By proactively closing connections when they are no longer needed, you reduce demand on the database server and make resources available to other users. Connections are closed when they go out of scope, but you can use the Close method to close a connection at any time.

The following code creates a Command object, implicitly creates a new connection, and calls a stored procedure.

<%
Set cmd = Server.CreateObject("ADODB.Command")
cmd.ActiveConnection = "dsn=sqlsrvr"
cmd.CommandText = "{call mysproc}"
cmd.Execute
%>

The Command object opens a connection automatically and releases it when the Command object goes out of scope at the end of the page. To release it before that time, you can either set the Command object reference to Nothing, as in:

Set cmd = Nothing

or you could close the connection by closing the active connection:

cmd.ActiveConnection.Close

Remember that storing Connection objects in the ASP Session object is not a good idea, because active connections cause database resources to remain open and would defeat the purpose of the ODBC connection pool.

Don't use more database connections than you need. Share connections wherever possible.

Whenever you specify a connection string, rather than a connection variable, you are requesting a new connection to the server. Instead of a connection string, create a single connection variable, and use it with the ActiveConnection property of the Command and Recordset objects.

If your application needs to perform more than one database operation, this approach is a good choice, because it creates the Connection separately from other ADO objects. The object can be reused as necessary and closed when all operations have been performed. Simply specify your existing Connection object when opening Recordset and Command objects. For example, the following script stores a reference to an open database connection in a global variable, and uses it to perform two separate queries against the same database:

<%@ LANGUAGE=VBScript EnableSessionState=False %>
<HTML>
<HEAD>
<TITLE>Products and Customers</TITLE>
</HEAD>
<BODY BGCOLOR=#FFFFFF>
<H2>Products</H2>
<%
Set cn = Server.CreateObject("ADODB.Connection")
cn.ConnectionTimeout = 20
cn.Open "DSN=AdvWorks"
Set rs = Server.CreateObject("ADODB.Recordset")
rs.CursorType = adOpenForwardOnly
rs.LockType = adLockReadOnly
rs.Open "Products", cn, , , adCmdTableDirect
Do Until rs.EOF
Response.Write rs("Product Name") & "<BR>"
rs.MoveNext
Loop
rs.Close
%>
<H2>Customers</H2>
<%
rs.Open "Customers", cn, , , adCmdTableDirect
Do Until rs.EOF
Response.Write rs("Company Name") & "<BR>"
rs.MoveNext
Loop
rs.Close
cn.Close
%>
</BODY>
</HTML>

Tip 5: Restrict Connections Across Pages

If your provider doesn't support automatic connection pooling, find a balance between the greatest number of connections on the one hand, and the hidden costs of creating and destroying connections on the other.

Storing Connection objects (or any of the database access components, for that matter) in the ASP Application object is not recommended. Unless your OLE DB provider supports the Both threading model, you will cause the Web server to serialize all user requests for the application-not a way to improve performance. Access does not support this, though SQL Server does; see tip 1. Even if your OLE DB provider supports free threading, you must be cautious when storing any components in the Application object.

If you must hold connections open, it is better to create them as individual users require, and store them in the user's Session object. Like the Application object, Apartment threaded providers such as Access lock the session to a single thread for all requests. Because sessions must time out before the server resources are finally released, applications that store connections in the Session object should also provide a means for terminating the session. This allows other clients to use the connection. For example, you could provide a Log Off button to explicitly end the user's session and release the active connection. (See "Developing Web Applications" in this book.)

The following example opens a connection at the start of the user's session. The connection is automatically closed when the session ends.

<OBJECT ID=cnSession RUNAT=Server SCOPE=Session
CLASSID="clsid:00000514-0000-0010-8000-00AA006D2EA4">
</OBJECT>
Sub Session_OnStart
'Open ADO connection to "UsersDB1" database.
cnSession.Open "UsersDB1", "userdblogin","userdbpassword"
End Sub

Remember, although you can utilize a single connection across more than one page, doing so holds the connection open and defeats the advantages of connection pooling.

Tip 6: Increase the Size of the Record Cache

Increase connection throughput by requesting multiple records at once.

By default, the CacheSize property of the Recordset object is set to 1 for all cursor types. By increasing CacheSize, you can increase the number of records the provider will retrieve at one time into local memory. For example, if CacheSize is 10, the provider will retrieve the first 10 records when first opening the Recordset object. As you move through the Recordset object, the provider uses the data from the local memory buffer. When you move past the last record in the cache, the provider retrieves the next 10 records from the data source.

rs.CacheSize = 10
rs.Open strSQL

Recordset caching doesn't work well for binary data, like images and text streams, because only the data from the last record in the cache will be available. When using binary data types, you must retrieve one row at a time. Also, records retrieved from the cache don't reflect changes other users may have made since the records were retrieved. To force an update of all the cached data, use a dynamic cursor and call the Resync method.

Recordsets and Cursors

A Recordset, in its simplest form, is a collection of rows returned from an ADO database query. In addition to the data, each Recordset object includes the database schema (or metadata) and a connection to the database. You can create and access a Recordset with just a few lines of script:

<%
Set rs = Server.CreateObject("ADODB.Recordset")
rs.Open "SELECT * FROM Authors", "DSN=Pubs;UID=sa;PWD="
'Loop through the recordset with MoveNext.
Do Until rs.EOF
'Access record fields here.
rs.MoveNext
Loop
rs.Close
%>

All Recordset objects include a cursor, which marks the current record. When a Recordset is first opened, the cursor is positioned to the first record that matched the query. As you move the cursor forward with MoveNext, it will eventually run out of records, causing the Recordset's EOF (End Of File) property to change to True.

All of the cursor types will let you insert and update records in a Recordset, unless you have specified adLockReadOnly as the lock type. Different cursor types give you varying degrees of visibility for database actions performed by other users. Table 7.3 lists cursor types based on their level of functionality.

Table 7.3 Recordset Functionality by Cursor Type

Cursor Type	Insert, Update Records	View External Updates/Deletions	View External Inserts
Forward-only, read-only (default)	False	False	False
Static	True	False	False
Keyset	True	True	False
Dynamic	True	True	True

Table 7.4 names and describes the four types of cursors available for ADO Recordset objects. There is no single cursor type that you should always use. Your choice of cursor type should depend on the functionality you require. The following sections further describe the differences in these cursor types.

Table 7.4 ADO Cursor Types

Cursor Type	Description
adOpenForwardOnly	Forward-only cursor. You can only scroll forward through records. This is the default cursor type.
adOpenStatic	Static cursor. A static copy of a set of records that you can use to find data or generate reports. Additions, changes, or deletions by other users are not visible. The Recordset is fully navigable, forward and backward.
adOpenDynamic	Dynamic cursor. Additions, changes, and deletions by other users are visible, and all types of movement through the Recordset are allowed (except for bookmarks if the provider doesn't support them).
adOpenKeyset	Keyset cursor (keysets are values used to access specific records or rows). Like the dynamic cursor type, except that you can't see records that other users add. Deletions and other modifications made by other users are still visible.

Since most applications typically access the Recordset sequentially from the first record, the default cursor type is optimized for forward-only traversal. (In the case of SQL Server, it is read-only as well. If you use the Execute method of the Connection (and Command) object, you will always get a forward-only, read-only cursor. This is also the default behavior of Recordset.Open if no extra parameters are specified.

When you use the Recordset.Open method, however, you have the opportunity to configure your Recordset exactly as you'd like it (cursor type, record locking method, number of records to cache, and so on). The following example demonstrates how to create Recordset objects with varying degrees of cursor functionality:

Function SimpleCursor()
Set cn = Server.CreateObject("ADODB.Connection")
cn.Open "pubs", "sa", ""
'Creates a forward-only, read-only cursor.
Set SimpleCursor = cn.Execute("select * from authors")
End Function
Function MoreFunctionalCursor()
Set cn = Server.CreateObject("ADODB.connection")
cn.Open "pubs", "sa", ""
'Requests dynamic keyset cursor and optimistic 
'Record locking (concurrency).
Set rs = Server.CreateObject("ADODB.Recordset")
rs.Open "select * from authors", cn, adOpenKeyset, adLockOptimistic
Set MoreFunctionalCursor = rs
End Function

Forward-Only Cursors

The first type of cursor is optimized for the most common type of database access-sequential, forward traversal of records. This is the default cursor type for Recordset objects created when you use the Execute method of the Command and Connection objects. Although you can still add, delete, and update records with a forward-only (but not read-only) cursor (depending on LockType), you cannot set a bookmark in the current row to return to the record later. If your cursor type supports bookmarks, you could save a bookmark to the current row if you need to come back later. Also, you cannot use the Resync method to view changes made externally to the Recordset.

Some providers (like SQL Server) implement the default forward-only, read-only cursor using a "firehose" mode, meaning that the connection is held open as long as there is data to retrieve. Once the cursor has been opened in this mode, the server does not wait for the client to request the next record and then provide it. Instead, the server sends the records as a continuous stream of data. Firehose cursoring is very fast, since overhead is only incurred as the data is read. Firehose cursors do not allow updates to the data.

Forward-Only Restrictions on Binary Large Objects

The Binary Large Object (BLOB) is a type of data field that contains large blocks of text, or binary data such as graphics, sound, or compiled code. BLOBs can be fetched with a forward-only or "firehose" cursor, but there are some access restrictions on the resulting Recordset.

The CacheSize property of the Recordset must be set to 1. If you cache more than a single row, only the BLOB on the last row is available.
All BLOB columns must be to the right of any scalar (non-BLOB) columns fetched in the query and you must access them in a left-to-right order.

These restrictions have nothing to do with the underlying table layout-only with the Recordset object. For instance, you could have a BLOB column in the middle of your table definition (with scalar columns to the left and right), but when you execute your SELECT statement, you must select the BLOB column after the scalar columns. The storage engine on SQL Server doesn't impose limitations on where a BLOB column lies, but it is significant to the ODBC driver.

Static vs. Dynamic Cursors

When you request a static cursor, you are requesting a snapshot of the data at the time the Recordset is created. Once the client has received all the rows, the cursor can scroll through all data without any further interaction with the server. The cursor position can be changed using both relative positioning (offsets from the current, top or bottom row) and absolute positioning**(using the row number). The static cursor's only shortcoming is that changes made to the database while the Recordset** is open aren't made available to a client using a static cursor.

A dynamic cursor, on the other hand, makes database changes available as they happen. Because the dynamic cursor requires the database provider to reorder data in the Recordset as it changes on the server, it is much more expensive than the static cursor in terms of the processing it requires. Also, because the order of the underlying records could change, only relative positioning is available to dynamic cursors. Lastly, dynamic cursors don't support bookmarks. If you need bookmark support, use a keyset or static cursor.

Keyset Cursors

A keyset cursor combines the functionality of the static and dynamic cursor types. It can view all database updates made by other users, and it can search using both absolute and relative positioning. A keyset is a set of values used to access specific rows or records in a database. The "key" is actually based on the database indexes used by the current table. When the database needs to update the cursor's value, it does so based on the row's key. Therefore, a "keyset recordset" must include enough columns to ensure unique records in a Recordset that's being updated.

Under normal circumstances, when you request records from a data source, you get all the data at once. However, when a keyset cursor is used, only the keys are returned, giving a performance boost. The data for the keys is retrieved when you scroll through the recordset. Except for the first query, which returns both keys and data, the next block of data is fetched only when it is needed. This way the cursor only has to manage small keys rather than large records. Therefore, keyset cursors are more suited for use with a Recordset in situations where not all the data will be needed at once.

Cursor Concurrency

The server-side Recordset supports four types of record locking (also called concurrency), as shown in Table 7.5.

Table 7.5 ADO Cursor Lock Types

LockType	Meaning
adLockReadOnly	The database doesn't lock records, since you are only accessing them in readonly state. This is the default concurrency.
adLockPessimistic	The database locks the records being changed as soon as editing begins. The records are unlocked when all changes are complete. No two users can access the same records at the same time.
adLockOptimistic	The database locks the records being changed only when the changes are committed. Two users can access the same record at the same time, and the database must be able to reconcile (or simply reject) conflicts.
adLockBatchOptimistic	This mode is required for batch updates using client cursors, and is similar to optimistic concurrency.

Note: If you fetch more than a single record into the record cache, you should use optimistic concurrency. Doing so allows the server to forgo locks on the database until they are needed, thus freeing resources. However, if there is high contention for the resource, pessimistic concurrency may be preferred. It is easier to reject a request to access a database and have the server try again than it is to reconcile data that is rapidly becoming out-of-date in a record cache.

Cursor Location

When you use RDS to open and populate a disconnected Recordset, you're relying on the client-side cursor engine. Although you would normally use a server-side cursor in applications hosted by the middle tier, it is possible to create and use client-side cursors on the server as well.

Client-Side Cursors

In order to use a client-side cursor, you must change the CursorLocation property before you open a Recordset, as in the following:

rs.CursorLocation = adUseClient

The client-side cursor engine creates a disconnected RDS Recordset, such as you would find on the client tier. Of course, this also requires the database to return all the records that satisfy the query when the Recordset is opened. For this reason, no matter what value you select for CursorType, you will really only have a static snapshot of the data when using client-side cursors.

Note: Because updates to a disconnected Recordset aren't immediately communicated to the server, you won't be notified of Recordset concurrency errors immediately, as you are with server-side cursors.

Two additional CursorLocation values are available: adUseClientBatch (for use with adLockBatchOptimistic concurrency setting) and adUseNone. The adUseNone value instructs ADO to forgo its normal rowset helper support, which provides extra cursor functionality beyond that which the OLE DB provider may support. When you specify adUseNone, you are effectively telling ADO not to include this additional service provider.

Client-side cursors can be used to extend the functionality of the existing driver-supplied OLE DB cursor support. It is possible that the client-side cursor engine can support functionality that is unavailable from the DBMS driver itself, as is the case with the Oracle driver. The RDS client-side cursor may also support features that your server-side cursors do not. For example, you can only use absolute page positioning against a data source by using a client-side cursor.

Server-Side Cursors

Although client-side cursors have their advantages, most applications should be written using server-side cursors for the following reasons:

Memory Usage When using server-side cursors, the client does not need to cache large amounts of data or maintain information about the cursor position; the server provides that functionality.
Performance If you only intend to access a small fraction of the data in a Recordset (typical of many browsing applications), server-side cursors boost your performance. This is because only the required data (and not the entire result set) is loaded into memory.
Additional Cursor Types Client cursors are either forward-only with read-only concurrency, or static with read-only and optimistic concurrency. Server-side cursors support the full range of concurrency values and a variety of different cursor types. For instance, keyset and dynamic cursor types are available only if you use server-side cursors.

Managing Records in a Recordset

No discussion of the ADO Recordset is complete without talking about how a developer can use it to accomplish the task at hand. In other words, how do you best use a Recordset object to add new records, delete and update existing ones, and work through the results of queries? Although this section describes a few techniques to help you through the more troublesome aspects of ADO Recordset management, it is not intended as a complete tutorial of ADO development. For more information, see the IIS 5.0 online product documentation.

Keeping Track of New Records

When you add a new record, several factors determine where in the Recordset it will be inserted. If you need to move the cursor after you've added a new record, keeping track of where the record was inserted can be a little tricky.

If your cursor type supports bookmarks, you could save a bookmark to the current row and come back to the record later. However, if you insert records into a table without an index, you will not be able to get back to the newly inserted row until after you use the Requery method to request a new Recordset.

Identity fields, which are updated automatically when you insert a record, can be useful for tracking entities in your database. However, you have no control over the value in this field, and from ADO you have no completely foolproof way of determining what the value will be before it is assigned. Since the value is assigned by the database system itself, you must use a dynamic or keyset cursor, and call the Resync method after you Updatethe new inserted row. The following example demonstrates this technique by adding a new record from data posted using an HTML form, and by displaying the new row's identity value:

<%
'Open a static cursor on the Survey table, and add record.
Set rs = Server.CreateObject("ADODB.Recordset")
rs.Open "Survey", Application("ConnectionString"), _
adOpenKeyset, adLockOptimistic, adCmdTableDirect
rs.AddNew
'Add form fields to Recordset, using field names as columns.
For Each Item In Request.Form
strItem = Trim(Request.Form(Item))
If strItem <> "" Then
rs(Item) = Server.HTMLEncode(strItem)
End If
Next
'Set time of update, and update new record with form values.
rs("Date") = Now()
rs.Update
'Force an update of the Recordset with the identity information.
rs.Resync
Response.Write "You are response number: " & rs("ID") & "<BR>"
rs.Close
%>

Important: Although this technique provides reasonable performance for smaller tables with few records, performance slows down considerably with larger tables. If you are using SQL Server, you should strongly consider using stored procedures to add new records. For more information, see "Stored Procedures" later in this chapter.

Avoiding Query Time-outs

It's not uncommon to find database tables of 10 million to 50 million records. On these systems, even stored procedures can sometimes take longer than a few minutes to run. A user running a report knows that the query may take several minutes and this is acceptable. However, a Web page doesn't normally wait that long before it will time out.

To solve this problem, you first need to set the CommandTimeout property of the Connection object. This property indicates how long to wait for a query to execute and applies to the Connection.Execute and Recordset.Open methods when the Source property is not a Command object. If the Recordset.Open call is using a genuine Command object, you'll need to set the CommandTimeout property of the Command object instead. This property establishes the length of time the application should wait for a Command objectto execute.

You will also need to increase the values of the ScriptTimeout property of the ASP Server object. If this property isn't set, it doesn't matter how long the ADO query takes; the script will have stopped executing in the meantime. Both the CommandTimeout and ScriptTimeout properties accept values indicating the number of seconds to wait before canceling the operation.

Purging Deleted Records

After you delete a record, the RecordCount property still includes the deleted record.

If you refresh (or requery) the Recordset, it will not include the deleted record and RecordCount will be accurate. If you set CacheSize to 1 (so that you are caching only a single record at a time), RecordCount will accurately exclude deleted rows. You should avoid increasing CacheSize with a keyset cursor type in circumstances when you are expecting to be able to use RecordCount after you delete records.

References to Field Values

One of the more surprising consequences of using a loosely typed VBScript variable is that you sometimes don't know exactly what you are referencing. In the following ASP example, the session variable strAuthor apparently loses its value once the Recordset is closed.

Set Session("strAuthor") = rstPublishers.Fields("Author")
rstPublishers.Close

And then later in the script:

Response.Write Session("strAuthor") 'Where'd it go?

The problem is that strAuthor is a Variant data type containing a reference to a Field object, not a string, as it appears to be. When the Recordset is closed, the Field object is no longer valid, and the variable appears empty. You can avoid the problem by qualifying the code with the Value property:

Session("strAuthor") = rstPublishers.Fields("Author").Value.

Rather than perform a lookup from the Recordset collection each time you need a field (which is what happens when you use the column name), you can use a reference to a Field object to keep track of the current value of a field.

<%
Set rs = Server.CreateObject("ADODB.Recordset")
rs.Open "Products", "DSN=AdvWorks",_
adOpenForwardOnly, adLockReadOnly, adCmdTableDirect
'Select fields now, using single lookups.
Set fldProdId = rs("ProductID")
Set fldProdName = rs("ProductName")
Set fldProdDesc = rs("ProductDescription")
'Loop through records using field references.
Do Until rs.EOF
Response.Write fldProdId & ": <b>" & fldProdName &_
"</b>: " & fldProdDesc & "<BR>"
rs.MoveNext
Loop
rs.Close
%>

With this technique, it is possible to perform a query even without specifying a SQL command.

VBScript Example: Filling a List Box

When using a server-side script to fill a list box for use on the client's HTML page, keep in mind that the list box doesn't exist yet when your server-side code is executed. For this reason, you need to generate client-side code using server-side scripting. You can use either the standard HTML SELECT tag (which creates a drop-down list box), or a custom ActiveX Listbox control. Using the SELECT tag, you simply use server-side HTML to fill in the VALUEs. The following example demonstrates how you could fill in a SELECT control from an ADO Recordset:

<SELECT>
<% Do While NOT rs.EOF %>
<OPTION VALUE="<%= rs("Name") %>"><%= rs("Name") %></OPTION>
<% rs.MoveNext
Loop %>
</SELECT>

When you are working with client-side ActiveX objects like the Listbox control, it's a good idea to use server-side scripting to programmatically pass values to ListBox.AddItem calls during the Window_OnLoad event. Here is an example (the groups of double quotation marks are used to keep the Name data together as a string):

<SCRIPT LANGUAGE="VBScript">
<!--
Sub Window_OnLoad()
<% Do While NOT rs.EOF
Response.Write "ListBox.AddItem """" & rs("Name") & """" & vbCr
rs.MoveNext
Loop %>
End Sub
" >
</SCRIPT>

PerlScript Example: Filling a Table

Filling a table is nearly identical to filling a SELECT tag. Here's an example that uses PerlScript to fill the rows of a table:

<%@ LANGUAGE=PerlScript %>
<% #--- Open the connection and query.
$Conn = $Server->CreateObject("ADODB.Connection");
$Conn->Open( "AdvWorks" );
$RS = $Conn->Execute( "SELECT * FROM Products" );
%>
<TABLE BORDER=1>
<TR>
<% #--- Create a row of column headings.
$count = $RS->Fields->Count;
for ( $i = 0; $i < $count; $i++ ) {
%><TH><%= $RS->Fields($i)->Name %></TH><%
};
%>
</TR>
<% #--- Fill in the rows of the table with data.
while ( ! $RS->EOF ) {
%><TR><%
for ( $i = 0; $i < $count; $i++ ) {
%><TD VALIGN=TOP>
<%= $RS->Fields($i)->Value %></TD><%
};
%></TR><%
$RS->MoveNext;
};
#--- Close connection.
$RS->Close;
$Conn->Close;
%>
</TABLE>

Limiting the Number of Records

If you know you'll require only the first few records, it makes sense to limit the number of records retrieved from the database. You can do this with the MaxRecords property of the Recordset object. The default setting for this property is zero, which means that the provider normally returns all requested records. Setting it to some other value will limit the size of the rowset returned from the query. The effect is the same as if you had used the Microsoft® SQL Rowcount directive as part of your query.

The MaxRecords property is often used together with a SQL "ORDER BY" clause to produce a "Top Ten" list based on some attribute of the data. The next example returns a Recordset containing the 10 most expensive publications in the Titles table.

<%
'NOTE: DSN-less connection.
strCnn = "driver={SQL Server};server=(local);" &_
"uid=sa;pwd=;database=pubs"
Set rs = Server.CreateObject("ADODB.Recordset")
rs.MaxRecords = 10
rs.Open "SELECT Title, Price FROM Titles ORDER BY Price DESC",_
strCnn, adOpenForwardOnly, adLockReadOnly, adCmdText
%>

Note: You must set the MaxRecords property before the Recordset is opened. After setting MaxPrecords, the property is read-only.

Visual Basic Example: Paging through a Recordset

After browsing the top ten results, users may want to view the remaining records. In this case, you don't necessarily want to limit the records. You just want to stop after the first page is filled, and wait for the signal to continue. It's easy enough to stop after a certain number of records. But what's the best way to pick up where you left off?

The Recordset object exposes three properties to assist you in your task: PageSize, PageCount, and AbsolutePage. Once you have set the page size, your database provider can calculate how many pages of data it will return to you, and let you jump to locations at the beginning of each page. (If you set the CacheSize equal to PageSize, only the records you display will be cached in the RecordSet.) Note that you must use a client-side cursor to enable absolute positioning in ODBC data sources.

The following Visual Basic example uses the AbsolutePage, PageCount, and PageSize properties to display names and hire dates from the Employee table (of the Pubs database), listing five records at a time in a message box:

Public Sub AbsolutePageExample()
Dim rstEmployees As New ADODB.Recordset
Dim strCnn As String
Dim strMessage As String
Dim intPage As Integer
Dim intPageCount As Integer
Dim intRecord As Integer
strCnn = "driver={SQL Server};server=(local);" & _
"uid=sa;pwd=;database=pubs"
'Use client cursor to enable AbsolutePage property.
rstEmployees.CursorLocation = adUseClient
rstEmployees.CacheSize = 5
rstEmployees.Open "employee", strCnn
rstEmployees.MoveFirst
'Display names and hire dates, five records at a time.
rstEmployees.PageSize = 5
intPageCount = rstEmployees.PageCount
For intPage = 1 To intPageCount
rstEmployees.AbsolutePage = intPage
strMessage = ""
For intRecord = 1 To rstEmployees.PageSize
strMessage = strMessage & _
rstEmployees!fname & " " & _
rstEmployees!lname & " " & _
rstEmployees!hire_date & vbCr
rstEmployees.MoveNext
If rstEmployees.EOF Then Exit For
Next intRecord
MsgBox strMessage
Next intPage
rstEmployees.Close
End Sub

Recordset paging gets a little more complex when you are using an ASP page. Because you can return only a single page of data at a time, you'll have to decide what to do with the open Recordset object while waiting for the next client request. You have two options-throw away the results and keep just the current page number, or stow the Recordset object in the user's Session object for later retrieval. Which approach you choose will be based on the needs of your user and your application. You must then decide whether you want to save time by caching the results, or if you would rather release the memory for another application to use.

Note: An example of Recordset paging is included on the Microsoft® Windows® 2000 Server Resource Kit companion CD, as part of the Feedback application.

Retrieving Image Data

Images (as with long bodies of text) are stored in a database as BLOB fields. Because of the added overhead for the DBMS, retrieving images from a database is slower than referencing an image URL on disk, so whether or not you store images in a database depends on the requirements of your application. You can retrieve image data with ADO by using the GetChunk method of the Field object. This method requires that you specify the number of bytes or characters that you wish to retrieve.

It takes two files working together to retrieve multiple images on the same page. The main file, which contains HTML formatting and IMAGE tags, requires the use of a separate ASP file to perform the actual image query. The secondary ASP file, Image.asp, retrieves the requested image, and returns it as a binary object in the HTTP response using a MIME content type of "image/gif." The following is the source for Image.asp:

<%@ LANGUAGE=JScript EnableSessionState=False %>
<%
var ID, rs, fld, cBytes
ID = Request.QueryString("ID");
if (ID + "" != "undefined") {
rs = Server.CreateObject("ADODB.Recordset");
rs.Filter = "ImageID=" + ID; //--- Search criteria
rs.Open ("Images", Application("ConnectionString"),
adOpenForwardOnly, adLockReadOnly, adCmdTableDirect);
if (!rs.EOF) {
fld = rs("ImageData"); //Get field reference.
cBytes = fld.ActualSize; //Determine size.
//Return raw binary image data as "image/jpeg" MIME type.
Response.ContentType = "image/jpeg";
Response.BinaryWrite(fld.GetChunk(cBytes));
}
else {
Response.Write("Image '" + ID + "' not found.");
}
rs.Close();
}
else {
Response.Write("No ID");
}
%>

In the main file, images stored in the database can now be retrieved using the image ID as a URL parameter to Image.asp, like this:

<IMG SRC="./image.asp?ID=<%=ImageID%>">

When you use a firehose cursor, there isn't a good way to discover the size of a BLOB field before you read it. If you must use a firehose cursor, consider maintaining a separate column in the database table that stores the image's size in bytes. Otherwise, you could call the GetChunk method repeatedly until it returns nothing. Alternatively, if you think that the image will fit into available memory, simply use the Value property to retrieve the data all at once.

Stored Procedures

If you find yourself performing complex data manipulation, consider organizing database dependencies and rules into stored procedures.

Stored procedures are precompiled queries stored on the database server. They can simplify development and significantly speed up complex queries. When a stored procedure is called, it controls which operations are performed and which database fields are accessed. A single stored procedure can even execute multiple queries.

Stored procedures have explicitly defined parameters, each with a specific data type, direction, and size. Before calling a stored procedure, the Parameter collection of the Command object must be prepared to precisely match the number and type of parameters defined for the procedure on the server. Although you can request the complete Parameter collection by calling the Refresh method, building the collection parameter by parameter is preferred. Calling this method results in faster execution and avoids a network round-trip to the server. (Also, some providers do not support populating the Parameter collection with the Refresh method.) The code, however, ends up looking a bit more complex, as shown here:

<%
Set cm = Server.CreateObject("ADODB.Command")
cm.CommandText = "AddCustomer"
cm.CommandType = adCmdStoredProc
Set p = cm.Parameters
p.Append cm.CreateParameter("@Name", adChar, adParamInput, 40)
p.Append cm.CreateParameter("@Address", adChar, adParamInput, 80)
p.Append cm.CreateParameter("@City", adChar, adParamInput, 20)
p.Append cm.CreateParameter("@State", adChar, adParamInput, 2)
p.Append cm.CreateParameter("@Zip", adChar, adParamInput, 11)
cm("@Name") = Trim(Request.Form("Name"))
cm("@Address") = Trim(Request.Form("Address"))
cm("@City") = Trim(Request.Form("City"))
cm("@State") = Trim(Request.Form("State"))
cm("@Zip") = Trim(Request.Form("Zip"))
cm.Execute
%>

Returning Values from Stored Procedures

The following lines of OSQL code (a utility that executes Transact-SQL commands) define two stored procedures-one that returns the value 10 directly, and one that returns a value by reference in its output parameter:

Create procedure sp_retvalparam as return 10
Go
Create procedure sp_inoutparam(@in char(200),@out char(200) out) as select @out = @in
Go

The following examples use two different but equally valid means of invoking these stored procedures. Here's the first method:

cmd.CommandText = "sp_retvalparam"
cmd.CommandType = adCmdStoredProc
'Quietly retrieve parameter info from server.
cmd(0).Direction = adParamReturnValue
cmd.Execute
Response.Write cmd(0)

The collection lookup, cmd(0).Direction = adParamReturnValue, causes an implicit Refresh to retrieve parameter descriptions automatically, which causes an extra trip to the server. You can avoid this extra trip by creating your own parameter collection with the CreateParameter method, as follows:

cmd.CommandText = "{ call sp_inoutparam(?,?) }"
cmd.CommandType = adCmdText
'Specify parameter info.
p.Append cmd.CreateParameter("in", adChar, adParamInput, 200, "foo")
p.Append cmd.CreateParameter("out", adChar, adParamOutput, 200)
cmd.Execute
Response.Write cmd(1)

Prepared Queries

If a SQL statement will be used multiple times, you can potentially improve the performance of your application with prepared queries. When a SQL statement is prepared, a temporary stored procedure is created and compiled. This procedure is executed when the prepared statement is called, which saves the overhead of parsing the command each time it is used.

The following example demonstrates the use of a prepared query:

<%
Set cmd = Server.CreateObject("ADODB.Command")
cmd.ActiveConnection = Application("ConnectionString")
cmd.Prepared = True
cmd.CommandText = "Select * From Catalogue Where Id=?"
Set p = cmd.Parameters
p.Append cmd.CreateParameter("prodId",adInteger,adParamInput)
cmd("prodId") = Request("Id1")
Set rs = cmd.Execute
cmd("prodId") = Request("Id2")
Set rs = cmd.Execute
%>

Usually, prepared queries are dropped when the connection is released. When connection pooling is enabled, however, you risk running out of space in your temporary database if your connections are recycled often enough. If you use prepared queries, you can configure the SQL Server driver to drop queries "as appropriate" when connection pooling is enabled. You can select this option in the Data Sources (ODBC) application (in Administrative Tools) in Control Panel.

Transaction Processing on the Web

People have been talking about client/server applications and middleware for years. But the truth is that, even with supposedly simple facilities like RPC (remote procedure call) and named pipes, writing server-based applications that readily and efficiently interact with desktop-based applications has remained just out of reach for most developers. Component Services makes it possible to easily write true n-tier applications.

Component Services extends the functionality of applications running on Windows 2000 Server and IIS 5.0. The integration of IIS 5.0 and Component Services makes it easy to manage transactions-complex, distributed database updates. In addition to managing transactions, Component Services provides other benefits that are perhaps even more important:

Simplified Programming Component programming in the Component Services environment is as noninvasive as possible. In most cases, fewer than five lines of code are required to make an object transaction-aware. In fact, most COM objects can participate in Component Services transactions without modification.
Distributed Applications Framework In the Component Services run-time environment, distributed applications are the norm, not the exception. Component Services seamlessly integrates the Component Services Executive, server processes, resource managers, and resource dispensers with the Microsoft® Distributed Transaction Coordinator (DTC) to create an environment that can run on the same system or be scattered across multiple systems.
Components Application components can be built in any language that supports the creation of COM objects.
Security Component Services automatically enforces user roles and Windows 2000 security for you.
Recovery and Restart When failures occur, COM+ applications are automatically restarted, and data consistency is maintained.
Scalability Context and thread management, automatic resource recycling, and just-in-time (JIT) activation help COM+ applications perform extremely well in high-usage scenarios.
Transactions COM+ applications inherit the power and reliability of automatic distributed transactions. Any DTC-compliant OLE DB or ODBC data source, such as Message Queuing, can participate in those transactions.

Transactions Explained

A transaction, simply put, is an "all or nothing" sequence of database transformations. No modifications are committed to the database until all steps of the transaction have completed successfully. If any of the actions cannot be completed, the entire transaction is automatically "rolled back," or undone. Transactions are a technique applied to guarantee the "correctness" of database operations.

The properties of transactions are collectively known by the acronym ACID:

Atomicity Transactions are either committed or rolled back as a whole. Unless all database transformations are successful, none of them will be committed to the database.
Consistency A transaction never leaves the system in a state that cannot be recovered if the transaction fails.
Isolation Until the transaction has completed successfully, its modifications are not visible to other users.
Durability Committed transactions persist beyond any subsequent software or hardware failures. Transactions that have not committed are rolled back when the system is restarted.

At the most basic level, transactions ensure that data is protected from accidental modifications that would invalidate it. If an event occurs that upsets the intended sequence of changes, all the previous changes can be undone to restore the database to its original form. What happens if the second (or third) update fails? What if the application that is making changes crashes? What if the computer is accidentally turned off? Transactions that are stopped short are guaranteed to have no lasting effect on your data.

A transaction defines a boundary that encapsulates several database interactions and makes them appear as a single atomic interaction. Once a transaction has begun, an application can make changes to multiple records, and the effect of these changes is isolated from the rest of the database and from other users. When the transaction is committed, all the changes appear to happen simultaneously, in such a way as to guarantee that no data is lost or compromised.

Although transactions are important in commerce, transactions aren't just about money. They arbitrate the contention that occurs with demand for any "hard" resource that cannot be created or destroyed.

Extending the Limits of Transactions

Although you can perform transactions using the ADO Connection object, the transaction is limited to a single server. In fact, most transaction processing systems allow only one server to participate in a transaction at a time.

But what if, for example, you are transferring money from checking accounts on two separate systems, and the systems use different database management systems? In this case, transaction protection limited to a single connection isn't enough. To manage both database operations as a single transaction, you need a distributed transaction coordinator like Component Services.

With Component Services, applications can easily use transactions that access multiple databases with true two-phase commit. The two-phase commit protocol ensures that transactions that apply to more than one server are completed on all servers or none at all. Two-phase commit is coordinated by the transaction manager and supported by resource managers.

Component Services supports a component-based programming model and run-time environment for developing and deploying enterprise Web server applications. You can configure components to run in the Component Services environment by adding them to a Component Services package. The packaged components run together in the same server process. Packages also define declarative security constructs that define how access to components is controlled at run time. With packages, developers and administrators can configure applications at deployment time to match the topology and security requirements of their target environment.

Transactional ASP

IIS 5.0 makes it possible to easily write scalable, reliable, and transactional Web applications. The features of Component Services are available to all ASP and ISAPI applications. IIS frees developers to focus on what is really important in an application
-the business logic.

For all ASP transactions, the "unit of work" is the ASP page. Transactions are limited to a single page; they cannot span multiple pages.

To perform a transaction, the ASP file containing the application script simply needs to include a command at the beginning of the script to declare that a transaction is needed: <%@ transaction=required %>. For these pages, a new ASP built-in object, ObjectContext, is defined that is used to commit or abort the transaction.

You indicate when the transaction is over by using the ObjectContext object, and then, if all is well, you commit the transaction using the SetComplete method. If the operation has failed, use SetAbort. Unless one of these methods is called explicitly, the transaction will commit automatically when the page has completed processing. Two event handlers are available on transactional pages, OnTransactionCommit and OnTransactionAbort.

The following example demonstrates each of these transaction elements by randomly aborting a transaction. Although it doesn't do any real database processing, it illustrates how easy it is to add the power of transactions to Web-based applications. Click the browser's Refresh button repeatedly to see the event handlers at work.

<%@ LANGUAGE=VBScript Transaction=Required EnableSessionState=False %>
<HTML>
<HEAD>
<TITLE>Transactional ASP</TITLE>
<SCRIPT Language=VBScript Runat=Server>
Sub OnTransactionCommit()
'Code used when transaction succeeds.
Response.Write "<FONT color=green>committed</FONT>"
End Sub
Sub OnTransactionAbort()
'Code used when transaction fails.
Response.Write "<FONT color=red>aborted</FONT>"
End Sub
</SCRIPT>
</HEAD>
<BODY BGCOLOR=#FFFFFF>
The transaction was:
<% 'Randomly abort the transaction.
Randomize
If Rnd > 0.5 Then
ObjectContext.SetAbort
End If
%>
</BODY>
</HTML>

Because the transactional event handlers aren't called until the page has completely finished processing, the final messages in the example appear after the closing HTML tag. A better implementation would consist of pure script in ASP pages, whose event handlers redirect to a separate page containing the appropriate HTML response. For a discussion of redirection, see "Developing Web Applications" in this book.

Any components used on the transactional ASP page indirectly affect the outcome of the transaction, even if the components themselves are not transactional.

Business Objects vs. Script in ASP Pages

It is a common mistake to code business logic by exclusively using script in ASP pages. ASP is interpreted at run time, so pure script implementations are much slower than those that use precompiled objects. Also, because the business logic is exposed as script, it can be viewed by anyone with access to the server, making it vulnerable to tampering. Lastly, scripted business logic is not easily and cleanly reusable. As the application grows, so does its complexity. Objects become enmeshed with other objects, the system becomes hard to manage, and the bugs multiply.

For all these reasons, ASP should only be used as the "glue" that holds components together. If you're serious about scalability, you must implement your business logic as components.

Transactional Components

Applications that access databases can be built two ways: Business logic can reside in a particular database, or it can be packaged into components. The advantage of using components is that business rules can be generalized to work with any database, and can be reused in multiple applications.

When business logic is segregated into components that work in multiple scenarios, the components become building blocks for applications. The same components can be used for both network and Web-based applications. Reusing components speeds development time and lowers costs.

Business Logic in Components

A component doesn't have to access a database to be considered a business object. In fact, some business objects simply perform complex calculations and automate common business tasks. This section presents some guidelines for designing effective COM components.

Component Granularity

The number of tasks a component performs determines its granularity.

A fine-grained component consumes and releases resources quickly after completing a task. Components such as these isolate individual tasks in well-defined modules, which are easily reused in other packages. For example, you might design a component to facilitate adding and removing customer records in a database. Because its task is simple, the component can be written efficiently and is easy to debug and maintain. It is also more likely to be reused in other applications that maintain customer data.

A coarse-grained component performs multiple tasks that are often unrelated to each other. For example, a component called PlaceOrder might not only create a new order, but modify inventory and update customer data, too. Coarse-grained components are somewhat harder to debug, and are less likely to be reused.

Component State

Objects that discard information between uses are considered stateless. Business objects usually don't need to maintain state to correctly process new requests; nor do they need to maintain database connections between calls. Stateless objects use fewer system resources, so they can scale better and perform more quickly.

Stateful objects, on the other hand, accumulate information over several method calls. Because a COM+ application cannot commit transactions until SetComplete is called, stateful objects effectively prevent the COM+ application from completing its work. Frequent network roundtrips and slower connections extend the lifetime of the object. The extra delay might cause server resources, such as database connections, to be held longer, and thus decrease the system resources available for other clients. A decision to hold state information in an object should be considered carefully.

As the number of concurrent transactions increases, stateless objects begin to outperform stateful ones significantly. In other words, stateless components scale better. Despite the limitations of stateful objects, it sometimes makes sense to maintain some state information, especially if it is the result of a complex or time-consuming query. Stateful objects might be used if reconstructing the object state is potentially more costly than the resources held open while it remains active.

Note: Objects that need to retain state across multiple calls from a client can protect themselves from having their work committed prematurely by the client. By calling the DisableCommit method of the ObjectContext object before returning control to the client, the object can guarantee that its transaction cannot be committed until the object has called EnableCommit.

Participating in Transactions

You can create a transactional component that takes advantage of the benefits of a COM+ application, with only a few extra lines of code.

To create a transactional component

Call GetObjectContext to get a reference to the ObjectContext object, which enables your component to "vote" on the success or failure of the transaction in progress.
Create other components by using the ObjectContext object's CreateInstance method. When a base client instantiates an object by using the CreateInstance method, the new object and its descendants will participate in the transaction, unless the new object's transaction attribute is set to Requires a new transaction or Does not support transactions.
Call either SetComplete when the object has completed successfully, or SetAbort to cancel the transaction.

Just-In-Time Activation

This discussion about component state really only makes sense within the scope of a transaction. But what happens to the object once the transaction has been completed? Component Services deactivates it. Whenever an application calls SetComplete, whether from an ASP page or from within a COM component, the system recycles all the objects involved in the transaction-even stateful ones. In the process of deactivation, the object's member variables are reinitialized to their initial values.

This process is part of just-in-time activation, which allows the Component Services run-time environment to free up object resources, including any database connections it holds, without requiring the application to release its references to component objects. Components are activated only as needed for executing requests from clients, allowing otherwise idle server resources to be used more productively. Just-In-Time activation allows your application to conserve system resources as it scales up to multiple users.

This is yet another reason to be careful about maintaining state in objects. Clients of a stateful COM component object must be aware of how it uses SetComplete to ensure that any state the object maintains won't be needed after the object undergoes just-in-time activation.

The following Visual Basic code template demonstrates how to incorporate these elements into a component:

Sub DoMTSTransaction()
Dim objCtx As ObjectContext
Dim objNew As NewObject
On Error Goto ErrHandler
Set objCtx = GetObjectContext()
Set objNew = objCtx.CreateInstance("MyObject.NewObject")
'---------------------------------
'--- More component logic here ---
'---------------------------------
objCtx.SetComplete
Exit Sub
ErrHandler:
objCtx.SetAbort
End Sub

Be sure not to use CreateObject or GetObject when creating objects in a COM component. Role-based security in Component Services works only if you use Server.CreateObject from ASP or ObjectContext.CreateInstance from within your COM component. When you use CreateObject or GetObject, the component identity is inherited from the application process. Conversely, when you use Server.CreateObject, the identity is that of the impersonated user. In order for COM role-based security to work properly, the correct caller identity must be determined.

For more information about COM security and roles in Component Services, see "Security" in this book. For more examples of how to create transactional components, refer to the Exploration Air sample site included on the Resource Kit companion CD.

Using Database Access Interfaces with Component Services

Because the ODBC Driver Manager is a Component Services resource dispenser, data accessed via ODBC is automatically protected by your object's transaction. For object transactions, an ODBC-compliant database must support the following features:

The database's ODBC driver must be thread-safe. Component Services must be able to connect to the database, utilize the connection, and disconnect by using different threads. (The Access driver cannot participate in Component Services transactions, because it is not completely thread-safe.)
If ODBC is used from within a transactional component, the ODBC driver must also support the OdbcSqlAttrEnlistInDtc connection attribute. It's through the use of this attribute that the ODBC Driver Manager can allow the ODBC driver to enlist a connection on a transaction. If you are using a database without a resource dispenser that can recognize Component Services transactions, contact your database vendor to obtain the required support.

Table 7.6 summarizes database requirements for full COM+ application support.

Table 7.6 ODBC Driver Requirements for Component Services Compliance

Requirements	Description
Support for the OLE transactions specification, or support for XA protocol	Enables direct interaction with the Distributed Transaction Coordinator (DTC) by using the XA Mapper.
ODBC driver	Platform requirement for COM+ application server.
Support for ODBC version 3.0's SqlAttrEnlistInDtc attribute	COM+ applications use this call to pass the transaction identifier to the ODBC driver. The ODBC driver then passes the transaction identifier to the database engine.
Fully thread-safe ODBC driver	ODBC driver must be able to handle concurrent calls from any thread at any time.

Because ADO is an indirect consumer of ODBC data sources, it makes a good candidate for data access from COM components. In fact, there is very little semantic difference between using ADO inside a COM component, and using it from a Web page.

Distribution and Scaling Issues

Hosting a large volume of business transactions from clients around the world introduces a demanding set of programming and administrative requirements. Distributed applications very often rely on diverse resources beyond the application's scope of control. Web applications must be prepared to host a large volume of connections without suffering a significant loss in performance.

The key factors that influence a component's ability to service a large volume of users include:

Resources allocated per user (memory, data connections).
Amount of information retrieved from and sent to the browser.
Location of processing (client, server, or both).
Impact of component distribution topology on response time.
Time to process common requests.

Components should be located as close as possible to the data source. If you are building a distributed application with a number of COM component packages running on local and remote servers, try to group your components according to the location of your data. For example, the Accounting server should host both the Accounting Component Services package and the Accounting database.

Pool your resources by server process. Note that Component Services runs each hosted package in a separate server process. The fewer pools that are running on a server, the more efficiently resources are pooled. Try to group components so that they share "expensive" resources, such as connections to a specific database. If you reuse these resources within your package, you will improve the performance and scaling of your application. For example, if a database lookup and a database update component are running in a customer maintenance application, package those components together so that they can share database connections.

Introducing Message Queuing

For components that are not directly connected, diminished network throughput can be a severe impairment to the scalability of an application. This section discusses the use of Message Queuing to increase the reliability of transactions that use disconnected components.

Message Queuing is a fast store-and-forward service that enables applications to communicate across heterogeneous networks and systems. Applications send messages to Message Queuing, and Message Queuing ensures that the messages eventually reach their destination. Message Queuing provides guaranteed message delivery, efficient routing, security, and priority-based messaging.

Most importantly, Message Queuing also supports Component Services transactions. Because it is a resource manager under the control of the DTC, you can use Message Queuing to safely implement transaction-compliant applications that ensure that message operations either succeed or fail in conjunction with other transactions. A transactional application can send a message through Message Queuing and update a database as part of the same transaction. The transaction coordinator ensures that both actions succeed or fail together.

For instance, a single online purchase might consist of a credit card validation, an adjustment to inventory on hand, and a notification to an external supplier. Of course, the external supplier's system may be temporarily offline, which prevents the system from notifying the external supplier. In the context of a transaction, failures such as this dictate that the entire process be aborted, whether or not the inventory was available and in spite of a successful credit card debit.

A better solution is to use Message Queuing, which guarantees that the vendor will be notified eventually. In this way, Message Queuing enables systems to respond more flexibly to factors that can cause transaction failures; as a result, this effectively eliminates errors caused by server outages and long pauses caused by network latency, making Message Queuing ideal for "time-independent" transactions.

Time-Independent Transaction Processing with Message Queuing

Complex transactions may have several smaller subtransactions, each with differing levels of importance. Determining which operations are required to complete the transaction, and which may be eligible for deferral, is a crucial first step in designing a robust system with Message Queuing. You should only consider using messages for operations that are not essential to the overall stability of the system.

Message Queuing operations always commit, and return, control to your application quickly. When used to process part of your transaction, Message Queuing saves a "message" to an on-disk queue, instead of performing a database operation. The message should contain enough information to describe the operation that would have been performed on the remote system, had it been connected.

Some time later (or at a time you choose), Message Queuing establishes a connection with Message Queuing that is running on the remote system and then transmits the message. As soon as the remote system has successfully received it, the message is dequeued and processed as part of a new database transaction. If all goes well, a confirmation message is sent back to the Message Queuing system that initiated the transaction. This message path is shown in Figure 7.5:

Bb742412.iis0705(en-us,TechNet.10).gif

Figure 7.5 Diagram of a Message Queuing Transaction

Since Message Queuing messages are system-specific, you are responsible for collecting the information your system will need to complete the transaction. Message Queuing merely guarantees that your message will be delivered to the appropriate queue on the remote system. The following example demonstrates how you might add support for Message Queuing to a COM component within a system that processes orders for a vendor's products:

Sub OrderProduct(
OrderID As Int,
ProductID As Int,
Quantity As Int)
Dim objCtx As ObjectContext
Dim query As New MSMQQuery
Dim msg As New MSMQMessage
Dim infoSend As New MSMQQueueInfo
Dim infoResp As MSMQQueueInfo
Dim queue As MSMQQueue
On Error Goto ErrHandler
Set objCtx = GetObjectContext()
'Open destination queue.
infoSend.PathName = "Contractor\ProductOrderQueue"
Set queue = infoSend.Open(MQ_SEND_ACCESS,MQ_DENY_NONE)
'Construct application specific message.
msg.Label = "Product Order"
msg.Body = Str(OrderID) & ";" & Str(ProductID) & ";" & Str(Quantity)
msg.PrivLevel = MQMSG_PRIV_LEVEL_BODY
'Lookup response queue.
Set infoResp = query.LookupQueue(Label:="ContractorResponse")
infoResp.Reset
'Set application specific response queue.
Set msg.ResponseQueueInfo = infoResp.Next
'Send message to remote queue.
msg.Send queue, MQ_MTS_TRANSACTION
queue.Close
objCtx.SetComplete
Exit Sub
ErrHandler:
objCtx.SetAbort
End Sub

A Message Queuing transaction merely sends a message, so the application must assume that the work can be performed. In this case, no effort has been made to determine if the contractor can supply the requested number of products. This inability to determine, in advance, if the transaction will be honored is a problem known as "deferred integrity." A good example of deferred integrity in the real world might be that of a bank handling the case of insufficient funds for a check that has already been cashed.

Once the decision has been made to allow portions of a transaction to take place asynchronously, a business policy must be implemented to determine what happens if the transaction cannot be satisfied. In certain cases, transaction failures may require an application-level rollback of prior-committed transactions. Or, they may require correspondence with the customer in order to decide the best course of action.

Additional Resources

The following Web sites and books provide additional information about IIS 5.0 and about other features of Windows 2000 Server. It also provides resources for Web server data access and transactions.

Web Links

https://www.microsoft.com/data/

Provides a central location for information about Universal Data Access and the technologies that make it possible. Here you will find information and the latest news on ADO, OLE DB, ODBC, and much more.

https://www.microsoft.com/com/

The Microsoft® Component Object Model (COM) technologies Web site encompasses information about COM-based technologies such as Microsoft® Distributed Component Object Model (DCOM), Component Services, ActiveX controls, and more.

https://msdn.microsoft.com

The Microsoft Web site for application developers and Web site builders provides information about developing Web applications. Includes information about access to data and transactions.

Books

Teach Yourself Active Web Database Programming in 21 Days byFleet, Warren, Chen, and Stojanovic, 1997, Indianapolis: Sams.net Publishing.

A step-by-step tutorial of ADO, and data-centric business object development fundamentals.

ADO 2.0 Programmer's Reference by Sussman and Homer, 1998, Chicago: Wrox Press Ltd.

Provides a concise, comprehensive guide to the way ADO 2.0 can be used in all kinds of applications. Demonstrates ADO using VB, C++, J++, and scripting languages with ASP.

Teach Yourself Database Programming with Visual C++ in 21 Days byL. Robinson, 1998, Indianapolis: Sams.net Publishing.

Covers Visual C++ programming for ADO, RDS, OLE DB, ODBC, Component Services, COM, as well as DAO and the MFC ODBC classes. You will learn how to choose the best database model/technology for your application. You will also learn about database design, ways to leverage a database server, how to create C++ component software for database applications, and multitier application development.