SQL Server Best Practices Article
Writers: Denny Lee, Nicholas Dritsas
Technical Reviewers: Lubor Kollar, Lindsey Allen
Applies To: SQL Server 2005 RTM
Summary: This white paper describes how to set up a load-balanced scalable querying environment for Microsoft SQL Server 2005 Analysis Services so that you can handle a large number of concurrent queries to your Analysis Services servers. Load-balanced querying ensures that readers of OLAP cubes can consistently query for the latest aggregations throughout the day and distribute the load of all queries among the available servers. This scale-out querying architecture optimizes cube processing time, increases the frequency of cube update, and makes processing more robust as you can afford more frequent processing and transparent error recovery.
On This Page
This white paper provides details on how to set up a load-balanced scalable querying environment for Microsoft® SQL Server™ 2005 Analysis Services. Load-balanced querying ensures that readers of OLAP cubes can consistently query for the latest aggregations throughout the day and distribute the load of all queries among the available servers. This architecture provides the following improvements:
Optimizes cube processing time. With a server devoted to cube processing, you can optimize your tasks and/or configurations specific to cube processing.
Increases the frequency of cube updates. Because there are separate servers for querying and processing, cubes can be updated more often if necessary.
Makes the processing mechanism more robust. With a server devoted to cube processing, you can afford more frequent processing and transparent error recovery.
Figure 1 Analysis Services scale-out querying architecture
Figure 1 is a template based on multiple existing customer implementations that have been operating in production for a number of years. As can be seen in the diagram, the scale-out querying architecture presented involves creating three separate layers: the presentation layer, the Analysis Services querying layer, and the processing layer. The presentation layer presents users with the data and is comprised of four Web servers. The Analysis Services querying layer contains the cubes that provide the data for the presentation layer and consists of two Analysis Services servers. The processing layer, which processes the data, consists of a server running SQL Server and an Analysis Services server. The actual number of servers will vary based on your environment, but this is a good starting point.
The presentation layer in our template (see Figure 1) has four Web servers, which present the OLAP cube data to the user. In typical environments, accessing cube data involves the use of SQL Server Reporting Services, a custom Web UI, or another Web application. The template shown in Figure 1 uses a Web application or SQL Server Reporting Services. Each of the four Web servers have the specifications of IIS 6.0, 32-bit Microsoft Windows® 2003 Standard Edition, 4-GB RAM, two CPUs, and about 250-GB (gigabyte) of storage. Because the goal of this architecture is to scale out, this is a standard Web reporting application setup suitable for most Analysis Services Web applications.
The Analysis Services querying layer has two Analysis Services servers; each server contains two instances of Analysis Services running on Windows 2003 Enterprise Edition in 64-bit. As many current hardware vendors can run in either 32-bit or 64-bit mode, it makes sense to make use of the Analysis Services ability to access memory within a 64-bit context. To properly use Analysis Services in 64-bit, it is suggested that you have at least 8-GB RAM and four CPUs (or two dual cores). Approximately 700 GB of storage space is required as well.
The processing layer includes one Analysis Services processing server and one server running SQL Server. The Analysis Services processing server has almost the same specifications as the Analysis Services querying servers except that only one instance of Analysis Services is required on this server. The server that is running SQL Server is more powerful so that it can process all of the data to feed the cubes and to be the source for ROLAP dimensions. This server is running SQL Server 2005 and Windows 2003 Enterprise Edition in 64-bit. As it is a more powerful server, it is suggested that you have 16-GB RAM and eight CPUs (or four dual cores). Approximately 1 terabyte of storage space is required so that SQL Server can process and store the data, indexes, and logs.
This document provides details and benefits based on this configuration. It is possible that your environment will require modifications. For example, if you have very large OLAP cubes with many distinct count calculations, it may be necessary to have more memory and/or CPUs to optimize performance on your Analysis Services servers. If your custom UI application is especially memory intensive and caches a lot of data, it may be necessary to increase your Web server’s memory and storage space. Your reporting application may have its own unique requirements. But the details provided in this white paper provide a starting point to make hardware requirement decisions.
The benefit of this scale-out querying architecture is to provide you with the ability to handle a large number of users who are concurrently querying your Analysis Services servers. Even if your cubes are not inherently large, if many users are trying to query this information, it is beneficial to implement a scale-out environment to ensure that multiple queries do not deplete the resources of your reporting environment. Therefore, it is important for you to consider this architecture if a large number of users concurrently access your cube data.
As you can see in Figure 1, the architecture is a scale-out design that involves the use of processing servers and separate querying servers, which are synchronized upon completion of processing. Processing requires committing a new state for the OLAP cube, thus taking an exclusive lock on the database. But long-running queries will block this request, so processing must wait until those queries complete. Having separate processing and querying Analysis Services servers ensures that the two functionalities do not interfere with each other. This allows your processing server to have a much larger processing window to complete its task. Meanwhile, the querying servers are devoted to only querying tasks, thereby improving the ability to cache the queries. During the synchronization of data from processing to querying servers, the transfer occurs to idle Analysis Services instances. Upon completion, the newly updated instances become the primary query instance. This allows users to continue querying the Analysis Services server instances without experiencing any delays in response.
Note that this process is just as time efficient as processing by using SQL Server 2000 and Network Load Balancing. By the same token, this particular design does not preclude the use of hardware load balancing to help optimize the individual Web or Analysis Services query layers.
The hardware types and the number of servers in our configuration (Figure 1) successfully delivers OLAP query cellsets in a matter of a few seconds to 30 concurrent users per Analysis Services server before maxing out. These queries are running against primarily count and summation aggregations, with one distinct count measure and a total OLAP database size of approximately 100 GB. It is important to note that a large number of distinct count measures will probably reduce the number of concurrent users per Analysis Services server and/or require more memory and CPUs to provide good performance. This “typical” Analysis Services reporting environment used in our configuration can handle approximately 60 concurrent users.
The actual number of servers that you require depends on your particular workload; therefore, querying and processing performance testing should be done in order to determine the actual number of servers required. Generally, you should use two Web servers per Analysis Services querying server to allow the Web servers to handle some of the load. To best determine the number of Analysis Services querying servers required, start by determining the number of concurrent users you expect to have. As previously noted, each Analysis Services server can handle about 30 concurrent users; our current environment can handle 60 concurrent users. If you expect 100 concurrent users, double the number of Analysis Services querying servers. It will also be necessary for you to double the number of Web servers to more evenly spread the query load.
Our architecture is dependent on a single Analysis Services processing server, which can easily handle the OLAP database size of 100 GB. For your environment, you must determine the hardware resources required to handle your OLAP cube. For example, if your reporting environment querying characteristics requires more memory on your Analysis Services processing server, it is necessary to increase the memory on your Analysis Services querying servers as well. First, determine your Analysis Services processing server size to define your hardware requirements, and then use these same requirements for your Analysis Services querying servers.
The primary limitation of this architecture is that you must update the connection string for the UI that is accessing your Analysis Services query server instances. Generally, this implies that you would want this to be controlled from your Web server rather than allowing customers to hardcode the Analysis Services query server instances (for example, using Microsoft Excel to connect directly to the Analysis Services query server instance). However, many server-based UIs that interact with Analysis Services will easily allow you to perform this connection string change.
As shown in Figure 2, the basic architecture for this system consists of four IIS servers, two reporting Analysis Services servers, one processing Analysis Services server, and the server running SQL Server that contains the relational data source.
Figure 2 Overview of the Analysis Services scale-out querying environment
As you can see from this diagram, the process overview involves:
OLAP cubes are processed on a dedicated Analysis Services Processing Server; this server is processing against a dedicated server running SQL Server.
Once the OLAP cubes are processed, the data is synchronized from the Analysis Services Processing Server to the idle Analysis Services Staging server instances (Analysis Services Server).
After synchronization is complete and the Analysis Services Staging server instances are ready to serve query results, the connection strings from the UI (in this case, the IIS servers) are updated to point to the previously idle Analysis Services instances.
After these connection strings are changed, new Analysis Services queries go to the new recently synchronized Analysis Services instances. The original Analysis Services instances will still be online to complete any of the original submitted queries and then will go idle upon completion.
Note In this hardware configuration, the SQL Server and Analysis Services servers use local disk. This design does not preclude the use of a Storage Area Network (SAN). We chose this configuration as an inexpensive robust solution. With multiple copies of the Analysis Services database, a failure in any of these Analysis Services servers can be mitigated when the query load is distributed out to the other Analysis Services query servers until the server can be fixed. If the processing server fails, the OLAP database can be quickly re-created by copying over an existing Analysis Services query server instance, and processing can be restarted from this point.
The details for each stage from processing to post-synchronization are covered in this section. As previously noted, we use the ratio of four Web servers, two Analysis Services query servers (each with two Analysis Services instances), one Analysis Services processing server, and one server running SQL Server. While this is a good starting point, the overall number and ratio of servers are best determined by query and processing performance testing for your specific environment.
The initial state can be seen diagrammatically in the following figure.
Figure 3 Analysis Services scale-out querying initial state
For this stage, all four Web servers are publishing Analysis Services result sets that were generated from the two Analysis Services servers. Note that all of these queries are delivered by the only two instances (AS1 and AS3 in Figure 3) of the Analysis Services query servers. ROLAP queries go through these two instances directly to SQL Server. Meanwhile, the other two instances (AS2 and AS4) are idle.
Any time throughout the day, the Analysis Services Processing server will perform the task of processing its OLAP database against the server that is running SQL Server. There may be some potential resource constraints against the SQL Server server to handle simultaneous ROLAP queries and Analysis Services processing queries (the queries sent to the server running SQL Server from the Analysis Services processing server for the purpose of OLAP dB processing). But, these queries do not lock the underlying relational data source. The impact of these conflicts should be minimal and may result in slightly slower return of SQL queries, if there is any impact at all.
After processing on the Analysis Services processing server is finished, set up your system so that it will synchronize this OLAP database with the existing older copies of the same OLAP databases located on the idle Analysis Services server instances (AS2, AS4). The set up for our configuration is shown in Figure 4.
Figure 4 Analysis Services scale-out querying synchronization state
Because the synchronization is occurring on the previously idle instances, none of the current queries are impacted heavily. There will be some slight resource conflicts to allow the transfer of the data from the Analysis Services processing server to the Analysis Services query server instances. But the impact is similar to a copying files across the network, which is the primary (but not the entire) functionality of the Analysis Services synchronize feature. Meanwhile, long-running queries on AS1 and AS3 can continue to run with little effect.
Analysis Services server sync
The default choice for synchronization is to use Analysis Services synchronization, which is quite robust and can handle large volumes of data. For more information, see SQL Server Books Online, particularly Synchronizing Analysis Services Databases. As well, to automate this functionality, you can use SQL Server Integration Services (SSIS) or the ascmd.exe sample tool to execute an XMLA script by using the Synchronize Element. The ascmd.exe tool is included in the SQL Server 2005 samples at SQL Server 2005 Samples and Sample Databases on the Microsoft Download Center.
Custom multi-server parallel server sync
In this particular architecture, we want to perform multiple server synchronization tasks in parallel. This design is also applicable to cases where there is frequent synchronization to non-robust networks. The idea is to use a fast copy utility, such as Robocopy, to copy the changed or added (delta) files within the Analysis Services data folder. A limiting factor is that you must copy all of the files within this data folder, which means that you will be copying all of the OLAP databases on the processing server. Therefore, we suggest that when using this architecture you limit the number of OLAP databases on the processing server to only the ones that you will want to distribute to the querying servers.
The steps involve creating a SSIS package that shuts down the Analysis Services idle instances on the query servers. This ensures that any dropped, uncompleted, or long-running queries will not interfere with this synchronization process.
Note You can write your own .NET code to shut down or restart the Analysis Services instances or you can use command-line utilities such as sc.exe and netsvc.exe to control services remotely. (For more information, see Using Sc.exe and Netsvc.exe to Control Services Remotely.)
Figure 5 Example of performing four parallel custom server synchronizations
With the idle instances shut down, the SSIS package can execute multiple threads of Robocopy to copy the entire Analysis Services data folder (by default, this is C:\Program Files\Microsoft SQL Server\MSSQL.2\OLAP\Data) from your Analysis Services processing server to the idle instance data folders of the query servers. We use Robocopy because by default it will copy only the newly added or updated files from the processing server to the query server. (Robocopy is included in Windows Vista™ and you can also find it within the Windows Server 2003 Resource Kit Tools.) Once this is completed, the Analysis Services idle instances can now be restarted and will now be updated with the latest aggregates.
Change Connection Strings
When database synchronization finishes, update the connection string of your Web servers (or UI) to point to the previously idle Analysis Services instances (AS2, AS4).
Figure 6 Analysis Services scale-out querying: changing connection string state
As you are most likely doing this in an automated fashion, you can use (for example), an SSIS package to perform the connection string change after it executes the XMLA to synchronize the databases. Meanwhile, the previous instances (AS1, AS3) continue running their previously requested queries. Any new queries are sent to the new instances.
Back to the Initial State
After the connection strings have been changed, the active Analysis Services query server instances are AS2 and AS4. Once the older queries are completed, the AS1 and AS3 instances are idle and ready for the next synchronization events. That is, now we are back to the initial state as before (Figure 3 in the Initial State section) except that AS1 and AS3 are now the idle instances and AS2 and AS4 are the active instances.
Figure 7: Back to the initial Analysis Services scale-out querying state (AS1 and AS3 are now idle)
While there are two Analysis Services instances on a single Analysis Services query server, the vast majority of the queries go to only one instance at any period of time. Therefore, the two Analysis Services instances have minimal impact on each other.
This load-balanced query architecture provides the ability to distribute out concurrent queries across multiple servers to keep fast query performance. It also separates the query and processing tasks for Analysis Services so that they do not impact each other. In addition to removing task conflicts, this architecture means that the individual servers can be configured for specific tasks. Therefore, cube processing becomes more robust and transparent to end users. Due to the multiple Analysis Services instances on the query servers, you can perform more frequent updates of the cubes if necessary. With multiple Analysis Services query servers, you can balance the query load among Analysis Services servers to provide faster results.