The Real Cost of Indexes
[This article was contributed by the SQL Azure team.]
In a previous blog post I discussed creating covered indexes to increase performance and reduce I/O usage. Covered indexes are a type of non-clustered index that “covers” all the columns in your query to create better performance than a table scan of all the data. An observant reader pointed out that there is a financial cost associated with creating covered indexes. He was right, SQL Azure charges for the room required to store the covered index on our server. I thought it would be interesting to write a few Transact-SQL queries to determine the cost of my covered index; that is what I will be doing in this blog post.
Covered Indexes Consume Resources
There is no such thing as a free covered index; regardless of whether you are on SQL Server or SQL Azure covered indexes consume resources. With SQL Server, you pay some of the costs upfront when you purchase the machine to run SQL Server; by buying additional RAM and hard drive space for the index. There are other costs associated with an on-premise SQL Server, like warranty, backup, power, cooling, and maintenance. All of these are hard to calculate on a monthly basis, you need to depreciate the machines over time, and some expenses are variable and unplanned for, like hard drive failures. These are overall server operating costs, there is no way to drill down and determine the cost of a single covered index. If there was, wouldn’t it would be nice to run a Transact-SQL statement to compute the monthly cost of the index? With SQL Azure you can do just that.
SQL Azure Pricing
Currently, SQL Azure charges $9.99 per gigabyte of data per month (official pricing can be found here). That is the cost for the range in which the actually size of data you want to store falls, not the cap size of the database. In other words, if you are storing just a few megabytes on a 1 GB Web edition database, the cost is $9.99 a month. The ranges are 1, 5, 10, 20, 30, 40, and 50 gigabytes – the closer you are to those sizes that lower the cost per byte to store your data. Here is a Transact-SQL statement that will calculate the cost per byte.
DECLARE @SizeInBytes bigint SELECT @SizeInBytes = (SUM(reserved_page_count) * 8192) FROM sys.dm_db_partition_stats DECLARE @Edition sql_variant SELECT @Edition =DATABASEPROPERTYEX ( DB_Name() , 'Edition' ) SELECT (CASE WHEN @SizeInBytes/1073741824.0 < 1 THEN (CASE @Edition WHEN 'Web' THEN 9.99 ELSE 99.99 END) WHEN @SizeInBytes/1073741824.0 < 5 THEN (CASE @Edition WHEN 'Web' THEN 49.95 ELSE 99.99 END) WHEN @SizeInBytes/1073741824.0 < 10 THEN 99.99 WHEN @SizeInBytes/1073741824.0 < 20 THEN 199.98 WHEN @SizeInBytes/1073741824.0 < 30 THEN 299.97 WHEN @SizeInBytes/1073741824.0 < 40 THEN 399.96 WHEN @SizeInBytes/1073741824.0 < 50 THEN 499.95 END) / @SizeInBytes
The Cost of Covered Indexes
Now that we know our true cost per byte, let’s figure out what each covered index costs. Note that the Transact-SQL can’t tell which indexes are covered indexes from all the non-clustered indexes, only the index creator will know why the index was created.
Here is some Transact-SQL to get our cost per month for each non-clustered index.
DECLARE @SizeInBytes bigint SELECT @SizeInBytes = (SUM(reserved_page_count) * 8192) FROM sys.dm_db_partition_stats DECLARE @Edition sql_variant SELECT @Edition =DATABASEPROPERTYEX ( DB_Name() , 'Edition' ) DECLARE @CostPerByte float SELECT @CostPerByte = (CASE WHEN @SizeInBytes/1073741824.0 < 1 THEN (CASE @Edition WHEN 'Web' THEN 9.99 ELSE 99.99 END) WHEN @SizeInBytes/1073741824.0 < 5 THEN (CASE @Edition WHEN 'Web' THEN 49.95 ELSE 99.99 END) WHEN @SizeInBytes/1073741824.0 < 10 THEN 99.99 WHEN @SizeInBytes/1073741824.0 < 20 THEN 199.98 WHEN @SizeInBytes/1073741824.0 < 30 THEN 299.97 WHEN @SizeInBytes/1073741824.0 < 40 THEN 399.96 WHEN @SizeInBytes/1073741824.0 < 50 THEN 499.95 END) / @SizeInBytes
SELECT idx.name, SUM(reserved_page_count) * 8192 'bytes', (SUM(reserved_page_count) * 8192) * @CostPerByte 'cost' FROM sys.dm_db_partition_stats AS ps INNER JOIN sys.indexes AS idx ON idx.object_id = ps.object_id AND idx.index_id = ps.index_id WHERE type_desc = 'NONCLUSTERED' GROUP BY idx.name ORDER BY 3 DESC
The results of my Adventure Works database look like this:
The most expense index is: IX_Address_AddressLine1_AddressLine2_City_StateProvince_PostalCode_CountryRegion which is a covered index to increase the lookup time for the address table. It costs me almost 50 cents per month. With this information, I can make a cost comparison against the performance benefits of having this covered index and determine if I want to keep it.
Keeping It All in Perspective
If I delete my IX_Address_AddressLine1_AddressLine2_City_StateProvince_PostalCode_CountryRegion index does it save me money? Not if it doesn’t change the range I fall into. Because the Adventure Works database is less than 1 Gigabyte, I pay $9.99 per month; as long as I don’t go over the top side of the range, creating another clustered index doesn’t cost me anymore. If you look at it that way, creating a covered index that doesn’t completely fill your database is basically free. Once you are committed to spending the money for the range you need for your data, you should create as many covered indexes (that increase performance) as will fit inside the maximum size.
This is basically the same as an on-premise SQL Server, once you have committed to spending the money for the server, adding additional clustered indexes within the machines resources doesn’t cost you anymore.
The prices for SQL Azure and the maximum database sizes can change in the future, make sure to compare the current costs against the queries provided to make sure your index costs are accurate.
Do you have a better way to accomplish the same thing? Post it in the comments below. Do you have questions, concerns, comments? Post them below and we will try to address them.