Physical Database Storage Design and Disk Allocation

The system page file should not be on the same physical spindle as any database files, otherwise you will not get optimum performance from the drive as its disk head will be jumping between the page file and the database files. Ditto for placement of log files and tempdb. Ideally you’d have page file, data files, log files, and tempdb on 4 separate physical spindles.


A very good whie paper on Database Design and Disk Allocation:


In addition above, following are some conversations I had with various experts on this topic:


My Question:

How should SQL database storage configured if it involves SAN? I have heard different answers:

1. Similar to what I have mentioned in the begining of this blog – dedicated LUNs RAIDED 1+0 each for tempdb, tranlogs, data files, and system database (this could go on a mirrored local disks as well – not all eggs in one basket)

2. With SAN - RAID doesn’t matter as it involves 100s of spindles with huge Non-volatile RAM cache with disk carved out off of shared LUNs which contains multiple spindles. You could get a huge benefit of that big cache for read/write activity and disk distributed over lots of spindles. In this case, the LUNs could be shared across multiple applications, databases, etc.

Out of two which one is recommended/best practice? I have worked with systems similar to #1 and I have seen them performing very well. I don’t have much exposure with large SAN. Is #2 true with large SAN? If SAN admin doesn’t let me have dedicated LUNs, is it worth to fight for it?

Expert's Answer - I:

There really isn’t a completely clear-cut answer to this, and honestly there are SAN experts whose careers are built on understanding how to optimize specific vendors’ SAN hardware.

That said, the generalities as they apply to SQL are:

#1 represents conventional wisdom in configuring SANs, and is carried over from the locally-attached disks. You want to isolate traffic and load to each of the classes of SQL storage so that they do not interfere with each other. Certainly whatever scheme you choose at a very minimum the LUNs dedicated to the log files must be backed by RAID 0+1 sets. The update penalty on RAID 5 is too high for logs which have very high update rates.

From there, it depends on your ratio of update to read IO, and cost considerations. If you can afford the storage, go RAID 0+1 across the board. On the other hand, if you have significant sets of data which are not updated often (historical data, catalog data, etc.) then RAID 5 is a good choice, as it uses less physical disk.

Most SAN implementations have the capability to subdivide the physical spindles into separate groups upon which your LUNs can be built, via one or two levels of indirection. Still, you can use this technique to isolate the IO load for various SQL workloads.

Finally, there is an emerging trend in the industry which says that the performance difference gained by this level of tuning does not justify the expense of managing it. They advocate a simpler approach in which all spindles are grouped in one large group, and the array migrates data as needed to balance load. This would be your #2 option. Even in this case, you may want to have separate LUNs for different databases for management purposes, unless you have a HUGE number of databases, as in a hosting environment.

So, the bottom line is that it depends on the capabilities of your SAN hardware and your need to tune that last increment of performance out of your IO subsystem. Balancing this against the cost of managing a more complex storage subsystem.

Expert's Answer - II:

we used the first approach because we were using relatively low-end SAN equipment with <56 spindles and <2GB of cache memory. in this case it was important to make sure that the LUNs were seperated by function so that there would not be contention and queuing.

in some SAN scenarios there is so much cache memory available and the bandwidth to the disc media is so high that contention and queuing are virtually impossible regardless of the LUN configuration. a high-end Hitachi SAN, for instance, could have up to 128GB of cache memory, and up to 1024 individual disc drives installed.