DavidBeavon-9445 avatar image
0 Votes"
DavidBeavon-9445 asked DavidBeavon-9445 answered

How do I get performance metrics for Azure SQL DB?

Is Microsoft deliberately hiding useful metrics for monitoring the performance of SQL? In the metrics section of the Azure portal, I can only find very vague and unhelpful numbers like "% data IO" and "% log IO". These are way too abstract and are essentially meaningless for our purposes.

I'd like to see the throughput rates in bytes/sec. And the number of IOPS. And the latencies on those IOPS.

Any help would be greatly appreciated. I am using single databases (provisioned) with vCore licensing. I realize that there is resource governance going on that is limiting my performance in terms of memory and I/O. However, without being able to review the performance characteristics of the currently provisioned database, it is impossible to determine how much of an upgrade we need to get to our desired goals.

I have read a number of articles, but none of them are helpful in telling me how to get to the throughput, IOPS, or latencies.

Any help would be appreciated.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

DavidBeavon-9445 avatar image
2 Votes"
DavidBeavon-9445 answered

It appears that Microsoft has made it really hard to discover the underlying details related to SQL IOPS in Azure. I think they are assuming that most of their customers just want to know how close they are to reaching the "% of tier" limit. The problem is that if you are under the limit and your application still seems slow then the "% of tier" is essentially meaningless for any troubleshooting purposes. For that matter, even if you are at 100% you probably need to complement that information with the absolute number of IOPS you are achieving, along with the absolute throughput rate (ie. byte rate for reads/writes).

Given that these hosted SQL databases have service tiers, the Azure limitations we run into are rarely due to hardware limitations but are due to the SQL resource governor. Because the governor is hyper-focused on "allowances" and "percentages-of-tier", that is probably the reason why our performance metrics are also being reported to us as percentages . The percentages are unhelpful, to say the least.

Say, for example, you are attempting to compare your cloud performance to how things worked on-premises. There is virtually no way to make a comparison because "100%" means something totally different in each context (or rather, it means virtually nothing in either context).

The good news is that I was able to open a case with Microsoft and they pointed me to some of the metrics I was looking for. The absolute IOPS and throughput rates are part of the resource governor's DMV. Here is the DMV that they pointed me to:

It is a bit obscure and I probably wouldn't have ever found it on my own. Thank you Microsoft support!

I haven't tested with both of them, but I think this DMV will work properly with SQL database, and SQL elastic pool. We are currently deploying our own solution to SQL elastic pool. Note that SQL managed-instances have DMV's that are far easier to use and provide better information, so a managed-instance customer would probably not use the DMV that I'm referring to here.

Below is a sample query, but I came up with this after playing for only an hour. You may want to tinker with it to suit your own needs. The fields within it are also a bit obscure. The fields are reported at intervals, by resource pool name. The pool names are a bit obsure as well.

     snapshot_time , 
     name as resource_governor_resource_pool_name,
     from sys.dm_resource_governor_resource_pools_history_ex 
     where name in ('SloSharedPool1', 'internal')
     order by snapshot_time desc

Hope this helps anyone else who needs actual IOP metrics rather than "% of tier". I'm excited to say that this Q&A question will probably be the best result in a google search for the DMV, ("dm_resource_governor_resource_pools_history_ex "). Let me know if I'm wrong!

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

pituach avatar image
0 Votes"
pituach answered DavidBeavon-9445 commented

Good day,

The following document provides all you need regarding monitoring the Azure SQL Database:

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

There is nothing mentioned in there about how to get the actual throughput and IOPS and latencies. I had already reviewed that link and pasted it into my original question.

The "% of IO" metrics are fairly useless. You can hit IO limitations for a lot of reasons and the portal doesn't clarify anything... IE. The metrics available in the portal don't give you the underlying components that the percentages are calculating against. It reminds me of going into a retail store where there is a sale that gets you"50%" off everything... but you end up paying the same prices as you would at any other store...

If I want to get the actual underlying components (ie. throughput and IOPS and latencies) then should I open a support ticket with Microsoft? Would they have visibility on this stuff?

0 Votes 0 ·
DavidBeavon-9445 avatar image
0 Votes"
DavidBeavon-9445 answered

It appears that Azure SQL users who run "manage instance" may have the IO metrics. It looks like they can see the actual IOPS and throughput numbers for their databases (rather than the meaningless "percent of tier").

See this DMV for example:

  • io_request

  • io_byte_read

  • io_bytes_written

Is there some equivalent for the other Azure SQL databases? A percentage is essentially a ratio that is calculated from two other numbers. Surely Microsoft understands how silly it is to give us their "IO percentage" without its two underlying metrics. It almost seems like there is a deliberate effort to hide the details. Why would IOPS and throughput be provided for "managed instance" and not for Azure SQL Database? Is there a way to back into the underlying metrics? I think there is additional detail that is available for individual queries, but I'm more interested in being able to monitor the database workload as a whole.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.