question

Marcin-3613 avatar image
0 Votes"
Marcin-3613 asked Marcin-3613 commented

Really slow DISCOVER_CSDL_METADATA in SQL Server Profiler

Our setup is SQL Server 2019 Standard Edition + SSAS + multidimensional OLAP + PowerBI on top of it.

Recent issue we noticed is that when user loads PowerBI report, from time to time forementioned event occurs. This leads to really long waiting time until the report loads (event takes up to 45-60 seconds).

Our cube has ~20 dimensions and ~50 measures, 2-3 million rows in 3 partitions, MOLAP storage.

What can we do about it? How can we debug it? We don't have SSAS experts on board and googling this event didn't help much. Where can we search for reasons of such behavior?

sql-server-analysis-services
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

DarrenGosbell-0123 avatar image
0 Votes"
DarrenGosbell-0123 answered Marcin-3613 commented

How about some warming up cache? Is it good approach to try catching most popular queries by SQL Server Profiler and run these queries after each data update?

If you can confirm that caching is the cause of this issue then warming up the cache after processing by running some of the more popular queries would be a good idea.


What about proactive caching options that are available in Enterprise Edition?

That will not help, that feature is poorly named, it is actually more like an automatic processing option and will not actually fix your issues if it is caching related.

Or having more than 3 partitions? We are running Standard Edition now and wonder if changing license would solve any of our problems.

You could test this with your current 3 partitions, by just re-processing one partition and seeing if the DISCOVER call is any faster than when you process all 3 partitions. But from memory I think that the MOLAP caches are managed at the cube level, not at the partition level so I'm not sure if this will make any difference.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thank you for your time. This entire technology stack is really sophisticated and we are self-taught, so sometimes it's really hard for us to understand what's going on behind the scenes. This discussion was very helpful and brought many meaningful insights and provided us with some bitter answers.

I believe that right now we are processing full only one partition - the one with current data. It's still not so clear for us what exactly happens if we change the processing options and what would be the best approach, considering incremental updates of fact tables and not-so-often updates of dimensions.

If I may have another question - what is the by-the-book approach of having live data within OLAP cubes? Is that tool even meant to provide users with live data? Or have we chosen wrong solution to provide our key stakeholders with almost live data?

0 Votes 0 ·
LukasYu-msft avatar image
0 Votes"
LukasYu-msft answered

Was this happen each time the SSAS server started ? Or do you aware what triggered the long data retrieving ?

What are the normal time cost for this DISCOVER_CSDL_METADATA showing in your profiler?

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Marcin-3613 avatar image
0 Votes"
Marcin-3613 answered

SSAS is turned on permanently. The cubes are refreshed every 30 minutes and fully refreshed once before workday starts.
The DISCOVER_CSDL_METADATA usually takes around 40-90 seconds (this is our "normal" time) several times a day for every user.

It kind of looks like there is some cache dropped, so every time as time passes, the event has to be fired again.

I am also concerned about our SSAS RAM/CPU usage - it is really low actually, even when it is processing cubes. It is the only application on server and it uses 3 out of 256 GB RAM and ~5% of CPU.
6 cubes, around 300-400GB of data behind them, several milions of rows.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

dgosbell avatar image
0 Votes"
dgosbell answered

It kind of looks like there is some cache dropped, so every time as time passes, the event has to be fired again.

Each time you process the cube all the caches will be flushed, but normally only the first query after a refresh would be affected. Are you using dynamic security roles (ie. security roles referencing the USERNAME() function) or do you have a large number of roles? Typically the global cache storing the metadata can be shared across all the users in a role but if you have lots of roles or dynamic security this could account for why multiple users see this slow performance.

The processing each half hour probably also accounts for the low ram usage. Multi-dim populates its memory caches based on usage, but when you re-process all the caches are flushed.

What sort of disk subsystem are you using. Putting the data onto fast SSD drives might help here. You could try running performance monitor on the server to see which resource is causing the bottleneck during the DISCOVER_CSDL_METADATA call (my guess would be that the disks are more likely to be the issue than the CPU or network)

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Marcin-3613 avatar image
0 Votes"
Marcin-3613 answered

I'll try my best to precisely answer your questions. Thank you for your detailed input though!

I believe we're not using any dynamic functions. The only roles we use are the roles defined inside the cube that split users to readers and admins, based on Active Directory groups. Overall we have defined 4 roles, each role has 6-30 members.

We were aware of multidim caching it's results, we've also designed usage-based aggregations. How can we keep our data up to date without dropping this cache?

We are using dell express flash pm1725b 1.6tb aic, so as far as I understand we are good here, in terms of disk and data storage.

In regards to performance monitor - we are waiting for our DBA, as we can't properly set it up due to insufficient privileges, will get back to you if we find something.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

dgosbell avatar image
0 Votes"
dgosbell answered

How can we keep our data up to date without dropping this cache?

You can't do this, the cache is related to your data. If you change your data the cache won't match the raw data anymore.

I have not seriously used Power BI over multi-dim in years so I'm speculating a bit here. But another possibility is that the CSDL contains some stats about the data (eg min and max values for each column). These values are trivial to calculate in tabular, but would be more expensive in multi-dim. It would be interesting to see if the slow responses to DISCOVER_CSDL_METADATA correlates with the first query after a data refresh. If so, one option might be to warm the cache by adding a call to DISCOVER_CSDL_METADATA after each data refresh completes or possibly even running a small set of common queries to also warm up the data caches.

You might also be able to get some more information about what is going on by running a profiler trace. If you looked at the command begin/end and discover begin/end and some of the progress events you might be able to get some insight into what SSAS is doing during this 40-60 seconds and this may help identify areas to focus on (but this probably needs help from your DBA as you would need admin rights to run these traces)

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Marcin-3613 avatar image
0 Votes"
Marcin-3613 answered Marcin-3613 edited

Again - thank you for your answer.

Warming up cache is on our bucketlist, however most of materials on that topic are often gone when I google it. But I believe that concept is to create some kind of SSIS package executed after the one that processes cube after updating the data.

Another slow event is MDSCHEMA_PROPERTIES, I'll take my chances and guess that this one is also strongly related to refreshing the cube and droping cache, ech?

I'll try to have a look at some traces, at least now we probably know what we should look for, thank you!

· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Did you get any progress or results from this?

0 Votes 0 ·

Hey,

what we've managed to do so far - as we don't have much time on our hands right now - is to warm up the cubes cache by running these queries as SSIS package after every process:

select
*
from $system.mdschema_properties;

Select
*
from SYSTEMRESTRICTSCHEMA ($System.Discover_csdl_metadata,
[CATALOG_NAME] = [OUR CATALOG NAME],
[PERSPECTIVE_NAME] = 'OUR CUBE NAME],
[VERSION] = '2.0');

And it feels it helped a little bit. We still wonder how to configure the data updates in a manner that would allow our users to use up-to-date data along with not dropping cache each time we process them. But we lack knowledge about impact of processing measures/dimensions on OLAP cache.

0 Votes 0 ·
dgosbell avatar image
0 Votes"
dgosbell answered Marcin-3613 commented

We still wonder how to configure the data updates in a manner that would allow our users to use up-to-date data along with not dropping cache each time we process them.

You cannot do this. The cache is based on the data. If you update the data you need to rebuild the cache.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

So we can either choose to update our data, like, once a day and have the cache rebuilded in the morning when users begin to use the tool or keep it up-to-date with users constantly hitting non-cache data?

How about some warming up cache? Is it good approach to try catching most popular queries by SQL Server Profiler and run these queries after each data update?

What about proactive caching options that are available in Enterprise Edition? Or having more than 3 partitions? We are running Standard Edition now and wonder if changing license would solve any of our problems.

0 Votes 0 ·