I am having some very strange behaviour with a two-tier application with a SQL Server 2017 backend. The max memory size for SQL is the default : 2147483647, which seems a crazy value, but from what I understand, this version of SQL should be managing memory dynamically. We have a performance problem recently where update transactions from users seem to block for an unusually long time - 30ec to 2 minutes when previously, this took perhaps 5 seconds.
I have had high resolution metrics on the system and cannot spot anything obvious. However, the SQL Serer Memory Manager\Free Memory(KB) counter is crashing up and down every few minutes but the peak to trough is only 4GB down to around 120MB. The system has about 768GB of memory and 40 cores. It's been difficult, but some direct observations of the slow saves correlate with the trough on the free memory. The system does not appear to be hard faulting (page reads/sec), but the page faults are continually toggling from 10's (ie really low) to thousands every 5 seconds or so.
My questions are:
Does the max server memory actually need to be set and what would be a sensible number?
Are there any other metrics I could consider to see if there is a memory management problem?
I have attached an image of the trace for free mem (converted from KB to MB). The circle shows where a user reported slow performance.