Let's understand that serverless architectures, like the one used in Azure Synapse Serverless SQL pools, do not have dedicated resources persistently running. Instead, resources are allocated dynamically on demand. When a query is executed after some inactivity, the system needs to allocate and start-up resources, including computing and memory. This initialization process, often referred to as a "cold start", can add overhead to the first query execution.
Since your query accesses data stored in a Delta format on Azure Data Lake, the first query also has additional overhead due to the initialization of connections to the Data Lake and metadata retrieval necessary to interpret the Delta files. Delta Lake maintains transaction logs that track changes to the dataset, and accessing these logs to construct a consistent view of the data for the first time might be time-consuming.
Remember also that the Azure Synapse Serverless SQL pool likely caches metadata of the files and datasets it accesses. After executing the first query, metadata such as file locations, schema information, and statistics might be cached, thereby speeding up subsequent queries. If the system is idle for a prolonged period (around 5 minutes in your case), these caches might be cleared, and thus, the initialization and metadata retrieval processes need to occur again with the next query.
Regarding increasing the time before a "cold start" occurs again, this capability is generally controlled by Azure and depends on its internal management and optimization algorithms. Serverless platforms prioritize efficient resource utilization, and keeping resources idle can be costly. However, Azure does not typically provide direct user control over how long resources or caches are retained idle in serverless environments.
If you'd like to avoid these delays, you might consider using a provisioned SQL pool within Azure Synapse, where resources are dedicated and always running. This comes at the cost of losing the pay-per-use flexibility of serverless but provides predictable performance.