Azure SQL Database large tables performance Issue

Question

In my Azure SQL Database, there exists a considerably large table, let's call it 'tableA'. In my application, I have several distinct queries (approximately 10) that involve different sets of columns from this table. For example, 'query1' utilizes columns 'col1', 'col2', and 'col3', while 'query2' uses columns 'col2', 'col3', and 'col4', then , while 'query3' uses columns 'col7', 'col8', and 'col9', and so forth.

And there are almost 50 such tables where same issue in present as 'tableA'.

Without applying indexing to these individual queries, I observe a significant performance drop in terms of output. However, on the flip side, when I create separate indices for each of these SELECT queries, the insertion speed in my table drastically declines and is noticeably poor.

Having said that, I am struggling to strike an optimum balance between read and write performance. Could anyone advise the best industry practices to manage such a scenario? What could potentially be the ideal indexing strategy to boost the overall efficiency in this case without excessively compromising on the insertion speed in the table? Any guidance to maneuver through this situation would be greatly appreciated.

Answer

I think that your situation is compounded by the fact that you have multiple distinct queries accessing different column sets across a large number of tables.

So instead of creating separate indexes for each query, consider creating indexes that cover multiple queries. This means identifying columns that are frequently accessed together and indexing those. For instance, if col2 and col3 are commonly queried together, a single index on both could serve multiple queries.

For queries that target a specific subset of data, filtered indexes can be extremely efficient. These indexes are smaller and less expensive to maintain because they only include rows that meet certain criteria. If your queries have where clauses that filter rows on the same or similar conditions, filtered indexes might be a good choice.

If your workload involves a lot of aggregate functions or scanning large numbers of rows, columnstore indexes can significantly improve performance. These indexes are designed to be highly efficient for read-heavy operations, particularly for analytics workloads. However, they might not be suitable for all scenarios, especially where transactional updates and deletes are frequent.

Having many indexes can lead to increased overhead during data modification operations. Implement a robust index maintenance strategy that includes regular reviews of index usage and fragmentation. Dropping unused or less frequently used indexes and reorganizing or rebuilding fragmented indexes can help in maintaining optimal performance.

For tables with very high transaction rates, consider using In-Memory OLTP features. Tables can be moved into memory-optimized tables, and you can also use natively compiled stored procedures for accessing these tables. This can significantly reduce latency for both reads and writes, although there are some limitations and considerations for using in-memory OLTP.

Table partitioning can help manage large tables by splitting them into smaller, more manageable pieces, while still treating them as a single logical entity. This can improve performance for certain types of queries and also make maintenance tasks like indexing more efficient.

Answer

Thanks for posting your question in the Microsoft Q&A forum.

I have several recommendations regarding this matter:

Partitioning: Divide large tables into smaller, manageable partitions based on specific data ranges to improve query performance and reduce data scanning.
Stored Procedures (SP): Use stored procedures to pre-compile and cache execution plans, enhancing query performance, code reusability, and security.
Smart Indexing: Implement targeted indexing strategies tailored to query patterns and frequently accessed columns to optimize query execution.
Caching: Introduce caching mechanisms, such as a helper service or in-memory cache, to reduce database accesses and minimize latency, enhancing overall system performance.
Command Query Responsibility Segregation (CQRS): Separate command and query operations into distinct databases or data models to optimize each operation independently, promoting scalability, performance, and flexibility.

Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful

Answer

What is the pattern of INSERT and SELECT? Are you inserting constantly, or are you running nightly loads? In the latter case, you could disable the indexes before the load, and rebuild them after the load.

Share via

Azure SQL Database large tables performance Issue

4 answers