SQL Server Big Data Clusters Configuration Properties

Applies to: SQL Server 2019 (15.x)

Important

The Microsoft SQL Server 2019 Big Data Clusters add-on will be retired. Support for SQL Server 2019 Big Data Clusters will end on February 28, 2025. All existing users of SQL Server 2019 with Software Assurance will be fully supported on the platform and the software will continue to be maintained through SQL Server cumulative updates until that time. For more information, see the announcement blog post and Big data options on the Microsoft SQL Server platform.

Big Data Clusters configuration settings can be defined at the following scopes: cluster, service, and resource. The hierarchy of the settings follows in this order as well, from highest to lowest. BDC components will take the value of the setting defined at the lowest scope. If the setting is not defined at a given scope, it will inherit the value from its higher parent scope. Below is a list of available settings for each component of BDC at the various scopes. You can also view configurable settings for your BDC using azdata.

Cluster-scope settings

You can configure the following settings at the cluster scope.

Setting Name Description Type Default Value Deployment-time Only
bdc.telemetry.customerFeedback Controls whether this cluster participates in the Customer Experience Improvement Program (CEIP) which sends product usage and diagnostic data to Microsoft. boolean true
bdc.resourcemanagement.enabled Controls whether or not this cluster applies memory and cpu requests and limits. Only applied to the spark-related containers within the gateway, nmnode, sparkhead, spark, storage, and zookeeper pods. bool false
bdc.resourcemanagement.cpuCoefficient If resource management is enabled, controls the coefficient scaling the base cpu request value for pods with mem/cpu requests/limits applied to them. Only applied to the spark-related containers within the gateway, nmnode, sparkhead, spark, storage, and zookeeper pods. float 1.0
bdc.resourcemanagement.memoryCoefficient If resource management is enabled, controls the coefficient scaling the base memory request value for pods with mem/cpu requests/limits applied to them. Only applied to the spark-related containers within the gateway, nmnode, sparkhead, spark, storage, and zookeeper pods. float 1.0

SQL service-scope settings

You can configure the following settings at the SQL service scope.

Setting Name Description Type Default Value Deployment-time Only
mssql.language.lcid Changes the SQL Server locale to any supported language identifier (LCID). int 1033

Spark service-scope settings

Visit the Apache Spark & Apache Hadoop configuration article to see all supported and unsupported settings.

HDFS service-scope settings

Visit the Apache Spark & Apache Hadoop configuration article to see all supported and unsupported settings.

Gateway service-scope settings

No gateway service-scope settings configurable. Configure settings at the gateway resource-scope.

App service-scope settings

None available

Master Pool resource-scope settings

Setting Name Description Type Default Value Deployment-time Only
mssql.licensing.pid SQL Server edition. string Developer
mssql.sqlagent.enabled Enables SQL Server agent. bool false
mssql.collation Changes the SQL Server collation to any of the supported collations. string SQL_Latin1_General_CP1_CI_AS true
hadr.enabled Boolean for enabling availability groups for SQL Server master pool. bool false true
hadr.leaseDurationInSeconds Lease expiration Timeout for HA Agent. int 30
hadr.externalLeasePollingEnabled Boolean for enabling external lease polling API. bool true true
mssql.telemetry.userRequestedLocalAuditDirectory Enables Sql Server Local Audit and lets user set the directory where the 'Local Audit' logs are created. The directory must be under '/var/opt/mssql/audit'. string
mssql.network.tlscert The absolute path to the certificate file that SQL Server uses for TLS. To use a self-signed certificate the file must be placed in the container before changing this path. See the SQL Server Big Data Clusters notebook for instructions to do so. string /var/run/secrets/certificates/sql/sql-certificate.pem
mssql.network.tlskey The absolute path to the private key file that SQL Server uses for TLS. To use a self-signed certificate the file must be placed in the container before changing this path. See the SQL Server Big Data Clusters notebook for instructions to do so. string /var/run/secrets/certificates/sql/sql-privatekey.pem
mssql.network.forceencryption If 1, then SQL Server forces all connections to be encrypted. By default, this option is 0. int 0
mssql.network.tlsprotocols A comma-separated list of which TLS protocols are allowed by SQL Server. SQL Server always attempts to negotiate the strongest allowed protocol. If a client does not support any allowed protocol, SQL Server rejects the connection attempt. For compatibility, all supported protocols are allowed by default (1.2, 1.1, 1.0). If your clients support TLS 1.2, Microsoft recommends allowing only TLS 1.2. string 1.2
mssql.network.tlsciphers Specifies which ciphers are allowed by SQL Server for TLS. This string must be formatted per OpenSSL's cipher list format. In general, you should not need to change this option. string ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA384

Storage Pool resource-scope settings

The storage pool consists of SQL, Spark, and HDFS components.

Available SQL configurations

Setting Name Description Type Default Value Deployment-time Only
mssql.degreeOfParallelism The number of processors employed to run a single statement, for each parallel plan execution, per SQL instance. int 0
mssql.maxServerMemory The maximum amount of memory (in megabytes) for a SQL Server process used by an instance of SQL Server. int 2147483647
mssql.minServerMemory The minimum amount of memory (in megabytes) for a SQL Server process used by an instance of SQL Server. int 0
mssql.processAffinityCpuRange Distributes SQL Server worker threads to each CPU within the specified range. CPUs outside the specified range won't have assigned threads. AUTO is the default value and specifies that no thread is assigned a CPU and can freely move among CPUs. Example: "1, 3-5, 6" assigns threads to CPUs 1, 3, 4, 5, and 6. string AUTO
mssql.storagePoolCacheSize Size (in megabytes) of the cache for each SQL instance in the Storage Pool. int 8
mssql.storagePoolMaxCacheSize Maximum size (in megabytes) of the cache for each SQL instance in the Storage Pool. int 16384
mssql.storagePoolCacheAutogrowth Autogrowth factor (in MBs) for the Storage Pool cache. int 256
mssql.tempdb.autogrowthPerDataFile Autogrowth (in MBs) for each TempDB data file. int 64
mssql.tempdb.autogrowthPerLogFile Autogrowth (in MBs) for each TempDB log file. int 64
mssql.tempdb.dataFileSize File size (in MBs) for each TempDB data file. int 8
mssql.tempdb.dataFileMaxSize Maximum file size (in MBs) for each TempDB data file. int 16777215
mssql.tempdb.logFileSize File size (in MBs) for each TempDB log file. int 8
mssql.tempdb.logFileMaxSize Maximum file size (in MBs) for each TempDB log file. int 2097151
mssql.tempdb.numberOfDataFiles Number of data files for TempDB. int 8
mssql.traceflags Enables or disables traceflags for the startup of the SQL Server service. Provide a space separated list of traceflags to apply. string 3614

Available Apache Spark and Hadoop configurations

Visit the Apache Spark & Apache Hadoop configuration article to see all supported and unsupported settings.

Data Pool resource-scope settings

Setting Name Description Type Default Value Deployment-time Only
mssql.degreeOfParallelism The number of processors employed to run a single statement, for each parallel plan execution, per SQL instance. int 0
mssql.maxServerMemory The maximum amount of memory (in megabytes) for a SQL Server process used by an instance of SQL Server. int 2147483647
mssql.minServerMemory The minimum amount of memory (in megabytes) for a SQL Server process used by an instance of SQL Server. int 0
mssql.processAffinityCpuRange Distributes SQL Server worker threads to each CPU within the specified range. CPUs outside the specified range won't have assigned threads. AUTO is the default value and specifies that no thread is assigned a CPU and can freely move among CPUs. Example: "1, 3-5, 6" assigns threads to CPUs 1, 3, 4, 5, and 6. string AUTO
mssql.tempdb.autogrowthPerDataFile Autogrowth (in MBs) for each TempDB data file. int 64
mssql.tempdb.autogrowthPerLogFile Autogrowth (in MBs) for each TempDB log file. int 64
mssql.tempdb.dataFileSize File size (in MBs) for each TempDB data file. int 8
mssql.tempdb.dataFileMaxSize Maximum file size (in MBs) for each TempDB data file. int 16777215
mssql.tempdb.logFileSize File size (in MBs) for each TempDB log file. int 8
mssql.tempdb.logFileMaxSize Maximum file size (in MBs) for each TempDB log file. int 2097151
mssql.tempdb.numberOfDataFiles Number of data files for TempDB. int 8
mssql.traceflags Enables or disables traceflags for the startup of the SQL Server service. Provide a space separated list of traceflags to apply. string 3614

Compute Pool resource-scope settings

Setting Name Description Type Default Value Deployment-time Only
mssql.degreeOfParallelism The number of processors employed to run a single statement, for each parallel plan execution, per SQL instance. int 0
mssql.maxServerMemory The maximum amount of memory (in megabytes) for a SQL Server process used by an instance of SQL Server. int 2147483647
mssql.minServerMemory The minimum amount of memory (in megabytes) for a SQL Server process used by an instance of SQL Server. int 0
mssql.processAffinityCpuRange Distributes SQL Server worker threads to each CPU within the specified range. CPUs outside the specified range won't have assigned threads. AUTO is the default value and specifies that no thread is assigned a CPU and can freely move among CPUs. Example: "1, 3-5, 6" assigns threads to CPUs 1, 3, 4, 5, and 6. string AUTO
mssql.tempdb.autogrowthPerDataFile Autogrowth (in MBs) for each TempDB data file. int 64
mssql.tempdb.autogrowthPerLogFile Autogrowth (in MBs) for each TempDB log file. int 64
mssql.tempdb.dataFileSize File size (in MBs) for each TempDB data file. int 8
mssql.tempdb.dataFileMaxSize Maximum file size (in MBs) for each TempDB data file. int 16777215
mssql.tempdb.logFileSize File size (in MBs) for each TempDB log file. int 8
mssql.tempdb.logFileMaxSize Maximum file size (in MBs) for each TempDB log file. int 2097151
mssql.tempdb.numberOfDataFiles Number of data files for TempDB. int 8
mssql.traceflags Enables or disables traceflags for the startup of the SQL Server service. Provide a space separated list of traceflags to apply. string 3614

Spark Pool resource-scope settings

Visit the Apache Spark & Apache Hadoop configuration article to see all supported and unsupported settings.

Gateway resource-scope settings

Visit the Apache Spark & Apache Hadoop configuration article to see all supported and unsupported settings.

Sparkhead resource-scope settings

Visit the Apache Spark & Apache Hadoop configuration article to see all supported and unsupported settings.

Zookeeper resource-scope settings

Visit the Apache Spark & Apache Hadoop configuration article to see all supported and unsupported settings.

Namenode resource-scope settings

Visit the Apache Spark & Apache Hadoop configuration article to see all supported and unsupported settings.

App Proxy resource-scope settings

None available

Next steps

Configure SQL Server Big Data Clusters