Azure Cosmos DB for PostgreSQL server parameters

Article
12/17/2023

APPLIES TO: Azure Cosmos DB for PostgreSQL (powered by the Citus database extension to PostgreSQL)

There are various server parameters that affect the behavior of Azure Cosmos DB for PostgreSQL, both from standard PostgreSQL, and specific to Azure Cosmos DB for PostgreSQL. These parameters can be set in the Azure portal for a cluster. Under the Settings category, choose Worker node parameters or Coordinator node parameters. These pages allow you to set parameters for all worker nodes, or just for the coordinator node.

Azure Cosmos DB for PostgreSQL parameters

Note

Clusters running older versions of the Citus extension might not offer all the parameters listed below.

General configuration

citus.use_secondary_nodes (enum)

Sets the policy to use when choosing nodes for SELECT queries. If it's set to 'always', then the planner queries only nodes that are marked as 'secondary' noderole in pg_dist_node.

The supported values for this enum are:

never: (default) All reads happen on primary nodes.
always: Reads run against secondary nodes instead, and insert/update statements are disabled.

citus.cluster_name (text)

Informs the coordinator node planner which cluster it coordinates. Once cluster_name is set, the planner queries worker nodes in that cluster alone.

citus.enable_version_checks (boolean)

Upgrading Azure Cosmos DB for PostgreSQL version requires a server restart (to pick up the new shared-library), followed by the ALTER EXTENSION UPDATE command. The failure to execute both steps could potentially cause errors or crashes. Azure Cosmos DB for PostgreSQL thus validates the version of the code and that of the extension match, and errors out if they don't.

This value defaults to true, and is effective on the coordinator. In rare cases, complex upgrade processes might require setting this parameter to false, thus disabling the check.

citus.log_distributed_deadlock_detection (boolean)

Whether to log distributed deadlock detection-related processing in the server log. It defaults to false.

citus.distributed_deadlock_detection_factor (floating point)

Sets the time to wait before checking for distributed deadlocks. In particular, the time to wait is this value multiplied by PostgreSQL's deadlock_timeout setting. The default value is 2. A value of -1 disables distributed deadlock detection.

citus.node_connection_timeout (integer)

The citus.node_connection_timeout GUC sets the maximum duration (in milliseconds) to wait for connection establishment. Azure Cosmos DB for PostgreSQL raises an error if the timeout elapses before at least one worker connection is established. This GUC affects connections from the coordinator to workers, and workers to each other.

Default: five seconds
Minimum: 10 milliseconds
Maximum: one hour

-- set to 30 seconds
ALTER DATABASE foo
SET citus.node_connection_timeout = 30000;

citus.log_remote_commands (boolean)

Log all commands that the coordinator sends to worker nodes. For instance:

-- reveal the per-shard queries behind the scenes
SET citus.log_remote_commands TO on;

-- run a query on distributed table "github_users"
SELECT count(*) FROM github_users;

The output reveals several queries running on workers because of the single count(*) query on the coordinator.

NOTICE:  issuing SELECT count(*) AS count FROM public.github_events_102040 github_events WHERE true
DETAIL:  on server citus@private-c.demo.postgres.database.azure.com:5432 connectionId: 1
NOTICE:  issuing SELECT count(*) AS count FROM public.github_events_102041 github_events WHERE true
DETAIL:  on server citus@private-c.demo.postgres.database.azure.com:5432 connectionId: 1
NOTICE:  issuing SELECT count(*) AS count FROM public.github_events_102042 github_events WHERE true
DETAIL:  on server citus@private-c.demo.postgres.database.azure.com:5432 connectionId: 1
... etc, one for each of the 32 shards

citus.show_shards_for_app_name_prefixes (text)

By default, Azure Cosmos DB for PostgreSQL hides shards from the list of tables PostgreSQL gives to SQL clients. It does this because there are multiple shards per distributed table, and the shards can be distracting to the SQL client.

The citus.show_shards_for_app_name_prefixes GUC allows shards to be displayed for selected clients that want to see them. Its default value is ''.

-- show shards to psql only (hide in other clients, like pgAdmin)

SET citus.show_shards_for_app_name_prefixes TO 'psql';

-- also accepts a comma separated list

SET citus.show_shards_for_app_name_prefixes TO 'psql,pg_dump';

Shard hiding can be disabled entirely using citus.override_table_visibility.

citus.override_table_visibility (boolean)

Determines whether citus.show_shards_for_app_name_prefixes is active. The default value is 'true'. When set to 'false', shards are visible to all client applications.

citus.use_citus_managed_tables (boolean)

Allow new local tables to be accessed by queries on worker nodes. Adds all newly created tables to Citus metadata when enabled. The default value is 'false'.

citus.rebalancer_by_disk_size_base_cost (integer)

Using the by_disk_size rebalance strategy each shard group gets this cost in bytes added to its actual disk size. This value is used to avoid creating a bad balance when there’s little data in some of the shards. The assumption is that even empty shards have some cost, because of parallelism and because empty shard groups are likely to grow in the future.

The default value is 100MB.

Query Statistics

citus.stat_statements_purge_interval (integer)

Sets the frequency at which the maintenance daemon removes records from citus_stat_statements that are unmatched in pg_stat_statements. This configuration value sets the time interval between purges in seconds, with a default value of 10. A value of 0 disables the purges.

SET citus.stat_statements_purge_interval TO 5;

This parameter is effective on the coordinator and can be changed at runtime.

citus.stat_statements_max (integer)

The maximum number of rows to store in citus_stat_statements. Defaults to 50000, and can be changed to any value in the range 1000 - 10000000. Each row requires 140 bytes of storage, so setting stat_statements_max to its maximum value of 10M would consume 1.4 GB of memory.

Changing this GUC doesn't take effect until PostgreSQL is restarted.

citus.stat_statements_track (enum)

Recording statistics for citus_stat_statements requires extra CPU resources. When the database is experiencing load, the administrator can disable statement tracking by setting citus.stat_statements_track to none.

all: (default) Track all statements.
none: Disable tracking.

citus.stat_tenants_untracked_sample_rate

Sampling rate for new tenants in citus_stat_tenants. The rate can be of range between 0.0 and 1.0. Default is 1.0 meaning 100% of untracked tenant queries are sampled. Setting it to a lower value means that already tracked tenants have 100% queries sampled, but tenants that are currently untracked are sampled only at the provided rate.

Data Loading

citus.multi_shard_commit_protocol (enum)

Sets the commit protocol to use when performing COPY on a hash distributed table. On each individual shard placement, the COPY is performed in a transaction block to ensure that no data is ingested if an error occurs during the COPY. However, there's a particular failure case in which the COPY succeeds on all placements, but a (hardware) failure occurs before all transactions commit. This parameter can be used to prevent data loss in that case by choosing between the following commit protocols:

2pc: (default) The transactions in which COPY is performed on the shard placements are first prepared using PostgreSQL's two-phase commit and then committed. Failed commits can be manually recovered or aborted using COMMIT PREPARED or ROLLBACK PREPARED, respectively. When using 2pc, max_prepared_transactions should be increased on all the workers, typically to the same value as max_connections.
1pc: The transactions in which COPY is performed on the shard placements is committed in a single round. Data might be lost if a commit fails after COPY succeeds on all placements (rare).

citus.shard_replication_factor (integer)

Sets the replication factor for shards that is, the number of nodes on which shards are placed, and defaults to 1. This parameter can be set at run-time and is effective on the coordinator. The ideal value for this parameter depends on the size of the cluster and rate of node failure. For example, you can increase this replication factor if you run large clusters and observe node failures on a more frequent basis.

Planner Configuration

citus.local_table_join_policy (enum)

This GUC determines how Azure Cosmos DB for PostgreSQL moves data when doing a join between local and distributed tables. Customizing the join policy can help reduce the amount of data sent between worker nodes.

Azure Cosmos DB for PostgreSQL sends either the local or distributed tables to nodes as necessary to support the join. Copying table data is referred to as a “conversion.” If a local table is converted, then it is sent to any workers that need its data to perform the join. If a distributed table is converted, then it is collected in the coordinator to support the join. The Azure Cosmos DB for PostgreSQL planner sends only the necessary rows doing a conversion.

There are four modes available to express conversion preference:

auto: (Default) Azure Cosmos DB for PostgreSQL converts either all local or all distributed tables to support local and distributed table joins. Azure Cosmos DB for PostgreSQL decides which to convert using a heuristic. It converts distributed tables if they're joined using a constant filter on a unique index (such as a primary key). The conversion ensures less data gets moved between workers.
never: Azure Cosmos DB for PostgreSQL doesn't allow joins between local and distributed tables.
prefer-local: Azure Cosmos DB for PostgreSQL prefers converting local tables to support local and distributed table joins.
prefer-distributed: Azure Cosmos DB for PostgreSQL prefers converting distributed tables to support local and distributed table joins. If the distributed tables are huge, using this option might result in moving lots of data between workers.

For example, assume citus_table is a distributed table distributed by the column x, and that postgres_table is a local table:

CREATE TABLE citus_table(x int primary key, y int);
SELECT create_distributed_table('citus_table', 'x');

CREATE TABLE postgres_table(x int, y int);

-- even though the join is on primary key, there isn't a constant filter
-- hence postgres_table will be sent to worker nodes to support the join
SELECT * FROM citus_table JOIN postgres_table USING (x);

-- there is a constant filter on a primary key, hence the filtered row
-- from the distributed table will be pulled to coordinator to support the join
SELECT * FROM citus_table JOIN postgres_table USING (x) WHERE citus_table.x = 10;

SET citus.local_table_join_policy to 'prefer-distributed';
-- since we prefer distributed tables, citus_table will be pulled to coordinator
-- to support the join. Note that citus_table can be huge.
SELECT * FROM citus_table JOIN postgres_table USING (x);

SET citus.local_table_join_policy to 'prefer-local';
-- even though there is a constant filter on primary key for citus_table
-- postgres_table will be sent to necessary workers because we are using 'prefer-local'.
SELECT * FROM citus_table JOIN postgres_table USING (x) WHERE citus_table.x = 10;

citus.limit_clause_row_fetch_count (integer)

Sets the number of rows to fetch per task for limit clause optimization. In some cases, select queries with limit clauses might need to fetch all rows from each task to generate results. In those cases, and where an approximation would produce meaningful results, this configuration value sets the number of rows to fetch from each shard. Limit approximations are disabled by default and this parameter is set to -1. This value can be set at run-time and is effective on the coordinator.

citus.count_distinct_error_rate (floating point)

Azure Cosmos DB for PostgreSQL can calculate count(distinct) approximates using the postgresql-hll extension. This configuration entry sets the desired error rate when calculating count(distinct). 0.0, which is the default, disables approximations for count(distinct); and 1.0 provides no guarantees about the accuracy of results. We recommend setting this parameter to 0.005 for best results. This value can be set at run-time and is effective on the coordinator.

citus.task_assignment_policy (enum)

Note

This GUC is applicable only when shard_replication_factor is greater than one, or for queries against reference_tables.

Sets the policy to use when assigning tasks to workers. The coordinator assigns tasks to workers based on shard locations. This configuration value specifies the policy to use when making these assignments. Currently, there are three possible task assignment policies that can be used.

greedy: The greedy policy is the default and aims to evenly distribute tasks across workers.
round-robin: The round-robin policy assigns tasks to workers in a round-robin fashion alternating between different replicas. This policy enables better cluster utilization when the shard count for a table is low compared to the number of workers.
first-replica: The first-replica policy assigns tasks based on the insertion order of placements (replicas) for the shards. In other words, the fragment query for a shard is assigned to the worker that has the first replica of that shard. This method allows you to have strong guarantees about which shards are used on which nodes (that is, stronger memory residency guarantees).

This parameter can be set at run-time and is effective on the coordinator.

citus.enable_non_colocated_router_query_pushdown (boolean)

Enables router planner for the queries that reference non-colocated distributed tables.

The router planner is only enabled for queries that reference colocated distributed tables because otherwise shards might not be on the same node. Enabling this flag allows optimization for queries that reference such tables, but the query might not work after rebalancing the shards or altering the shard count of those tables.

The default is off.

Intermediate Data Transfer

citus.max_intermediate_result_size (integer)

The maximum size in KB of intermediate results for CTEs that are unable to be pushed down to worker nodes for execution, and for complex subqueries. The default is 1 GB, and a value of -1 means no limit. Queries exceeding the limit are canceled and produce an error message.

DDL

citus.enable_schema_based_sharding

With the parameter set to ON, all created schemas are distributed by default. Distributed schemas are automatically associated with individual colocation groups such that the tables created in those schemas are converted to colocated distributed tables without a shard key. This setting can be modified for individual sessions.

For an example of using this GUC, see how to design for microservice.

Executor Configuration

General

citus.all_modifications_commutative

Azure Cosmos DB for PostgreSQL enforces commutativity rules and acquires appropriate locks for modify operations in order to guarantee correctness of behavior. For example, it assumes that an INSERT statement commutes with another INSERT statement, but not with an UPDATE or DELETE statement. Similarly, it assumes that an UPDATE or DELETE statement doesn't commute with another UPDATE or DELETE statement. This precaution means that UPDATEs and DELETEs require Azure Cosmos DB for PostgreSQL to acquire stronger locks.

If you have UPDATE statements that are commutative with your INSERTs or other UPDATEs, then you can relax these commutativity assumptions by setting this parameter to true. When this parameter is set to true, all commands are considered commutative and claim a shared lock, which can improve overall throughput. This parameter can be set at runtime and is effective on the coordinator.

citus.remote_task_check_interval (integer)

Sets the frequency at which Azure Cosmos DB for PostgreSQL checks for statuses of jobs managed by the task tracker executor. It defaults to 10 ms. The coordinator assigns tasks to workers, and then regularly checks with them about each task's progress. This configuration value sets the time interval between two consequent checks. This parameter is effective on the coordinator and can be set at runtime.

citus.task_executor_type (enum)

Azure Cosmos DB for PostgreSQL has three executor types for running distributed SELECT queries. The desired executor can be selected by setting this configuration parameter. The accepted values for this parameter are:

adaptive: The default. It's optimal for fast responses to queries that involve aggregations and colocated joins spanning across multiple shards.
task-tracker: The task-tracker executor is well suited for long running, complex queries that require shuffling of data across worker nodes and efficient resource management.
real-time: (deprecated) Serves a similar purpose as the adaptive executor, but is less flexible and can cause more connection pressure on worker nodes.

This parameter can be set at run-time and is effective on the coordinator.

citus.multi_task_query_log_level (enum) {#multi_task_logging}

Sets a log-level for any query that generates more than one task (that is, which hits more than one shard). Logging is useful during a multitenant application migration, as you can choose to error or warn for such queries, to find them and add a tenant_id filter to them. This parameter can be set at runtime and is effective on the coordinator. The default value for this parameter is 'off'.

The supported values for this enum are:

off: Turn off logging any queries that generate multiple tasks (that is, span multiple shards)
debug: Logs statement at DEBUG severity level.
log: Logs statement at LOG severity level. The log line includes the SQL query that was run.
notice: Logs statement at NOTICE severity level.
warning: Logs statement at WARNING severity level.
error: Logs statement at ERROR severity level.

It could be useful to use error during development testing, and a lower log-level like log during actual production deployment. Choosing log will cause multi-task queries to appear in the database logs with the query itself shown after "STATEMENT."

LOG:  multi-task query about to be executed
HINT:  Queries are split to multiple tasks if they have to be split into several queries on the workers.
STATEMENT:  select * from foo;

citus.propagate_set_commands (enum)

Determines which SET commands are propagated from the coordinator to workers. The default value for this parameter is ‘none’.

The supported values are:

none: no SET commands are propagated.
local: only SET LOCAL commands are propagated.

citus.create_object_propagation (enum)

Controls the behavior of CREATE statements in transactions for supported objects.

When objects are created in a multi-statement transaction block, Azure Cosmos DB for PostgreSQL switches sequential mode to ensure created objects are visible to later statements on shards. However, the switch to sequential mode is not always desirable. By overriding this behavior, the user can trade off performance for full transactional consistency in the creation of new objects.

The default value for this parameter is 'immediate'.

The supported values are:

immediate: raises error in transactions where parallel operations like create_distributed_table happen before an attempted CREATE TYPE.
automatic: defer creation of types when sharing a transaction with a parallel operation on distributed tables. There might be some inconsistency between which database objects exist on different nodes.
deferred: return to pre-11.0 behavior, which is like automatic but with other subtle corner cases. We recommend the automatic setting over deferred, unless you require true backward compatibility.

For an example of this GUC in action, see type propagation.

citus.enable_repartition_joins (boolean)

Ordinarily, attempting to perform repartition joins with the adaptive executor fails with an error message. However setting citus.enable_repartition_joins to true allows Azure Cosmos DB for PostgreSQL to temporarily switch into the task-tracker executor to perform the join. The default value is false.

citus.enable_repartitioned_insert_select (boolean)

By default, an INSERT INTO … SELECT statement that can’t be pushed down attempts to repartition rows from the SELECT statement and transfer them between workers for insertion. However, if the target table has too many shards then repartitioning will probably not perform well. The overhead of processing the shard intervals when determining how to partition the results is too great. Repartitioning can be disabled manually by setting citus.enable_repartitioned_insert_select to false.

citus.enable_binary_protocol (boolean)

Setting this parameter to true instructs the coordinator node to use PostgreSQL’s binary serialization format (when applicable) to transfer data with workers. Some column types don't support binary serialization.

Enabling this parameter is mostly useful when the workers must return large amounts of data. Examples are when many rows are requested, the rows have many columns, or they use wide types such as hll from the postgresql-hll extension.

The default value is true. When set to false, all results are encoded and transferred in text format.

citus.max_adaptive_executor_pool_size (integer)

Max_adaptive_executor_pool_size limits worker connections from the current session. This GUC is useful for:

Preventing a single backend from getting all the worker resources
Providing priority management: designate low priority sessions with low max_adaptive_executor_pool_size, and high priority sessions with higher values

The default value is 16.

citus.executor_slow_start_interval (integer)

Time to wait in milliseconds between opening connections to the same worker node.

When the individual tasks of a multi-shard query take little time, they can often be finished over a single (often already cached) connection. To avoid redundantly opening more connections, the executor waits between connection attempts for the configured number of milliseconds. At the end of the interval, it increases the number of connections it's allowed to open next time.

For long queries (those taking >500 ms), slow start might add latency, but for short queries it’s faster. The default value is 10 ms.

citus.max_cached_conns_per_worker (integer)

Each backend opens connections to the workers to query the shards. At the end of the transaction, the configured number of connections is kept open to speed up subsequent commands. Increasing this value reduces the latency of multi-shard queries, but also increases overhead on the workers.

The default value is 1. A larger value such as 2 might be helpful for clusters that use a small number of concurrent sessions, but it’s not wise to go much further (for example, 16 would be too high).

citus.force_max_query_parallelization (boolean)

Simulates the deprecated and now nonexistent real-time executor. This parameter is used to open as many connections as possible to maximize query parallelization.

When this GUC is enabled, Azure Cosmos DB for PostgreSQL forces the adaptive executor to use as many connections as possible while executing a parallel distributed query. If not enabled, the executor might choose to use fewer connections to optimize overall query execution throughput. Internally, setting this to true ends up using one connection per task.

One place where this parameter is useful is in a transaction whose first query is lightweight and requires few connections, while a subsequent query would benefit from more connections. Azure Cosmos DB for PostgreSQL decides how many connections to use in a transaction based on the first statement, which can throttle other queries unless we use the GUC to provide a hint.

BEGIN;
-- add this hint
SET citus.force_max_query_parallelization TO ON;

-- a lightweight query that doesn't require many connections
SELECT count(*) FROM table WHERE filter = x;

-- a query that benefits from more connections, and can obtain
-- them since we forced max parallelization above
SELECT ... very .. complex .. SQL;
COMMIT;

The default value is false.

Task tracker executor configuration

citus.task_tracker_delay (integer)

This parameter sets the task tracker sleep time between task management rounds and defaults to 200 ms. The task tracker process wakes up regularly, walks over all tasks assigned to it, and schedules and executes these tasks. Then, the task tracker sleeps for a time period before walking over these tasks again. This configuration value determines the length of that sleeping period. This parameter is effective on the workers and needs to be changed in the postgresql.conf file. After editing the config file, users can send a SIGHUP signal or restart the server for the change to take effect.

This parameter can be decreased to trim the delay caused due to the task tracker executor by reducing the time gap between the management rounds. Decreasing the delay is useful in cases when the shard queries are short and hence update their status regularly.

citus.max_assign_task_batch_size (integer)

The task tracker executor on the coordinator synchronously assigns tasks in batches to the daemon on the workers. This parameter sets the maximum number of tasks to assign in a single batch. Choosing a larger batch size allows for faster task assignment. However, if the number of workers is large, then it might take longer for all workers to get tasks. This parameter can be set at runtime and is effective on the coordinator.

citus.max_running_tasks_per_node (integer)

The task tracker process schedules and executes the tasks assigned to it as appropriate. This configuration value sets the maximum number of tasks to execute concurrently on one node at any given time and defaults to 8.

The limit ensures that you don't have many tasks hitting disk at the same time, and helps in avoiding disk I/O contention. If your queries are served from memory or SSDs, you can increase max_running_tasks_per_node without much concern.

citus.partition_buffer_size (integer)

Sets the buffer size to use for partition operations and defaults to 8 MB. Azure Cosmos DB for PostgreSQL allows for table data to be repartitioned into multiple files when two large tables are being joined. After this partition buffer fills up, the repartitioned data is flushed into files on disk. This configuration entry can be set at run-time and is effective on the workers.

Explain output

citus.explain_all_tasks (boolean)

By default, Azure Cosmos DB for PostgreSQL shows the output of a single, arbitrary task when running EXPLAIN on a distributed query. In most cases, the explain output is similar across tasks. Occasionally, some of the tasks are planned differently or have much higher execution times. In those cases, it can be useful to enable this parameter, after which the EXPLAIN output includes all tasks. Explaining all tasks might cause the EXPLAIN to take longer.

citus.explain_analyze_sort_method (enum)

Determines the sort method of the tasks in the output of EXPLAIN ANALYZE. The default value of citus.explain_analyze_sort_method is execution-time.

The supported values are:

execution-time: sort by execution time.
taskId: sort by task ID.

Managed PgBouncer parameters

The following managed PgBouncer parameters can be configured on single node or coordinator.

Parameter Name	Description	Default
pgbouncer.default_pool_size	Set this parameter value to the number of connections per user/database pair.	295
pgbouncer.ignore_startup_parameters	Comma-separated list of parameters that PgBouncer can ignore. For example, you can let PgBouncer ignore `extra_float_digits` parameter. Some parameters are allowed, all others raise error. This ability is needed to tolerate overenthusiastic JDBC wanting to unconditionally set 'extra_float_digits=2' in startup packet. Use this option if the library you use report errors such as `pq: unsupported startup parameter: extra_float_digits`.	extra_float_digits, ssl_renegotiation_limit
pgBouncer.max_client_conn	Set this parameter value to the highest number of client connections to PgBouncer that you want to support.	2000
pgBouncer.min_pool_size	Add more server connections to pool if below this number.	0 (Disabled)
pgBouncer.pool_mode	Set this parameter value to TRANSACTION for transaction pooling (which is the recommended setting for most workloads).	TRANSACTION
pgbouncer.query_wait_timeout	Maximum time (in seconds) queries are allowed to spend waiting for execution. If the query isn't assigned to a server during that time, the client is disconnected.	20s
pgbouncer.server_idle_timeout	If a server connection has been idle more than this many seconds, it is closed. If 0 then this timeout is disabled.	60s

PostgreSQL parameters

DateStyle - Sets the display format for date and time values
IntervalStyle - Sets the display format for interval values
TimeZone - Sets the time zone for displaying and interpreting time stamps
application_name - Sets the application name to be reported in statistics and logs
array_nulls - Enables input of NULL elements in arrays
autovacuum - Starts the autovacuum subprocess
autovacuum_analyze_scale_factor - Number of tuple inserts, updates, or deletes prior to analyze as a fraction of reltuples
autovacuum_analyze_threshold - Minimum number of tuple inserts, updates, or deletes prior to analyze
autovacuum_naptime - Time to sleep between autovacuum runs
autovacuum_vacuum_cost_delay - Vacuum cost delay in milliseconds, for autovacuum
autovacuum_vacuum_cost_limit - Vacuum cost amount available before napping, for autovacuum
autovacuum_vacuum_scale_factor - Number of tuple updates or deletes prior to vacuum as a fraction of reltuples
autovacuum_vacuum_threshold - Minimum number of tuple updates or deletes prior to vacuum
autovacuum_work_mem - Sets the maximum memory to be used by each autovacuum worker process
backend_flush_after - Number of pages after which previously performed writes are flushed to disk
backslash_quote - Sets whether "'" is allowed in string literals
bgwriter_delay - Background writer sleep time between rounds
bgwriter_flush_after - Number of pages after which previously performed writes are flushed to disk
bgwriter_lru_maxpages - Background writer maximum number of LRU pages to flush per round
bgwriter_lru_multiplier - Multiple of the average buffer usage to free per round
bytea_output - Sets the output format for bytea
check_function_bodies - Checks function bodies during CREATE FUNCTION
checkpoint_completion_target - Time spent flushing dirty buffers during checkpoint, as fraction of checkpoint interval
checkpoint_timeout - Sets the maximum time between automatic WAL checkpoints
checkpoint_warning - Enables warnings if checkpoint segments are filled more frequently than this
client_encoding - Sets the client's character set encoding
client_min_messages - Sets the message levels that are sent to the client
commit_delay - Sets the delay in microseconds between transaction commit and flushing WAL to disk
commit_siblings - Sets the minimum concurrent open transactions before performing commit_delay
constraint_exclusion - Enables the planner to use constraints to optimize queries
cpu_index_tuple_cost - Sets the planner's estimate of the cost of processing each index entry during an index scan
cpu_operator_cost - Sets the planner's estimate of the cost of processing each operator or function call
cpu_tuple_cost - Sets the planner's estimate of the cost of processing each tuple (row)
cursor_tuple_fraction - Sets the planner's estimate of the fraction of a cursor's rows that are retrieved
deadlock_timeout - Sets the amount of time, in milliseconds, to wait on a lock before checking for deadlock
debug_pretty_print - Indents parse and plan tree displays
debug_print_parse - Logs each query's parse tree
debug_print_plan - Logs each query's execution plan
debug_print_rewritten - Logs each query's rewritten parse tree
default_statistics_target - Sets the default statistics target
default_tablespace - Sets the default tablespace to create tables and indexes in
default_text_search_config - Sets default text search configuration
default_transaction_deferrable - Sets the default deferrable status of new transactions
default_transaction_isolation - Sets the transaction isolation level of each new transaction
default_transaction_read_only - Sets the default read-only status of new transactions
default_with_oids - Creates new tables with OIDs by default
effective_cache_size - Sets the planner's assumption about the size of the disk cache
enable_bitmapscan - Enables the planner's use of bitmap-scan plans
enable_gathermerge - Enables the planner's use of gather merge plans
enable_hashagg - Enables the planner's use of hashed aggregation plans
enable_hashjoin - Enables the planner's use of hash join plans
enable_indexonlyscan - Enables the planner's use of index-only-scan plans
enable_indexscan - Enables the planner's use of index-scan plans
enable_material - Enables the planner's use of materialization
enable_mergejoin - Enables the planner's use of merge join plans
enable_nestloop - Enables the planner's use of nested loop join plans
enable_seqscan - Enables the planner's use of sequential-scan plans
enable_sort - Enables the planner's use of explicit sort steps
enable_tidscan - Enables the planner's use of TID scan plans
escape_string_warning - Warns about backslash escapes in ordinary string literals
exit_on_error - Terminates session on any error
extra_float_digits - Sets the number of digits displayed for floating-point values
force_parallel_mode - Forces use of parallel query facilities
from_collapse_limit - Sets the FROM-list size beyond which subqueries aren't collapsed
geqo - Enables genetic query optimization
geqo_effort - GEQO: effort is used to set the default for other GEQO parameters
geqo_generations - GEQO: number of iterations of the algorithm
geqo_pool_size - GEQO: number of individuals in the population
geqo_seed - GEQO: seed for random path selection
geqo_selection_bias - GEQO: selective pressure within the population
geqo_threshold - Sets the threshold of FROM items beyond which GEQO is used
gin_fuzzy_search_limit - Sets the maximum allowed result for exact search by GIN
gin_pending_list_limit - Sets the maximum size of the pending list for GIN index
idle_in_transaction_session_timeout - Sets the maximum allowed duration of any idling transaction
join_collapse_limit - Sets the FROM-list size beyond which JOIN constructs aren't flattened
lc_monetary - Sets the locale for formatting monetary amounts
lc_numeric - Sets the locale for formatting numbers
lo_compat_privileges - Enables backward compatibility mode for privilege checks on large objects
lock_timeout - Sets the maximum allowed duration (in milliseconds) of any wait for a lock. 0 turns this off
log_autovacuum_min_duration - Sets the minimum execution time above which autovacuum actions are logged
log_connections - Logs each successful connection
log_destination - Sets the destination for server log output
log_disconnections - Logs end of a session, including duration
log_duration - Logs the duration of each completed SQL statement
log_error_verbosity - Sets the verbosity of logged messages
log_lock_waits - Logs long lock waits
log_min_duration_statement - Sets the minimum execution time (in milliseconds) above which statements are logged. -1 disables logging statement durations
log_min_error_statement - Causes all statements generating error at or above this level to be logged
log_min_messages - Sets the message levels that are logged
log_replication_commands - Logs each replication command
log_statement - Sets the type of statements logged
log_statement_stats - For each query, writes cumulative performance statistics to the server log
log_temp_files - Logs the use of temporary files larger than this number of kilobytes
maintenance_work_mem - Sets the maximum memory to be used for maintenance operations
max_parallel_workers - Sets the maximum number of parallel workers that can be active at one time
max_parallel_workers_per_gather - Sets the maximum number of parallel processes per executor node
max_pred_locks_per_page - Sets the maximum number of predicate-locked tuples per page
max_pred_locks_per_relation - Sets the maximum number of predicate-locked pages and tuples per relation
max_standby_archive_delay - Sets the maximum delay before canceling queries when a hot standby server is processing archived WAL data
max_standby_streaming_delay - Sets the maximum delay before canceling queries when a hot standby server is processing streamed WAL data
max_sync_workers_per_subscription - Maximum number of table synchronization workers per subscription
max_wal_size - Sets the WAL size that triggers a checkpoint
min_parallel_index_scan_size - Sets the minimum amount of index data for a parallel scan
min_wal_size - Sets the minimum size to shrink the WAL to
operator_precedence_warning - Emits a warning for constructs that changed meaning since PostgreSQL 9.4
parallel_setup_cost - Sets the planner's estimate of the cost of starting up worker processes for parallel query
parallel_tuple_cost - Sets the planner's estimate of the cost of passing each tuple (row) from worker to main backend
pg_stat_statements.save - Saves pg_stat_statements statistics across server shutdowns
pg_stat_statements.track - Selects which statements are tracked by pg_stat_statements
pg_stat_statements.track_utility - Selects whether utility commands are tracked by pg_stat_statements
quote_all_identifiers - When generating SQL fragments, quotes all identifiers
random_page_cost - Sets the planner's estimate of the cost of a nonsequentially fetched disk page
row_security - Enables row security
search_path - Sets the schema search order for names that aren't schema-qualified
seq_page_cost - Sets the planner's estimate of the cost of a sequentially fetched disk page
session_replication_role - Sets the session's behavior for triggers and rewrite rules
standard_conforming_strings - Causes '...' strings to treat backslashes literally
statement_timeout - Sets the maximum allowed duration (in milliseconds) of any statement. 0 turns this off
synchronize_seqscans - Enables synchronized sequential scans
synchronous_commit - Sets the current transaction's synchronization level
tcp_keepalives_count - Maximum number of TCP keepalive retransmits
tcp_keepalives_idle - Time between issuing TCP keepalives
tcp_keepalives_interval - Time between TCP keepalive retransmits
temp_buffers - Sets the maximum number of temporary buffers used by each database session
temp_tablespaces - Sets the tablespace(s) to use for temporary tables and sort files
track_activities - Collects information about executing commands
track_counts - Collects statistics on database activity
track_functions - Collects function-level statistics on database activity
track_io_timing - Collects timing statistics for database I/O activity
transform_null_equals - Treats "expr=NULL" as "expr IS NULL"
vacuum_cost_delay - Vacuum cost delay in milliseconds
vacuum_cost_limit - Vacuum cost amount available before napping
vacuum_cost_page_dirty - Vacuum cost for a page dirtied by vacuum
vacuum_cost_page_hit - Vacuum cost for a page found in the buffer cache
vacuum_cost_page_miss - Vacuum cost for a page not found in the buffer cache
vacuum_defer_cleanup_age - Number of transactions by which VACUUM and HOT cleanup should be deferred, if any
vacuum_freeze_min_age - Minimum age at which VACUUM should freeze a table row
vacuum_freeze_table_age - Age at which VACUUM should scan whole table to freeze tuples
vacuum_multixact_freeze_min_age - Minimum age at which VACUUM should freeze a MultiXactId in a table row
vacuum_multixact_freeze_table_age - Multixact age at which VACUUM should scan whole table to freeze tuples
wal_receiver_status_interval - Sets the maximum interval between WAL receiver status reports to the primary
wal_writer_delay - Time between WAL flushes performed in the WAL writer
wal_writer_flush_after - Amount of WAL written out by WAL writer that triggers a flush
work_mem - Sets the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files
xmlbinary - Sets how binary values are to be encoded in XML
xmloption - Sets whether XML data in implicit parsing and serialization operations is to be considered as documents or content fragments

Next steps

Another form of configuration, besides server parameters, are the resource configuration options in a cluster.
The underlying PostgreSQL data base also has configuration parameters.

Azure Cosmos DB for PostgreSQL server parameters

Azure Cosmos DB for PostgreSQL parameters

General configuration

citus.use_secondary_nodes (enum)

citus.cluster_name (text)

citus.enable_version_checks (boolean)

citus.log_distributed_deadlock_detection (boolean)

citus.distributed_deadlock_detection_factor (floating point)

citus.node_connection_timeout (integer)

citus.log_remote_commands (boolean)

citus.show_shards_for_app_name_prefixes (text)

citus.override_table_visibility (boolean)

citus.use_citus_managed_tables (boolean)

citus.rebalancer_by_disk_size_base_cost (integer)

Query Statistics

citus.stat_statements_purge_interval (integer)

citus.stat_statements_max (integer)

citus.stat_statements_track (enum)

citus.stat_tenants_untracked_sample_rate

Data Loading

citus.multi_shard_commit_protocol (enum)

citus.shard_replication_factor (integer)

Planner Configuration

citus.local_table_join_policy (enum)

citus.limit_clause_row_fetch_count (integer)

citus.count_distinct_error_rate (floating point)

citus.task_assignment_policy (enum)

citus.enable_non_colocated_router_query_pushdown (boolean)

Intermediate Data Transfer

citus.max_intermediate_result_size (integer)

DDL

citus.enable_schema_based_sharding

Executor Configuration

General

citus.all_modifications_commutative

citus.remote_task_check_interval (integer)

citus.task_executor_type (enum)

citus.multi_task_query_log_level (enum) {#multi_task_logging}

citus.propagate_set_commands (enum)

citus.create_object_propagation (enum)

citus.enable_repartition_joins (boolean)

citus.enable_repartitioned_insert_select (boolean)

citus.enable_binary_protocol (boolean)

citus.max_adaptive_executor_pool_size (integer)

citus.executor_slow_start_interval (integer)

citus.max_cached_conns_per_worker (integer)

citus.force_max_query_parallelization (boolean)

Task tracker executor configuration

citus.task_tracker_delay (integer)

citus.max_assign_task_batch_size (integer)

citus.max_running_tasks_per_node (integer)

citus.partition_buffer_size (integer)

Explain output

citus.explain_all_tasks (boolean)

citus.explain_analyze_sort_method (enum)

Managed PgBouncer parameters

PostgreSQL parameters

Next steps

Feedback

Additional resources