Multi-tenant monitoring

APPLIES TO: Azure Cosmos DB for PostgreSQL (powered by the Citus database extension to PostgreSQL)

Important

Applicable to Citus 11.3 & newer versions

Tenant monitoring is a crucial aspect of managing a multi-tenant SaaS platform. The feature offers insights into CPU usage and the overall query volume, attributed to specific tenants. Tracking of statistics is done locally on nodes where tenant resides and are made accessible at both tenant and node level of granularity. The collected statistics allows in quick identification of noisy neighbors & helps in better optimizing the resource usage in your cluster.

Conceptual model

Azure Cosmos DB for PostgreSQL uses row-based sharding at core, which means that each distributed table contains a distributed column - denoted as tenant key. Tenant key enables splitting data across shards, where each shard contains one or more tenants. Shards themselves are distributed among the nodes in the cluster, enabling horizontal scaling of your database.

Diagram shows three node cluster hosting tenants, with top three tenants usage of CPU_in_seconds & Overall query count traced in citus_stat_tenants view.

The CPU time is measured in seconds, consumed by tenant, across all available cores on a node in a cluster. Total CPU time available on cluster, can be estimated as: number_of_vcores * citus.stat_tenants_period * number_of_nodes.

Node parameters

citus.stat_tenants_log_level (text)

Controls which message levels are written to the server log. Valid values are DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1, DEBUG, LOG, INFO, NOTICE, WARNING, ERROR. Default is LOG.

citus.stat_tenants_limit (int)

Controls the number of tenants (top N) tracked within a single time window. Default is 100.

citus.stat_tenants_period (int)

Controls the time window (in seconds) to which tenant statistics are allocated. Default is 60 seconds.

citus.stat_tenants_track (boolean)

Enables or disables the tracking of tenant statistics. Default is None. Modification to the node parameter requires a restart.

Statistics views

pg_catalog.citus_stat_tenants

The citus_stat_tenants view tracks these statistics within time buckets. Once a period ends, its statistics are stored in the last period, providing you with ongoing and completed period insights. Metrics currently tracked within the view include read query count, overall query count & CPU cycle consumed at the node, colocation group & tenant granularity.

ColumnName Type Description
nodeid INTEGER Autogenerated identifier for an individual node.
colocation_id INTEGER Colocation group to which this table belongs.
tenant_attribute TEXT The distribution column\shard key.
read_count_in_this_period INTEGER SELECT queries in current citus.stat_tenants_period.
read_count_in_last_period INTEGER SELECT queries in completed citus.stat_tenants_period.
query_count_in_this_period INTEGER SELECT, INSERT, UPDATE and DELETE queries in current citus.stat_tenants_period.
query_count_in_last_period INTEGER SELECT, INSERT, UPDATE and DELETE queries in completed citus.stat_tenants_period.
cpu_usage_in_this_period DOUBLE PRECISION Seconds of CPU time consumed in current period.
cpu_usage_in_last_period DOUBLE PRECISION Seconds of CPU time consumed in completed period.

Note

Privileges (execute,select) on view is granted to the role pg_monitor.

Tracking tenant level statistics adds an overhead, and by default is disabled.

pg_catalog.citus_stat_tenants_reset()

The function resets the collected metrics within citus_stat_tenants view.

Operations tracked

When you enable this feature, accounting is activated for SQL commands such as INSERT, UPDATE, DELETE, and SELECT. This accounting is specifically designed for a single tenant. A query qualifies to be a single tenant query, if the query planner can restrict the query to a single shard or single tenant.

In a multi-tenant environment, each tenant typically has access to their own dataset, and as a result, queries are filtered based on tenant keys. Any query in such systems, which operates across tenants gets initiated by the system administrator and aren't traced back to individual tenants. For example, to generate a report that includes aggregated data from all tenants - cross-tenant query is needed which can't be attributed to a specific tenant.

CREATE TABLE organizations (id BIGSERIAL PRIMARY KEY, name TEXT);

SELECT create_distributed_table ('organizations', 'id');

INSERT INTO organizations (name) VALUES ('Teleflora'); -- tracked
INSERT INTO organizations (name) VALUES ('BloomThat'); -- tracked
INSERT INTO organizations (name) VALUES ('UrbanStems');-- tracked

SELECT COUNT(*) FROM organizations where id = 1; -- tracked
 count 
-------
     1
(1 row)
SELECT COUNT(*) FROM organizations where id IN (1,2); -- untracked
 count
-------
     1
(1 row)
UPDATE organizations SET name = 'Bloomers' WHERE id = 2; -- tracked
UPDATE 1
DELETE FROM organizations WHERE id = 3; -- tracked
DELETE 1
SELECT tenant_attribute,
       read_count_in_this_period,
       query_count_in_this_period,
       cpu_usage_in_this_period
FROM citus_stat_tenants;
 tenant_attribute | read_count_in_this_period | query_count_in_this_period | cpu_usage_in_this_period 
------------------+---------------------------+----------------------------+--------------------------
 1                |                         2 |                          3 |                        0
 2                |                         0 |                          2 |                        0
 3                |                         0 |                          2 |                        0
(3 rows)

Note

Both COPY & \COPY command isn't tracked with the feature.

Next steps

Learn more about modeling multi-tenant schema and reviewing statistics.