How to: Join mirrored Azure Cosmos DB data with other mirrored databases in Microsoft Fabric (Preview)

In this guide, join two Azure Cosmos DB for NoSQL containers from separate databases using Fabric mirroring.

You can join data from Cosmos DB with any other mirrored databases, warehouses, or lakehouses within same Fabric workspace.

Important

Mirroring for Azure Cosmos DB is currently in preview. Production workloads aren't supported during preview. Currently, only Azure Cosmos DB for NoSQL accounts are supported.

Prerequisites

Tip

During the public preview, it's recommended to use a test or development copy of your existing Azure Cosmos DB data that can be recovered quickly from a backup.

Setup mirroring and prerequisites

Configure mirroring for the Azure Cosmos DB for NoSQL database. If you're unsure how to configure mirroring, refer to the configure mirrored database tutorial.

  1. Navigate to the Fabric portal.

  2. Create a new connection using your Azure Cosmos DB account's credentials.

  3. Mirror the first database using the connection you configured.

  4. Now, mirror the second database.

  5. Wait for replication to finish the initial snapshot of data for both mirrors.

Create a query that joins databases

Now, use the SQL analytics endpoint to create a query across two mirrored database items, without the need for data movement. Both items should be in the same workspace.

  1. Navigate to one of the mirrored databases in the Fabric portal.

  2. Switch from Mirrored Azure Cosmos DB to SQL analytics endpoint.

    Screenshot of the selector to switch between items in the Fabric portal.

  3. In the menu, select + Warehouses. Select the SQL analytics endpoint item for the other mirrored database.

    Screenshot of the OneLake data hub with multiple mirrored database endpoints.

  4. Open the context menu for the table and select New SQL Query. Write an example query that combines both databases.

    Screenshot of the query editor with multiple mirrored databases available.

    For example, this query would execute across multiple containers and databases, without any data movement. This example assumes the name of your table and columns. Use your own table and columns when writing your SQL query.

    SELECT
        product_category_count = COUNT (product_category),
        product_category 
    FROM
        [StoreSalesDB].[dbo].[storeorders_Sql] as StoreSales 
    INNER JOIN
        [dbo].[OrdersDB_order_status] as OrderStatus 
            ON StoreSales.order_id = OrderStatus.order_id 
    WHERE
        order_status='delivered' 
        AND OrderStatus.order_month_year > '6/1/2022' 
    GROUP BY
        product_category 
    ORDER BY
        product_category_count desc 
    

    You can add data from more sources and query them seamlessly. Fabric simplifies and eases bringing your organizational data together.