Non-real Time Dashboard Reference Architectures

This reference architecture represents a simple analytics pipeline that you can build on Azure. It can be leveraged when you won't be tracking data that requires real-time analysis and instead you just plan to do review sessions of the data every now and then (daily, weekly, bi-weekly, monthly). The presentation layer is a dashboard that you will be able to customize at will. You could use this while you are developing your game and in production.

Gathering analytics in a small scale

This article will describe the architecture used in this sample on GitHub. Keep in mind that the code from this reference architecture is only an example for guidance and there may be places to optimize the code before using in a production environment.

Architecture diagram

Non-real time dashboard reference architecture

Relevant services

  • Azure Function - Used as the API receiving the events from the device clients.
  • Azure Event Hub - A service tailored for analytics pipelines and is simple to use with little configuration or management overhead. As a bonus, it will be also usable if you decide later on that you would like to process events in real-time.
  • Azure Databricks - Transforms the data from Azure Event Hubs Capture (AVRO format) to JSON files, and also converts the data into CSV files compatible with Power BI. Streaming data from Azure Event Hubs to Azure Blob Storage in a performant way is not entirely trivial, unless it's a very small scale.
  • Azure Blob Storage - Optimized for storing massive amounts of unstructured data.
  • Power BI - A fully customizable dashboard. Note that Azure can integrate with other data visualization products like IBM SPSS or Tableau, but at the moment those can't connect directly with Azure Blob Storage. Please see the alternative architecture below leveraging Azure SQL Data Warehouse if you are interested in using these data visualization products.

Step by step

  1. Invoke the Azure Function from the device client. Alternatively you could have use virtual machines with a load balancer.
  2. Transfer data from the Azure Function to the Azure Event Hubs.
  3. Using the out-of-the-box Event Hub Capture, AVRO files are generated containing the data.
  4. An Azure Databricks job reads the data from the AVRO files and extracts the JSON events in the payload. Azure Databricks does the data preparation and deposits the output (CSV files) into Azure Blob Storage.
  5. Power BI reads the CSV files stored in Azure Blob Storage and display them in a dashboard/report.

Deployment template

Click the following button to deploy the project to your Azure subscription:

Deploy to Azure

This operation will trigger a template deployment of the azuredeploy.json ARM template file to your Azure subscription, which will create the necessary Azure resources. This may induce charges in your Azure account.

Have a look at the general guidelines documentation that includes a section summarizing the naming rules and restrictions for Azure services.

Note

If you're interested in how the ARM template works, review the Azure Resource Manager template documentation from each of the different services leveraged in this reference architecture:

Add these Function application settings so the sample project can connect to the Azure services:

  • EVENTHUB_CONNECTION_STRING - The connection string to the Azure Event Hub namespace that was created

Next, start the Azure Databricks cluster either through the portal or API.

Then either mount the Azure Storage account using DBFS, or setup the access key for directly using APIs. See the document on how to access Azure Blob Storage from Azure Databricks for all the details. The recommended path is leveraging secrets and mounting a container or a folder within the container, then access files as if they were local files.

Finally go to the Azure Databricks cluster that you created, then go to the folder you want to use (or create a new one), select Import, then URL and use this script to import the Notebook.

Tip

To run the Azure Functions locally, update the local.settings.json file with these same app settings.

Implementation details

Including a version number will be helpful once the tracked parameters evolve in future game updates.

Event Hub partitions

Have a look at the general guidelines documentation to understand the Azure Event Hub requirements and the rule of thumb to select the partition count.

Blob storage performance and limits

Have a look at the general guidelines documentation to learn more about the limits of an Azure storage account and how to avoid throttling.

API

In this reference architecture, the API is going to be implemented via an Azure Function (serverless), so you do not have to consider load balancing and scaling servers. The input of the Azure Function is going to be an HTTP trigger and the output will be an Event Hub.

[return: EventHub("ehnrtanalytics-output", Connection = "EventHubConnectionAppSetting")]
    public static async Task<IActionResult> Run([HttpTrigger(AuthorizationLevel.Function, "get", "post", Route = null)] HttpRequest req, ILogger log)

Azure Databricks jobs

The goal is to keep the amount of time the Azure Databricks cluster is running to a minimum. Setup the cluster to terminate after a few minutes of inactivity. Then schedule an Azure Databricks Notebook every day, for example.

Power BI dashboard

The list of steps to pull the information from the Azure Blob Storage and prepare the data for visualization is:

  1. Open Power BI.
  2. Select Get Data and pick Azure.
  3. Choose Azure Blob Storage.
  4. You will be asked for the URL, which you can find in the Azure Portal:
    1. Open the resource group.
    2. Select the Storage account.
    3. Select Blobs (in the left menu under Blob service).
    4. Select the path (ehcapture-analytics in this example)
    5. Finally, you'll find the URL in the Properties section
  5. You will be asked for the Storage account key, which can also be found in the Azure portal:
    1. Open the resource group.
    2. Select the Storage account.
    3. Select Access keys (in the left menu under Settings).
    4. Copy the key only, not the connection string.
  6. Then Power BI will connect with the Azure Blob Storage and display a preview of some files discovered. Click Load.
  7. To massage the data:
    1. Select Edit Queries.
    2. Filter the Name column to CSV so only those types of files show up.
    3. Click on the Combine Files icon (it looks like two arrows pointing downwards) from the Content column.
    4. A pop up dialog will prompt, click on the Ok button.
    5. Close and apply.
  8. Select the fields to display in the query and start adding them to the dashboard in any of the available Power BI visualizations.

Security considerations

Do not hard-code any Event Hub or Cognitive Services connection strings into the source of the Function. Instead, at a minimum, leverage the Function App Settings or, for even stronger security, use Key Vault instead. There is a tutorial explaining how to create a Key Vault, how to use a managed service identity with a Function and finally how to read the secret stored in Key Vault from a Function.

Review the Event Hub authentication and security model overview and put it into practice to ensure only your chat server can talk to the Event Hub.

Optimization considerations

You can transition blobs stored in Azure Blob Storage to a "cooler" storage tier (Hot to Cool, Hot to Archive, or Cool to Archive), or delete blobs at the end of their lifecycles to optimize for performance and cost using Azure Blob Storage lifecycle management policy.

Alternatives

You could consider replacing Azure Databricks with Azure HDInsight. The main difference in this scenario is Azure Databricks handles spinning up/down clusters for you, while with Azure HDInsight you have to take care of that yourself

Gathering analytics in a large scale

Architecture diagram

Large scale analytics with Azure SQL Data Warehouse

Implementation details

Leveraging Azure Event Hubs Capture and Azure Event Grid you can get the data sent by your players into Azure SQL Data Warehouse. For a full step-by-step walkthrough, see migrate captured Event Hubs data to a SQL Data Warehouse using Event Grid and Azure Functions, including how to use Power BI with SQL Data Warehouse.

Deployment template

Click the following button to deploy the project to your Azure subscription:

Deploy to Azure

This operation will trigger a template deployment of the EventHubsDataMigration.json ARM template file to your Azure subscription, which will create the necessary Azure resources.

Have a look at the general guidelines documentation that includes an article summarizing the naming rules and restrictions for Azure services.

Involving Azure Databricks and Azure HDInsight

Optionally you can choose to prepare data outside of your warehouse to leverage new skills and tooling that are emerging in your studio.

  • With Azure Databricks, data scientists can use the full power of the Databricks Runtime with a variety of language choices to develop ETL (extract, transform, load) processes that can scale as your game grows, and write the data directly into Azure SQL Data Warehouse.
  • Or you could also perform fast, interactive SQL queries at scale over structured or unstructured data produced by your game by hooking up Azure SQL Data Warehouse with Azure HDInsight.

Additional resources and samples

Pricing

If you don't have an Azure subscription, create a free account to get started with 12 months of free services. You're not charged for services included for free with Azure free account, unless you exceed the limits of these services. Learn how to check usage through the Azure Portal or through the usage file.

You are responsible for the cost of the Azure services used while running these reference architectures. The total amount will vary based on usage. See the pricing webpages for each of the services that were used in the reference architecture:

You can also use the Azure pricing calculator to configure and estimate the costs for the Azure services that you are planning to use. Prices are estimates and are not intended as actual price quotes. Actual prices may vary depending upon the date of purchase, currency of payment, and type of agreement you enter with Microsoft. Contact a Microsoft sales representative for additional information on pricing.