Create a Stream Analytics job to analyze phone call data and visualize results in a Power BI dashboard
This tutorial shows how to use Azure Stream Analytics to analyze a sample phone call that is generated by a client application. The phone call data generated by the client application contains some fraudulent calls and we will define a Stream Analytics job to filter such calls.
In this tutorial, you learn how to:
- Generate sample phone call data and send phone it to Azure Event Hubs
- Create a Stream Analytics job
- Configure input and output to the job
- Define a query to filter fraudulent calls
- Test and start the job
- Visualize results in Power BI
Before you start, make sure you have the following:
- If you don't have an Azure subscription, create a free account.
- Log in to the Azure portal.
- Download the phone call event generator app TelcoGenerator.zip from the Microsoft Download Center or you can get the source code from GitHub.
Create an Azure Event Hub
Before Stream Analytics can analyze the fraudulent calls data stream, you should send the data to Azure. In this tutorial, you will send data to Azure by using Azure Event Hubs. For this tutorial, you create an event hub and make the event generator app send call data to that event hub. Run the following steps to create an event hub:
- Log in to the Azure portal.
Select Create a resource > Internet of Things > Event Hubs.
Fill out the Create namespace pane with the following values:
Setting Suggested value Description Name myEventHubNS A unique name to identify the event hub namespace. Subscription <Your subscription> Select an Azure subscription where you want to create the event hub. Resource group MyASADemoRG Select Create New and enter a new resource-group name for your account. Location West US2 Location where the event hub namespace can be deployed.
Use default options on the remaining settings and select Create.
When the namespace has finished deploying, go to All resources > find "myEventHubNS" in the list of Azure resources > select to open it.
Next select +Event Hub > Name the event hub “MyEventHub”. You can use a different name. Use default options on remaining settings, select Create and wait for the deployment to succeed.
Grant access to the event hub and get a connection string
Before an application can send data to Azure Event Hubs, the event hub must have a policy that allows appropriate access. The access policy produces a connection string that includes authorization information.
- Navigate to the Event Hubs you created in the previous step that is “MyEventHub” > select Shared access policies from the event hub pane > select +Add.
Set the Policy name to Mypolicy > and select Manage > select Create.
After the policy has been deployed, select to open the policy, find the Connection string–primary key and select the copy next to the connection string.
Paste the connection string into a text editor. You need this connection string in the next section.
The connection string looks as follows:
Endpoint=sb://<Your event hub namespace>.servicebus.windows.net/;SharedAccessKeyName=<Your shared access policy name>;SharedAccessKey=<generated key>;EntityPath=<Your event hub name>
Notice that the connection string contains multiple key-value pairs, separated with semicolons: Endpoint, SharedAccessKeyName, SharedAccessKey, and EntityPath.
Remove the EntityPath pair from the connection string (don't forget to remove the semicolon that precedes it).
Start the event generator application
Before you start the TelcoGenerator app, you should configure it to send data to the Azure Event Hubs you created earlier.
- Extract the contents of TelcoGenerator.zip file.
TelcoGenerator\TelcoGenerator\telcodatagen.exe.configfile in a text editor of your choice (There is more than one .config file, so be sure that you open the right one.)
element in the config file with the following details:
- Set the value of the EventHubName key to the value of the EntityPath in the connection string.
- Set the value of the Microsoft.ServiceBus.ConnectionString key to the connection string without the EntityPath value that is the value you got from step 5 in the previous section.
Save the file.
Next open a command window, change to the folder where you have is unzipped the TelcoGenerator application and enter the following command:
telcodatagen.exe 1000 .2 2
This command takes the following parameters:
- Number of call data records per hour.
- Percentage of fraud Probability - that is how often, the app should simulate a fraudulent call. The value .2 means that about 20% of the call records will look fraudulent.
- Duration in hours - the number of hours that the app should run. You can also stop the app any time by ending the process (Ctrl+C) at the command line.
After a few seconds, the app starts displaying phone call records on the screen as it sends them to the event hub. The phone call data contains the following fields:
Record Definition CallrecTime The timestamp for the call start time. SwitchNum The telephone switch used to connect the call. For this example, the switches are strings that represent the country of origin (US, China, UK, Germany, or Australia). CallingNum The phone number of the caller. CallingIMSI The International Mobile Subscriber Identity (IMSI). It's a unique identifier of the caller. CalledNum The phone number of the call recipient. CalledIMSI International Mobile Subscriber Identity (IMSI). It's a unique identifier of the call recipient.
Create a Stream Analytics job
Now that you have a stream of call events, you can create a Stream Analytics job that reads data from the event hub.
To create a Stream Analytics job, navigate to the Azure portal
Select Create a resource > Internet of Things > Stream Analytics job.
Fill out the New Stream Analytics job pane with the following values:
Setting Suggested value Description Job name ASATutorial A unique name to identify the event hub namespace. Subscription <Your subscription> Select an Azure subscription where you want to create the job. Resource group MyASADemoRG Select Use existing and enter a new resource-group name for your account. Location West US2 Location where the job can be deployed. It's recommended to place the job and the event hub in the same region for best performance and so that you don't pay to transfer data between regions. Hosting environment Cloud Stream Analytics jobs can be deployed to cloud or edge. Cloud allows you to deploy to Azure Cloud, and Edge allows you to deploy to an IoT edge device. Streaming units 1 Streaming units represent the computing resources that are required to execute a job. By default, this value is set to 1. To learn about scaling streaming units, see understanding and adjusting streaming units article.
Use default options on the remaining settings, select Create and wait for the deployment to succeed.
Configure job input
The next step is to define an input source for the job to read data. For this tutorial, you'll use the event hub you created in the previous section as input. Run the following steps to configure input to your job:
From the Azure portal open All resources pane, and find the ASATutorial Stream Analytics job.
In the Job Topology section of the Stream Analytics job pane, select the Inputs option.
Select +Add stream input (Reference input refers to static lookup data, which you won't use in this tutorial), Event hub and then fill out the pane with the following values:
Setting Suggested value Description Input alias CallStream Provide a friendly name to identify your input. Input alias can contain alphanumeric characters, hyphens, and underscores only and must be 3-63 characters long. Subscription <Your subscription> Select the Azure subscription where you created the event hub. The event hub can be in same or a different subscription as the Stream Analytics job. Event hub namespace MyEventHubNS Select the event hub namespace you created in the previous section. All the event hub namespaces available in your current subscription are listed in the dropdown. Event Hub name MyEventHub Select the event hub you created in the previous section. All the event hubs available in your current subscription are listed in the dropdown. Event Hub policy name Mypolicy Select the event hub shared access policy you created in the previous section. All the event hubs policies available in your current subscription are listed in the dropdown.
Use default options on the remaining settings, select Save and wait for the deployment to succeed.
Configure job output
The last step is to define an output sink for the job where it can write the transformed data. For this tutorial, you'll output results to Power BI and visualize the date. Run the following steps to configure output to your job:
From the Azure portal open All resources pane, and the ASATutorial Stream Analytics job.
In the Job Topology section of the Stream Analytics job pane, select the Outputs option.
Select +Add > Power BI and fill the form with the following details (you can provide a friendly name to identify Output alias, Dataset name and Table name as shown in the table) and select Authorize:
Setting Suggested value Output alias MyPBIoutput Dataset name ASAdataset Table name ASATable
After you select on Authorize, a pop-up window opens, and you are asked to provide credentials to authenticate to your Power BI account. Once the authorization is successful, Save the settings.
Define a query to analyze input data
After you have a Stream Analytics job setup to read an incoming data stream, the next step is to create a transformation that analyzes data in real time. You define the transformation query by using Stream Analytics Query Language. In this tutorial, you define a query that detects fraud calls from the phone data.
For this example, we consider fraudulent calls are the ones that originate from the same user but in separate locations and the duration between both calls is five seconds. For example, the same user can't legitimately make a call from the US and Australia at the same time. To define the transformation query for your Stream Analytics job, run the following steps:
From the Azure portal open All resources pane, and open the ASATutorial Stream Analytics job you created earlier.
In the Job Topology section of the Stream Analytics job pane, select the Query option. The pop-up window lists the inputs and outputs that are configured for the job, and lets you create a query to transform the input stream.
Next, replace the existing query in the editor with the following data, this query performs a self-join on a 5-second interval worth of call data:
SELECT System.Timestamp AS WindowEnd, COUNT(*) AS FraudulentCalls INTO "MyPBIoutput" FROM "CallStream" CS1 TIMESTAMP BY CallRecTime JOIN "CallStream" CS2 TIMESTAMP BY CallRecTime ON CS1.CallingIMSI = CS2.CallingIMSI AND DATEDIFF(ss, CS1, CS2) BETWEEN 1 AND 5 WHERE CS1.SwitchNum != CS2.SwitchNum GROUP BY TumblingWindow(Duration(second, 1))
To check for fraudulent calls, you should self-join the streaming data based on the
CallRecTimevalue. You can then look for call records where the
CallingIMSIvalue (the originating number) is the same, but the
SwitchNumvalue (country of origin) is different. When you use a JOIN operation with streaming data, the join must provide some limits on how far the matching rows can be separated in time. As the streaming data is endless, the time bounds for the relationship are specified within the ON clause of the join, using the DATEDIFF function.
This query is just like a normal SQL join except for the DATEDIFF function. The DATEDIFF function used in this query is specific to Streaming Analytics, and it must appear within the
Save the query.
Test your query
You can test a query from the query editor and you need sample data to test a query. For this walkthrough, you'll extract sample data from the phone call stream that's coming into the event hub. Run the following steps to test the query:
Make sure that the TelcoGenerator app is running and producing phone call records.
In the Query pane, select the dots next to the CallStream input and then select Sample data from input. This opens a pane that lets you specify how much sample data to read from the input stream.
Set Minutes to 3 and select OK. Three minutes worth of data is sampled from the input stream and notifies you when the sample data is ready. You can view the status of sampling from the notification bar.
The sample data is stored temporarily and is available while you have the query window open. If you close the query window, the sample data is discarded, and you'll have to create a new set of sample data. As an alternative, you can get a .json file that has sample data in it from [GitHub](https://github.com/Azure/azure-stream-analytics/blob/master/Sample Data/telco.json), and then upload that .json file to use as sample data for the CallStream input.
Select Test to test the query, you should see output results as shown in this screenshot:
Start the job and visualize output
To start the job, navigate to the Overview pane of your job, and select Start.
Select Now for job output start time and select Start. The job starts in few minutes and you can view the status in the notification bar.
After the job succeeds, navigate to Powerbi.com and sign in with your work or school account. If the Stream Analytics job query outputs results, you see that your dataset is already created. Navigate to the Datasets tab, you can view a dataset named “ASAdataset”.
From your workspace, select +Create. Create a new dashboard and name it Fraudulent Calls. You will add two tiles to this dashboard, where one tile is used to view the count of fraudulent calls at a given instance and the other tile has a line chart visualization.
At the top of the window, select Add tile > and select Custom Streaming Data > Next > choose the ASAdataset > for Visualization type select Card > and Fields as fraudulentcalls. Select Next > enter a name for the tile select Apply.
Follow the steps 4 again, with the following options:
- When you get to Visualization Type, select Line chart.
- Add an axis and select windowend.
- Add a value and select fraudulentcalls.
- For Time window to display, select the last 10 minutes.
Your dashboard looks as following screenshot after both tiles are added. You'll notice that, if your event hub sender application and Streaming Analytics application are running, your PowerBI dashboard periodically updates as new data arrives.
Embedding your PowerBI Dashboard in a Web Application
For this part of the tutorial, you'll use a sample ASP.NET web application created by the PowerBI team to embed your dashboard. For more information about embedding dashboards, see embedding with Power BI article.
In this tutorial, we'll follow the steps for the user owns data application. To set up the application, go to the PowerBI-Developer-Samples Github repository and follow the instructions under the User Owns Data section (use the redirect and homepage URLs under the integrate-dashboard-web-app subsection). Since we are using the Dashboard example, use the integrate-dashboard-web-app sample code located in the [GitHub repository](https://github.com/Microsoft/PowerBI-Developer-Samples/tree/master/User Owns Data/integrate-dashboard-web-app). Once you've got the application running in your browser, follow these steps to embed the dashboard you created earlier into the web page:
Select Sign in to Power BI, which grants the application access to the dashboards in your PowerBI account.
Select the Get Dashboards button, which displays your account's Dashboards in a table. Find the name of the dashboard you created earlier (powerbi-embedded-dashboard) and copy the corresponding EmbedUrl.
Finally, paste the EmbedUrl into the corresponding text field and select Embed Dashboard. You can now view the same dashboard embedded within a web application.
In this tutorial, you have created a simple Stream Analytics job, analyzed the incoming data and presented results in a Power BI dashboard. To learn more about Stream Analytics jobs, continue to the next tutorial: