The Azure Stream Analytics Query Language

This post is by Janet Yeilding, a Program Manager on the Azure Stream Analytics team, and is based on a recent session from the Microsoft Advanced Analytics webinar series.

Azure Stream Analytics (ASA) is a cloud service that enables real-time processing over streams of data flowing in from devices, sensors, websites and other live systems. The stream-processing logic in ASA is expressed in a SQL-like query language with some added extensions such as windowing for performing temporal calculations. At a recent webinar, we explained key language constructs and query patterns and did an Internet of Things (IoT) demonstration – you can view a recording of the webinar here – and, in this post, we share a few highlights from the webinar and about this technology.

Events and Time

ASA is a temporal system, so every event that flows through it has a timestamp. A timestamp is assigned automatically based on the event's arrival time to the input source but you can also access a timestamp in your event payload explicitly using TIMESTAMP BY:

SELECT * FROM SensorReadings TIMESTAMP BY time

As you perform computations and aggregations over your input and generate output events, Stream Analytics will assign timestamps to them, as well. You can access this timestamp with System.Timestamp:

SELECT System.Timestamp AS Time FROM SensorReadings

Windowing

Windowing is a core requirement for stream processing applications to perform set-based operations like counts or aggregations over events that arrive within a specified period of time. ASA supports three types of windows: Tumbling, Hopping, and Sliding.

Tumbling Windows

Tumbling Windows define a repeating, non-overlapping window of time.  


Example: Calculate the count of sensor readings per device every 10 seconds

SELECT sensorId, COUNT(*) AS Count
FROM SensorReadings TIMESTAMP BY time
GROUP BY sensorId, TumblingWindow(second, 10)

Hopping Windows

Like Tumbling Windows, Hopping Windows move forward in time by a fixed period but they can overlap with one another.


Example: Every 5 seconds calculate the count of sensor readings and the average temperature over the last 10 seconds

SELECT sensorId, COUNT(*) AS Count, AVG(temp)
FROM SensorReadings TIMESTAMP BY time
GROUP BY sensorId, HoppingWindow(second, 10 , 5) 

Sliding Windows

With a Sliding Window, the system is asked to logically consider all possible windows of a given length and output events for cases when the content of the window actually changes – that is, when an event entered or existed the window.


Example : Generate an output event if the temperature is above 75 for a total of 5 seconds

SELECT sensorId, MIN(temp) as temp
FROM SensorReadings
TIMESTAMP BY time
GROUP BY sensorId, SlidingWindow(second, 5)
HAVING MIN(temp) > 75 

Joining Multiple Streams

Like standard T-SQL, JOIN in ASA is used to combine records from two or more input sources. JOINs in ASA are temporal in nature, meaning each JOIN must provide some limits on how far the matching rows can be separated in time. 

Example: Find cases where the humidity reading changed by at least .1 within 5 seconds

SELECT s1.time, s1.dspl, s1.hmdt as previousHmdt, s2.hmdt as newHmdt, datediff(ss, s1.time, s2.time) as secondsApart
FROM SensorData s1 timestamp by time
JOIN SensorData s2 timestamp by time
ON s1.dspl = s2.dspl
AND DATEDIFF(s, s1, s2) BETWEEN 0 AND 5
WHERE (s2.hmdt - s1.hmdt >= .1) or (s1.hmdt - s2.hmdt >= .1) 

Reference Data JOIN

Reference data is static or slow-changing data. It is often used for performing lookups or correlations and can be incorporated into a ASA query using a Reference Data JOIN.

Example: Use a lookup table to perform dynamic temperature threshold alerting for each device

SELECT SensorReadings.sensorID, SensorReadings.temp
FROM SensorReadings
JOIN thresholdRefData
ON SensorReadings.sensorID = thresholdRefData.sensorID
WHERE SensorReadings.temp > thresholdRefData.value 

Multiple Outputs

ASA supports multiple computations and output targets in a single job.  You can use the INTO keyword to specify an output target explicitly. 

Example: Write all events to outputLog and write events above 75 degrees to outputTempAlert

SELECT *
INTO outputLog
FROM SensorReadings

SELECT *
INTO outputTempAlert
FROM SensorReadings
WHERE temp > 75

ML Functions

Azure ML and ASA are now integrated in private preview, enabling you to call ML models over streaming data in an ASA job. For details on supported scenarios and signup information, see Real Time scoring of streaming data using Machine Learning models.

Stay Connected!

You can learn more about ASA from the documentation page and blog or follow us at @AzureStreaming for the latest news.

Janet