Public preview of derived tables and views on graph tables in MATCH queries
SQL Server 2017 and Azure SQL Database introduced native graph database capabilities for modeling many-to-many relationships. The first implementation of SQL Graph introduced support for nodes to represent entities, edges to represent relationships, and a new MATCH predicate to support graph pattern matching and traversal.
We will be further expanding the graph database capabilities with several new features. In this blog we will discuss one of those features that is now available for public preview in Azure SQL Database and SQL Server 2019 CTP2.1: use of derived tables and views on graph tables in MATCH queries.
Graph queries on Azure SQL Database now support using view and derived table aliases in the MATCH syntax. To use these aliases in MATCH, the views and derived tables must be created either on a node or edge table which may or may not have some filters on it or a set of node or edge tables combined together using the UNION ALL operator. The ability to use derived table and view aliases in MATCH queries, could be very useful in scenarios where you are looking to query heterogeneous entities or heterogeneous connections between two or more entities in your graph.
In this post we will discuss a few examples, based on the following graph schema created in the database, to see how derived tables and views can be used to query heterogeneous associations in a graph. The scripts to create the required graph schema can be downloaded from here. In the graph above, we see two types of heterogeneous associations between entities:
- Heterogeneous Nodes: A node is connected to two or more nodes via the same edge, in the graph. For example, consider that WideWorldImporters would like to find all the customers who bought a StockItem from them . In the graph above, we see that customers of WideWorldImporters could be either an individual Customer or a Supplier. Both Customer and Supplier are connected to StockItem via the same edge type Hence, the query will need to find a Supplier or a Customer who bought a StockItem from WideWorldImporters.
- Heterogeneous Edges: Two nodes or entities in a graph are connected to each other via two or more relationships or edges. For instance, in the WideWorldImporters example above, to find a Supplier who operates in a given City, the query has to find a Supplier who is either locatedIn or takes deliveryIn that
These types of queries are generally implemented in the relational model by keeping an extra type or ID column on one of the tables. Queries extract the required information or rows of data, based on the value of this extra type or ID column. But, as the schema and application evolve with more data and relationships, writing queries that involve joins and filters on multiple ID or type columns may not be trivial. With derived table/view support in graph queries, users can write such queries easily using a simple MATCH syntax. In the following section, we will look at some examples to understand how this can be done.
Querying heterogeneous edges
Consider that WideWorldImporters wants to find all the Suppliers that operate in a City. As we discussed earlier, this means they have to find all the suppliers who are either located in or take delivery in the given city. Since , WideWorldImporters can define a view on heterogeneous relationships involved and then use the view alias in the MATCH query as follows:
CREATE VIEW OperatesIn AS SELECT *, 'located' AS relation FROM locatedIn UNION ALL SELECT *, 'delivery' FROM deliveryIn GO
Now, they can use the OperatesIn view in the following query and other queries which might involve querying same relationships.
SELECT SupplierID, SupplierName, PhoneNumber, relation FROM Supplier, City, OperatesIn WHERE MATCH(Supplier-(OperatesIn)->City) AND City.CityName = 'San Francisco'
This query will return information about all the suppliers who operate in San Francisco.
Querying heterogeneous nodes connected via same edge
WideWorldImporters wants to find all of their customers located in San Francisco. This means, they have to find all the Customers (distributor or organization) and Suppliers who are locatedIn San Francisco. They can now create a view to combine all heterogeneous types of customers into one entity as follows:
CREATE VIEW Customer AS SELECT SupplierID AS ID, SupplierName AS NAME, SupplierCategory AS CATEGORY FROM Supplier UNION ALL SELECT CustomerID, CustomerName, CustomerCategory FROM Customers GO
Now, to find all customers locatedIn San Francisco, they can run the following MATCH query:
SELECT Customer.ID, Customer.NAME, Customer.CATEGORY FROM Customer, City, locatedIn WHERE MATCH(Customer-(locatedIn)->City) AND City.CityName = 'San Francisco'
Querying heterogeneous nodes and edges
Extending the scenario in the first example above, let's consider that WideWorldImporters now wants to find all the customers (distributor, organization or suppliers) who operate in San Francisco. Note that here, both the involved edges (locatedIn and deliveryIn) and the involved customer nodes (Supplier and Customers) are heterogeneous. Since WideWorldImporters has already created Customer and OperatesIn views, they can now write the following MATCH query to get the desired results:
SELECT Customer.ID, Customer.NAME, Customer.CATEGORY FROM Customer, City, OperatesIn WHERE MATCH(Customer-(OperatesIn)->City) AND City.CityName = 'San Francisco'
Nested derived tables or views on node or edge tables
Assume that WideWorldImporters wants to find the customers or suppliers who are a store or supplier for Novelty goods, Toys or Gifts. They can create the following views on Supplier and Customers tables to filter the required rows:
CREATE VIEW Novelty_Supplier AS SELECT SupplierID, SupplierName , SupplierCategory , ValidTo FROM Supplier WHERE SupplierCategory LIKE '%Novelty%' OR SupplierCategory LIKE '%Toy%' GO CREATE VIEW Novelty_Customer AS SELECT CustomerID, CustomerName, CustomerCategory, ValidTo FROM Customers WHERE CustomerCategory LIKE '%Novelty%' OR CustomerCategory LIKE '%Gift%' GO
Now, they want to find out all stores or suppliers for novelty goods, toys or gifts who operate in San Francisco and they have purchased ‘White Chocolate Snow Balls 250g’ from WideWorldImporters. They want to make sure that the supplier or customer still has a valid membership with WideWorldImporters. The following query helps them gather all the information:
SELECT Name, ID, Category FROM (SELECT SupplierID AS ID, SupplierName AS Name, SupplierCategory AS Category, ValidTo FROM Novelty_Supplier WHERE ValidTo > getdate() UNION ALL SELECT CustomerID, CustomerName, CustomerCategory, ValidTo FROM Novelty_Customer WHERE ValidTo > getdate()) AS NoveltyCust, StockItems, bought, Operates, City WHERE MATCH(City<-(Operates)-NoveltyCust-(bought)->Stockitems) AND StockItemName = 'White chocolate snow balls 250g' AND city.cityname = 'San Francisco' GO
The ability to use view and derived table aliases in a MATCH query make many scenarios easier. For example, for fraud detection in banking, finance or insurance organizations, often one needs to find the heterogeneous relationships that a given customer shares with other customers in the organization. Derived tables on nodes or edge tables will make writing those queries easy and these derived tables can be used in several places in an application for different type of queries.
It is now possible to use view and derived table aliases within a MATCH query in Azure SQL Database. For this feature to work, the view or derived table alias must be defined on
- Either a node or an edge table with some filter(s) on it
- Or a view or derived table which combines several node or edge tables together using the UNION ALL operator.
Combining node or edge tables using other operators like join, is possible, but such view and derived table aliases cannot be used inside a MATCH query.
You can now use derived table and view aliases within a graph match query in Azure SQL Database and SQL Server 2019 CTP2.1. Please give it a try and send us your feedback.