Is dynamic SQL Queries supported on Azur Databricks SQL Cluster?

Question

Hello,

I'm planning to implement Dynamic SQL function to query data on Databricks table. Tables and access for the users are governed by a custom access matrix using the Unity catalog.

The problem is that in a custom matrix, there are two types of users: individual users and aggregated users. Individual users are governed through UC but for Aggregated users, they should get result from the Databricks table only if the columns selected by users with metric has a value > 10.

As it depends on user selection, the fields that user will select is dynamic and accordingly aggregate value which will be populated should be dynamic

For example:

This is my dummy table:

User's image

User Can select at one instance Gender wise values, sometimes users can select Gender,Age-Group wise Values.

In whatever select query user fires, for aggregate user, if value >10 then it should show result set.

Actual table will have more than 50+ columns, so i have to design dynamic SQL query/functions.

In actual solution, these queries will be received to Databricks SQL Cluster from PowerBI

Can someone please let me know if dynamic sql queries are supported in Databricks SQL, if no, then any alternative solution please let me know.

Answer

@Jayesh Potwade - Thanks for the question and using MS Q&A platform.

Yes, dynamic SQL queries are supported on Azure Databricks SQL cluster. You can use the spark-sql command to execute dynamic SQL queries on Azure Databricks SQL cluster. Here is an example:

%sql SELECT * FROM my_table WHERE column_name = '${value}'

In this example, ${value} is a parameter that can be passed dynamically to the query. You can also use the spark.sql() function to execute dynamic SQL queries in Python or Scala. Here is an example:

from pyspark.sql.functions import lit

value = "some_value"
query = "SELECT * FROM my_table WHERE column_name = '{}'".format(value)

df = spark.sql(query)

In this example, value is a parameter that can be passed dynamically to the query. The spark.sql() function is used to execute the query and return the result as a DataFrame.

You can use the EXECUTE IMMEDIATE statement to execute dynamic SQL queries in Databricks SQL. Here is an example:

DECLARE @query NVARCHAR(MAX)
SET @query = 'SELECT * FROM my_table WHERE column_name = ''' + @value + ''''

EXECUTE IMMEDIATE @query

In this example, @value is a parameter that can be passed dynamically to the query. The EXECUTE IMMEDIATE statement is used to execute the query.

Regarding your requirement for aggregated users, you can use a CASE statement to check if the selected columns have a value greater than 10. Here is an example:

SELECT
  Gender,
  Age_Group,
  SUM(CASE WHEN Value > 10 THEN Value ELSE 0 END) AS Aggregate_Value
FROM
  my_table
GROUP BY
  Gender,
  Age_Group

In this example, the SUM function is used to calculate the aggregate value for each group. The CASE statement is used to check if the Value column is greater than 10. If it is, then the Value is included in the sum. Otherwise, 0 is included in the sum.

You can modify this query to include only the selected columns by the user. You can also modify the CASE statement to check if the selected columns have a value greater than 10.

Hope this helps. Do let us know if you any further queries.

Share via

Is dynamic SQL Queries supported on Azur Databricks SQL Cluster?

1 answer