September 2019

Volume 34 Number 9

[.NET Development]

Expression Trees in Visual Basic and C#

By Zev Spitz

Imagine if you had objects that represented parts of a program. Your code could combine these parts in various ways, creating an object that represented a new program. You could compile this object into a new program and run it; your program could even rewrite its own logic. You could pass this object to a different environment, where the individual parts would be parsed as a set of instructions to be run in this other environment.

Welcome to expression trees. This article aims to provide a high-level overview of expression trees, their use in modeling code operations and runtime code compilation, and how to create and transform them.

A Note About Language For many language features in Visual Basic and C#, the syntax is actually just a thin wrapper over language-agnostic .NET types and methods. For example, C#’s foreach loop and Visual Basic’s For Each construct both call into the IEnumerable-implementing type’s GetEnumerator method. LINQ keyword syntax is resolved into methods such as Select and Where with the appropriate signature. A call to IDisposable.Dispose wrapped in error handling is generated every time you write Using in Visual Basic or a using block in C#.

This is true of expression trees, as well—both Visual Basic and C# have syntactic support for creating expression trees, and can use the .NET infrastructure (in the System.Linq.Expressions namespace) for working with them. As such, the concepts described here apply equally well to both languages. Whether your primary development language is C# or Visual Basic, I believe you’ll find value in this article.

Expressions, Expression Trees and the Types in System.Linq.Expressions

An expression in Visual Basic and C# is a piece of code that returns a value when evaluated, for example:

42
"abcd"
True
n

Expressions can be composed of other expressions, such as:

x + y
"abcd".Length < 5 * 2

These form a tree of expressions, or an expression tree. Consider n + 42 = 27 in Visual Basic, or n + 42 == 27 in C#, which can be separated into parts as shown in Figure 1.

Breakdown of an Expression Tree
Figure 1 Breakdown of an Expression Tree

.NET provides a set of types in the System.Linq.Expressions namespace for constructing data structures that represent expression trees. For example, n + 42 = 27 can be represented with objects and properties as shown in Figure 2. (Note that this code will not compile—these types don’t have public constructors, and they’re immutable, so no object initializers.)

Figure 2 Expression Tree Objects Using Object Notation

New BinaryExpression With {
  .NodeType = ExpressionType.Equal,
  .Left = New BinaryExpression With {
    .NodeType = ExpressionType.Add,
    .Left = New ParameterExpression With {
      .Name = "n"
    },
    .Right = New ConstantExpression With {
      .Value = 42
    }
  },
  .Right = New ConstantExpression With {
    .Value = 27
  }
}
new BinaryExpression {
  NodeType = ExpressionType.Equal,
  Left = new BinaryExpression {
    NodeType = ExpressionType.Add,
    Left = new ParameterExpression {
      Name = "n"
    },
    Right = new ConstantExpression {
      Value = 42
    }
  },
  Right = new ConstantExpression {
    Value = 27
  }
}

The term expression tree in .NET is used for both syntactical expression trees (x + y), and expression tree objects (a Binary­Expression instance).

To understand what a given node in the expression tree represents, the node object’s type—BinaryExpression, ConstantExpression, ParameterExpression—only describes the shape of the expression. For example, the BinaryExpression type tells me that the represented expression has two operands, but not which operation combines the operands. Are the operands added together, multiplied together or compared numerically? That information is contained in the NodeType property, defined in the Expression base class. (Note that some node types don’t represent code operations—see “Meta Node Types.”)

Meta Node Types

Most node types represent code operations. However, there are three “meta” node types—node types that provide information about the tree, but do not map directly to code.

ExpressionType.Quote This type of node always wraps a LambdaExpression, and specifies that the LambdaExpression defines a new expression tree, not a delegate. For example, the tree generated from the following Visual Basic code:

Dim expr As Expression(Of Func(Of Func(Of Boolean))) = Function() Function() True

or the following C# code:

Expression<Func<Func<bool>>> expr = () => () => true;

represents a delegate that produces another delegate. If you want to represent a delegate that produces another expression tree, you have to wrap the inner one in a Quotenode. The compiler does this automatically with the following Visual Basic code:

Dim expr As Expression(Of Func(Of Expression(Of Func(Of Boolean)))) =
  Function() Function() True

Or the following C# code:

 

Expression<Func<Expression<Func<bool>>>> expr = () => () => true;

ExpressionType.DebugInfo This node emits debug information, so when you debug compiled expressions, the IL at a particular point can be mapped to the right place in your source code.

ExpressionType.RuntimeVariablesExpression Consider the arguments object in ES3; it’s available within a function without having been declared as an explicit variable. Python exposes a locals function, which returns a dictionary of variables defined in the local namespace. These “virtual” variables are described within an expression tree using the RuntimeVariablesExpression.

An additional property of note on Expression is the Type property. In both Visual Basic and C#, every expression has a type—adding two Integers will produce an Integer, while adding two Doubles will produce a Double. The Type property returns the System.Type of the expression.

When visualizing the structure of an expression tree, it’s useful to focus on these two properties, as depicted in Figure 3.

NodeType, Type, Value and Name Properties
Figure 3 NodeType, Type, Value and Name Properties

If there are no public constructors for these types, how do we create them? You have two choices. The compiler can generate them for you if you write lambda syntax where an expression type is expected. Or you can use the Shared (static in C#) factory methods on the Expression class.

Constructing Expression Tree Objects I: Using the Compiler

The simple way to construct these objects is to have the compiler generate them for you, as mentioned previously, by using lambda syntax anywhere an Expression(Of TDelegate) (Expression<TDelegate> in C#) is expected: either assignment to a typed variable, or as an argument to a typed parameter. Lambda syntax is introduced in Visual Basic using either the Function or Sub keywords, followed by a parameter list and a method body; in C# the => operator indicates lambda syntax.

Lambda syntax that defines an expression tree is called an expression lambda (as opposed to a statement lambda), and looks like this in C#:

Expression<Func<Integer, String>> expr = i => i.ToString();
IQueryable<Person> personSource = ...
var qry = qry.Select(x => x.LastName);

And like this in Visual Basic:

Dim expr As Expression(Of Func(Of Integer, String))  = Function(i) i.ToString
Dim personSource As IQueryable(Of Person) = ...
Dim qry = qry.Select(Function(x) x.LastName)

Producing an expression tree in this way has some limitations:

  • Limited by lambda syntax
  • Supports only single-line lambda expressions, with no multiline lambdas
  • Expression lambdas can only contain expressions, not statements such as If…Then or Try…Catch
  • No late binding (or dynamic in C#)
  • In C#, no named arguments or omitted optional arguments
  • No null-propagation operator

The complete structure of the constructed expression tree—that is, node types, types of sub-expressions and overload resolution of methods—is determined at compile time. Because expression trees are immutable, the structure cannot be modified once created.

Note also that compiler-­generated trees mimic the behavior of the compiler, so the generated tree might be different from what you’d expect given the original code (Figure 4). Some examples:

Compiler-Generated Trees vs. Source Code
Figure 4 Compiler-Generated Trees vs. Source Code

Closed-over Variables In order to reference the current value of variables when going in and out of lambda expressions, the compiler creates a hidden class whose members correspond to each of the referenced variables; the lambda expression’s function becomes a method of the class (see “Closed-over Variables”). The corresponding expression tree will render closed-over variables as MemberAccess nodes on the hidden instance. In Visual Basic the $VB$Local_ prefix will also be appended to the variable name.

Closed-over Variables

When a lambda expression references a variable defined outside of it, we say the lambda expression “closes over” the variable. The compiler has to give this variable some special attention, because the lambda expression might be used after the value of the variable has been changed, and the lambda expression is expected to reference the new value. Or vice versa, the lambda expression might change the value, and that change should be visible outside of the lambda expression. For example, in the following C# code:

var i = 5;
Action lmbd = () => Console.WriteLine(i);
i = 6;
lmbd();

or in the following Visual Basic code:

Dim i = 5
Dim lmbd = Sub() Console.WriteLine(i)
i = 6
lmbd()

the expected output would be 6, not 5, because the lambda expression should use the value of i at the point when the lambda expression is invoked, after i has been set to 6.

The C# compiler accomplishes this by creating a hidden class with the needed variables as fields of the class, and the lambda expressions as methods on the class. The compiler then replaces all references to that variable with member access on the class instance. The class instance doesn’t change, but the value of its fields can. The resulting IL looks something like the following in C#:

[CompilerGenerated]
private sealed class <>c__DisplayClass0_0 {
  public int i;
  internal void <Main>b__0() => Console.WriteLine(i);
}
var @object = new <>c__DisplayClass0_0();
@object.i = 5;
Action lmbd = @object.<Main>b__0;
@object.i = 6;
lmbd();

The Visual Basic compiler does something similar, with one difference: $VB$Local is prepended to the property name, like so:

 

<CompilerGenerated> Friend NotInheritable Class _Closure$__0-0
  Public $VB$Local_i As Integer
  Sub _Lambda$__0()
    Console.WriteLine($VB$Local_i)
  End Sub
End Class
Dim targetObject = New _Closure$__0-0 targetObject With { .$VB$Local_i = 5 }
Dim lmbd = AddressOf targetObject. _Lambda$__0
targetObject.i = 6
lmbd()

NameOf Operator The result of the NameOf operator is rendered as a Constant string value.

String Interpolation and Boxing Conversion This is resolved into a call to String.Format and a Constant format string. Because the interpolation expression is typed as a value type—Date (the Visual Basic alias for DateTime)—while the corresponding param­eter of String.Format is expecting an Object, the compiler also wraps the interpolation expression with a Convert node to Object.

Extension Method Calls These are actually calls to module-level methods (static methods in C#), and they’re rendered as such in expression trees. Module-level and Shared methods don’t have an instance, so the corresponding MethodCallExpression’s Object property will return Nothing (or null). If the MethodCallExpression represents an instance method call, the Object property will not be Nothing.

The way extension methods are represented in expression trees highlights an important difference between expression trees and Roslyn compiler syntax trees, which preserve the syntax as is; expression trees are focused less on the precise syntax and more on the underlying operations. The Roslyn syntax tree for an extension method call would look the same as a standard instance method call, not a Shared or static method call.

Conversions When an expression’s type doesn’t match the expected type, and there’s an implicit conversion from the expression’s type, the compiler will wrap the inner expression in a Convert node to the expected type. The Visual Basic compiler will do the same when generating expression trees with an expression whose type implements or inherits from the expected type. For example, in Figure 4, the Count extension method expects an IEnumerable(Of Char), but the actual type of the expression is String.

Constructing Expression Tree Objects II: Using the Factory Methods

You can also construct expression trees using the Shared (static in C#) factory methods at System.Linq.Expressions.Expression. For example, to construct the expression tree objects for i.ToString, where i is an Integer in Visual Basic (or an int in C#), use code like the following:

' Imports System.Linq.Expressions.Expression
Dim prm As ParameterExpression = Parameter(GetType(Integer), "i")
Dim expr As Expression = [Call](
  prm,
  GetType(Integer).GetMethods("ToString", {})
)

In C#, the code should look like this:

// Using static System.Linq.Expressions.Expression
ParameterExpression prm = Parameter(typeof(int), "i");
Expression expr = Call(
  prm,
  typeof(int).GetMethods("ToString", new [] {})
);

While building expression trees in this manner usually requires a fair amount of reflection and multiple calls to the factory methods, you have greater flexibility to tailor the precise expression tree needed. With the compiler syntax, you would have to write out all possible variations of the expression tree that your program might ever require.

In addition, the factory method API allows you to build some kinds of expressions that aren’t currently supported in compiler-­generated expressions. A few examples:

Statements such as a System.Void-returning Conditional­Expression, which corresponds to If..Then and If..Then..Else..End If (in C# known as if (...) { ...} else { ... }); or a TryCatchExpression representing a Try..Catch or a try { ... } catch (...) { ... } block.

Assignments Dim x As Integer: x = 17.

Blocks for grouping multiple statements together.

For example, consider the following Visual Basic code:

Dim msg As String = "Hello!"
If DateTime.Now.Hour > 18 Then msg = "Good night"
Console.WriteLine(msg)

or the following equivalent C# code:

string msg = "Hello";
if (DateTime.Now.Hour > 18) {
  msg = "Good night";
}
Console.WriteLine(msg);

You could construct a corresponding expression tree in Visual Basic or in C# using the factory methods, as shown in Figure 5.

Figure 5 Blocks, Assignments and Statements in Expression Trees

' Imports System.Linq.Expressions.Expression
Dim msg = Parameter(GetType(String), "msg")
Dim body = Block(
  Assign(msg, Constant("Hello")),
  IfThen(
    GreaterThan(
      MakeMemberAccess(
        MakeMemberAccess(
          Nothing,
          GetType(DateTime).GetMember("Now").Single
        ),
        GetType(DateTime).GetMember("Hour").Single
      ),
      Constant(18)
    ),
    Assign(msg, Constant("Good night"))
  ),
  [Call](
    GetType(Console).GetMethod("WriteLine", { GetType(string) }),
    msg
  )
)
// Using static System.Linq.Expressions.Expression
var msg = Parameter(typeof(string), "msg");
var expr = Lambda(
  Block(
    Assign(msg, Constant("Hello")),
    IfThen(
      GreaterThan(
        MakeMemberAccess(
          MakeMemberAccess(
            null,
            typeof(DateTime).GetMember("Now").Single()
          ),
          typeof(DateTime).GetMember("Hour").Single()
        ),
        Constant(18)
      ),
      Assign(msg, Constant("Good night"))
    ),
    Call(
      typeof(Console).GetMethod("WriteLine", new[] { typeof(string) }),
      msg
    )
  )
);

Using Expression Trees I: Mapping Code Constructs to External APIs

Expression trees were originally designed to enable mapping of Visual Basic or C# syntax into a different API. The classic use case is generating a SQL statement, like this:

SELECT * FROM Persons WHERE Persons.LastName LIKE N'D%'

This statement could be derived from code like the following snippet, which uses member access (. operator), method calls inside the expression, and the Queryable.Where method. Here’s the code in Visual Basic:

Dim personSource As IQueryable(Of Person) = ...
Dim qry = personSource.Where(Function(x) x.LastName.StartsWith("D"))

and here’s the code in C#:

IQueryable<Person> personSource = ...
var qry = personSource.Where(x => x.LastName.StartsWith("D");

How does this work? There are two overloads that could be used with the lambda expression—Enumerable.Where and Queryable.Where. However, overload resolution prefers the overload in which the lambda expression is an expression lambda—that is, Queryable.Where—over the overload that takes a delegate. The compiler then replaces the lambda syntax with calls to the appropriate factory methods.

At runtime, the Queryable.Where method wraps the passed-in expression tree with a Call node whose Method property references Queryable.Where itself, and which takes two parameters—personSource, and the expression tree from the lambda syntax (Figure 6). (The Quote node indicates that the inner expression tree is being passed to Queryable.Where as an expression tree and not as a delegate.)

Visualization of Final Expression Tree
Figure 6 Visualization of Final Expression Tree

A LINQ database provider (such as Entity Framework, LINQ2­SQL or NHibernate) can take such an expression tree and map the different parts into the SQL statement at the beginning of this section. Here’s how:

  • ExpresssionType.Call to Queryable.Where is parsed as a SQL WHERE clause
  • ExpressionType.MemberAccess of LastName on an instance of Person becomes reading the LastName field in the Persons table—Persons.LastName
  • ExpressionType.Call to the StartsWith method with a Constant argument is translated into the SQL LIKE operator, against a pattern that matches the beginning of a constant string:
LIKE N'D%'

It’s thus possible to control external APIs using code constructs and conventions, with all the benefits of the compiler—type safety and correct syntax on the various parts of the expression tree, and IDE autocompletion. Some other examples:

Creating Web Requests Using the Simple.OData.Client library (bit.ly/2YyDrsx), you can create OData requests by passing in expression trees to the various methods. The library will output the correct request (Figure 7 shows the code for both Visual Basic and C#).

Figure 7 Requests Using the Simple.OData.Client Library and Expression Trees

Dim client = New ODataClient("https://services.odata.org/v4/TripPinServiceRW/")
Dim people = Await client.For(Of People)
  .Filter(Function(x) x.Trips.Any(Function(y) y.Budget > 3000))
  .Top(2)
  .Select(Function(x) New With { x.FirstName, x.LastName})
  .FindEntriesAsync
var client = new ODataClient("https://services.odata.org/v4/TripPinServiceRW/");
var people = await client.For<People>()
  .Filter(x => x.Trips.Any(y => y.Budget > 3000))
  .Top(2)
  .Select(x => new {x.FirstName, x.LastName})
  .FindEntriesAsync();

Outgoing request:

 

> https://services.odata.org/v4/TripPinServiceRW/People?$top=2 &amp;
  $select=FirstName, LastName &amp; $filter=Trips/any(d:d/Budget gt 3000)

Reflection by Example Instead of using reflection to get hold of a MethodInfo, you can write a method call within an expression, and pass that expression to a function that extracts the specific overload used in the call. Here’s the reflection code first in Visual Basic:

Dim writeLine as MethodInfo = GetType(Console).GetMethod(
  "WriteLine", { GetType(String) })

And the same code in C#:

MethodInfo writeLine = typeof(Console).GetMethod(
  "WriteLine", new [] { typeof(string) });

Now here’s how this function and its usage could look in Visual Basic:

Function GetMethod(expr As Expression(Of Action)) As MethodInfo
  Return CType(expr.Body, MethodCallExpression).Method
End Function
Dim mi As MethodInfo = GetMethod(Sub() Console.WriteLine(""))

and here’s how it could look in C#:

public static MethodInfo GetMethod(Expression<Action> expr) =>
  (expr.Body as MethodCallExpression).Method;
MethodInfo mi = GetMethod(() => Console.WriteLine(""));

This approach also simplifies getting at extension methods and constructing closed generic methods, as shown here in Visual Basic:

Dim wherePerson As MethodInfo = GetMethod(Sub() CType(Nothing, IQueryable(Of
  Person)).Where(Function(x) True)))

It also provides a compile-time guarantee that the method and overload exist, as shown here in C#:

 

// Won’t compile, because GetMethod expects Expression<Action>, not Expression<Func<..>>
MethodInfo getMethod = GetMethod(() => GetMethod(() => null));

Grid Column Configuration If you have some kind of grid UI, and you want to allow defining the columns declaratively, you could have a method that builds the columns based on subexpressions in an array literal (using Visual Basic):

grid.SetColumns(Function(x As Person) {x.LastName, x.FirstName, x.DateOfBirth})

Or from the subexpressions in an anonymous type (in C#):

grid.SetColumns((Person x) => new {x.LastName, x.FirstName, DOB = x.DateOfBirth});

Using Expression Trees II: Compiling Invocable Code at Runtime

The second major use case for expression trees is to generate executable code at runtime. Remember the body variable from before? You can wrap it in a LambdaExpression, compile it into a delegate, and invoke the delegate, all at runtime. The code would look like this in Visual Basic:

Dim lambdaExpression = Lambda(Of Action)(body)
Dim compiled = lambdaExpression.Compile
compiled.Invoke
' prints either "Hello" or "Good night"

and like this in C#:

var lambdaExpression = Lambda<Action>(body);
var compiled = lambdaExpression.Compile();
compiled.Invoke();
// Prints either "Hello" or "Good night"

Compiling an expression tree into executable code is very useful for implementing other languages on top of the CLR, as it’s much easier to work with expression trees over direct IL manipulation. But if you’re programming in C# or Visual Basic, and you know the program’s logic at compile time, why not embed that logic in your existing method or assembly at compile time, instead of compiling at runtime?

However, runtime compilation really shines when you don’t know the best path or the right algorithm at design time. Using expression trees and runtime compilation, it’s relatively easy to iteratively rewrite and refine your program’s logic at runtime in response to actual conditions or data from the field.

Self-Rewriting Code: Call-Site Caching in Dynamically Typed Languages As an example, consider call-site caching in the Dynamic Language Runtime (DLR), which enables dynamic typing in CLR-targeting language implementations. It uses expression trees to provide a powerful optimization, by iteratively rewriting the delegate assigned to a specific call site when needed.

Both C# and Visual Basic are (for the most part) statically typed languages—for every expression in the language we can resolve a fixed type that doesn’t change over the lifetime of the program. In other words, if the variables x and y have been declared as an Integer (or an int in C#) and the program contains a line of code x + y, the resolution of the value of that expression will always use the “add” instruction for two Integers.

However, dynamically typed languages have no such guarantee. Generally, x and y have no inherent type, so the evaluation of x + y must take into account that x and y could be of any type, say String, and resolving x + y in that case would mean using String.Concat. On the other hand, if the first time the program hits the expression x and y are Integers, it’s highly likely that successive hits will also have the same types for x and y. The DLR takes advantage of this fact with call-site caching, which uses expression trees to rewrite the delegate of the call site each time a new type is encountered.

Each call site gets assigned a CallSite(Of T) (or in C# a CallSite<T> ) instance with a Target property that points to a compiled delegate. Each site also gets a set of tests, along with the actions that should be taken when each test succeeds. Initially, the Target delegate only has code to update itself, like so:

‘ Visual Basic code representation of Target delegate
Return site.Update(site, x, y)

At the first iteration, the Update method will retrieve an applicable test and action from the language implementation (for example, “if both arguments are Integers, use the ‘add’ instruction”). It will then generate an expression tree that performs the action only if the test succeeds. A code-equivalent of the resulting expression tree might look something like this:

‘ Visual Basic code representation of expression tree
If TypeOf x Is Integer AndAlso TypeOf y Is Integer Then Return CInt(x) + CInt(y)
Return site.Update(site, x, y)

The expression tree will then be compiled into a new delegate, and stored at the Target property, while the test and action will be stored in the call site object.

In later iterations, the call site will use the new delegate to resolve x + y. Within the new delegate, if the test passes, the resolved CLR operation will be used. Only if the tests fail (in this case, if either x or y isn’t an Integer) will the Update method have to turn again to the language implementation. But when the Update method is called it will add the new test and action, and recompile the Target delegate to account for them. At that point, the Target delegate will contain tests for all the previously encountered type pairs, and the value resolution strategy for each type, as shown in the following code:

If TypeOf x Is Integer AndAlso TypeOf y Is Integer Then Return CInt(x) + CInt(y)
If TypeOf x Is String Then Return String.Concat(x, y)
Return site.Update(site, x, y)

This runtime code rewriting in response to facts on the ground would be very difficult—if not impossible—without the ability to compile code from an expression tree at runtime.

Dynamic Compilation with Expression Trees vs. with Roslyn You can also dynamically compile code from simple strings rather than from an expression tree with relative ease using Roslyn. In fact, this approach is actually preferred if you’re starting off with Visual Basic or C# syntax, or if it’s important to preserve the Visual Basic or C# syntax you’ve generated. As noted earlier, Roslyn syntax trees model syntax, whereas expression trees only represent code operations without regard for syntax.

Also, if you try to construct an invalid expression tree, you’ll get back only a single exception. When parsing and compiling strings to runnable code using Roslyn, you can get multiple pieces of diagnostic information on different parts of the compilation, just as you would when writing C# or Visual Basic in Visual Studio.

On the other hand, Roslyn is a large and complicated dependency to add to your project. You may already have a set of code operations that came from a source other than Visual Basic or C# source code; rewriting into the Roslyn semantic model may be unnecessary. Also, keep in mind that Roslyn requires multithreading, and cannot be used if new threads aren’t allowed (such as within a Visual Studio debugging visualizer).

Rewriting Expression Trees: Implementing Visual Basic’s Like Operator

I mentioned that expression trees are immutable; but you can create a new expression tree that reuses parts of the original. Let’s imagine you want to query a database for people whose first name contains an “e” and a subsequent “i.” Visual Basic has a Like operator, which returns True if a string matches against a pattern, as shown here:

Dim personSource As IQueryable(Of Person) = ...
Dim qry = personSource.Where(Function(x) x.FirstName Like "*e*i*")
For Each person In qry
  Console.WriteLine($"LastName: {person.LastName}, FirstName: {person.FirstName}")
Next

But if you try this on an Entity Framework 6 DbContext, you’ll get an exception with the following message:

'LINQ to Entities does not recognize the method 'Boolean LikeString(System.String, System.String, Microsoft.VisualBasic.CompareMethod)' method, and this method cannot be translated into a store expression.'

The Visual Basic Like operator resolves to the LikeOperator.Like-­String method (in the Microsoft.VisualBasic.CompilerServices namespace), which EF6 can’t translate into a SQL LIKE expression. Thus, the error.

Now EF6 does support similar functionality via the DbFunctions.Like method, which EF6 can map to the corresponding LIKE. You need to replace the original expression tree, which uses the Visual Basic Like, with one that uses DbFunctions.Like, but without changing any other parts of the tree. The common idiom for doing this is to inherit from the .NET ExpressionVisitor class, and override the Visit* base methods of interest. In my case, because I want to replace a method call, I’ll override VisitMethodCall, as shown in Figure 8.

Figure 8 ExpressionTreeVisitor Replacing Visual Basic Like with DbFunctions.Like

Class LikeVisitor
  Inherits ExpressionVisitor
  Shared LikeString As MethodInfo =
    GetType(CompilerServices.LikeOperator).GetMethod("LikeString")
  Shared DbFunctionsLike As MethodInfo = GetType(DbFunctions).GetMethod(
    "Like", {GetType(String), GetType(String)})
  Protected Overrides Function VisitMethodCall(
    node As MethodCallExpression) As Expression
    ' Is this node using the LikeString method? If not, leave it alone.
    If node.Method <> LikeString Then Return MyBase.VisitMethodCall(node)
    Dim patternExpression = node.Arguments(1)
    If patternExpression.NodeType = ExpressionType.Constant Then
      Dim oldPattern =
        CType(CType(patternExpression, ConstantExpression).Value, String)
      ' partial mapping of Visual Basic's Like syntax to SQL LIKE syntax
      Dim newPattern = oldPattern.Replace("*", "%")
      patternExpression = Constant(newPattern)
    End If
    Return [Call](DbFunctionsLike,
      node.Arguments(0),
      patternExpression
    )
  End Function
End Class

The Like operator’s pattern syntax is different from that of SQL LIKE, so I’ll replace the special characters used in the Visual Basic Like with the corresponding ones used by SQL LIKE. (This mapping is incomplete—it doesn’t map all the pattern syntax of the Visual Basic Like; and it doesn’t escape SQL LIKE special characters, or unescape Visual Basic Like special characters. A full implementation can be found on GitHub at bit.ly/2yku7tx, together with the C# version.)

Note that I can only replace these characters if the pattern is part of the expression tree, and that the expression node is a Constant. If the pattern is another expression type—such as the result of a method call, or the result of a BinaryExpression that concatenates two other strings—then the value of the pattern doesn’t exist until the expression has been evaluated.

I can now replace the expression with the rewritten one, and use the new expression in my query, like so:

Dim expr As Expression(Of Func(Of Person, Boolean)) =
  Function(x) x.FirstName Like "*e*i*"
Dim visitor As New LikeVisitor
expr = CType(visitor.Visit(expr), Expression(Of Func(Of Person, Boolean)))
Dim personSource As IQueryable(Of Person) = ...
Dim qry = personSource.Where(expr)
For Each person In qry
  Console.WriteLine($"LastName: {person.LastName}, FirstName: {person.FirstName}")
Next

Ideally, this sort of transformation would be done within the LINQ to Entities provider, where the entire expression tree—which might include other expression trees and Queryable method calls—could be rewritten in one go, instead of having to rewrite each expression before passing it into the Queryable methods. But the transformation at its core would be the same—some class or function that visits all the nodes and plugs in a replacement node where needed.

Wrapping Up

Expression trees model various code operations and can be used to expose APIs without requiring developers to learn a new language or vocabulary—the developer can leverage Visual Basic or C# to drive these APIs, while the compiler provides type-checking and syntax correctedness, and the IDE supplies Intellisense. A modified copy of an expression tree can be created, with added, removed, or replaced nodes. Expression trees can also be used for dynamically compiling code at runtime, and even for self-rewriting code; even as Roslyn is the preferred path for dynamic compilation.

Code samples for this article can be found at bit.ly/2yku7tx.

More Information


Zev Spitz has written a library for rendering expression trees as strings in multiple formats—C#, Visual Basic and factory method calls; and a Visual Studio debugging visualizer for expression trees.

Thanks to the following Microsoft technical expert for reviewing this article: Kathleen Dollard


Discuss this article in the MSDN Magazine forum