Only a few hours left (part 5)

The previous post ended up showing that while visitors are available in C#, they lack usability brought by built in language constructs they could have that would make them an ideal choice to solve our problem.  In Java, we saw that anonymous inner classes provided such a convenient construct, and in this post i wanted to show how several of the future language enhancements we're bringing to C# will also make this convenient usage possible.  Now, from what you've seen from PDC so far, the new Linq enhancements are targetted all around object, data, and xml querying and manipulation.  However, as this post will show, the core functionality we've added to C# can be used for far more than just that. 

Specifically, we're going to make use of two of the new core language features: Lambda Expressions and Object Initializers.  Let's take a look at how we could use those two features to make visitors far more enjoyable.  To begin with, we're going to define our visitor interface and DefaultTokenVisitor slightly differently than how we did in Java.  In C# they're now going to look like this:

     public delegate void Action<A>(A a);

    public interface ITokenVisitor
    {
        Action<Token>                  VisitToken                  { get; set; }
        Action<KeywordToken>           VisitKeywordToken           { get; set; }
        Action<InterfaceToken>         VisitInterfaceToken         { get; set; }
        Action<ClassToken>             VisitClassToken             { get; set; }
        Action<IdentifierToken>        VisitIdentifierToken        { get; set; }
        Action<ContextualKeywordToken> VisitContextualKeywordToken { get; set; }
        Action<AccessibilityToken>     VisitAccessibilityToken     { get; set; }
        Action<NoisyToken>             VisitNoisyToken             { get; set; }
        Action<CommentToken>           VisitCommentToken           { get; set; }
        Action<WhitespaceToken>        VisitWhitespaceToken        { get; set; }
    }

    public class DefaultTokenVisitor : ITokenVisitor
    {
        public virtual Action<Token>                  Default                     { get { ... } set { ... } }
        public virtual Action<Token>                  VisitToken                  { get { ... } set { ... } }
        public virtual Action<KeywordToken>           VisitKeywordToken           { get { ... } set { ... } }
        public virtual Action<InterfaceToken>         VisitInterfaceToken         { get { ... } set { ... } }
        public virtual Action<ClassToken>             VisitClassToken             { get { ... } set { ... } }
        public virtual Action<IdentifierToken>        VisitIdentifierToken        { get { ... } set { ... } }
        public virtual Action<ContextualKeywordToken> VisitContextualKeywordToken { get { ... } set { ... } }
        public virtual Action<AccessibilityToken>     VisitAccessibilityToken     { get { ... } set { ... } }
        public virtual Action<NoisyToken>             VisitNoisyToken             { get { ... } set { ... } }
        public virtual Action<CommentToken>           VisitCommentToken           { get { ... } set { ... } }
        public virtual Action<WhitespaceToken>        VisitWhitespaceToken        { get { ... } set { ... } }
    }

So far, this code just makes use of C# 2.0 features.  It might take a couple of minutes for your mind to wrap around it, but what we've ended up doing here is creating an type whose methods can be overridden trivially at runtime.  We do this by simulating methods with properties that return delegates.  For example, you can provide a method implementation by writing: ".VisitToken = ..." and that method can then be invoked just as you would expect: "VisitToken()".  While they are actually properties and delegates, they appear (for all intents and purposes) as methods that you can change at runtime.  In fact, from a syntactic perspective, the code is indistinguishable, and so we do not even need to change any or our visitor accept methods on our token classes.

So what can we do with this?  Well, let's see what we can turn our parser code into thanks to the above classes:

         void parseType()
        {
            parseModifiers();

            CurrentToken.AcceptVisitor(new DefaultTokenVisitor {
                VisitClassToken     = token => parseClass(),
                VisitInterfaceToken = token => parseInterface(),
                /* include other cases */
                Default             = token => /*handle error*/
            });

            //Parse rest of type
        }

The code bahaves how it reads.  We create a simple visitor with most of the logic built in.  But at construction time we tell it how to behave when it sees a "class" or "interface" token.  And in those cases, we are able to specify the behavior through a lamba expression in a succint manner. 

The "Object Initializer" feature is what allows us to instantiate an object and assign into its fields (much like attribute constructors), in an expression form rather than a statement form.  The lambda expressions then allow us to easily create closures of code that we which to execute at a later point in time.  Both features are simply lightweight syntactic constructs on what's already available in C# 2.0, but as you can see from the following code, the difference it makes in code is enormous:

         void parseType()
        {
            parseModifiers();

            ITokenVisitor visitor = new DefaultTokenVisitor();
            visitor.VisitClassToken = delegate (ClassToken token) {
                this.parseClass();
            };

            visitor.VisitInterfaceToken = delegate (InterfaceToken token) {
                this.parseInterface();
            };

            /* include other cases */
            
            visitor.Default = delegate (Token token) {
                /*handle error*/
            };

            CurrentToken.AcceptVisitor(visitor);

            //Parse rest of type
        }

(In fact, it's probably worse than the initial example of visitors in C#).  Completely unintelligible IMO.

While this code appears similar to Java's anonymous inner classes, they actually don't share almost nothing in common.  In Java you would be creating a instance of an unamed subclass of the DefaultTokenVisitor type, whereas in C# you are just instantiating an instance of the actual DefaultTokenVisitor class.  In Java the method declarations we used were overrides of the methods defined in the DefaultTokenVisitor class, whereas in C# we're just assigning lambda expressions into our properties which we're using to simulate methods.  In Java, if we were to use the "this" keyword, we'd be referring to the instance of the anonymous inner class the code was executing in, whereas in C# the "this" still refers to the parser instance in scope.  By using these two constructs we end up with the ability to use visitors from C# easily with a syntax that is even less heavyweight than the corresponding java constructs.  There are many ways in which the code is simpler:

  1. The C# code doesn't need () on the instantiation of DefaultTokenVisitor.  Very minor.
  2. The C# code can eschew the "public void" on all it's visitation operations.  Nice, and cleans things up well.
  3. The C# code doesn't allows the type of the parameter to be optional.  If you want it, you can keep it.  However, you can leave it out and still have things statically checked at compile time.  It helps with code duplication to not have to say: VisitClassToken(ClassToken..., and instead say: VisitClassToken(token...
  4. The C# code doesn't need the {}'s if the code is a trivial expression or statement.  Very minor
  5. The C# code doesn't need to dismbiguate the instance of the parser and the instance of the inner class like java does.  So there's no need to write: Parser.this.parseInterface().  Instead you can just do the simple: this.parseInterface()

All in all, the C# version is about 2/3s the length and cuts out a lot of the cruft.  This allows you to intermingle your visitor logic with the rest of your code just as you'd like to be able to do.

So while Linq is a focused effort from many different teams at Microsoft to provide powerful integrated query support, you can see that the individual features that have been introduced in C# can be used in a lot of powerful ways beyond just that.  Hopefully you'll have your own good ideas about how to use these new language features in other interesting ways.  If so, let me know so we can share it with the rest of the development community out there!