A developer’s view of Workflow

A new programming language

Windows Workflow Foundation (WF) 4.0 offers many features to simplify business application development, deployment and management. In this post I’d like to explore workflow from a developer’s angle as a new programming language.

WF has a lot of “new” features compared to mainstream industry languages like C\C++\Java\C#\VB. As a developer, I personally appreciate the below ones the most:

1. Fully declarative programming model

2. Support for continuation

3. Built-in persistence functionality

4. Instance management in service host

Declarative programming

#1 is pretty easy to understand: in WF, code is data. Developers could use a visual editor to create a workflow definition by drag-n-drop. The produced workflow definition is an AST for workflow which any tool could directly traverse and makes sense of.

I’d like to expand more on other 3 points using an example in rest of this post.

Continuation and instance management

The old world

Now let’s assume we are building a program to track a document approval process without using WF. It accepts a document from a user, sends a notice to an approver, waits for approver’s decision (approve or reject), then notifies the original submitter.

First let’s try a simple console version as proof of concept using C# style pseudo-code:

         string document = Console.ReadLine();

         Console.WriteLine("Approver, please respond");

         string result = Console.ReadLine();

         Console.WriteLine("Submitter, result for your document approval is {0}", result);

                    Program 1

 

This is pretty easy, but when we turn it into a real service which could accept different requests from different users at any time, the program needs to deal with several new issues:

1. State management. In the simple code above, the program has states, but developers don’t need to track them explicitly: when the 1st ReadLine returns, the program automatically resumes from the point to send approver a message; when 2nd ReadLine returns, the program automatically resumes from the point to send result back to the submitter. Here the program’s states are saved by calling stack, which is taken care by compiler and operating system. But in real world, it could take the approver days to respond a request, so it’s impractical for the program to block a thread waiting for the responses. As a result, there will be no calling stack to save the program’s states; the developer has to track states explicitly.

2. Instance management. Since the system could handle multiple documents at the same time. The service needs to track multiple approval process instances and their states separately.

3. Scalability and availability. Because of both points above, we know this service is stateful. For any stateful service, it’s always a concern that how the service could scale with increasing demands and keep available even with inevitable hardware/software failures. Generally speaking, states have to be maintained without any affinity to the service process to achieve those goals.

This is an over-simplified pseudo code snippet for how this service might be implemented. We could certainly optimize it using overlapped IO and thread pool or use WCF to simplify certain things, but that’s not focus here.

       while (true)

       {

           Socket requestSocket = socket.Accept();

           new Thread(HandleRequest).Start(requestSocket);

       }

       

        static void HandleRequest(object state)

        {

            Socket socket = state as Socket;

            //Receive bytes from socket and serialize it to a Request object

            Request request = …

            if (request.Type == RequestType.Submit)

            {

              //Read and serialize more bytes from the socket to a SubmitRequest object

               SubmitRequest submit = …

                HandleSubmitRequest(submit);

            }

            else if (request.Type == RequestType.Approval)

            {

//Read and serialize more bytes from the socket to a SubmitRequest object

                ApprovalRequest approval = …

                HandleApprovalRequest(approval);

            }

        }

 

        static void HandleSubmitRequest(SubmitRequest submit)

        {

            int documentId = CreateDocumentInstanceInDB(submit);

            SendEmailToApprover(submit.Approver);

            SetDocumentInstanceStateInDB(documentId, DocumentState.WaitForApproval);

        }

 

        static void HandleApprovalRequest(ApprovalRequest approval)

        {

            SubmitRequest submit = LookupDocumentInstanceInDB(approval.DocumentId);

            SendEmailToSubmitter(submit.Submitter, approval.Approved);

            SetDocumentInstanceStateInDB(approval.DocumentId, approval.Approved ? DocumentState.Approved : DocumentState.Rejected);

        }

                 Program 2

As we could see here, a very straightforward program in program 1 suddenly becomes much more complicated once the developer needs to track states, instances explicitly and independent of the code.

As a good OO practice, one natural evolution to the program is to abstract the document approval life cycle as a class by itself:

    class DocumentApproval

    {

        public int Id {…}

        public DocumentState State {…}

        public string Submitter {…}

        public string Approver {…}

 

        public void OnSubmitted()

        {

            SendEmailToApprover(this.Approver);

            this.State = DocumentState.WaitForApproval;

        }

 

        public void OnApprovalReceived(bool approved)

        {

            SendEmailToSubmitter(this.Submitter, approved);

            this.State = approved ? DocumentState.Approved : DocumentState.Rejected;

        }

}

        Program 3

 

With this class, we could separate the real business logic from socket, DB, and instance management code. With most of code same as Program 2, the 2 handle request method could be rewritten to:

        static void HandleSubmitRequest(SubmitRequest submit)

        {

            //create a DocumentApproval object from a SubmitRequest

         DocumentApproval doc = …;

            doc.OnSubmitted();

            PersistToDB(doc);

        }

 

        static void HandleApprovalRequest(ApprovalRequest approval)

        {

            DocumentApproval doc = ReadFromDB(approval.DocumentId);

            doc.OnApprovalReceived(approval.Approved);

            PersistToDB(doc);

        }

                 Program 4

Now the code is certainly cleaner. Business logic is separated with plumbing code. However the DocumentApproval class now has its logic scattered in several methods and it relies on the plumbing code to call the right method at right time in right order. This is still quite different than the sequential coding experience we have in Program 1, but much better.

What if we could push the plumbing code down to a Runtime and ask developers to focus on business logic like the DocumentApproval class? Workflow Foundation does exactly that.

The new world

Now let’s look at how the same service could be implemented by WF4.

image

Program 5

Assuming you have some basic understanding of Workflow Foundation, Activities, Workflow Services, and Messaging Activities, I’m not going to explain this workflow in details. Basically it models a service that will receive a message for document submission first, send an email to approver (using a custom Activity), receive a second message for document approval, then send email to the submitter (with a custom Activity).

Interestingly enough, this workflow looks as simple as the code in program 1. It describes program flow for one single document approval process, and it describes the process as a continuous sequence as if there are no breakpoints in the middle. However it’s functionality is as powerful as Program 2: this service will not block any thread when it waits for new messages; it could track multiple document instances at the same time; all the states could be persisted outside of the service processes so that a different machine could pick up a document approval process if the previous machine is down or overloaded. All of this is possible without the developer writing any code because of those features of WF runtime:

Continuation

In this workflow, when the Receive activity is executing, the workflow becomes “idle”, meaning it’s waiting for something to happen without anything. At this point, workflow runtime could take this instance away from the running thread and start to run other instances. When a message arrives for this document, WF runtime has a way to find out the corresponding instance, and resumes it from the idle point. All states of the instance are preserved as if the “calling stack” never changes for the thread.

Continuation is not a new thing, a lot language including Lisp, Scheme, Ruby, and Perl all have similar concept. But WF’s continuation is much easier to use, it comes so naturally even without developer’s notice: the fact that I often need to explain to WF beginners what continuation means for WF is evidence. Also combined with persistence feature, WF continuation is durable and easy to scale for long-running asynchronous services.

Built in instance management

When this workflow definition is hosted by WorkflowServiceHost, the service host will automatically start a new instance of the workflow for a document submission request. Whenever a workflow instance reaches an idle point, the instance data could be unloaded from memory using the persistence feature. When a document approval request comes, the service host could find out which existing document process this request is for, reactivate that instance, resume and feed the new request to it. Another cool thing is that after persistence, a WF instance is not sticky to its hosting process, so it’s possible for another host in a different machine to load the same instance, which enables great scalability and availability.

WF3.5 already has feature to activate a workflow instance with a WCF message. What’s new in WF4 is correlate message with WF instance with message content based on custom queries, like OrderId or CustomerId.

Built in persistence support

As we see previously, to make continuation and instance management durable and scalable, all instance data has to be persisted to durable format and detached from the host process. WF runtime has a great feature set to support that functionality. Also WF4’s programming model enforced separation between workflow metadata and instance data, which makes persistence much easier than WF3.5.

Continuation of the blog

This post took a peek of why a developer might be interested to use Workflow Foundation as a new tool for long running business applications. It demoed how to build a powerful durable and scalable services (which takes substantial lines of code in C#) with few drag-n-drop in WF4. It quickly touched several important feature areas in WF4.0, like continuation, new data model, persistence, WCF integration, and content based correlation. Our team will continue to blog in each of those categories in details. Please add our feed to your favorite blog reader and wait for update. J

Yun Jin
Development Lead
https://blogs.msdn.com/yunjin