Messaging Patterns in Service Oriented Architecture, Part 2
Cap Gemini Ernst & Young
Summary: Explores the contract patterns that illustrate the behavioral specifications required to maintain smooth communications between service provider and service consumer; also explores message construction patterns that describe creation of message content that travels across the messaging system. (9 printed pages)
Contracts and Information Hiding
In part one of this paper published in issue 2 of JOURNAL, we described how messaging patterns exist at different levels of abstraction in SOA. Specifically, Message Type Patterns were used to describe different varieties of messages in SOA, Message Channel Patterns explained messaging transport systems, and finally Routing Patterns explained mechanisms to route messages between the Service Provider and Service Consumer. In this second part of the paper, we will cover Contract Patterns that illustrate the behavioral specifications required to maintain smooth communications between Service Provider and Service Consumer, and Message Construction Patterns that describe creation of message content that travels across the messaging system.
Contracts and Information Hiding
An interface contract is a published agreement between a service provider and a service consumer. The contract specifies not only the arguments and return values that a service supplies, but also the service's pre-conditions and post-conditions. Parnas and Clements best describe the principles of information hiding:
"Our module structure is based on the decomposition criterion known as information hiding [IH]. According to this principle, system details that are likely to change independently should be the secrets of separate modules; the only assumptions that should appear in the interfaces between modules are those that are considered unlikely to change. Each data structure is used in only one module; one or more programs within the module may directly access it. Any other program that requires information stored in a module's data structures must obtain it by calling access programs belonging to that module." (Parnas and Clements 1984)
Applying this statement to SOA, a service should never expose its internal data structures. Otherwise it causes unnecessary dependencies (tight coupling) between the service provider and its consumers. Internal implementation details are exposed by creating an parameterized interface design mapped to the service's implementation aspects rather than to its functional aspects.
How can behaviors be defined independent of implementations?
The concept of an interface contract was added to programming languages like C# and Java to describe a behavior both in syntax and semantics. Internal data semantics must be mapped into the external semantics of an independent contract. The contract depends only on the interface's problem domain, not on any implementation details.
The methods, method types, method parameter types, and field types prescribe the interface syntax. The comments, method names, and field names describe the semantics of the interface. An object can implement multiple interfaces.
The message itself is simply some sort of data structure—such as a string, a byte array, a record, or an object. It can be interpreted simply as data, as the description of a command to be invoked on the receiver, or as the description of an event that occurred in the sender. When two applications wish to exchange a piece of data, they do so by wrapping it in a message. Message construction introduces the design issues to be considered after generating the message. In this message construction patterns catalogue we will present three important message construction patterns.
In any messaging system, a consumer might send several message requests to different service providers. As a result it receives several replies. There must be some mechanism to correlate the replies to the original request.
Figure 1. Correlation identifier
Each reply message should contain a correlation identifier; a unique id that indicates which request message this reply is for. This correlation id is generated based on a unique id containing within the request message.
There are six parts to Correlation Identifier:
- Requestor—Consumer application.
- Replier—Service Provider. It receives the request ID and stores it as the correlation ID in the reply.
- Request—A Message sent from the consumer to the Service Provider containing a request ID.
- Reply—A Message sent from the Service Provider to the Consumer containing a correlation ID.
- Request ID—A token in the request that uniquely identifies the request.
- Correlation ID—A token in the reply that has the same value as the request ID in the request.
During the creation time, a request message is assigned with a request ID.
When the service provider processes the request, it saves the request ID and adds that ID to the reply as a correlation ID. Therefore it helps to identify request-reply matching. A correlation ID (and also the request ID) is usually associated with the message header of a message rather than the body and can be treated as a metadata of the message.
Because of the inherent distributed nature of messaging, communication generally occurs over a network. You must utilize appropriate network bandwidth, maintaining best performance. In certain scenarios (such as sending a list of invoices for a particular customer) you might need to send large amounts of data (100 MB or more). In such cases it is recommended to divide the data into smaller chunks, and send the data as a set of messages. The problem is, how to rearrange the data chunks to form the whole set.
Figure 2. Message sequence indicating size
Use a message sequence, and mark each message with sequence identification fields.
The three message sequence identification fields are:
- Sequence ID—Used to differentiate one sequence from other.
- Position ID—A relative unique ID to identify a message position within a particular sequence.
- End of Sequence indicator—Used to indicate the end of a sequence.
Figure 3. Message sequence with message end indicator
The sequences are typically designed such that each message in a sequence indicates the total size of the sequence; that is, the number of messages in the sequence (see Figure 26). As an alternative, you can design the sequences such that each message indicates whether it is the last message in that sequence (see Figure 27). Let's take a real life example. Suppose we want to generate a report for all invoices from 01/01/2001 to 31/12/2003. This might return millions of records. To handle this scenario, divide the timeframe into quarters and return data for each quarter. The sender sends the quarterly data as messages, and the receiver uses the sequence number to reassemble the data and identifies the completion of received data based on End of Sequence indicator.
Messages are stored on disk or persistent media. With the growing number of messages, disk space is consumed. At the end of messaging life cycle, messages should be expired and destroyed to reclaim disk space.
Figure 4. Message expiration
Set the message expiration to specify a time limit for preservation of messages on persisting media.
A message expiration is a timestamp (date and time) that decides lifetime of the message. When a message expires, the messaging system might simply discard it or move it to a dead letter channel.
Various applications might not agree on the format for the same conceptual data; the sender formats the message one way, but the receiver expects it to be formatted another way. To reconcile this, the message must go through an intermediate conversion procedure that converts the message from one format to another. Message transformation might involve data change (data addition, data removal, or temporary data removal) in existing nodes by implementing business rules. Sometimes it might enrich an empty node as well. Here we present few important message transformation patterns.
When one message format is encapsulated inside another, the system might not be able to access node data. Most messaging systems allow components (for example, a content based router) to access only data fields that are part of the defined message header. If one message is packaged into a data field inside another message, the component might not be able to use the fields to perform routing or business rule based transformation. Therefore, some data fields might have to be elevated from the original message into the message header of the new message format.
Use an envelope wrapper to wrap data inside an envelope that is compliant with the messaging infrastructure. Unwrap the message when it arrives at the destination.
The process of wrapping and unwrapping a message consists of five steps:
- The message source publishes a message dependent on raw format.
- The wrapper takes the raw message and transforms it into a message format that complies with the messaging system. This may include adding message header fields, encrypting the message, adding security credentials etc.
- The messaging system processes the compliant messages.
- A resulting message is delivered to the unwrapper. The unwrapper reverses any modifications the wrapper made. This may include removing header fields, decrypting the message or verifying security credentials.
- The message consumer receives a 'clear text' message. An envelope typically wraps both the message header and the message body or payload.We can think of the header as being the information on the outside of the envelope – it is used by the messaging system to route and track the message. The contents of the envelope are the payload, or body – the messaging infrastructure does not care about it until it arrives at the destination.
Figure 5. Envelope wrapper
Let's consider the example. An online loan processing system receives information including a customer credit card number and an SSN. In order to complete the approval process, it needs to perform a complete credit history check. However, this loan processing system doesn't have the credit history data. How do we communicate with another system if the message originator does not have all the required data fields available?
Figure 6. Content enricher
Use a specialized transformer, a content enricher, to access an external data source in order to enrich a message with missing information.
The content enricher uses embedded information inside the incoming message to retrieve data from an external source. After the successful retrieval of the required data from the resource, it appends the data to the message. The content enricher is used in many occasions to resolve reference IDs contained in a message. In order to keep messages small, manageable, and easy to transport, very often we just pass simple object references or keys rather than passing a complete object with all data elements. The content enricher retrieves the required data based on the object references included in the original message.
The content enricher helps us in situations where a message receiver requires more (or different) data elements than are contained in the original message. There are surprisingly many situations where the reverse is desired; the removal data elements from a message. The reason behind data removal from the original message is to simplify message handling, remove sensitive security data, and to reduce network traffic. Therefore, we need to simplify the incoming documents to include only the elements we are actually interested.
Figure 7. Content filter
Use a content filter to remove unimportant data items from a message.
The content filter not only removes data elements but also simplify the message structure. Many messages originating from external systems or packaged services contain multi-levels of nested, repeating groups because they are modeled after generic, normalized database structures. The content filter flattens this complex nested message hierarchy. Multiple content filters can be used as a to break one complex message into individual messages that each deal with a certain aspect of the large message.
A content enricher enriches message data and a content filter removes unneeded data items from a message. Sometimes however, the scenario might be little different. Moving large amounts of data via messages might be inefficient due to network limitation or hard limits of message size, so we might need to temporarily remove fields for specific processing steps where they are not required, and add them back into the message at a later point.
Figure 8. Claim check
Store message data in a persistent store and pass a claim check to subsequent components. These components can use the claim check to retrieve the stored information using a content enricher.
The Claim Check pattern consists of the following five steps:
- A message with data arrives.
- The 'check luggage' component generates a unique key that is used in later stage as the claim check
- The check luggage component extracts the data based on a unique key from the persistent store.
- It removes the persisted data from the message and adds the claim check.
- The checked data is retrieved by using a content enricher to retrieve the data based on the claim check.
This process is analogous to a luggage check at the airport. If you do not want to carry your luggage with you, you simply check it with the airline counter. In return you receive a sticker on your ticket that has a reference number that uniquely identifies each piece of luggage you checked. Once you reach your final destination, you can retrieve your luggage.
SOA stresses interoperability, the ability to communicate different platforms and languages with each other. Today's enterprise needs a technology-neutral fabricated solution to orchestrate the business processes across the verticals. The SOA, then, presents a shift from the traditional paradigm of enterprise application integration (EAI) where automation of a business process required specific connectivity between applications.
According to Robert Shimp, vice president of Technology Marketing at Oracle: "EAI requires specific knowledge of what each application provided ahead of time. SOA views each application as a service provider and enables dynamic introspection of services via a common service directory, Universal Description Discovery and Integration of Web services (UDDI)."
Messaging is the backbone of SOA. Steven Cheah, director of Software Engineering and Architecture at Microsoft Singapore, states: "We now finally have a standard vehicle for achieving SOA. We can now define the message standards for SOA using these Web services standards."
Cheah considers SOA 'a refinement of EAI'. Specifically, SOA recommends some principles, which actually help achieve better application integration. These principles include the description of services by the business functions they perform; the presentation of services as loosely-coupled functions with the details of their inner workings not visible to parties who wish to use them; the use of messages as the only way 'in' or 'out' of the services; and federated control of the SOA across organizational domains, with no single party having total control of it.
We started at the ten thousand foot level with a vision of service-oriented enterprise. We then descended down through a common architecture (SOA) and proceeded by outlining messaging. Now, we are armed with the necessary messaging patterns valuable to attack the SOA complexities and to achieve the vision of dynamic process oriented service bus enterprise.
1. Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions, Gregor Hohpe and Bobby Woolf, Addison-Wesley, 2004
2. Service Oriented architecture: A Primer, Michael S Pallos, EAI Journal, December 2001
3. Solving Information Integration Challenges in a Service-Oriented Enterprise, ZapThink Whitepaper, http://www.zapthink.com
4. SOA and EAI, De Gamma Website, http://www.2gamma.com/en/produit/soa/eai.asp
5. Introduction to Service-Oriented Programming, Guy Bieber and Jeff Carpenter, Project Openwings, Motorola ISD, 2002
6. Java Web Services Architecture, James McGovern, Sameer Tyagi, Michael Stevens, and Sunil Mathew, Morgan Kaufman Press, 2003
7. Using Service-Oriented Architecture and Component-Based Development to Build Web Service Applications, Alan Brown, Simon Johnston, and Kevin Kelly, IBM, June 2003
8. The Modular Structure of Complex Systems, Parnas D and Clements P, IEEE Journal, 1984
9. Design Patterns: Elements of Reusable Object-Oriented Software, Gamma E, Helm R, Johnson R, and Vlissides J, Addison-Wesley, 1994
10. Computerworld Year-End Special: 2004 Unplugged, Vol. 10, Issue No. 10, 15 December 2003 - 6 January 2004, http://www.computerworld.com.sg/pcwsg.nsf/currentfp/fp
11. Applying UML and Patterns – An introduction to OOA/D and the Unified Process, Craig Larman, 2001
G Hohpe & B Woolf, ENTERPRISE INTEGRATION PATTERNS, (adapted material from pages 59-83), © 2004 Pearson Education, Inc. Reproduced by permission of Pearson Education, Inc. Publishing as Pearson Addison Wesley. All rights reserved.
About the author
Soumen Chatterjee is a Microsoft Certified Professional and Sun Certified Enterprise Architect. He's significantly involved in enterprise application integration and distributed object oriented system development using Java/J2EE technology to serve global giants in the finance and health care industries. With expertise in EAI design patterns, messaging patterns and testing strategies he designs and develops scalable, reusable, maintainable and performance tuned EAI architectures. Soumen is a Senior Consultant with Cap Gemini Ernst & Young. He's an admirer of extreme programming methodology and has primary interests in AOP and EAI. Besides software, Soumen likes movies, music, and follows mind power technologies. Soumen can be reached at firstname.lastname@example.org.
This article was published in the Architecture Journal, a print and online publication produced by Microsoft. For more articles from this publication, please visit the Architecture Journal website.