More Discussion of SOA is like the Night Sky...

I received some thoughtful commentary from someone named John and I thought I would share his comments and some of my responses with you all. I’m still trying to get used to Blogging and don’t know how to give better attribution to John than his first name. Here goes:

Sender: John


re: SOA is like the Night Sky...

[John] This seems overly simplified to me. You say 'data being copied is unlocked'.

[John] In my mind I have always considered that simply by having data I have an implicit read-lock on that data. This is an optimistic read-lock, but a lock nevertheless. If I am not in the sole execution context (basically 'logical thread') that manages this data my optimistic read-lock can become stale any moment after I have received it. In short, *data is a read-lock*. It's what you know and you have to act on it until you know it's now obsolete and you have been wasting your time.

[Pat] This is a great question (or comment). The issue cuts to what you believe about a service and its relationship to the outside world. My assertion is that a distrusting relationship with autonomy WILL NOT include data updates of the backend database (even with optimistic concurrency control). A distrusting service will insist on verifying the behavior implied by incoming work through its business logic. I’m fully aware that optimistic concurrency control can be made to work across distances and without holding locks. The issue is that this is unsafe behavior.

[Pat] If I (a service) value my data, I’m not going to let others change it. This is about independence, autonomy, and trust. The IRS does not let me perform optimistic concurrency control against their backend database when I post my tax return. The premise behind services is a loose coupling and distrust between the participating services. In my opinion, this means that a read/write semantic against backend data is completely unacceptable.

[John] You say the 'act of sending a message by the remote service involves its unlocking of the records containing the data being transmitted' but this is not true, it is issuing an optimistic read-lock. Optimistic concurrency is not a new idea.

[Pat] Again, it’s about the semantics available across distrusting boundaries, not the ability to use optimistic concurrency control.

[John] If I was providing a service-interface (foo) that only supported read operations then I guess I could disregard these locks and not care about the fact that my client’s data is becoming stale. Most real world services (bars) will also receive requests for alterations to data via messages on their service-interfaces (foos). Pessimistic locking doesn't strike me as feasible in any way shape or form for a distributed system (where a 'distributed system' is over a network where one node can fail (or degrade) independantly of the entire system), but clients of my service must know what data I have before they can send me a request to update it. If their read-lock has become stale I must fail them with a concurrency error, and then force them to begin again or move them into the 'merge' process. This is not a new idea, and in my view it doesn't fall outside the scope of 'transaction management'. Is SOA simply a new vocabulary? What's wrong with the one that we already have? Why is 'SOA like the night sky'? Isn't it just like optimistic concurrency?

[Pat] If this were about optimistic concurrency control, we would be pursuing a reincarnation of the same behavior. In that case, you would be correct that it is simply inventing a new vocabulary. It is not about the same behavior, though.

[Pat] SOA is about interacting with a business-function semantic. It is also about the assumption that when you do your business function, it is only connected via messaging. This leads us to a style of interaction that is reminiscent of the way we interact with businesses. I may place a hotel reservation (and, perhaps, later on cancel that reservation). I don’t fiddle with the hotel’s backend database records.

[Pat] This is why there’s a lot of excitement about SOA. While it has been done before we came up with a new name (e.g. EDI, MQ, etc), it has not been worked on with the same intensity and with the same hope for broad impact. What you posit for interaction (with optimistic concurrency control over direct access to the partner’s data) is definitely not SOA.

[John] Since a client can take these read-locks, they are implicitly involved in a distributed transaction whenever they hold data they requested from a service where there might be some intention to post a message to the service based on the contents of that data. This type of distributed transaction's ACID principles still apply, but there is always the risk of processing or viewing stale data (because we don't serialize access to data). These ideas don't strike me as really new or ground breaking. A paradigm where a service maintained a record of state known to all clients and managed a message dispatch system that let them know that 'the sun just blew up' (or more likely 'data you hold a read-lock on just got modified') would be, but optimistic concurrency is not.

[Pat] Same comment as above. The interaction is not about record reading and is not about ACID transactions that span the services (“bar”s).

[John] By the way, if the Sun blows up it is telling all its clients as soon as it can about that change in state (all practical latency aside). By virtue of time and motion there isn't such a thing as 'real-time' when you have more than one execution context, but there is 'as close to real-time as possible' (invariably race conditions will need to be dealt with). I'm pretty sure that SOA doesn't imply that all services will notify all clients about a change in state that they would have an interest in at the speed of light (but the Sun would if it blew up (and I didn't even ask for a read-lock, I just get one, it streams its state at me as fast as it can)).

[Pat] In fact, I am trying to point out that SOA is about looseness between the services. The behavior of a collection of services should be identical even if one of them goes up and down intermittently (of course with the exception that the responsiveness of the collections of services is impacted). The use of queues for the messages that connect the services allows for a great deal of tolerance of intermittent availability. Amazon.Com (and most scalable web sites) have a scale-out front end and a centralized back end. Browsing and shopping happen on the front end. When you push SUBMIT, a message is enqueued for delivery to the back end system. Normally, you get an email from the back end system a few seconds later. Sometimes the back end system is down for a while and you get the email in an hour or so. You still get your books.

[Pat] So, one of the ideas is to tolerate the fact that these systems are in different time domains as much as possible. That is the opposite of believing it is at the speed of light.

[John] Also, I'm still not really comfortable with these exploding layers of abstraction that all do the same thing. For example, you define bar as: “a collection of data and logic that is completely isolated from anything else except through incoming messages. A bar has explicit boundaries and is autonomous. Typically (i.e. in real applications), a bar is implemented as a bunch of code surrounding a set of tables in a single database.”

[John] That sounds like a function to me. Oh, and a class. Oh, and an API. Oh, and a process. Oh, and an operating system. But apparently it's this new and 'different' thing..? Concurrency has been an issue with all types of messaging paradigms with multiple execution contexts. The context could be threads, local processes, remote processes and beyond (into real life if you like), basically any situation where you can lose a deterministic order of events.

[Pat] Oh, would that this were a function, a class, or (dare I say it) a component. When was the last time you saw a system in which all of the components minded their own business and didn’t fiddle with the same data that other components fiddled with? When I look at enterprise applications that manage data stored in the database, I see lots of different components fiddling with the same data. Furthermore, you have to look pretty far to actually be able to group a collection of database tables and a collection of code into a chunk where no transaction spans this chunk and that chunk.

[Pat] Some of this may be a community perspective. Language and object folks don’t think about the disjointedness of the database data that must exist to have true encapsulation. They think about member variables and avoiding the use of globals (which is a fine thing). If, however, this component and that component fiddle with the same database records, what kind of encapsulation does your component have?

[John] Schema, contract and policy also seem to me to have been around for a very long time, at many different levels. Didn't the word for this used to be 'type'?

[Pat] Not an unreasonable perspective but missing a few things. As we are trying to hook together our services, we do not want to be as chatty as we have been with component-based systems. It becomes important to package up a big request and ship it off to the services with the possibility of as much looseness and independence in the definition as possible. I totally agree that schema is very much like types and, indeed, very much like interfaces. Interfaces are almost always finer-grained and chattier. Interfaces did not include notions comparable to contract or policy, though. The interface did not say the allowable order of method calls. Similarly, interfaces and types did not attempt to address the domain that policy is targeting.

[John] If SOA is going to be thrown around as the flavour of the month, I reckon it'd be worth admitting that it was something far more specific. Give it real concrete bounds, not vague wishy-washy ones that could just as easily describe how my class relies on the 'services' of a function, my API relies on the 'services' of a class, my process relies on the 'services' of an API, my OS relies on the 'services' of a process, etc.

[John] Rebranding old ideas isn't progress. It's marketing.

[Pat] I think there is a difference. I will totally grant you that there is ambiguity about the crisp delineation between what a service is and what a component is amongst the SOA crowd. We are debating these issues and trying to form a consensus just as the object folks went through some churn during their formative times.

[Pat] It is my opinion that there is a substantive difference and, hence, this is more than simply rebranding. I would be pleased to have your commentary and to do my best to address anything you have to say in this forum. I am pleased as can be about your firmness in expressing you opinions and concerns!

[John] If service is foo, and service-interface is bar, then SOA is snafu.

[Pat] I had intended that a bar be a service and a foo be a service-interface in my earlier discussion. As I said, these terms are subject to discussion and redefinition. Still, I don’t think that’s a big part of your concern compared to the issues discussed above.

[Pat] Thanks, again, for your vociferous comments! Please let me know what you think!