UML and DSLs Again

I’m often asked by audiences, visitors to Microsoft and journalists to explain our position with respect to UML (e.g. VSLive! Interview). Many people who read our views on model driven development, as described in these postings and other places, assume that our emphasis on domain specific (modeling) languages, or DSLs, somehow has put us into an anti-UML position.  We want to make it clear that this is not true. While I laughed out loud at some points in Alex Bell’s excellent article in March 2004 ACM Queue called Death By UML Fever, we still agree with many of the points made by Grady Booch in his response. Before UML, there was an unproductive diversity of modeling approaches, and their convergence into UML 1.0 was a significant step forward in using models in software development.


But for whatever reasons, the existence of UML and UML-based tools, has not significantly changed the way developers build applications. Nor has it significantly contributed to developer productivity. Since Microsoft ships one of the most-used UML tools – those based on Visio in Visual Studio Enterprise Architect - we anonymously survey developers (not just our customers) on tool usage. We have discovered that it’s a very small population who claim to use UML tools in support of their tasks, and most usage clusters around class diagrams and use case diagrams. When we drill into those who claim to use class diagrams, it’s a tiny fraction that actually uses them to generate code.


This was one of the driving forces behind Whitehorse. We really wanted to take tasks that developers and architects find difficult, and figure out ways that modeling tools could add value and help them. We are enthusiastic supporters of UML notation and diagrams. A walk around the developers’ offices just in my corridor reveals whiteboards covered with UML class diagrams and sequence diagrams. Just the other day, my colleagues and I were in the cafeteria sketching out an architecture on a napkin using UML diagram fragments. We use UML notation in specification documents, and I have a data model of large chunk of our Business Solution division’s applications data model as a poster hanging in my office – yes, drawn with UML notation! To support the need for our customers to produce documentation and conceptual sketches we’ll continue to ship the UML toolset with Visual Studio 2005. In fact, at Microsoft generally, we use UML for many purposes – mostly documentation or sharing of conceptual ideas – but almost never for any purpose where those documents relate to actual software development artifacts.


Those same whiteboards in offices in my corridor are also covered with scrawled code. But again – these are sketches – they are rarely point-perfect compilable program source. That’s a key difference for developers. Any artifact that contributes to actual software development must be capable of manipulation digitally. Source code has a well-defined syntax, a comprehensible semantics - often defined behaviorally by the compiler’s transformation of it to lower level code or IL - and can be manipulated consistently by compilers, debuggers and re-factoring programs, to name but a few. To be useful to developers, a model must have the same status as source code. It too must have a precise syntax, a comprehensible semantics, and well-defined mappings to source code or other well-defined models. And it must be more than just documentation.


Take the Whitehorse Service Designer, for example. It’s not just documentation, although it can serve that purpose. Rather it allows a developer (or architect) to focus on one aspect of her system – the connectivity between services in a service-based architecture. She can design this aspect of the system before building projects, WSDL files, code and schemas, or ask the tool to document connectivity between services if those artifacts already exist. Since connectivity information is scattered throughout many development artifacts, the holistic view such a diagram gives is fundamentally useful even though all the information it conveys could be deduced by careful scrutiny of the implementation artifacts. The Service Designer has a well-defined syntax (its DSL metamodel), and predictable, always-synchronized mappings to the various implementation artifacts. The underlying designer framework plays the role of a compiler for Service Designer diagrams, much like the role played by a traditional compiler with respect to source code files.


But, you say, why couldn’t the Whitehorse team have just built this new “language” of service connectivity as an extension to UML – especially with the improvements made to UML 2.0?


Well, when we looked at the direction the UML 2.0 specification had taken, we realized that there were several reasons this could still not be the basis for development artifacts that were anything more than documentation of a system. Since no natural sub-language of UML fits the requirements for service connectivity, we would have had to resort describing our Service Connectivity DSL using stereotypes and tags on an existing UML sub-language. This would have led to an overly complicated model within what has been described by many in the industry as an already bloated and complex specification. Using standard UML notation, in which an existing shape corresponding to whatever sub-language we’d extended is reused, would have compromised the readability and clarity of the diagrams. Lastly, we’d have been dogged by the lack of precision in the specification in key areas, and the mismatch of type-systems inherent in UML compared to the .Net and XML languages.


For these reasons, we elected to define the Service Connectivity DSL using a purpose-built metamodel, itself defined as a member of a family of related metamodels. This gives us a natural and precise foundation for service connectivity, and gives us a high-fidelity mapping to the underlying implementation artifacts which includes, of course, some amount of code. This is the same conclusion we have reached on other focused development tasks, and led to similar DSLs for the Class Designer and the Logical Data Center Designer described in other postings. Support for extensible DSLs, built as a family, with well-defined mappings between them and other development artifacts has thus become the basis of Microsoft’s model driven development strategy.


To summarize, we’d recommend using UML and UML-based tools for

  • Sketching,
  • White boarding,
  • Napkins,
  • Documentation,
  • Conceptual drawings that do not directly relate to code.

We’d recommend precisely defined DSLs and DSL-based tools for

  • Precise abstractions from which code is generated
  • Precise abstractions that map to variabilities in frameworks and components
  • Precise mappings between DSLs
  • Conceptual drawings that have precisely specifiable mappings to other DSLs or to code artifacts.

We’d recommend neither of the above for visual programming of detailed program logic (or at least for many years yet), despite Grady Booch’s optimism about executable UML expressed in an interview for the British Computer Society Bulletin that can be found here.