An Overview of Cω

 

Dare Obasanjo
Microsoft Corporation

January 13, 2005

Summary: Dare Obasanjo covers the Cω programming language created by Microsoft Research by augmenting C# with constructs to make it better at processing information, such as XML and relational data. (18 printed pages)

Introduction

One of the main reasons for XML's rise to prominence as the lingua franca for information interchange is that, unlike prior data interchange formats, XML can easily represent both rigidly-structured tabular data (relational data or serialized objects) and semi-structured data (office documents). The former tends to be strongly typed and is typically processed using Object<->XML mapping technologies, while the latter tends to be untyped and is usually processed using traditional XML technologies like DOM, SAX, and XSLT. However, in both cases, there is somewhat of a disconnect for developers processing XML using traditional object oriented programming languages.

In the case of processing strongly typed XML using Object<->XML mapping technologies, there is the impedance mismatch between programming language objects and XML schema languages like DTDs or W3C XML Schema. Notions such as the distinction between elements and attributes, document order, and content models that specify a choice of elements are all intrinsic to XML schema languages, but have no counterpart in traditional object oriented programming. These mismatches tend to lead to some contortions and data loss when mapping XML to objects. Processing untyped XML documents using technologies such as XSLT or the DOM has a different set of issues. In the case of XSLT or other XML specific languages like Xquery, the developer has to learn a whole other language to effectively process XML as well as their programming language of choice. Often, all the benefits of the integrated development environment of the host language, such as compile time checking and IntelliSense, cannot be utilized when processing XML. In the case of processing XML with APIs such as the DOM or SAX, developers often complain that the code they have to write tends to become unwieldy and cumbersome.

As design patterns and Application Programming Interfaces (APIs) for performing particular tasks become widely used, they sometimes become incorporated into programming languages. Programming languages like C# have promoted concepts that exist as design patterns and API calls in other languages, such as native string types, memory management using garbage collection, and event handling to core constructs within the language. This evolutionary process has now begun to involve XML. As XML has grown in popularity, certain parties have begun to integrate constructs for creating and manipulating XML into mainstream programming languages. The expectation is that making XML processing a native part of these programming languages will ease some of the problems facing developers that use traditional XML processing techniques.

Two of the most notable examples of XML being integrated into conventional programming languages are (C-Omega) produced by Microsoft Research, which is an extension of C# and ECMAScript for XML (E4X) produced by ECMA International, which is an extension of ECMAScript. This article provides an overview of the XML features of Cω and an upcoming article will explore the E4X language. The article begins with an introduction to the changes to the C# type system made in Cω, followed by a look at the operators added to the C# language to enable easier processing of relational and XML data

The Cω Type System

The goal of the Cω type system is to bridge the gap between Relational, Object, and XML data access by creating a type system that is a combination of all three data models. Instead of adding built-in XML or relation types to the C# language, the approach favored by the Cω type system has been to add certain general changes to the C# type system that make it more conducive for programming against both structured relational data and semi-structured XML data.

A number of the changes to C# made in Cω make it more conducive for programming against strongly typed XML, specifically XML constrained using W3C XML Schema. Several concepts from XML and XML Schema have analogous features in Cω. Concepts such as document order, the distinction between elements and attributes, having multiple fields with the same name but different values, and content models that specify a choice of types for a given field exist in Cω. A number of these concepts are handled in traditional Object<->XML mapping technologies, but it is often with awkwardness. Cω aims to make programming against strongly typed XML as natural as programming against arrays or strings in traditional programming languages.

Streams

Streams in Cω are analogous to sequences in XQuery & XPath 2.0 and the System.Collections.Generic.IEnumerable<T> type, which will exist in version 2.0 of the .NET Framework. With the existence of streams, Cω promotes the concept of an ordered, homogenous collection of zero or more items into a programming language construct. Streams are a fundamental aspect of the Cω type system upon which a number of the other extensions to C# rest.

A stream is declared by appending the operator '*' to the type name in the declaration of the variable. Typically streams are generated using iterator functions. An iterator function is a function that returns an ordered sequence of values by using a yield statement to return each value in turn. When a value is yielded, the state of the iterator function is preserved and the caller is allowed to execute. The next time the iterator is invoked, it continues from the previous state and yields the next value. Iterators functions in Cω work like the iterators functions planned for C# 2.0. The most obvious difference between iterator functions in Cω and iterator functions in C# is that Cω iterators return a stream (T*), while C# 2.0 iterators return an enumerator (IEnumerator<T>). However, there is little difference in behavior when interacting with a stream or an enumerator. The more significant difference is that, just like sequences in XQuery, streams in Cω cannot contain other streams. Instead, when multiple streams are combined the results are flattened into a single stream. For example, appending the stream (4,5) to the stream (1,2,3) results in a stream containing five items (1,2,3,4,5), not a stream with four items (1, 2, 3, (4, 5)). In C# 2.0, it isn't possible to combine multiple enumerators in such a manner, although it is possible to create an enumerator of enumerators.

The following iterator function returns a stream containing the three books in the Lord of the Rings trilogy.

 public string* LoTR(){

    yield return "The Fellowship of the Ring"; 
    yield return "The Two Towers";
    yield return "The Return of the King";

  }

The results of the above function can be processed using a traditional C# foreach loop as shown below.

  public void PrintTrilogyTitles(){

    foreach(string title in LoTR()) 
      Console.WriteLine(title);
  }

A powerful feature of Cω streams is that one can invoke methods on a stream, which are then translated to subsequent method calls on each item in the stream. The following method shows an example of this feature in action:

  public void PrintTrilogyTitleLengths(){

    foreach(int size in LoTR().Length) 
      Console.WriteLine(size);
  }

The method call above results in the value of the Length property being invoked on each string returned by the PrintTrilogyTitles() method. The ability to access the properties of the contents of a stream in such an aggregate manner allows one to write XPath-style queries over object graphs.

There is also the concept of an apply-to-all-expressions construct that allows one to apply an anonymous method directly to each member of a stream. These anonymous methods may contain the special variable it, which is bound to each successive element of the iterated stream. Below is an alternate implementation of the PrintTrilogyTitles() method which uses the apply-to-all-expressions construct.

  public void PrintTrilogyTitles(){
  
    LoTR().{Console.WriteLine(it)};
  }

Choice Types and Nullable Types

Cω's choice types are very similar to union types in programming languages like C and C++, the '|' operator in DTDs and the xs:choice element in W3C XML Schema. The following is an example of a class that uses a choice type:

public class NewsItem{

    string title;
    string author;
    choice {string pubdate; DateTime date;};
    string body;    
}

In this example, an instance of the NewsItem class can have a pubdate field of type System.String or a date field of type System.DateTime, but not both. It should be noted that the Cω compiler enforces that each field in the choice should have a different name, otherwise there would be ambiguity as to what type was intended when the field is accessed. The way fields in a choice type are accessed in Cω differs from union types in C and C++. In C/C++, the programmer has to keep track of what type the value in a particular union type represents because no static type checking is done by the compiler. Cω union types are statically checked by the compiler, which allows one to declare them like so:

  choice{string;DateTime;} x =  DateTime.Now;
  choice{string;DateTime;} y =  "12/12/2004";

However, there is still the problem of how to statically declare a type that may depend on a field in a choice type that does not exist. For example, in the above sample, x.Length is not a valid property access if the variable is initialized with an instance of System.DateTime, but returns 10 when initialized with the string "12/12/2004". This is where nullable types come in to play.

In both the worlds of W3C XML Schema and relational databases, it is possible for all types to have instances whose value is null. Languages such as Java and existing versions of C# do not allow one to assign null to an integer or floating point value type. However, when working with XML or relational data it is valuable to be able to state that null is a valid value for a type. In such cases, one would not want property accesses on the value to result in NullReferenceExceptions being thrown. Nullable types make this possible by mapping all values returned by a property access to the value null. Below are some examples of using nullable types.

   string? s = null;
   int? size = s.Length;  // returns null instead of throwing  
                          // NullReferenceException
  
   if(size == null)
     Console.WriteLine("The value of size is NULL");

   choice{string;DateTime;} pubdate =  DateTime.Now;
   int? dateLen  = pubdate.Length; //works since it returns null because 
                                   //Length is a property of System.String
   int dateLen2  = (int) pubdate.Length; //throws NullReferenceException 

This feature is similar to but not the same as nullable types in C# 2.0. A nullable type in C# 2.0 is an instance of Nullable<T>, which contains a value and an indication whether the value is null or not. This is basically a wrapper for value types such as ints and floats that can't be null. Cω takes this one step further with the behavior of returning null instead of throwing a NullReferenceException on accessing a field or property of a nullable type whose value is null.

Anonymous Structs

Anonymous structs are analogous to the xs:sequence element in W3C XML Schema. Anonymous structs enable one to model certain XML-centric notions in Cω, such as document order and the fact that an element may have multiple child elements with the same name that have different values. An anonymous struct is similar to a regular struct in C# with a few key distinctions:

  1. There is no explicit type name for an anonymous struct.
  2. Fields within an anonymous struct are ordered, which allows them to be accessed through the array index operator.
  3. The fields within an anonymous struct do not have to have a name. They can only have a type.
  4. An anonymous struct can have multiple fields with the same name. In this case, accessing these fields by name results in a stream being returned.
  5. Anonymous structs with similar structure (that is, same member types in the same order) are compatible and variables of such structs can be assigned back and forth.

The following examples highlight the various characteristics of anonymous structs in Cω.

  struct{ int; string; 
          string; DateTime date; 
     string;} x =          new {47, "Hello World", 
                 "Dare Obasanjo", date=DateTime.Now, 
                 "This is my first story"};
    Console.WriteLine(x[1]); 
    DateTime pubDate = x.date; 

    struct{ long; string; string; 
            DateTime date; string;} newsItem = x;     
    Console.WriteLine(newsItem[1] + " by " + newsItem[2] + " on " + newsItem.date);
    
    struct {string field; 
            string field; 
       string field;} field3 = new {field="one", 
                field="two", 
                field="three"};

    string* strField = field3.field; 
    //string strField = field3.field doesn't work since field3.field returns a stream    

    struct {int value; string value;} tricky = new {value=10, value="ten"}; 
    choice {int; string;}* values = tricky.value;     

Content Classes

A content class is a class that has its members grouped into distinct units using the struct keyword. To some degree, content classes are analogous to DTDs in XML. Consider the following content class for a Books object:

public class Books{
  
  struct{
    string title;
    string author;
    string publisher;
    string? onloan;
  }* Book;      

} 

This sample is analogous to the following DTD:

<!ELEMENT Books (Book*)>
<!ELEMENT Book  (title, author,publisher, onloan?)>
<!ELEMENT title  (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT publisher (#PCDATA)>
<!ELEMENT onloan (#PCDATA)>

As well as the following XML schema:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:element name="Books">
    <xs:complexType>
      <xs:sequence minOccurs="0" maxOccurs="0">
   <xs:element name="Book">
     <xs:complexType>
       <xs:sequence>
         <xs:element name="title" type="xs:string" />
         <xs:element name="author" type="xs:string" />
         <xs:element name="publisher" type="xs:string" />
         <xs:element name="onloan" type="xs:string" minOccurs="0"/>
       </xs:sequence>
     </xs:complexType>
   </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

The following code sample shows how the Books object would be used:

using Microsoft.Comega;
using System;

public class Books{
  
  struct{
    string title;
    string author;
    string publisher;
    string? onloan;
  }* Book;      


  public static void Main(){

    Books books = new Books();

    books.Book = new struct {title="Essential.NET", author="Don Box", 
        publisher="Addison-Wesley", onloan =  (string?) null};

    Console.WriteLine((string) books.Book.author + " is the author of " + 
        (string) books.Book.title);
  }
}

Query Operators in Cω

Cω adds two broad classes of query operators to the C# language:

  • XPath-based operators for querying the member variables of an object by name or by type.
  • SQL-based operators for performing sophisticated queries involving projection, grouping, and joining of data from one or more objects.

XPath-based Operators

With the existence of streams and anonymous structs that can have multiple members with the same name, even ordinary direct member access with the '.' operator in Cω can be considered a query operation. For example, the operation books.Book.title from the previous section returns the titles of all the Book objects contained within the Books class. This is equivalent to the XPath query '/Books/Book/title' that returns all the titles of Book elements contained within the Books element.

The wildcard member access operator '.*' is used to retrieve all the fields of a type. This operator is equivalent to the XPath query child::* that returns all the child nodes of the principal node type of the current node. For example, the operation books.Book.* return a stream container all the members of all the Book objects contained within the Books class. This is equivalent to the XPath query '/Books/Book/*' that returns all the child elements of the Book elements contained within the Books element.

Cω also supports transitive member access using the '...' operator, which is analogous to the descendent-or-self axis or '//' abbreviated path in XPath. The operation books...title returns a stream containing all title member fields that are contained within the Books class or are member fields of any of its contents in a recursive manner. This is equivalent to the XPath query '/Books//title' that returns all title elements that are descendants of the Books element. The transitive member access operator can also be used to match nodes restricted to a particular type using syntax of the form '...typename::*'. For example, the operation books...string::* returns a stream containing all the member fields of type System.String that are contained within the Books class or are member fields of any of its contents in a recursive manner. This is analogous to the XPath 2.0 query /Books//element(*, xs:string) that matches any descendent of the Books element of type xs:string.

Filter operations can be applied to the results of a transitive member access in the same manner that predicates can be used to filter XPath queries. Just as in XPath, a Cω filter is applied to a query operation by using the '[expression]' operator placed after the query. As is the case with apply-to-all-expressions, the filter may contain the special variable it, which is bound to each successive element of the iterated stream. Below is an example, which queries all the fields of type System.Int32 in an anonymous struct, and then filters the results to those whose value is greater than 8.

 struct {int a; int b; int c;} z = new {a=5, b=10, c= 15}; 
 int* values = z...int::*[it > 8];

 foreach(int i in values){
    Console.WriteLine(i + " is greater than 8");
 }

SQL-based Operators

Cω includes a number of constructs from the SQL language as keywords. Operators for performing selection with projection, filtering, ordering, grouping, and joins are all built into Cω. The SQL operators can be applied to both in-memory objects and relational stores that can be accessed using ADO.NET. When applied to a relational database, the Cω query operators are converted to SQL queries over the underlying store. The primary advantage of using the SQL operators from the Cω language is that the query syntax and results can be checked at compile time instead of at runtime, as is the case with embedding SQL expressions in strings using traditional relational APIs.

To connect to a SQL database in Cω, it must be exposed as a managed assembly (that is, a .NET library file), which is then referenced by the application. A relational database can be exposed to a Cω as a managed assembly either by using the sql2comega.exe command line tool or the Add Database Schema... dialog from within Visual Studio. Database objects are used by Cω to represent the relational database hosted by the server. A Database object has a public property for each table or view, and a method for each table-valued function found in the database. To query a relational database, a table, view, or table-valued function must be specified as input to the one or more of the SQL-based operators.

The following sample program and output shows some of the capabilities of using the SQL-based operators to query a relational database in Cω. The database used in this example is the sample Northwind database that comes with Microsoft SQL Server. The name DB used in the example refers to a global instance of a Database object in the Northwind namespace of the Northwind.dll assembly generated using sql2comega.exe.

  using System;
  using System.Data.SqlTypes;
  using Northwind;
  
  class Test {
     static void Main() {
  
     // The foreach statement can infer the type of the 
     // iteration variable 'row' statically evaluating 
     // the result type of the select expression
       foreach( row in select ContactName from DB.Customers ) {
         Console.WriteLine("{0}", row.ContactName);
       }
     }
  }

The following sample program and output shows some of the capabilities of using the SQL-based operators to query in-memory objects in Cω.

  Code Sample
  using Microsoft.Comega;
  using System;
  
  public class Test{
  
    enum CDStyle {Alt, Classic, HipHop}
  
    static struct{ string Title; string Artist; CDStyle Style; int Year;}* CDs =
      new{new{ Title="Lucky Frog", Artist="Holly Holt", Style=CDStyle.Alt, Year=2001},
     new{ Title="Kamikaze", Artist="Twista", Style=CDStyle.HipHop, Year=2004},
     new{ Title="Stop Light Green", Artist="Robert O'Hara", Style=CDStyle.Alt, Year=1981},
     new{ Title="Noctures", Artist="Chopin", Style=CDStyle.Classic, Year=1892},
     new{ Title="Mimosa!", Artist="Brian Groth", Style=CDStyle.Alt, Year=1980},
     new {Title="Beg For Mercy", Artist="G-Unit", Style=CDStyle.HipHop, Year=2003}   
      };
  
  
    public static void Main(){
  
      struct { string Title; string Artist;}* results; 
      
      Console.WriteLine("QUERY #1: select Title, Artist from CDs where Style == CDStyle.HipHop");
      results = select Title, Artist from CDs where Style == CDStyle.HipHop;
      results.{ Console.WriteLine("Title = {0}, Artist =  {1}", it.Title, it.Artist); };
  
      Console.WriteLine();
  
      struct { string Title; string Artist; int Year;}* results2; 
  
      Console.WriteLine("QUERY #2: select Title, Artist, Year from CDs order by Year");
      results2 = select Title, Artist, Year from CDs order by Year;
      results2.{ Console.WriteLine("Title = {0}, Artist =  {1}, Year = {2}", it.Title, it.Artist, it.Year); };    
    }
  }
  Output
  QUERY #1: select Title, Artist from CDs where Style == CDStyle.HipHop
  Title = Kamikaze, Artist =  Twista
  Title = Beg For Mercy, Artist =  G-Unit
  
  QUERY #2: select Title, Artist, Year from CDs order by Year
  Title = Noctures, Artist =  Chopin, Year = 1892
  Title = Mimosa!, Artist =  Brian Groth, Year = 1980
  Title = Stop Light Green, Artist =  Robert O'Hara, Year = 1981
  Title = Lucky Frog, Artist =  Holly Holt, Year = 2001
  Title = Beg For Mercy, Artist =  G-Unit, Year = 2003
  Title = Kamikaze, Artist =  Twista, Year = 2004

A number of operations that require tedious nested loops can be processed in a straightforward manner using the declarative SQL-like operators in Cω. Provided below is a brief description of the major classes of SQL operators included in Cω.

Projection

The projection of the select expression is the list of expressions following the select keyword. The projection is executed once for each row specified by the from clause. The job of the projection is to shape the resulting rows of data into rows containing only the columns required. The simplest form of the select command consists of the select keyword followed by a projection list of one or more expressions identifying columns from the source, followed by the from keyword, and then an expression identifying the source of the query. Here's an example:

    rows = select ContactName, Phone from DB.Customers;
     
    foreach( row in rows ) {
     Console.WriteLine("{0}", row.ContactName);
    }

In this example the type designator for the results of the select query are not specified. The Cω compiler automatically infers the correct type. The actual type of an individual result row is a Cω tuple type. One can specify the result type directly using a tuple type (that is, an anonymous struct) and the asterisk (*) to designate a stream of results. For example:

   struct{SqlString ContactName;}* rows =
                  select ContactName from DB.Customers;

   struct{SqlString ContactName; SqlString Phone;}* rows = 
                  select ContactName, Phone from DB.Customers;

Filtering

The results of a select expression can be filtered using one of three keywords—distinct, top, and where. The distinct keyword is used to restrict the resulting rows to only unique values. The top keyword is used to restrict the total number of rows produced by the query. The keyword top is followed by a constant numeric expression that specifies the number of rows to return. One can also create a distinct top selection that restricts the total number of unique rows returned by the query. The where clause is used to specify a Boolean expression for filtering the rows returned by the query source. Rows where the expression evaluates to true are retained, while the rest are discarded. The example below shows all three filter operators in action:

  select distinct top 10 ContactName from DB.Customers where City == "London";

Ordering

The resulting rows from a select expression can be sorted by using the order by clause. The order by clause is always the last clause of the select expression, if it is specified at all. The order by clause consists of the two keywords order by followed immediately by a comma-separated list of expressions that define the values that determine the order. The first expression in the list defines the ordering criteria with the most precedence. It is also possible to specify whether each expression should be considered in ascending or descending order. The default for all expressions is to be considered in ascending order. The example below shows the order by clause at work:

   rows = select ContactName, Phone from DB.Customers
            order by ContactName desc, Phone asc;

Grouping

Values can be aggregated across multiple rows using the group by clause and the built-in aggregate functions. The group by clause enables one to specify how different rows are actually related, so they can be grouped together. Aggregate functions can then be applied to the columns to compute values over the group. An aggregate function is a function that computes a single value from a series of inputs; such as computing the sum or average of a series of numbers. There are six aggregate functions built into Cω. They are Count, Min, Max, Avg, Sum, and Stddev. To use these functions in a query, one must first import the System.Query namespace. The following example shows how to use the group by clause and the built-in aggregate functions.

    rows = select Country, Count(Country) from DB.Customers 
      group by Country; 

This example uses an aggregate to produce the set of all countries and the count of customers within each country. The Count() aggregate tallies the number of items in the group.

Aggregates can be used in all clauses evaluated after the group by clause. The projection list is evaluated after the group by clause, even though it is specified earlier in the query. A consequence of this is that aggregate functions cannot be applied in the where clause. However, it is still possible to filter grouped rows using the having clause. The having clause acts just like the where clause, except that it is evaluated after the group by clause. The following example shows how the having clause is used.

    rows = select Country, Count(Country) from DB.Customers
           group by Country
           having Count(Country) > 1;

Joins

Select queries can be used to combine results from multiple tables. A SQL joining is a Cartesian product of one or more tables where each row from one table is paired up with each row from another table. The full Cartesian product consists of all such pairings. To select multiple sources whose data should be joined to perform a query, the from clause can actually contain a comma separated list of source expressions, each with its own iteration alias. The following example pairs up all Customer rows with their corresponding Order rows, and produces a table listing the customer's name and the shipping date for the order.

   rows = select c.ContactName, o.ShippedDate
            from c in DB.Customers, o in DB.Orders
            where c.CustomerID == o.CustomerID;

Cω also supports more sophisticated table joining semantics from the SQL world including inner join, left join, right join, and outer join using the corresponding keywords. Explanations of the semantics of the various kinds of joins is available at the W3Schools tutorial on SQL JOIN. The following example shows a select expression that uses the inner join keyword.

   rows = select c.ContactName, o.ShippedDate
          from c in DB.Customers
          inner join o in DB.Orders
          on c.CustomerID == o.CustomerID;

Data Modification

The relational data access capabilities of Cω are not limited to querying data. One can also insert new rows into a table using the insert command, modify existing rows within a table using the update command, or delete rows from a table using the delete command.

The insert command is an expression that evaluates to the number of successful inserts that have been made to the table as a result of executing the command. The following example inserts a new customer in the Customers table.

   int n = insert CustomerID = "ABCDE", ContactName="Frank", CompanyName="Acme"
           into DB.Customers;

The same effect can be obtained by using an anonymous struct as opposed to setting each field directly as shown below:

   row = new{CustomerID = "ABCDE", ContactName="Frank", CompanyName="Acme"};
   int n = insert row into DB.Customers;

The update command is an expression that evaluates to the number of rows that were successfully modified as a result of executing the command. The following example shows how to do a global replace of all misspelled references to the city "London".

  int n = update DB.Customers
          set City = "London"
          where Country == "UK" && City == "Lundon";

It is also possible to modify all rows in the table by omitting the where clause. The delete command is an expression that evaluates the number of rows that were successfully deleted as a result of executing the command. The following example deletes all the orders for customers in London.

 int n = delete o from c in DB.Customers
         inner join o in DB.Orders
         on c.CustomerID == o.CustomerID
         where c.City == "London";

Most applications that use insert, update, and delete expressions will also use transactions to guarantee ACIDity (atomicity, consistency, isolation, and durability) of one or more changes to the database. The Cω language includes a transact statement that promotes the notion of initiating and exiting transactions in a programming language feature. The transact statement is a statement bounding a block of code associated with a database transaction. If the code executes successfully and attempts to exit the block, the transaction will automatically be committed. If the block is exited due to a thrown exception, the transaction will automatically be aborted. The developer can also specify handlers that execute some user-defined once the transact block is exited using the commit and rollback handlers. The following example shows transactions in action:

   transact(DB) {
      delete from DB.Customers where CustomerID == "ALFKI";
   }
   commit {
      Console.WriteLine("commited");
   }
   rollback {
      Console.WriteLine("aborted");
   }

Using XML Literals in Cω

In Cω, one can construct object instances using XML syntax. This feature is modeled after the ability to construct elements in languages like XQuery and XSLT. Like XQuery and XSLT, the XML literals can contain embedded code for constructing values. However, because Cω is statically typed, the names of members and types must be known at compile time, and cannot be constructed dynamically.

One can control the shape of the XML using a number of constructs. One can specify that fields should be treated as attributes in the XML literal with the use of the attribute modifier on the member declaration. Similarly, choice types and anonymous structs are treated as the children of the XML element that maps to the content class. The following example shows how to initialize a content class from an XML literal.

using Microsoft.Comega;
using System;

public class NewsItem{

  attribute  string title;
  attribute  string author;
  struct { DateTime date; string body; } 

  public static void Main(){

    NewsItem news = <NewsItem title="Hello World" author="Dare Obasanjo">
                      <date>{DateTime.Now}</date>
                      <body>I am the first post of the New Year.</body>
                    </NewsItem>;

    Console.WriteLine(news.title + " by " + news.author + " on " + news.date);
    
  }
}

XML literals are intended to make the process of constructing strongly typed XML much simpler. Consider the following XQuery example taken from the W3C XQuery Use cases document. It iterates over a bibliography that contains a number of books. For each book in the bibliography, it lists the title and authors, grouped inside a result element.

for $b in $bs/book
return
 <result>
   {$b/title}
   {$b/author}
 </result>

Performing the equivalent task in Cω looks very similar.

foreach (b in bs.book){

  yield return <result>
         {b.title}
         {b.author}
        </result>
}

Conclusion

The Cω language is an interesting attempt to bridge the impending mismatches involved in typical enterprise development efforts when crossing the boundaries of the relational, object oriented, and XML worlds. A number of the ideas in the language have taken hold in academia and within Microsoft as is evidenced in comments by Anders Hejlsberg about the direction of C# 3.0. Until then developers can explore what it means to program in a more data oriented programming language by downloading the Cω compiler or Visual Studio.NET plug-in.

Acknowledgements

I'd like to thank Don Box, Mike Champion, Denise Draper, and Erik Meijer for their ideas and feedback while writing this article.

Dare Obasanjo is a program manager on the MSN Communication Services Platform team. He brings his love of solving problems with XML to building the server infrastructure utilized by the MSN Messenger, MSN Hotmail, and MSN Spaces teams.

Feel free to post any questions or comments about this article on the Extreme XML message board on GotDotNet.