A First Look at ObjectSpaces in Visual Studio 2005

Article
08/15/2008

Dino Esposito
Wintellect

February 2004

Applies to:
   Microsoft® Visual Studio® 2005
   Microsoft® ADO.NET
   Microsoft® SQL Server™ 2000
   SQL Language

Summary: ObjectSpaces is one of the most interesting new features in Microsoft Visual Studio 2005 (formerly code-named "Whidbey"). An Object/Relational mapping tool fully integrated with Microsoft ADO.NET and Microsoft .NET technologies, ObjectSpaces puts an abstraction layer between your business tier and the raw-data tier where the physical connection to the data source is handled. You think about and design application features using objects, and ObjectSpaces reads and writes data across a variety of data sources using SQL statements. (15 printed pages)

Note This article is based on the Visual Studio Whidbey preview code that was distributed at the Microsoft Professional Developers Conference in October 2003.

Introduction
ObjectSpaces "In Person"
Mapping Tables to Classes
Getting Data Objects
Persisting Changes
Working with Object Graphs
Delay Loading
Benefits of ObjectSpaces

Introduction

When designing the Data Access Layer (DAL) of a .NET-based application, architects normally have two options for setting up a bidirectional communication infrastructure between the DAL itself and the business and presentation layers. The first option is to write classes that move data in and out of the data source using Microsoft ADO.NET objects. The second possibility entails using classes that abstract the schema of the underlying tables, and optionally add logic and application-specific capabilities. In both cases, the application moves structured data through the tiers, while reading and writing data using SQL commands.

In the former scenario, extra code is required only to bind fields and tables to the user-interface elements—a task that data-binding features in .NET-based applications greatly simplify. The main drawback is there is a lot of SQL to be written by hand that, especially when the size of the application grows, becomes very complex to write and maintain. Overall, the design of the application is data-centric.

Opting for a more abstract and object-oriented model is not a route to take lightheartedly, either. In this case, you have a strong business logic layer, but need an extra layer to be able to persist the object model to the underlying storage medium. This extra layer would completely shield you from dealing with the syntax of the storage medium.

The bottom line is that an approach based on objects is elegant and neat on paper, but may consume significant time in reality if you have to write it from scratch. However, when large applications are involved that manage tightly interrelated, hierarchical data, thinking in terms of objects is terribly helpful and often it represents your only safe way out. For this reason, Object/Relational Mapping (O/RM) tools have existed for a long time, and from several vendors.

An O/RM system allows you to persist an existing object model to a given storage medium. You should use an O/RM system because you have an object model to persist. You shouldn't create an object model in order to use an O/RM system.

You define how your classes map to physical tables and columns and the O/RM system will take care of querying and updating the underlying tables for you. You work with an object-oriented interface and have the O/RM transpose your high-level language into raw SQL calls. At its core, this is just what ObjectSpaces is and does.

ObjectSpaces "In Person"

ObjectSpaces is an O/RM framework embedded in the .NET Framework in Visual Studio 2005. It provides a suite of classes that deal with SQL Server 2000 and SQL Server 2005 tables in reading and writing. Client modules invoke ObjectSpaces classes in lieu of ADO.NET classes and pass them data packed in .NET custom classes. The ObjectSpaces engine translates object queries into SQL queries and then translates modifications to your objects into modifications to the underlying tables. Data that is fetched is processed and stored in instances of .NET classes before being returned to the callers. The figure below provides a 10,000 feet view of ObjectSpaces in the context of an application.

Figure 1. The overall architecture of an application based on ObjectSpaces

By design, business rules express the logic of the application and govern the interaction of the various entities that form the problem's domain. Business rules are formulated in terms of objects that closely match specific business entities such as customers, orders, invoices and the like, not general-purpose data containers such as DataSets and DataTables.

ObjectSpaces lets developers concentrate on business entities and design application reasoning in terms of objects instead of data streams. ObjectSpaces requires a preliminary effort to map classes to data tables. After that, the ObjectSpaces engine deals with the data source and largely shields you from the details of the interaction. As a result, you design your application using a flexible, reusable, and extensible object-oriented paradigm; at the same time, you keep your data in a relational data store.

The ObjectSpaces architecture sits in-between the application logic and the data source and enables the developer to manage data without a deep knowledge of the underlying physical schema. By using ObjectSpaces, you persist objects to a data source and retrieve objects from a data source without having to write any SQL code.

ObjectSpaces is part of Visual Studio 2005, and for the time being supports only two data sources—SQL Server 2000 and SQL Server 2005.

Mapping Tables to Classes

The root class of the ObjectSpaces architecture is ObjectSpace. The ObjectSpace class handles communication with the data source and governs the query and retrieve activity that occurs on the data source. ObjectSpace is responsible for persisting objects to tables and for instantiating objects out of the results of a query. In order to work, the ObjectSpace class needs a mapping schema and an ADO.NET connection object. The mapping schema can be a static resource such as an XML file; alternatively, it can be dynamically built through the interface of the MappingSchema object.

The MappingSchema object determines which fields and tables will be used to persist the object data, and from which fields and tables the state of an object will be retrieved. The following code snippet demonstrates how to instantiate an object space.

Dim conn As SqlConnection = New SqlConnection(ConnString)
Dim os As ObjectSpace = New ObjectSpace("myMappings.xml", conn)

The connection object is a plain SqlConnection object that contains the parameters to connect to the specified instance of SQL Server.

The mapping schema is divided in three parts: relational schema definition (RSD), object schema definition (OSD), and mapping schema, which links the previous two schemas together. For your convenience, you can store each schema component to a distinct XML file. In this case, only the mapping schema file must be passed to the ObjectSpace constructor. The mapping schema will then implicitly reference the relational and object schema.

The following listing shows the mapping schema file that binds together a few tables in the Northwind database (rsd.xml file) and an object schema defined in the osd.xml file.

<m:MappingSchema 
    xmlns:m="https://schemas.microsoft.com/data/2002/09/28/mapping">
   <m:DataSources>
      <m:DataSource Name="NorthwindRSD" Type="SQL Server" 
                    Direction="Source">
         <m:Schema Location="RSD.XML" />
         <m:Variable Name="Customers" Select="Customers" />
      </m:DataSource>
      <m:DataSource Name="DataTypesOSD" Type="Object" Direction="Target">
         <m:Schema Location="OSD.XML" />
      </m:DataSource>
   </m:DataSources>
   <m:Mappings>
      <m:Map SourceVariable="Customers" TargetSelect="Samples.Customer">
         <m:FieldMap SourceField="CustomerID" TargetField="Id" />
         <m:FieldMap SourceField="CompanyName" TargetField="Company" />
         <m:FieldMap SourceField="ContactName" TargetField="Name" />
         <m:FieldMap SourceField="Phone" TargetField="Phone" />
      </m:Map>
   </m:Mappings>
</m:MappingSchema>

The <Mappings> section of the XML document above defines the bindings between a database table and a .NET class. In particular, the binding is set between the Customers table and the Customer class. The SourceField attribute indicates a table column; the TargetField indicates a class property. For example, the CustomerID column is bound to the Id property.

The relational schema of the source data table is shown below.

<rsd:Database Name="Northwind" Owner="sa" 
       xmlns:rsd="https://schemas.microsoft.com/data/2002/09/28/rsd">
  <r:Schema Name="dbo"
       xmlns:r="https://schemas.microsoft.com/data/2002/09/28/rsd">
    <rsd:Tables>
      <rsd:Table Name="Customers">
        <rsd:Columns>
          <rsd:Column Name="CustomerID" SqlType="nchar" Precision="5" />
          <rsd:Column Name="CompanyName" SqlType="nvarchar" 
                      Precision="40" />
          <rsd:Column AllowDbNull="true" Name="ContactName" 
                      SqlType="nvarchar" Precision="30" />
          <rsd:Column AllowDbNull="true" Name="Phone" SqlType="nvarchar" 
                      Precision="24" />
        </rsd:Columns>
        <rsd:Constraints>
          <rsd:PrimaryKey Name="PK_Customers">
            <rsd:ColumnRef Name="CustomerID" />
          </rsd:PrimaryKey>
        </rsd:Constraints>
      </rsd:Table>
    </rsd:Tables>
  </r:Schema>
</rsd:Database>

As you can see, the schema selects only a few columns out of the Customers table. The column list includes the primary key. The schema is merely an XML description of the table view of interest.

The object schema definition is an XML file that looks like the listing below.

<osd:ExtendedObjectSchema Name="DataTypesOSD" 
     xmlns:osd="https://schemas.microsoft.com/data/.../persistenceschema">
  <osd:Classes>
    <osd:Class Name="Samples.Customer">
      <osd:Member Name="Id" Key="true" />
      <osd:Member Name="Company" />
      <osd:Member Name="Name" />
      <osd:Member Name="Phone" />
    </osd:Class>
  </osd:Classes>
</osd:ExtendedObjectSchema>

The file describes a class like the one shown below.

Namespace Samples
Public Class Customer 
    Public Id As String
    Public Name As String
    Public Company As String 
    Public Phone As String
End Class
End Namespace

In summary, the mapping information above instructs the ObjectSpaces system to serialize and deserialize the contents of a Samples.Customer object to and from a given set of columns in the Customers table.

Let's see how to read and write data using the Customer class. To start out, have a look at the methods defined on the ObjectSpace class. (See Table 1.)

Table 1. Methods of the ObjectSpace class

Method	Description
BeginTransaction	Begins a transaction at the data source
Commit	Commits the current transaction at the data source
GetObject	Returns a single object from the data source based on the type and the query string specified
GetObjectReader	Returns a stream of objects from the data source based on the type and the query string specified
GetObjectSet	Returns a collection of objects from the data source based on the type and the query string specified
MarkForDeletion	Marks the specified object for deletion. The object (and related database rows) will be deleted at the data source when PersistChanges is next called
PersistChanges	Propagates, inserts, updates, and deletes to the data source
Resync	Refreshes the state of the object with current values read from the data source
Rollback	Rolls back the current transaction at the data source
StartTracking	Identifies an object as persistent. When marked as persistent, an object is assigned a state and taken into account for I/O operations against the data source

The methods can be divided in three logical groups—transactional, reading, and writing. Methods such as BeginTransaction, Commit, and Rollback belong to the first category. Their role and implementation is straightforward and rather self-explanatory.

Reading methods are Resync, GetObject, GetObjectReader, and GetObjectSet. The GetXXX methods return data packed into instances of the mapped class(es). They differ on the number of data objects retrieved and on the state of the underlying connection. The GetObject and GetObjectSet method retrieve their data by calling GetObjectReader internally.

GetObject extracts only the first object out of the result set and throws an exception if multiple data objects are returned. This method broadly maps to the IDbCommand's ExecuteScalar method.

GetObjectReader returns a stream of objects—an instance of the ObjectReader class—much like a data reader does with rows in plain ADO.NET. When the method returns, the connection is busy and is closed as soon as you close the ObjectReader. I'll provide an example of this in a moment. This method is similar to the IDbCommand's ExecuteReader method.

Finally, GetObjectSet returns a disconnected collection of data objects retrieved by the query. The return type is ObjectSet. The method is the ObjectSpaces counterpart to the data adapter's Fill method.

How does the mapping between application objects and data source types take place? Let's have a look at the signature of the various methods.

Function GetObject(t As Type, query As String) As Object
Function GetObjectReader(t As Type, query As String) As ObjectReader
Function GetObjectSet(t As Type, query As String) As ObjectSet

The first parameter is a Type object which identifies the data object to work with. Here's a code snippet.

Dim dataSet As ObjectSet
dataSet = os.GetObjectSet(GetType(Customer), "Id = 'ALFKI'")

The system creates a query object based on the specified query string and runs it against the data source. Internally, the GetObjectSet gets a stream of results and processes individual rows loading data into fresh instances of the specified type. Finally, a collection of those objects is returned and the connection is closed.

It is important to point out that the objects returned by all GetXXX methods are automatically tracked for changes by the ObjectSpace engine. It is not necessary to attach them to the ObjectSpace engine using the StartTracking method. More on this later.

The Resync method takes a single object, or a collection of objects, and refreshes their data by running a new query to collect up-to-date information.

os.Resync(dataSet, Depth.SingleObject)

The method takes a second parameter—a Depth enum value—which indicates if the object specified is updated, or if related object data (i.e., child objects) is also updated.

Writing-related methods are StartTracking, MarkForDeletion, and PersistChanges. StartTracking identifies the object as a persistent object. When the method is invoked, the object is given a state value, is added to the context of the ObjectSpace system, and is tracked for changes. The tracked state of an object is broadly equivalent to the RowState property of a DataRow object in an ADO.NET table. The state indicates whether the object should be added to the data source, deleted from it, or just updated during the next call to PersistChanges. An object marked as unchanged will not be processed. StartTracking must be used to insert new objects in the data store, but this requirement should be gone by the time Visual Studio 2005 ships.

MarkForDeletion modifies the state of the object so that it will be deleted the next time. MarkForDeletion is equivalent to the Delete method of the ADO.NET DataRow object.

PersistChanges starts a batch update process and persists all ongoing changes held in memory to the data source. You may pass a single object to the method, or a collection of objects. The PersistChanges method implicitly starts a local transaction before performing the updates at the data source. If the update fails, the transaction is rolled back and an exception is thrown. Otherwise, the transaction is committed and the updates are written to the data source.

Let's take a closer look at these methods and the overall ObjectSpaces architecture by working out an example.

Getting Data Objects

To take advantage of ObjectSpaces in your Visual Studio 2005 applications, you need to reference a couple of assemblies—System.Data.ObjectSpaces and System.Data.SqlXml. The latter is needed because it is internally referenced by the primary ObjectSpaces assembly. For your convenience, you can also import the System.Data.ObjectSpaces namespace in the source.

Imports System.Data.ObjectSpaces

Make sure you have the mapping schema files in the same directory as the executables, and then use the following code.

Dim ConnString As String = "..."
Dim conn As New SqlConnection(ConnString)
Dim os As New ObjectSpace("map.xml", conn)

Dim query As New ObjectQuery(GetType(Customer), "Id = 'ALFKI'")
Dim reader As ObjectReader = os.GetObjectReader(query)

For Each c As Customer In reader
     CustID.Text = c.Id
     CustName.Text = c.Name
     CustCompany.Text = c.Company
     CustPhone.Text = c.Phone
Next
reader.Close()

The code is part of a Windows Forms application that displays a form with a few text boxes—customer ID, name, company, and phone number. The query to run is represented by an instance of the ObjectQuery class.

The class constructor needs a Type object and a query string. The type object specifies the business object to use to exchange information with the data source. An essential point to note here is that the query string must be written according to the persistent fields of the object (fields or properties), not the columns in the database. The query string above returns Customer objects in which the Id property equals ALFKI.

Queries for object data are written in a new query language named OPath. As the name suggests, OPath is similar to XPath and enables you to specify queries for objects in object-oriented syntax.

The query is passed to the GetObjectReader method and a stream-based ObjectReader object is returned. The contents of an object reader can be enumerated using a For..Each construct. The reader component returns ready-made instances of the specified type, Customer.

For Each c As Customer in reader
    ' Process information here
Next

Once you have each database row mapped to an instance of a user-defined class, binding properties to user-interface elements is a child's game. The figure below demonstrates a simple form populated using the above code.

Figure 2. A Windows form filled using ObjectSpaces classes

To read the contents of an object reader, you can also resort to the Read method, which would position your code on the subsequent object. (This approach is not really different from using the Read method on an ADO.NET data reader object.)

The object reader must be closed when you're done with it. Objects returned by the GetObjectReader method are automatically tracked by the ObjectSpace system, and there's no need to add them to the ObjectSpace context.

Persisting Changes

The PersistChanges method is responsible for writing back to the data source any object that is currently being tracked by the ObjectSpace object. Objects retrieved through GetXXX methods are automatically tracked, but what about new objects? Here's how to add a new object to the tracking system.

Dim c As New Customer
c.Name = "Belinda Newman"
c.Company = "Litware, Inc."
c.Phone = "(425) 707-9790"
c.Fax = "(425) 707-9799" 
os.StartTracking(c, InitialState.Inserted)

You create and fill a new object, and then add it to the ObjectSpaces context through a call to the StartTracking method. The call requires an initial state for the object to identify it as new to the data source, or as an object that already exists. The InitialState enum counts two items: Inserted and Unchanged. When PersistChanges is called on the object, a new row is added to the data source only if the initial state equals Inserted.

os.PersistChanges(c)

It is interesting to notice that the Customer class in the above example has a Fax property. However, this property is not mapped to the Fax column in the underlying Customers Northwind table. (See the map.xml listing above.) As a result, when the above change is persisted, a new record is added to the Customers table.

Figure 3. Changes to the Customers table inducted by the ObjectSpaces system

Existing applications that make use of batch update and DataSet objects find in the pair ObjectSet class and PersistChanges method a direct counterpart.

Dim objSet As ObjectSet = os.GetObjectSet(GetType(Customer), query)
' Add a new customer and update
objSet.Add(c) 
os.PersistChanges(objSet)

ObjectSpaces guarantees that all the objects added to an ObjectSet object are automatically tracked. You work with an instance of the ObjectSet class in much the same way you work with a disconnected DataSet. When you're done, instead of updating through a data adapter, you call the PersistChanges method.

Working with Object Graphs

ObjectSpaces can also handle hierarchical data and graphs of objects. A typical scenario is when a one-to-many relationship exists between the Customer and the Order object. Consider a Customer class enhanced as follows.

Public Class Customer 
    Public Id As String
    Public Name As String
    Public Company As String
    Public Phone As String
    Public Fax As String
    Public Orders As ArrayList = New ArrayList()
End Class

The new Orders property will contain an array of orders: all the orders issued by the given customer. The code that processes the data retrieved looks like the following:

Dim reader As ObjectReader
Dim oq As New ObjectQuery(GetType(Customer), "Id = 'ALFKI'", "Orders")
reader = os.GetObjectReader(oq) 
For Each c As Customer In reader
    OutputCustomer(c)
    For Each o As Order in c.Orders
        OutputOrder(o)
    Next
Next
reader.Close()

There are two elements that make this code differ significantly from any other similar-looking code we considered so far. The most important aspect is hidden from view here—a new mapping schema is used that is aware of a join relationship between the Customers and Orders data. The constructor of the ObjectQuery class is different, too.

In this case, the query object is built from three parameters: the type of the object to return, the OPath query string, and a third string known as the span.

A span is a comma-separated string that identifies related objects that will be returned by the query. Specifying a span value of "Orders" ensures that the Order objects related to any Customer object are returned by the query as well. All orders are automatically packed into the Orders property.

As for the mapping schema, the <DataSource> section undergoes some changes and now includes a new <Relationship> node.

<m:DataSource Name="NorthwindRSD" Type="SQL Server" Direction="Source">
     <m:Schema Location="hRSD.XML" />
     <m:Variable Name="Customers" Select="Customers" />
     <m:Variable Name="Orders" Select="Orders" />
     <m:Relationship Name="Customers_Orders" 
        FromVariable="Customers" ToVariable="Orders">
        <m:FieldJoin From="CustomerID" To="CustomerID"/>
     </m:Relationship>
</m:DataSource>

The <Relationship> node defines the inner join between Customers and Orders on the common CustomerID field. In the relational schema file (*.rsd), the description of the Orders table can't lack a foreign key constraint.

<rsd:ForeignKey ForeignTable="Customers" Name="FK_Orders_Customers">
    <rsd:ColumnMatch ForeignName="CustomerID" Name="CustomerID" />
</rsd:ForeignKey>

The object schema file now features a new node named <ObjectRelationships>. The node describes the type of relationship set between a parent class (Customer) and a child class (Order), both defined in the OSD resource.

<osd:ObjectRelationships>
   <osd:ObjectRelationship Name="Customers_Orders" Type="OneToMany" 
        ParentClass="Customer" ParentMember="Orders"
        ChildClass="Order" ChildMember="Customer" />
</osd:ObjectRelationships>

Figure 4 demonstrates a sample console application that makes use of spans to retrieve hierarchical data.

Figure 4. Hierarchical data retrieved through ObjectSpaces

Delay Loading

To improve performance and memory use in parent/child relationships, ObjectSpaces provides a facility known as "delay loading". It works both for one-to-many and one-to-one relationships. The idea is that child objects are loaded in memory and on demand only at the time that they are requested.

Two new objects are involved with this functionality. The ObjectList provides delay loading for a one-to-many relationship; the ObjectHolder provides delay loading for a one-to-one relationship. Both objects can be considered as container objects for the data to load on demand. ObjectList exposes the actual object through the InnerList property. ObjectHolder exposes the delay-loaded object through the InnerObject property. Both properties are declared as type Object.

To gain the benefits of strongly typed programming, you typically wrap the delay-loaded object with an ad hoc property of a particular type. In the following code snippet, the property Orders is implemented through an internal member of type ObjectList. The get accessor of the property casts InnerList to its actual type, OrderList.

Public Class Customer
   Private m_Orders As ObjectList = New ObjectList()
   Public Property Orders As OrderList
      Get
      Return m_Orders.InnerList
      End  Get
      Set (ByVal Value As OrderList)
      m_Orders = Value
      End Set
   End Property
   :
End Class

Delay-loaded properties must be properly mapped to ObjectSpaces. You use the LazyLoad attribute in the OSD mapping file.

<osd:Member Name="m_Orders" Alias="Orders" LazyLoad="true" />

Additionally, the OSD needs to reflect that the relationship is based on the private member, not the public member.

<osd:ObjectRelationship Name="Customers_Orders" Type="OneToMany" 
     ParentClass="Customer" ParentMember="Orders"
     ChildClass="Order" ChildMember="m_Orders" />

Data associated with the delay-loaded property is retrieved the first time the property is programmatically accessed. This happens transparently to your code. Note that ObjectSpaces does not implicitly refresh the contents of a delay-loaded property once it has been retrieved. To force a refresh of all related objects for a specified delay-loaded property, you use the Fetch method on the ObjectEngine object.

Benefits of ObjectSpaces

Although far from being complete, ObjectSpaces qualifies as one of the most interesting new features in Visual Studio 2005. It is an Object/Relational mapping tool fully integrated with ADO.NET and .NET technologies. It puts an abstraction layer between your business tier and the raw-data tier where the physical connection to the data source is handled. You think about and design application features using objects, and ObjectSpaces does the dirty work of reading and writing data across a variety of data sources (currently it works only with SQL Server 2000 and 2005) using SQL statements.

ObjectSpaces plugs into an application as easily as any other .NET assembly. The programming model of ObjectSpaces is aligned to ADO.NET as much as possible.

ObjectSpaces, and any other similar framework, adds some performance overhead to any application and requires that developers get acquainted with its programming model. Today, quite a few points appear critical for the success of ObjectSpaces. In no particular order, they are: the efficiency of the tracking system, the lack of ad hoc mapping tools, the quality of SQL code, and perhaps the overall performance. Mapping tools to automate the creation of schemas will be available by the time Visual Studio 2005 ships, if not in the Beta 1 timeframe. A user's sample mapper utility was demonstrated at PDC and can be found here. Changes to the tracking system are in the works today just to improve its overall performance and effectiveness.

By the way, in Visual Studio 2005, we're talking about a product that today is far from being released and that has not even entered a public beta testing phase. I want to emphasize that this is only the beginning. Let's wait and see; it sounds terrifically promising.

About the Author

Dino Esposito is a trainer and consultant based in Rome, Italy. A member of the Wintellect team, Dino specializes in ASP.NET and ADO.NET and spends most of his time teaching and consulting across Europe and the United States. In particular, Dino manages the ADO.NET and .NET Framework courseware for Wintellect and writes the "Cutting Edge" column for MSDN Magazine.