This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.
|Programming for the New Platform|
or the past year or so, I've been focusing my attention on the MicrosoftÂ® .NET common language runtime platform. In my opinion, most new development will target this platform because it makes application development so much easier and faster. I also expect existing application development to move toward .NET at a rapid pace.
True Object-oriented DesignFor programmers using the Win32Â® SDK, access to most of the operating system features is through a set of standalone functions exported from DLLs. These standalone functions are very easy to call from non-object-oriented languages like C. However, it is quite daunting for new developers to face literally thousands of independent functions that, on the surface, seem unrelated. Making things more difficult is the fact that many functions start with the word Get (for example, GetCurrentProcess and GetStockObject). In addition, the Win32 API has evolved over the years and Microsoft has added new functions having similar semantics but offering slightly different features over the earlier functions. You can usually identify the newer functions because their names are similar to the original function's name (such as CreateWindow/CreateWindowEx, CreateTypeLib/CreateTypeLib2, and one of my personal favorites: CreatePen/CreatePenIndirect/ExtCreatePen).
All of these issues have given programmers the impression that developing for WindowsÂ® is difficult. With .NET, Microsoft is finally addressing developers' cries for help by creating an entirely object-oriented platform. Platform services are now divided into individual namespaces (such as System.Collections, System.Data, System.IO, System.Security, System.Web, and so on), and each namespace contains a set of related class types that allow access to the platform's services.
Since class methods may be overloaded, methods that differ just slightly in their behavior are given identical names and differ only by their prototypes. For example, a class might offer three different versions of a CreatePen method. All methods do the same thing: create a pen. However, each method takes a different set of parameters and has slightly different behavior. In the future, should Microsoft need to create a fourth CreatePen method, the new method would fit seamlessly into the class as a first-class citizen.
Since all platform services are offered via this object-oriented paradigm, software developers should have some understanding of object-oriented programming. The object-oriented paradigm enables other features as well. For example, it is now easy to create specialized versions of the base class library's types using inheritance and polymorphism. Again, I strongly suggest you become familiar with these concepts, as they are now critical to working with the Microsoft .NET Framework.
System.ObjectIn .NET, every object is ultimately derived from the System.Object type. This means that the following two type definitions (shown using C#) are identical:
Since all object types are ultimately derived from System.Object, you are guaranteed that every object of every type has a minimum set of capabilities. The public methods available in the System.Object class are shown in Figure 1.
The common language runtime requires that all object instances be created using the new operator (which calls the newobj IL instruction). The following code shows how to create an instance of the Jeff type (declared previously):
The new operator creates the object by allocating the number of bytes required for the specified type from the managed heap. It initializes the object's overhead members. Every object has some additional bytes that the common language runtime uses to manage the object, such as the object's virtual table pointer and a reference to a sync block.
The class type's constructor is called, passing it any parameters (the string "ConstructorParam1" in the earlier example) specified in the call to new. Note that most languages will compile constructors so that they call the base type's constructor; however, this is not required by the common language runtime.
After new has performed all of the operations I just mentioned, it returns a reference to the newly created object. In the example code, this reference is saved in the variable j, which is of type Jeff.
By the way, the new operator has no counterpart. That is, there is no way to explicitly free or destroy an object. The common language runtime offers a garbage-collected environment that automatically detects when objects are no longer being used or accessed and frees the objects automatically, which is a topic I plan to cover in an upcoming issue of MSDNÂ® Magazine.
Casting Between Data TypesWhen programming, it is quite common to cast an object from one data type to another. In this section, I'll look at the rules that govern how objects are cast between data types. To start, look at the following line:
The previous line of code compiles and executes correctly because there is an implied cast. The new operator returns a reference to a Jeff type, but o is a reference to a System.Object type. Since all types (including the Jeff type) can be cast to System.Object, the implied cast is successful.
However, if you execute the following line, you get a compiler error since the compiler does not provide an implicit cast from a base type to a derived type.
To get the command to compile, you must insert an explicit cast, as follows:
Now the code compiles and executes successfully.
Let's look at another example:
On the first line, I have created an object of type System.Object. On the second line, I am attempting to convert a reference of type System.Object to a reference of type Jeff. Both lines of code compile just fine. However, when executed, the second line generates an InvalidCastException exception, which if not caught, forces the application to terminate.
When the second line of code executes, the common language runtime verifies that the object referred to by o is in fact an object of type Jeff (or any type derived from type Jeff). If so, the common language runtime allows the cast. However, if the object referenced by o has no relationship to Jeff, or is a base class of Jeff, then the common language runtime prevents the unsafe cast and raises the InvalidCastException exception.
C# offers another way to perform a cast using the as operator:
The as operator attempts to cast an object to the specified type. However, unlike normal casting, the as operator will never throw an exception. Instead, if the object's type cannot be cast successfully, then the result is null. When the ill-cast reference is used, a NullReferenceException exception will be thrown. The following code demonstrates this concept.
In addition to the as operator, C# also offers an is operator. The is operator checks whether an object instance is compatible with a given type and the result of the evaluation is either True or False. The is operator will never raise an exception.
Note, if the object reference is null, the is operator always returns False since there is no object available to check its type.
To make sure you understand everything just presented, assume that the following two class definitions exist.
Now, check Figure 2 to see which lines would compile and execute successfully (ES), which would cause a compiler error (CE), and which would cause a common language runtime error (RE).
Assemblies and NamespacesA collection of types can be grouped together into an assembly (a set of one or more files) and deployed. Within an assembly, individual namespaces can exist. To the application developer, namespaces look like logical groupings of related types. For example, the base class library assembly contains many namespaces. The System namespace includes core low-level types such as the Object base type, Byte, Int32, Exception, Math, and Delegate, while the System.Collections namespace includes such types as ArrayList, BitArray, Queue, and Stack.
To the compiler, a namespace is simply an easy way of making a type's name longer and ensuring uniqueness by preceding the name with some symbols separated by dots. To the compiler, the Object type in the System namespace really just identifies a type called System.Object. Similarly, the Queue type in the System.Collections namespace simply identifies a type called System.Collections.Queue.
The runtime engine doesn't know anything about namespaces. When you access a type, the common language runtime just needs to know the full name of the type and which assembly contains the definition of the type so that the common language runtime can load the correct assembly, find the type, and manipulate it.
Programmers usually want the most concise way of expressing their algorithms; it is extremely inconvenient to refer to every class type using its fully qualified name. For this reason, many programming languages offer a statement that instructs the compiler to try appending various prefixes to a type name until a match is made. When coding in C#, I usually place the following line at the top of my source code modules:
When I refer to a type in my code, the compiler needs to ensure that the type is defined and that my code accesses the type in the correct way. If the compiler can't find a type with the specified name, it tries appending "System." to the type name and checks if the generated name matches an existing type. The previous line of code allows me to use Object in my code, and the compiler will automatically expand the name to System.Object. I'm sure you can easily imagine how much typing this saves.
When checking for a type's definition, the compiler must know which assembly contains the type so that the assembly information and the type information can be emitted into the resulting file. To get the assembly information, you must pass the assembly that defines any referenced types to the compiler.
As you might imagine, there are some potential problems with this scheme. For programming convenience, you should avoid creating types that have conflicting names. However, in some cases, it is simply not possible. .NET encourages the reuse of components. Your application may take advantage of a component created by Microsoft and another component created by me. Both of these companies' components may offer a type called FooBarâ€"Microsoft's FooBar does one thing and Richter's FooBar does something entirely different. In this scenario, you had no control over the naming of the class types. To reference the Microsoft FooBar, you'd use Microsoft.FooBar and to reference my FooBar, you'd use Richter.FooBar.
In the following code, the reference to FooBar is ambiguous. It might be nice if the compiler reported an error here, but the C# compiler actually just picks one of the possible FooBar types; you won't discover a problem until runtime:
To remove the ambiguity, you must explicitly tell the compiler which FooBar you want to create.
There is another form of the using statement that allows you to create an alias for a single type. This is handy if you have just a few types that you use from a namespace and don't want to pollute the global namespace with all of a namespace's types. The following code demonstrates another way to solve the ambiguity problem.
These methods of disambiguating a type are useful, but there are scenarios where this is still not good enough. Imagine that the Australian Boomerang Company (ABC) and the Alaskan Boat Corporation (ABC) are each creating a type, called BuyProduct, which they intend to ship in their respective assemblies. It is likely that both companies would create a namespace called ABC that contains a type called BuyProduct. Anyone who tries to develop an application that needs to buy both boomerangs and boats would be in for some trouble unless your programming language gives you a way to programmatically distinguish between the assembliesâ€"not just between namespaces.
Unfortunately, the C# using statement only supports namespaces and does not offer any way to specify an assembly. However, in the real world, this problem does not come up very often, so it's rarely an issue. If you are designing component types that you expect third parties to use, it is highly recommended that you define these types in a namespace so compilers can easily disambiguate types. In fact, you should use your full company name (not an acronym) to be your top-level namespace name to reduce the likelihood of conflict. You can see that Microsoft uses a namespace of "Microsoft".
Creating a namespace is simply a matter of writing a namespace declaration into your code, as follows:
Note that namespaces are implicitly public. You cannot change this by including any access modifiers. However, you can define types within a namespace that are internal (can't be used outside the assembly) or public (can be accessed by any assembly). Namespaces denote logical containment only; accessibility or packaging is accomplished by placing the namespace into an assembly.
In my next column, I'll describe the different kinds of types that all .NET programmers must be aware of: primitive types, reference types, and value types. A good understanding of value types is extremely important to every .NET programmer.
|Jeffrey Richter is the author of Programming Applications for Microsoft Windows (Microsoft Press, 1999). Jeff specializes in programming/design for .NET and Win32. He is also a cofounder of Wintellect, a software consulting and education firm, and can be reached at http://www.JeffreyRichter.com.
From the October 2000 issue of MSDN Magazine.