Nullable syntax

Article
06/15/2004

Had a long talk with Renaud today about nullable types and the interesting ideas they've been pushing through the language. Specifically we've added a few nicities in teh compiler to make using nullable value types as easy as using the actual value type.

For example, in C# you can type:

 
    int i = GetSomeInt();
    int j = GetSomeInt();
    int k = i + j;

    short a = GetSomeShort();
    int b = a;

If you were to use Nullables then you'd have to write:

 
    Nullable<int> i = GetSomeNullableInt();
    Nullable<int> j = GetSomeNullableInt();
    Nullable<int> k = i.HasValue && j.HasValue ? new Nullable<int>(i.Value + j.Value) : (Nullable<int>)null;

    Nullable<short> a = GetSomeNullableShort();
    Nullable<int> b = a.HasValue : new Nullable<int>(a.Value) : (Nullable<int>)null;

Pretty verbose and unweildy. In C# 2.0 you can now write that as:

 
    int? i = GetSomeNullableInt();
    int? j = GetSomeNullableInt();
    int? k = i + j;

    short? a = GetSomeNullableShort();
    int? b = a;

Far far far easier than the version where you have to write out Nullable and pretty close to the code that would have been written in the original non-null case. The two additions to the language that have been added are "nullable conversions" and "lifted operators". The "nullable conversion" is where we allow predefined/user conversions on a type T to be used on a Nullable<T>. i.e. since there is a predefined conversion from byte to int, there is automatically a conversion from byte? to int?. Similarly, for all operators on value types (like op_plus, etc.) there is automatically the operator on the nullable type that will do the appropriate null checking. So if you have "int +(int i, int j)" there now is "int? +(int? i, int? j)". If any of the arguments are null, then the result is null. If all the arguments are non-null then underlying operation is applied to the actual values.

We were thinking about how that made using nullable types far easier and how it allowed you think of a nullable struct as the underlying struct... except for in one way. Operators and conversions are made available to you automatically, however methods/properties/fields are not. i.e. you can't do this:

 
     public struct Vector {
           public Vector Normalize { get; }
           public double Dot(Vector v);
     }

...

   Vector v1 = GetSomeVector();
   Vector v2 = v1.Normalize;
   double d = v1.Dot(v2);

...

   Vector? v1 = GetSomeNullableVector();
   Vector? v2 = v1.Normalize;
   double? d = v1.Dot(v2);

This is because properties and methods aren't lifted up into the nullable type. This is kind of a shame (IMO). I have an entire graphics library written about primitives like colors/points/vectors/matrices, none of which will have the benefit. So I'll have to write:

 
    Matrix? m = GetSomeMatrix();
    Vector? v = m.HasValue ? m.EigenVector : (Vector?)null;

... //instead of

   Vector? v = m.EigenVector;

This woudl be especially bad if I had to do something like "m.EigenVector.Normalize" checking for 'HasValue' every time. One of the issues with lifting preopties is the ambiguities that would arise. consider the following:

 
     struct S {
           bool HasValue { get; }
     }

     ...

     S? s;
     bool b = s.HasValue;

Does the 'HasValue' access refer to Nullable<S>.HasValue or S.HasValue. Renaud, Luke and I talked about this and were considering possible syntax to take care of this situation. One thing Renaud thought was that we could use the -> syntax to accomplish this. For example you could write things out like:

 
   Vector? v1 = GetSomeNullableVector();
   Vector? v2 = v1->Normalize;
   double? d = v1->Dot(v2);

Because you could never have a "T?*" (a Nullable<T>*) then the -> operator would never be ambiguous. However, Luke thinks that bringing in the arrow operator would be incredibly bad because of all the weight that that carries with the C#/C/C++ meaning with pointer dereferencing. However, I see a certain symmetry with pointers and nullable types. For example, you have:

 
     int* i;
     int j = (*i).CompareTo(4);
     int k = i->CompareTo(4);

    ...

    int? i;
    int j = (?i).CompareTo(4);
    int k = i->CompareTo(4);

In the second to last line the (?i) acts to pull the value out of the nullable (or throw if it's null). This is similar to how (*i) will dereference the pointer to the int, or throw if the pointer is null. An alternative to the -> syntax would be to have something like => . It's concise, non-ambiguous, doesn't seem to carry any baggage along with it (well, perl probably uses it for something, but then again perl has a construct for anything). The addition of this syntax would allow for Nullable types to be interchanged with regular value types with very little overhead in syntax to support it. What do you think?

Edit: Daniel raised a good point about the nicities of having sepcialized syntax to lift a method call onto the actual underlying value. Specifically, say you had the following definition of Nullable:

 
public class Nullable where T : struct {
     public readonly T Value;
     public Nullable(T value) {
          this.Value = value;
     }
}

Then if you did:

 
     MyStruct? nullableMe;
     MyStruct me = nullableMe.Value;
     me.DoSomethingThatChangesMyState()
     MethodWithNullableMyStructParameter(nullableMe);

Then you'd have a problem because the changed state in "me" wouldn't be visible in "nullableMe". However, if you had:

 
     MyStruct? nullableMe;
     nullableMe->DoSomethingThatChangesMyState()
     MethodWithNullableMyStructParameter(nullableMe);

Then everything would be ok. (I think). Mutable structs are still really confusing and you should definitely _not_ use them :-)
However, if you do then it will work in the nullable system as well. I don't have a copy of the spec in front of me, but I'm pretty sure that accessing a struct through a field does not give you a copy of the struct but a reference to the actual struct itself.

Nullable syntax

Additional resources