A Brief History of DateTime [Anthony Moore]
For the .NET 3.5 release, the BCL team has been trying to solve some problems related to dates, times and time zones that customers having been asking us to address for some time. We’ve invested in two significant features:
- TimeZoneInfo: Extended Time Zone Support. This allows for enumeration, conversion and serialization of a time zones other than a machine’s current time zone. This includes support for time zones that change Daylight Saving Time rules over time, such as the recent 2007 change in the USA.
- DateTimeOffset: A DateTime that includes a UTC Offset. This allows a local time zone to map to an absolute UTC time and preserves the local time even if the instance is moved from one machine to another.
DateTimeOffset is a completely new DateTime type that stores the UTC Offset in addition to the DateTime. In presenting the design of this type, there are a couple of questions that keep coming up that we would like to address. The full rationale behind these provisional decisions requires going back through some of the history of DateTime over the last 8 years.
In .NET 3.5 there will be a new DateTime type called DateTimeOffset. The name of this type is an interesting topic in itself, but current plans are that this new type will complement rather than fully replace the existing DateTime type, and thus it will not be called DateTime2. A couple of questions keep coming up about this which I will address in this post by covering some relevant history:
- Why can’t the existing DateTime type simply be extended to have the UTC offset in it?
- If (1) is not possible, why can’t the new type be called DateTime2 and completely replace all the usage scenarios of the existing DateTime?
It is necessary to start with the original rationale and design issues surrounding the original DateTime type.
In the Beginning (V1.0)
The original DateTime type was mostly designed in 1998-1999, as part of the the new .NET Platform type system. Up until that time, almost all platforms (e.g. Win32, C, VB, COM, databases) represented dates and times with a combined date and time type that had no time zone information. The time zone context was outside the type’s data and in the application’s context, in the same way that using Decimal to represent money involves the context of the currency unit living outside the Decimal instance. The option of a DateTime with a UTC offset or a time zone ID was discussed and there was a lot of design back and forth. There was a lot of desire to keep the type system simple, and while a time zone aware DateTime was better at representing absolute points in time such as a file time-stamp, it got awkward when doing certain things:
- Interoperating with database, COM or legacy systems where the time zone could not be reliably inferred.
- Representing genuinely abstract dates and time, e.g. the opening times for stores of a company spanning time zones.
- Representing whole dates, e.g. the type for a calendar control to return. (Obviously a “Date” type is an even better way of representing this, and many people have asked for separate Date and Time types, but in V1.0 there was a desire to keep the number of base types small to limit complexity of the platform. This feature request is definitely still on the table, although not planned for .NET 3.5).
Another factor was that many prior platforms had lived with DateTime types like this without running into serious issues. In the end, the desire to keep the platform simple won out, and the DateTime was a date and a time and nothing else, although support for conversion between local and UTC was provided using the machine’s local time zone.
Once this was shipped, a number of serious problems became clear:
- Data loss during DST rollback. Unfortunately the decision to use a simple DateTime type was combined with the convention of defaulting to returning local instances from almost all .NET Framework APIs. Of 84 APIs returning “absolute” points in time in the framework, 81 returned local instances. This resulted in data loss during the repeated hour when the clock was rolled back coming out of DST. Win32 did not have this problem because APIs defaulted to UTC. It was desirable for the .NET Framework to return more user-friendly local times, but the reliability implications were not fully evaluated before shipping. This was definitely an unfortunate mistake.
- Arithmetic errors during DST transitions. As above, the prevalent use of local DateTime instances created a usability problem in that people would do arithmetic with local instances, which was wrong more often than not because of changes to the clock during transitions in and out of DST. Again, Win32 did not have this problem because of the default to UTC. This can be worked around, although the data loss problem above meant that it could not be fully worked around.
- Serialization in XML defaulted to assuming all DateTime instances were local, which prevented correct serialization of UTC instances, abstract instances or date-only instances. It would have been better to not assume anything about the time zone since the instance had no such information.
- DateTime could not deal with the server-side scenario of dealing with local instances from time zones other than the current machine. Correctly handled this required using two fields to represent a DateTime, which was decidedly clumsy.
Changing the Existing DateTime (v2.0)
In V2.0, when the severity of these issues was fully understood, design began on a solution. There was actually a fully developed specification to address the above issues by adding a UTC offset to the existing DateTime type. This was going to be a challenging operation since this was a significant change to what the type was. The only way to keep a reasonable compatibility guarantee was to make the time zone offset optional. However, this proposed change simply did not stand up to design scrutiny and implementation trials, and we discovered that our options of changing the existing type were significantly more limited than we had hoped. Here were some of the problems that emerged:
- Compatibility was the main issue. The only way to have stayed completely compatible would have been for existing APIs to have never set or consumed the offset, even for existing APIs on the DateTime class itself. The optional offset was fine for new code, but having something like DateTime.Now actually set the local time offset, which would have been required to address the data loss problem, would break any code that decided to act differently in the presence of the offset. So you can either have DateTime respect the new offset in existing APIs, but then not allow any existing APIs to return an instance with the offset, or you allow existing APIs to return them, but have DateTime only opt-in to actually using it. Neither option actually addressed the data loss in the existing APIs and would have required them to have been deprecated. There is a case described below that provides a concrete example of the compatibility issues even after the proposed solution was significantly scaled back.
- A lot of systems special cased the binary layout of base types such as DateTime. The design required making the DateTime at least 2 bytes larger. It is not clear if this would have been practical from a compatibility standpoint either.
- If we were to deprecate the existing APIs taking or return DateTime instances, adding a completely new type would have been a much clearer and more compatible way to do this anyway, as it would have been unclear what name the new versions of the existing APIs, and there would have been sets of APIs with identical signatures that differed only by the type of instance of DateTime they took or returned.
- The optional UTC offset created a lot of design consistency problems, because instances without an offset might be used with instances with one. It is not clear what comparison should do. You would either have to disallow comparison, in which case it might as well be a new type anyway, or you could have a compatible comparison that would have odd behavior, such as intransitive comparison. This also creates issues with extra argument validation as most APIs would want to take either absolute or floating instances and not either one.
This was a difficult situation. It was extremely undesirable to leave every existing API taking DateTime and accepting or returning local instances with unreliable behavior around DST transitions. But a comprehensive change to DateTime was not possible in a compatible way either. It was decided to do a very minimal change to DateTime by using some unused bits in the structure to indicate if an instance was UTC or Local, and using an extra state to disambiguate the hour repeated during DST transition to solve the reliability problem. The DateTime.Kind property was introduced to solve the reliability problem without doing a full deprecation of usage of the type. However, it was not a comprehensive solution because to maintain compatibility, the Kind could only be very minimally used in existing APIs. It was decided that we would try to make it possible for existing APIs to use the Kind, but make any behavior changes on DateTime itself opt-in as much as possible. Thus the Kind is used in the critical conversions to and from Local and UTC, but not when DateTime instances are compared or formatted by default.
Even this minimal change presented significant compatibility challenges. An example of this was DateTime.ToString. If the instance was marked with DateTimeKind.Utc, it would makes sense to output its time zone offset as “+00:00” instead of pulling in the machine’s local time which only makes sense for local instances. At one point the code actually did this more correct behavior. However, we found that a significant number of users of DateTime were taking a dependency on this incorrect behavior and were broken by this change. They were serializing UTC instances in XML, and because it defaulted to formatting as a local time and because Parse converted back to the local time zone by default, their code was working most of the time, even though the persisted string would have been incorrect if parsed by a 3rd party system. To maintain compatibility even this change had to be backed out, and correct formatting of the time zone offset had to be made opt-in via the new “o” and “K” format options.
The DateTimeKind solved the most pressing issues with DateTime, but still left the space with a lot of unresolved issues. It was still very difficult to use correctly, and it could not deal well with the server scenarios where instances from different machines in different time zones needed to be handled.
So, this should address the first question of why we are not currently pursuing further the option of adding an optional offset to the existing DateTime type.
The Comprehensive Solution
After .NET 2.0 it became clear that there was still considerable demand for a type with DateTime and a UTC offset combined. This is the type we are delivering in .NET 3.5 (DateTimeOffset). The second question then comes up as to why we are planning to a new type that complements DateTime instead of creating a complete replacement and calling it DateTime2. It is definitely compelling to have a single new type so that there does not have to be complex guidance around which of the types to use for what scenarios.
Even with compatibility taken out of the picture a lot of the reasoning above still makes it compelling to have new type that is always absolute. First, consider the landscape of date and time related scenarios:
- Legacy DateTime Interop (databases, COM, Win32, etc).
- Abstract DateTime scenarios (e.g. store starting time).
- Date-only scenarios.
- Time-only scenarios.
- Absolute time scenarios. (e.g. file time).
- Server scenarios with dates from multiple time zones where the original local time must be retained.
(Scenarios C-D might be even better handled by dedicated Date and Time types. These options are still on the table, although they will not be in the .NET 3.5 release).
The existing DateTime can do scenarios A-E. But it does a mediocre job of E, because you have to convert it to and from UTC and Local depending on whether you want human-readable display or accurate arithmetic. For scenario F, you need to store an additional field to make it work. The proposed new DateTime with UTC offset is significantly better for scenario E and enables scenarios F with a single type instance. However, it is not as good for scenarios A-D where it would have to default to some arbitrary time zone offset. As a point of reference, the majority of DateTime scenarios involve E, although A and C are fairly common also, and F becomes more important when richer Time Zone support is available.
It is possible to create a new type that addresses scenarios A-D as well by making the UTC Offset optional rather than required. But this creates a lot of the same problems encountered during the aborted attempt to create such a type in .NET 2.0:
- The design consistency problems mentioned above related to having interaction of abstract and absolute instances. With an optional offset, you either get behavior different from DateTime, intransitive comparison or the inability to compare or subtract the two instance types at all.
- Most APIs taking DateTime as input will want something that is always abstract or always absolute and would be unlikely to need either at runtime. As such, most APIs would have to do extra argument validation to special case the less desirable behavior. There is value in having the type always constrained to represent an absolute and unambiguous point in time.
- Deprecating DateTime by adding DateTime2 would imply a second-hand deprecation of any API that did take or return DateTime. While the new type is definitely better for some scenarios, DateTime is working just fine in the majority of existing uses, and is no longer subject to data loss in the V2.0 timeframe forward.
- The majority of APIs on the new type would end up having dual behaviors depending on the mode, which would make the class difficult to understand and use.
- Databases that do support dates and times with UTC offsets also do so by having a separate column type that always has an offset. Breaking from this pattern would make moving data to and from a database more complex.
- The main benefit of a single type is to avoid having to describe or understand which type to use for which scenario. However if there was such a combined type with or without a UTC offset, you would just have to transfer the complex explanation from the selection of the type to the use of the type.
An interesting question that is always useful to ask about a design is: Given what we know now, what would we do differently if we could start again?
It would be hard to get broad consensus on this point. My personal opinion is that the date and time space is too complicated to address with a single type. If I could start again, I would have something like the V1.0 DateTime, but have a separate DateTimeOffset type also. The main difference to where we will land with the current plan of record is that we would not need the DateTimeKind that was added to DateTime in V2.0. This was really a band-aid that would not have been needed if a DateTimeOffset had been available in V1. That being said, my own opinion is that the DateTimeKind solution was critical to salvage reliable usage of the existing APIs given that there was no better type to use in V1.0.
- Why can’t the existing DateTime type simply be extended to have the offset in it and enable these new scenarios? The option of doing this has been thoroughly explored, and there is no practical way to do it without breaking compatibility.
- If (1) is not possible, why can’t the new type be called DateTime2 and completely replace all the usage scenarios of the existing DateTime? The new type is less effective for scenarios when there is genuinely no time zone such as dealing with DateTime instances from legacy system, dealing with abstract instances, dealing with date-only instances and dealing with time-only instances. Having it support these scenarios as well and fully deprecating the existing DateTime has issues that may not be outweighed by the benefits.