You Can’t Always Get What You Want, You Get What You Need
I get a lot of feedback from customers on Assembly Resolution. It normally comes in the form of "I can't get the reference I want in my project". And more often than not, I find that what they want is really not what they want at all. Instead, a myriad of confusion and lore have clouded the way. This post will hopefully dispel the myths.
While we could start at the beginning, let's start at the end and work backwards toward a better understanding. And, we can start with our real expectation. We expect that when we write an app that references other assemblies, that those assemblies can be found and are loaded as needed by our code such that we can execute the code in them. That is pretty simple.
At runtime, our code is loaded into a process. At this point, all of the static references that we have made are resolved.
Most of this work is handled by the binder in the CLR. The binder (aka Fusion or CLR loader) cracks open the referencing assembly or EXE to obtain its manifest. The manifest contains a list of all the assemblies that are being referenced. For example, a 2.0 entry might have a reference like:
.assembly extern System
.publickeytoken = (B7 7A 5C 56 19 34 E0 89 ) // .z\V.4..
The entry above is one I pulled out of a random assembly on my disk, and references the System assembly. To get this, I used ILDASM.exe to open the assembly and double-clicked on the manifest.
It specifies the public key token and the specific version that needs to be loaded. The binder uses this information to then locate the assembly and bind to it. (The algorithm for binding is documented here.) Since the manifest explicitly calls out each assembly it references, all that needs to be done is to find it and bind to it. And the binder does this work for us.
But, there are times when you may need to load an assembly on the fly. Dynamic assembly binding can be done through Assembly.Load(…) and Assembly.LoadFrom(…). Suzanne Cook's old blog has a lot of good notes on this. To call them, you need to know the assembly you want. The locating and binding, just like the static case, is handled for you.
The important thing to note at this point is that we already know the assembly that we want to bind to; as it was stored in the assembly's manifest.
Note: If you have built your assembly, and you are having trouble getting the assembly to load, you can use FusLogVw.exe to debug the issue. (FusLogVw docs are here) These are issues with the binder probing for the assembly and not finding it.
So, how do we just know the assembly that we want? Well, obviously this starts with a need for some library or piece of code that we want to run. (Duh!) But, how does the manifest get its list of assemblies?
This answer is that assemblies are resolved at build time through a process called RAR (Resolve Assembly References). And other references such as COM references and the like are resolved through similar mechanisms such as RCR (Resolve COM References). The resolution process takes the name of the library that you requested and finds the reference that you need.
Why does RAR need to do anything? I just told you what I want! To discuss why this is, lets first talk about multi-targeting, and then it should become clear.
Targeting is the term we use to describe the fact that your software will run on some specific device or machine. Now, a long time ago, there may have been only one version of a Framework, or one version of DOS, Windows, etc. that you might want to run your application on. And generally, each version of Windows tries to be backwards compatible with the last. So, unless you are using some new feature, you are good to go. But, how do you know? What machines will your software run on? If there is only one target, then "no problem", everything you can code should just work. This was true at the release of .NET v2.0 and Visual Studio 2005.
Note that the targeting problem is to make sure that if you run the software on a target machine, we have:
- Built the software for that target environment
- That the code that you are using is available on the target
And, to simplify things, we are not including the architecture (x86, x64, etc.) of the machine that you are targeting for this discussion. We are really only considering the target runtime environment.
With the release of Visual Studio 2008 and .NET Framework v3.5, there were suddenly two (2) frameworks that you could choose from. There was the .NET Framework v2.0 and the new .NET Framework v3.5. Note that both of these frameworks actually run on a single CLR (Common Language Runtime); the v2.0 CLR to be specific. The difference between these two frameworks is basically the new set of functionality delivered in v3.5, which included the new LINQ functionality and the Entity Framework. As a result, Visual Studio 2008 introduced a new concept that we now call "pseudo multi-targeting" which allowed you to specify that you wanted to target v2.0 or v3.5.
At compile time, if you had selected to target v2.0 and you attempted to use something that was only available in v3.5, you would get a compiler error. This allowed you to specifically target the 2.0 framework and be sure that you would not load up you app and find that the machine you were running on did not have the required components to run your application.
For assembly resolution, this means that a company might provide two different components in their SDK. One of those components might be for v2.0, and the other might use new features in v3.5, and as a result have a dependency on the v3.5 framework.
In Visual Studio 2010, things got quite a bit more complex. With this release, the new .NET Framework v4.0 was release. And with it came a new CLR v4.0. This means that the CLR that the design time environment is running on, may not be the same as the actual runtime environment. This is where "real" multi-targeting kicked in.
With Visual Studio 2010, you can target .NET Framework v2.0. If you build you app and run it, it will always run on a machine that has .NET v2.0 installed on it. If you target v4.0, your app will have the 4.0 CLR reference, and hence will not run on v2.0. But, you will be able to use all the cool new features and changes in .NET v4.0. (for example, the new parallel Task library)
At this point, I think you pretty much get the idea. At build time, the process determines what you are targeting, and takes the reference that you have requested, and looks at the SDKs and components installed on your machine, and it selectively chooses the right assembly to match you target environment. Then, after having figured out all the specific references that you need, we pass those to the compiler, who then makes sure that you don't right code against some feature that won't be available when you go to run your application.
The rules for how we do the actual resolution are very specific, and they are, to be fair, quite complicated. But, with Visual Studio, you don't need to worry about that. When you go to the Add Reference dialog, you select an assembly or project from one of the tabs. We then write down your choice in the project file. When we build the project, we select the references for you based on what is the best selection for your target. If you later change your mind and re-target your application to something else, there is no need to fix up any of the references. You rebuild the project, and we then select the right references for the new target, and voila.
I hear this all the time, that "the Add Reference dialog is a Window to the GAC". It is not. The GAC is a runtime mechanism. However, we will look for items in the GAC for resolution as needed. In the Add Reference dialog, we very specifically show you the Framework assemblies for the target framework, and the matching registered SDK assemblies (as specified in AssemblyFoldersEx).
You can also add what we refer to as a "P2P" or Project to Project reference from the Add Reference dialog. This will get you a reference to another project in your build. We also use this in MSBuild to determine the dependencies so we know what to build first. (And I would not recommend the use of Solution Dependencies. But, that's another article.)
If you literally have an assembly that you can't reference any other way, given to you by another developer for example, then you can put it in your project folder or in a sub-folder. (although I don't recommend the "bin" folder as I blow this directory away all the time, and then you won't have your reference; plus if you put it in source control it's a nightmare) Next, use the Browse button in the Add Reference dialog to add the reference to it. This will use a HintPath to specify where the file is. Since it's not registered on the machine (that is, not part of an SDK and hence registered appropriately in AssemblyFoldersEx), then we really don't have any information to go on regarding multi-targeting. As a result, you are pretty much on your own if things go wrong at runtime. And the warnings and errors you may bet at build time when things really go wrong are confusing at best if you are not sure what you are doing. Don't get me wrong, this should work just fine, it's just not the preferred way of doing things.
I get lots of feedback from customers about issues with referencing. Most of the time the issues I see are related to the fact that we did a poor job, in my mind, of explaining how this all works. As a result, there are a lot of folks out there attempting to force a specific version of a reference, using HintPath, and other such nonsense to get their project to build. In reality, for referencing, "less is more". If you tell us you are targeting a specific framework, and you give us a reference like "System.Drawing", we will find the write "System.Drawing" to match you needs.
And actually, if you tell us you want to reference "System.Drawing, version=4.0.30319, Culture=neutral, PublicKeyToken=b77a5c561934e089" and you are targeting .NET v2.0, we will politely give you a warning, and give you the one that goes with the v2.0 framework. In this case we know that right version because we know all about the frameworks and what assemblies are the ones for specific framework targets. For third party assemblies supplied in SDKs, we only have limited information. But, either you will get an error when you compile your code, or worse, your application will fall over when you try to run it in the target environment.
So, leave the resolution to us. Let Visual Studio use the short name for an assembly. Don't use HintPath unless you have no other way to specify the reference, and realize that your off the beaten path, and you are not protected by multi-targeting.