XSLTC — Compile XSLT to .NET Assembly
In my two previous posts I described a potential performance hit caused by XSLT-to-MSIL compilation and JIT-compilation when you load and run some XSLT stylesheet with the
XslCompiledTransform engine for the first time. Since the .NET Framework 2.0 did not allow you to save compiled stylesheets, you had to pay the compilation price on each application run.
XSLT Compiler Utility
The good news is we are providing the XSLT Compiler command-line utility
xsltc.exe (announced here) that can be used to compile multiple stylesheets into one assembly. The changes to the System.Xml assembly required for this utility to work are shipped with .NET Framework 2.0 Service Pack 1, and the utility itself is shipped with Windows SDK 6.0, which absorbs .NET Framework SDK. Both these components will be installed by Visual Studio 2008. Below is the usage screen of
C:\>xsltc.exe /? Microsoft (R) XSLT Compiler version 3.5 [Microsoft (R) .NET Framework version 2.0.50727] Copyright (C) Microsoft Corporation. All rights reserved. xsltc [options] [/class:<name>] <source file> [[/class:<name>] <source file>...] XSLT Compiler Options - OUTPUT FILES - /out:<file> Specify name of binary output file (default: name of the first file) /platform:<string> Limit which platforms this code can run on: x86, Itanium, x64, or anycpu, which is the default - CODE GENERATION - /class:<name> Specify name of the class for compiled stylesheet (short form: /c) /debug[+|-] Emit debugging information /settings:<list> Specify security settings in the format (dtd|document|script)[+|-],... Dtd enables DTDs in stylesheets, document enables document() function, script enables <msxsl:script> element - MISCELLANEOUS - @<file> Insert command-line settings from a text file /help Display this usage message (short form: /?) /nologo Suppress compiler copyright message
The most useful options are
/out. If you have not specified the class name for some stylesheeet, it is defaulted to the name of the file containing that stylesheet, omitting the extension. The
/debug option disables practically all optimizations (beware of performance degradation!) and creates a PDB file for the output assembly, which allow debugging stylesheets with a debugger. For security reasons, DTDs in stylesheets, the
document XSLT function, and
msxsl:script elements are disabled by default; you have to explicitly enable them using the
/settings option if required. Each stylesheet is compiled into an abstract class, which can be loaded later by a new
public void Load(Type compiledStylesheet);
Compiling stylesheets into an assembly both simplifies the deployment (you don't have to deploy multiple stylesheet files) and eliminates XSLT-to-MSIL compilation time. Moreover, you may also eliminate JIT-compilation time by installing the resulting assembly in the native image cache.
How to Use It
C:\docbook-xsl-1.72.0>xsltc /settings:dtd+,document+ /class:DocBookToHtml html\docbook.xsl /class:DocBookToFO fo\docbook.xsl
If you run the ILDASM tool on the resulting
docbook.dll assembly, you will see two classes,
DocBookToHtml generated for the stylesheets specified on the command line along with two helper
$ArrayType$... classes used internally to initialize XSLT engine runtime tables:
Assembly with compiled DocBook stylesheets
To use compiled stylesheets from your favorite .NET language, you need to add a reference to
docbook.dll to your project, and pass the desired class to the
XslCompiledTransform.Load method. After that you may call
Transform methods on the loaded
XslCompiledTransform object the usual way:
XslCompiledTransform stylesheet = new XslCompiledTransform(); stylesheet.Load(typeof(DocBookToHtml)); stylesheet.Transform("input.xml", "output.html");
To improve startup time you may choose to "pre-JIT" the assembly, installing a native image for it in the native image cache. However, before that you probably want to change the preferred base address of the assembly to avoid rebasing (I recommend reading Improving Application Startup Time and NGen Revs Up Your Performance with Powerful New Features articles). The
xsltc.exe utility does not support the
/baseaddress option, but you may use either
editbin.exe tool, both of which come with Visual Studio®:
C:\docbook-xsl-1.72.0>editbin.exe /rebase:base=0x60000000 docbook.dll /nologo C:\docbook-xsl-1.72.0>ngen install docbook.dll /nologo Installing assembly C:\docbook-xsl-1.72.0\docbook.dll Compiling 1 assembly: Compiling assembly C:\docbook-xsl-1.72.0\docbook.dll ... docbook, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null
You may ask why we decided to compile stylesheets to abstract classes instead of implementing some common interface similar to
IXmlTransform from Mvp.Xml project. There were two main reasons. First,
System.Xml is a "red" assembly, and changes in the red bits have been greatly limited in Orcas. We tried to make public API changes as minimal as possible. Second, implementing XSLT 2.0 in the next release of the .NET Framework will probably require us to change the interface anyway.
If the stylesheet contains
msxsl:script elements, their content is compiled to one or more separate assemblies using the CodeDOM technology. Since the CodeDOM does not allow having code snippets in different languages in a single assembly, one script assembly per script language is created. Suppose, for example, that the stylesheet
MyTransform.xsl contains C# and Visual Basic .NET script blocks. When you compile it, three assemblies will be created:
MyTransform.dll, containing compiled XSLT code,
MyTransform.Script.cs.dll, containing compiled C# script blocks, and
MyTransform.Script.vb.dll, containing compiled Visual Basic .NET script blocks. You may merge script assemblies with the XSLT assembly using the ILMerge utility:
C:\MyTransform>ILMerge /out:MyTransform.dll MyTransform.dll MyTransform.Script.cs.dll MyTransform.Script.vb.dll
xsltc.exe does not allow to embed XML files as resources. Why might you need that? Suppose that the stylesheet
C:\MyTransform\MyTransform.xsl contains relative document references
document('config.xml'). If you compile it and deploy to another machine, it will try to read
C:\MyTransform\config.xml file respectively, which will result in an error unless you deploy
config.xml in the same folder as on the build machine. You may think that relative document references should be resolved relative to the location of the compiled XSLT assembly, or that all documents referenced with relative URIs should be embedded in the assembly, but there are always cases when you need a different behavior. Fortunately, this problem may be resolved by modifying
xsltc.exe to use a custom
XmlResolver; I may write on this later.
Another limitation is that while
XslCompiledTransform compiles a stylesheet to a set of unloadable
DynamicMethods, an assembly generated by
xsltc.exe cannot be unloaded until you shut down all
AppDomains that used it (an infamous CLR limitation). This should not be a problem if you have a small set of fixed stylesheets, but becomes a real issue in server scenarios when thousand of stylesheets are generated dynamically based on user settings and customizations. We are actively investigating possible solutions for server scenarios, which do not require complicated
Under the Hood
Under the hood,
xsltc.exe is a wrapper around the new
XslCompiledTransform.CompileToType static method. You don't need to know about it unless you are developing your own version of the XSLT compiler. We expect that very few people will ever need to call this low-level method directly, as most will use
xsltc.exe and optionally do some post-processing with other command-line utilities. However, for the sake of completeness, here is its brief description. (WARNING: The signature of the
CompileToType method in beta releases of .NET Framework 2.0 SP1 may differ from the one given below.)
// Compiles an XSLT stylesheet to a System.Type public static CompilerErrorCollection CompileToType( XmlReader stylesheet, XsltSettings settings, XmlResolver stylesheetResolver, bool debug, TypeBuilder typeBuilder, string scriptAssemblyPath);
XmlReader positioned on the beginning of the stylesheet.
XsltSettings to apply to the stylesheet. If this is
XsltSettings.Default settings are applied.
XmlResolver used to resolve any stylesheet modules referenced in
xsl:include elements. If this is
null, external resources are not resolved.
true to compile in debug mode; otherwise
false. Setting this to
true enables debugging the stylesheet with a debugger.
TypeBuilder to use for the stylesheet compilation.
The base path for the assemblies generated for
msxsl:script elements. If only one script assembly is generated, this parameter specifies the path for that assembly. In case of multiple script assemblies, a distinctive suffix will be appended to the file name to ensure uniqueness of assembly names.
CompilerErrorCollection object containing compiler errors and warnings that indicates the results of the compilation.
Note that the first three parameters are the same as in
XslCompiledTransform.Load method. The
xsltc.exe utility creates an
AssemblyBuilder and a
MethodBuilder, then for each stylesheet specified on the command line creates a
TypeBuilder, and compiles the stylesheet into it using the
CompileToType method. Compiler errors and warning returned from the
CompileToType method are output to the console. If all stylesheets have been compiled successfully, the dynamic assembly is saved to disk. If you are new to Reflection.Emit, you may find this dynamic assembly sample code useful.
xsltc.exe utility allows you to precompile XSLT stylesheets so that your application will not incur the performance penalty of XSLT-to-MSIL and JIT-compilation on the first stylesheet execution. It also makes deployment of complex XSLT solutions, consisting of dozens of files, less cumbersome and protects your source XSLT code. Multiple stylesheets may be compiled into a single assembly, and the resulting assembly may be merged with the main DLL or EXE file of your application using the ILMerge utility.