Short notes on the Rotor JIT

I wrote some notes on the Rotor JIT before I started at Microsoft. My intention was to play around and figure out how to replace or extend it, perhaps giving Rotor the ability to spit out processor extension specific code (MMX, 3dNow! etc). I never did get around to finishing the work. 

What kind of cool JIT related stuff would you guys like to see? 

Dumping the native code the JIT emits:

  • Set a COMPlus_JitHalt at the method you wish to see the JIT output of

  • cdb clix.exe ProgramName.exe
    g
    !sos.u <value of EIP>

  • You may also want to view the runtime stack:

    0:000> dds esp

Tracing the compilation of IL to native code:

With the following steps, you can see the JIT emitting the x86 on the fly into memory.

  • Start up a trusty VS.NET debugging session with clix.exe ProgramName.exe (you can do this by typing: devenv clix.exe programname.exe, and right click on clix, and click step into)
  • Set a breakpoint at the following places:
    - fjit.cpp[2607] -> FJitResult FJit::jitCompile(...)
    - fjit.cpp[2958] -> switch (opcode)
  • Add a watch to the following variables:
    - codeBuffer
    - opcode
    - szDebugClassName
  • Run until you hit your class name in szDebugClassName - step into some of the macro emissions for the IL it's switching on. All the native code output is going straight into the codeBuffer - You can then compare this with the dissasembled output (as above).

Macro emission:

  • fjit.cpp calls abstracted processor functions which are #defined away in the relavent processor specific fjit file. In this case: x86fjit.h. In turn, more stuff is #defined away and used in x86def.h

  • x86def.h is where the fun stuff happens (x86 native emission). These are the defines which the JIT uses to output the target processors native code.

    For an example of how the flow works, we'll look at emitting the method return.

    fjit.cpp
    emit_return(0, mapInfo.hasRetBuff);

    x86fjit.h
    #define
    emit_return(argsSize,hasRetBuff) x86_emit_return(argsSize)

    x86fjit.h #define x86_emit_return(argsSize)
    x86_mov_reg(x86DirTo, x86Big, x86_mod_ind_disp(X86_ESI, X86_EBP, 0-sizeof(void
    *)));
    x86_mov_reg(x86DirTo, x86Big, x86_mod_reg(X86_ESP,X86_EBP));
    x86_pop(X86_EBP);
    x86_ret(argsSize)

    x86def.h
    #define x86_mov_reg(dir, size, addMode)
    (/*_ASSERTE(size == x86Byte | size == x86Big),*/
    cmdBlock2(cmdByte(expNum(0x88 | dir | size)),addMode))

  • The above shows a basic flow of how the code is #defined away from abstract functions to processor specific definitions. Remember -> Most x86def.h functions will eventually emit native processor codes into memory. These codes are the ``0x88'' (and similar) hex codes we see scattered around in functions like expNum(..).

Mapping the native Jit output to processor hex codes:

  • ntsd and cdb does a create job of that for us. Just ``!sos.u'' away the method and we come across the human readable and the Hex formats. In the case of the emit_return the output looks like this:

    8b 75 fc 8b e5 5d c3    (we can see this in codeBuffer)

    (ntsd.exe output)
    8b75fc          mov     esi,[ebp-0x4]
    8be5             mov     esp,ebp
    5d                 pop     ebp
    c3                 ret

  • Just for completeness -> the c3 ret instruction is emitted under #define x86_ret(bytes)

    cmdByte(expNum(0xC3))

General flow:

  • The overall picture is a fairly simple one -> Call jitCompile on all methods; emit processor specific code switching on the the IL opcode; once we've hit the ``ret'' IL opcode, bail out and update the MethodTable with the offset of the native code buffer we've just used for native code emission.

    FJitResult FJit::jitCompile(BYTE ** ReturnAddress, unsigned ReturncodeSize)

    while (!FinishedJitting)
    {
    ...
    switch (opcode)
    // emit native code into codeBuffer (outPtr)
    case CEE_RET:
    FJitResult FJit::compileCEE_RET()
    FinishedJitting = true;
    }