Access to modified closure in Resharper/C#

We all know that Resharper complains about modified closure when you write something like this:

 
        static void Main(string[] args)
        {
            Action action = () => Console.WriteLine(0);
            for (int i = 0; i < 10; i++)
            {
                if (i == 1)
                    action = () => Console.WriteLine(i);
            }

            action();
        }

But it stops, once you write it this way:

 
        static void Main(string[] args)
        {
            Action action = () => Console.WriteLine(0);
            for (int i = 0; i < 10; i++)
            {
                int copy = i;
                if (i == 1)
                    action = () => Console.WriteLine(copy);
            }

            action();
        }

The difference is that the result of the first code is 10, but the second is 1. We know that it somehow has to do with closures and variable binding, and we just should write it the second way. But what’s going on under the hood?
Let’s use Reflector to see:

 
    internal class Program
  {
       private static void Main(string[] args)
     {
           Action action = (<>c.<>9__0_0 != null) ?
                <>c.<>9__0_0 :
              (<>c.<>9__0_0 = new Action(this.<Main>b__0_0));
           <>c__DisplayClass0_0 class_ = new <>c__DisplayClass0_0();
           class_.i = 0;
           while (class_.i < 10)
            {
               if (class_.i == 1)
              {
                   action = new Action(class_.<Main>b__1);
               }
               int i = class_.i;
               class_.i = i + 1;
           }
           action();
       }

       [Serializable, CompilerGenerated]
       private sealed class <>c
      {
           public static readonly Program.<>c <>9 = new Program.<>c();
           public static Action <>9__0_0;

            internal void <Main>b__0_0()
          {
               Console.WriteLine(0);
           }
       }
   }

   [CompilerGenerated]
 private sealed class <>c__DisplayClass0_0
 {
       public int i;

       internal void <Main>b__1()
        {
           Console.WriteLine(this.i);
      }
   }

Now we can see, that once we used a free variable in a closure, it’s no longer free, and it has been bound since that place during the compilation. It’s even easier to think that the variable now belongs to the closure, and just happen to be used outside, not vice versa.
As the variable is now bound, we can see that it keeps incrementing it inside the closure as the loop goes on. Hence the result. So the closure gets created one time before the loop, and persists the whole way.

Now what about the second approach:

 
        private static void Main(string[] args)
        {
            Action action = (<>c.<>9__0_0 != null) ?
               <>c.<>9__0_0 :
              (<>c.<>9__0_0 = new Action(this.<Main>b__0_0));
            for (int i = 0; i < 10; i++)
            {
                <>c__DisplayClass0_0 class_ = new <>c__DisplayClass0_0();
                class_.copy = i;
                if (i == 1)
                {
                    action = new Action(class_.<Main>b__1);
                }
            }
            action();
        }

        [Serializable, CompilerGenerated]
        private sealed class <>c
        {
            public static readonly Program.<>c <>9 = new Program.<>c();
            public static Action <>9__0_0;

            internal void <Main>b__0_0()
            {
                Console.WriteLine(0);
            }
        }

Ok, now it’s completely different. The original index variable is now free, and the closure is constructed on each loop iteration, and the copy of the index is bound to the closure. So, in total, we create 10 closures, and collect 9 of them, and only one of them persists after the loop. So, in terms of garbage collection, it doesn’t look like a very efficient code, as it creates much more garbage.

What if we don’t bind the closure to anything, will we get the same number of collections? Let’s see:

 
 private static void Main(string[] args)
 {
       Action action = (<>c.<>9__0_0 != null) ?
            <>c.<>9__0_0 :
          (<>c.<>9__0_0 = new Action(this.<Main>b__0_0));
       for (int i = 0; i < 10; i++)
     {
           if (i == 1)
         {
               action = (<>c.<>9__0_1 != null) ?
                   <>c.<>9__0_1 :
                  (<>c.<>9__0_1 = new Action(this.<Main>b__0_1));
           }
       }
       action();
   }

And we can see that no, and actually no closure class will be constructed at all.

Let's get back and see if optimizer does anything with it:

 
        private static void Main(string[] args)
        {
            Action action = (<>c.<>9__0_0 != null) ?
                <>c.<>9__0_0 :
              (<>c.<>9__0_0 = new Action(this.<Main>b__0_0));
            for (int i = 0; i < 10; i++)
            {
                <>c__DisplayClass0_0 class_ = new <>c__DisplayClass0_0();
                class_.copy = i;
                if (i == 1)
                {
                    action = new Action(class_.<Main>b__1);
                }
            }
            action();
        }

Looks like not much. Let’s look at the IL (using dotPeek):

 
    // IL_0022: br.s         IL_0046
    // // start of loop, entry point: IL_0046
    // IL_0024: newobj       instance void ModifiedClosure1.Program/'<>c__DisplayClass0_0'::.ctor()
      // IL_0029: stloc.2      // 'cDisplayClass00 [Range(Instruction(IL_0029 stloc.2)-Instruction(IL_0035 ldloc.2))]'
      // IL_002a: ldloc.2      // 'cDisplayClass00 [Range(Instruction(IL_0029 stloc.2)-Instruction(IL_0035 ldloc.2))]'
      // IL_002b: ldloc.1      // 'index [Range(Instruction(IL_0021 stloc.1)-Instruction(IL_0046 ldloc.1))]'
      // IL_002c: stfld        int32 ModifiedClosure1.Program/'<>c__DisplayClass0_0'::copy
      // IL_0031: ldloc.1      // 'index [Range(Instruction(IL_0021 stloc.1)-Instruction(IL_0046 ldloc.1))]'
      // IL_0032: ldc.i4.1     
      // IL_0033: bne.un.s     IL_0042
      // IL_0035: ldloc.2      // 'cDisplayClass00 [Range(Instruction(IL_0029 stloc.2)-Instruction(IL_0035 ldloc.2))]'
      // IL_0036: ldftn        instance void ModifiedClosure1.Program/'<>c__DisplayClass0_0'::'<Main>b__1'()
      // IL_003c: newobj       instance void [mscorlib]System.Action::.ctor(object, native int)
      // IL_0041: stloc.0      // 'action [Range(Instruction(IL_001f stloc.0)-Instruction(IL_004b ldloc.0))]'
      // IL_0042: ldloc.1      // 'index [Range(Instruction(IL_0021 stloc.1)-Instruction(IL_0046 ldloc.1))]'
      // IL_0043: ldc.i4.1     
      // IL_0044: add          
      // IL_0045: stloc.1      // 'index [Range(Instruction(IL_0021 stloc.1)-Instruction(IL_0046 ldloc.1))]'
      // IL_0046: ldloc.1      // 'index [Range(Instruction(IL_0021 stloc.1)-Instruction(IL_0046 ldloc.1))]'
      // IL_0047: ldc.i4.s     10 // 0x0a
      // IL_0049: blt.s        IL_0024
      // // end of loop

So yeah, looks like it’s still inside the loop. And it makes sense, if the optimizer couldn’t prove that the object creation is free of any side effects.
C# currently has no way of specifying purity. It has a [Pure] attribute, but it’s not really enforced:

From https://msdn.microsoft.com/en-us/library/system.diagnostics.contracts.pureattribute.aspx:

Methods and types that are marked with this attribute can be used in calls to System.Diagnostics.Contracts.Contract methods. Pure methods do not make any visible state changes. This attribute is not enforced by the current analysis tools; you should use this attribute only if you are sure that the methods are pure.

Getting back to the original problem, however, you have to be careful, as Resharper won't catch everything. Let's see what happens if we use reference type as an "index":

 
    class AdvancedInt
    {
        public int i = 0;
    }

    class Program
    {
        static void Main(string[] args)
        {
            Action action = () => Console.WriteLine(0);
            AdvancedInt i = new AdvancedInt();
            while (i.i < 10)
            {
                AdvancedInt copy = i;
                if (i.i == 1)
                    action = () => Console.WriteLine(copy.i);

                ++i.i;
            }

            action();
        }
    }

The result will be 10 again, and Resharper won't complain. It's easy to see why, because the copy is only a reference. We don't even need Reflector to see this.

So we can conclude several facts:

  • closure variables are bound at their first use, and closure class is created at that point
  • using closures with bound variables inside long loops creates lots of garbage collection
  • don't rely on Resharper catching all closure binding errors