Thursday, July 06, 2006 10:15 PM
bart
C# 2.0 Iterators
Introduction
In my post about LINQ a couple of days ago, I promised to do a dive deep post on iterators in C# 2.0 (and later). Promises kept, here it is.
So, what’s in a name? Iterators are defined in the C# 2.0 specification, section 22. From the spec we learn the following:
An iterator block is a block (§8.2) that yields an ordered sequence of values. An iterator block is distinguished from a normal statement block by the presence of one or more yield statements.
- The yield return statement produces the next value of the iteration.
- The yield break statement indicates that the iteration is complete.
An iterator block may be used as a method-body, operator-body or accessor-body as long as the return type of the corresponding function member is one of the enumerator interfaces (§22.1.1) or one of the enumerable interfaces (§22.1.2).
A few keywords have been marked in bold. We’ll focus on each of these individually in a minute. But let’s concretize this spec definition by a little example:
using System;
using System.Collections.Generic;
class Test
{
public static void Main()
{
foreach (string s in GetItems())
Console.WriteLine(s);
}
private static IEnumerable<string> GetItems()
{
yield return "Hello yield 1";
yield return "Hello yield 2";
yield return "Hello yield 3";
yield return "Hello yield 4";
yield return "Hello yield 5";
}
}
In here, the iterator is the method GetItems. Two elements indicate this:
-
The presence of the yield keyword in the method body.
-
The use of an IEnumerable<T> return type.
But hang on, where the object of (a) type (that implements) IEnumerable<T> which is returned? Enter the powerful world of iterators!
Exercise
What does the following code fragment put on the screen? Think about it for a while and move on to the next section.
using System;
using System.Collections.Generic;
class Test
{
public static void Main()
{
foreach (int i in EvenNumbers())
Console.WriteLine(i);
}
public static IEnumerable<int> EvenNumbers()
{
for (int i = 0; true; i += 2)
yield return i;
}
}
It's all about laziness
I’d like to summarize iterators with one simple statement: “iterators are sequence generators”. As such, an iterator is lazy and just sits there idle till a consumer asks to provide the next element of a sequence. The sample above (exercise) shows this. The method EvenNumbers is a generator for even numbers, that’s clear. At first glance it might look as a method that never stops executing due to the endless for loop (which I made a explicit by means of the true condition, there are of course more dirty ways of creating endless loops). But what’s really going on?
1. Dissecting the foreach construction
Let’s start at the consumer side:
foreach (int i in EvenNumbers())
Console.WriteLine(i);
As the matter in fact, the foreach loop construction is built around the IEnumerable and IEnumerable<T> interfaces (C# spec, sections 8.8.4 (non-generic) and 20.8.10 (generic)). The spec states that the foreach statement shown above is the equivalent of the following (cf 20.8.10):
IEnumerator<int> enumerator = ((IEnumerable<int>)(collection)).GetEnumerator();
try
{
while (enumerator.MoveNext())
{
int element = (int)enumerator.Current; //notice the cast isn’t required
Console.WriteLine(element);
}
}
finally
{
enumerator.Dispose(); //see note below on the absence of a null-check
}
Of course you can validate this by using the much beloved ildasm tool (source compiled with /o flag):
.method public hidebysig static void Main() cil managed
{
.entrypoint
.locals init (int32 V_0,
class [mscorlib]System.Collections.Generic.IEnumerator`1<int32> V_1)
IL_0000: call class [mscorlib]System.Collections.Generic.IEnumerable`1<int32> Test::EvenNumbers()
IL_0005: callvirt instance class [mscorlib]System.Collections.Generic.IEnumerator`1<!0> class [mscorlib]System.Collections.Generic.IEnumerable`1<int32>::GetEnumerator()
IL_000a: stloc.1
.try
{
IL_000b: br.s IL_001a
IL_000d: ldloc.1
IL_000e: callvirt instance !0 class [mscorlib]System.Collections.Generic.IEnumerator`1<int32>::get_Current()
IL_0013: stloc.0.0
IL_0014: ldloc.0
IL_0015: call void [mscorlib]System.Console::WriteLine(int32)
IL_001a: ldloc.1.1
IL_001b: callvirt instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
IL_0020: brtrue.s IL_000d
IL_0022: leave.s IL_002e
}
finally
{
IL_0024: ldloc.1
IL_0025: brfalse.s IL_002d
IL_0027: ldloc.1
IL_0028: callvirt instance void [mscorlib]System.IDisposable::Dispose()
IL_002d: endfinally
}
IL_002e: ret
}
Note: There’s a little more in this code than specified by the official spec. The finally-block contains an additional check to see whether the local V_1 (the IEnumerator<int>) isn’t null (see line IL_0025). You can see this even better when compiling the source without the /o compiler flag (you’ll notice a ldnull – ceq series of statements to perform the null check in that build).
The most important thing to remember for now is that the foreach statement is simply a short form to deal with an IEnumerator to iterate over the sequence (notice I don’t say collection) in a forward-only manner.
2. Behind the scenes of the iterator
Now jump to the iterator’s definition itself:
public static IEnumerable<int> EvenNumbers()
{
for (int i = 0; true; i += 2)
yield return i;
}
When you take a look at the IL of this method, you’ll find the following:
.method public hidebysig static class [mscorlib]System.Collections.Generic.IEnumerable`1<int32>
EvenNumbers() cil managed
{
.locals init (class Test/'<EvenNumbers>d__0' V_0)
IL_0000: ldc.i4.s -2
IL_0002: newobj instance void Test/'<EvenNumbers>d__0'::.ctor(int32)
IL_0007: stloc.0
IL_0008: ldloc.0
IL_0009: ret
}
A big unknown type appears – Test/’<EvenNumbers>d__0’ – which can be found in the same assembly of course. (Don’t worry about the mysterious -2 parameter passed to the constructor of this unknown type.)
<Intermezzo>
As you can already feel, compilers today are doing much more than compilers a decade ago. More and more easy-to-learn productivity constructs in a language require a complex mapping under the covers.
Other examples include:
-
using-statement: translation into a try-finally block with IDisposable
-
lock-statement: translation to a try-finally block with Monitor.Enter and Monitor.Leave calls
-
anonymous methods: creation of a “cached anonymous delegate” and another private method
-
events: add and remove handler stuff
-
properties: getter and setter methods
I’m sure you can think of many more (not to speak about late-bound languages such as VB).
</Intermezzo>
Back to the unknown type I was referring to. What’s in a name?
Let’s start with the class definition
.class auto ansi sealed nested private beforefieldinit '<EvenNumbers>d__0'
extends [mscorlib]System.Object
implements class [mscorlib]System.Collections.Generic.IEnumerable`1<int32>,
[mscorlib]System.Collections.IEnumerable,
class [mscorlib]System.Collections.Generic.IEnumerator`1<int32>,
[mscorlib]System.Collections.IEnumerator,
[mscorlib]System.IDisposable
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = ( 01 00 00 00 )
}
Notice the type implements two generic interfaces, IEnumerable<T> and IEnumerator<T>. That makes it possible to be used in combination with the foreach statement (see above).
Next, what are the fields of the type?
.field private int32 '<>1__state'
.field private int32 '<>2__current'
.field public int32 '<i>5__1'
The first two, the state and the current field, are the most important ones for now.
On to the constructor
.method public hidebysig specialname rtspecialname
instance void .ctor(int32 '<>1__state') cil managed
{
IL_0000: ldarg.0
IL_0001: call instance void [mscorlib]System.Object::.ctor()
IL_0006: ldarg.0
IL_0007: ldarg.1
IL_0008: stfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_000d: ret
}
Nothing special again, the base constructor of System.Object is called and the ‘<>1__state’ field is populated using the supplied constructor first (and only) argument’s value.
What about the properties?
Because of the implementation of IEnumerator<int32> and IEnumerator we expect two properties (a generic and a non-generic one), a getter for the current item during the iteration over the sequence:
.property instance int32 'System.Collections.Generic.IEnumerator<System.Int32>.Current'()
{
.get instance int32 Test/'<EvenNumbers>d__0'::'System.Collections.Generic.IEnumerator<System.Int32>.get_Current'()
}
.property instance object System.Collections.IEnumerator.Current()
{
.get instance object Test/'<EvenNumbers>d__0'::System.Collections.IEnumerator.get_Current()
}
The corresponding methods look as follows:
.method private hidebysig newslot specialname virtual final
instance int32 'System.Collections.Generic.IEnumerator<System.Int32>.get_Current'() cil managed
{
.override method instance !0 class [mscorlib]System.Collections.Generic.IEnumerator`1<int32>::get_Current()
IL_0000: ldarg.0
IL_0001: ldfld int32 Test/'<EvenNumbers>d__0'::'<>2__current'
IL_0006: ret
}
.method private hidebysig newslot specialname virtual final
instance object System.Collections.IEnumerator.get_Current() cil managed
{
.override [mscorlib]System.Collections.IEnumerator::get_Current
IL_0000: ldarg.0
IL_0001: ldfld int32 Test/'<EvenNumbers>d__0'::'<>2__current'
IL_0006: box [mscorlib]System.Int32
IL_000b: ret
}
Basically, the field with the current “state” (see further) is returned by the Current property getter. Notice the boxing in the non-generic case, as we need to return an object of the “big mother” type System.Object.
<Intermezzo>
Just in case you wonder why there’s a ldarg.0 instructor at the beginning of all these methods… This first (hidden) parameter is the current instance point (“this”).
</Intermezzo>
Some less interesting methods
In order to keep the best for the last, first some less interesting methods:
-
The Reset method (.override [mscorlib]System.Collections.IEnumerator::Reset) just throws a NotSupportedException. Once the iteration has started, you can’t go back to the initial state unless you create a new instance of the enumerator, either by calling the iterator (method) again, or by calling GetEnumerator (see the foreach statement explanation).
-
The Dispose method (.override [mscorlib]System.IDisposable::Dispose) is empty.
Getting the enumerator
Now it becomes more and more interesting: the interface IEnumerable has a method called GetEnumerator used to return the corresponding IEnumerator (read this sentence both non-generic and generic please).
<Intermezzo>
Why a getter method and not a property? Because the GetEnumerator – as you’ll see in just a couple of seconds – can be time-consuming. For time-consuming getter actions, you should use a method instead (cf. D.2.1.1 in the Common Language Infrastructure Annotated Standard which states: “Do use a method in the following situations. (…) The operation is expensive (orders of magnitude slower than a field set would be).”).
</Intermezzo>
Time to investigate what’s happening in here… The non-generic one is the least sexy one of both:
.method private hidebysig newslot virtual final
instance class [mscorlib]System.Collections.IEnumerator
System.Collections.IEnumerable.GetEnumerator() cil managed
{
.override [mscorlib]System.Collections.IEnumerable::GetEnumerator
IL_0000: ldarg.0
IL_0001: call instance class [mscorlib]System.Collections.Generic.IEnumerator`1<int32> Test/'<EvenNumbers>d__0'::'System.Collections.Generic.IEnumerable<System.Int32>.GetEnumerator'()
IL_0006: ret
}
So, the hunted secret (of the GetEnumerator anyway) should be in the generic brother method. Before you continue, make sure you understand the crucial role this method plays in respect to the “consumer” (foreach-statement equivalent).
.method private hidebysig newslot virtual final
instance class [mscorlib]System.Collections.Generic.IEnumerator`1<int32>
'System.Collections.Generic.IEnumerable<System.Int32>.GetEnumerator'() cil managed
{
.override method instance class [mscorlib]System.Collections.Generic.IEnumerator`1<!0> class [mscorlib]System.Collections.Generic.IEnumerable`1<int32>::GetEnumerator()
.locals init (class Test/'<EvenNumbers>d__0' V_0)
IL_0000: ldarg.0
IL_0001: ldflda int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_0006: ldc.i4.0
IL_0007: ldc.i4.s -2
IL_0009: call int32 [mscorlib]System.Threading.Interlocked::CompareExchange(int32&,int32,int32)
IL_000e: ldc.i4.s -2
IL_0010: bne.un.s IL_0016
IL_0012: ldarg.0
IL_0013: stloc.0
IL_0014: br.s IL_001d
IL_0016: ldc.i4.0
IL_0017: newobj instance void Test/'<EvenNumbers>d__0'::.ctor(int32)
IL_001c: stloc.0
IL_001d: ldloc.0
IL_001e: ret
}
Wow, pretty complex at first sight isn’t it? Let’s analyze what’s happening:
IL_0001: ldflda int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_0006: ldc.i4.0
IL_0007: ldc.i4.s -2
IL_0009: call int32 [mscorlib]System.Threading.Interlocked::CompareExchange(int32&,int32,int32)
Note: ldflda stands for “Load Field Address” (cf. 4.10 in the Common Language Infrastructure Annotated Standard).
Stack = ..., '<>1__state'&, 0, -2
Call = System.Threading.Interlocked::CompareExchange
public static int CompareExchange (
ref int location1,
int value,
int comparand
)
The (conceptual) equivalent of these three instructions is (C#):
('<>1__state' == -2 ? 0 : '<>1__state')
A threading library method is used because of the need for an atomic compare and exchange operation to ensure correctness. The CompareExchange method always returns the original value of the first operand (in this case '<>1__state'), so the original (state) value end up on top of the stack:
Stack = ..., '<>1__state'
Let’s continue:
IL_000e: ldc.i4.s -2
IL_0010: bne.un.s IL_0016
IL_0012: ldarg.0
IL_0013: stloc.0
IL_0014: br.s IL_001d
IL_0016: ldc.i4.0
IL_0017: newobj instance void Test/'<EvenNumbers>d__0'::.ctor(int32)
IL_001c: stloc.0
The result of the CompareExchange (which is the original value of '<>1__state', see above) is compared to -2 (read the bne.un.s instruction as “if the result of CompareExchange is not equal to -2, then jump to IL_001d”).
-
In case of equality, nothing has happened (see further) since the constructor of the object was called, and the statements IL_0012 and IL_0013 are executed, after which control is transferred to IL_001d with the local V_0 set to the current instance (recall that ldarg.0 stands for “this”).
-
In case the of inequality, the current instance is already being used (i.e. an iteration has started, see further). In order to answer the call to GetEnumerator() we have to create a brand new instance of our '<EvenNumbers>d__0' type and return that). This is done in the statements IL_0016 and IL_0017, after which control is transferred to IL_001d with the local V_0 set to the newly created instance with an internal state '<>1__state' set to 0 (cf. IL_0016 and the constructor’s IL code, see above).
To recap:
A call to GetEnumerator therefore results in a ready-to-be-used enumerator object. This means that, whenever you launch a foreach loop over the iterator (which implicitly calls GetEnumerator, see foreach-statement explanation), you end up with a unique instance of our internal class, with internal (initial) state set to 0.
Iterating over the sequence – MoveNext
One crucial method remains in order to be able to iterate over the sequence, the MoveNext method (of the IEnumerator). This method is implemented as a state machine, explanation of this in a minute:
.method private hidebysig newslot virtual final
instance bool MoveNext() cil managed
{
.override [mscorlib]System.Collections.IEnumerator::MoveNext
.locals init (int32 V_0)
IL_0000: ldarg.0
IL_0001: ldfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_0006: stloc.0
IL_0007: ldloc.0
IL_0008: switch (
IL_0017,
IL_003a)
IL_0015: br.s IL_0051
IL_0017: ldarg.0
IL_0018: ldc.i4.m1
IL_0019: stfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_001e: ldarg.0
IL_001f: ldc.i4.0
IL_0020: stfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
IL_0025: ldarg.0
IL_0026: ldarg.0
IL_0027: ldfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
IL_002c: stfld int32 Test/'<EvenNumbers>d__0'::'<>2__current'
IL_0031: ldarg.0
IL_0032: ldc.i4.1
IL_0033: stfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_0038: ldc.i4.1
IL_0039: ret
IL_003a: ldarg.0
IL_003b: ldc.i4.m1
IL_003c: stfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_0041: ldarg.0
IL_0042: dup
IL_0043: ldfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
IL_0048: ldc.i4.2
IL_0049: add
IL_004a: stfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
IL_004f: br.s IL_0025
IL_0051: ldc.i4.0
IL_0052: ret
}
Again pretty scary in the face in first instance. Starting at the top:
IL_0000: ldarg.0
IL_0001: ldfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_0006: stloc.0
IL_0007: ldloc.0
IL_0008: switch (
IL_0017,
IL_003a)
IL_0015: br.s IL_0051
This is a switch statement (which contains also a switch instruction) and works as follows:
-
IL_0000 returns the “this” instance.
-
IL_0001 loads the current internal state from the current (“this”) instance.
-
IL_0006 puts the state (which is now on top of the stack) in the local variable V_0 (of type int32).
-
IL_0007 loads this local V_0 again (to the top of the stack).
-
IL_0008 is a switch instruction (cf. 3.66 in the Common Language Infrastructure Annotated Standard):
-
If the variable on top of the stack equals 0, go to IL_0017.
-
If the variable on top of the stack equals 1, go to IL_003a.
-
IL_0015 is the fall-through after the switch instruction and can be seen as the “default” case (go to IL_0051).
We end up with two blocks: IL_0017 to IL_0039 and IL_003a to IL_004f.
Let’s start with the first one (IL_0017 to IL_0039):
IL_0017: ldarg.0
IL_0018: ldc.i4.m1
IL_0019: stfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_0017 to IL_0019 set the internal state to -1 (ldc.i4.m1). This is not a final state yet, but it’s already different from the mystery number -2, which causes another call to GetEnumerator to return a new instance of the enumerator class (see explanation above). Now the real work starts:
IL_001e: ldarg.0
IL_001f: ldc.i4.0
IL_0020: stfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
IL_0025: ldarg.0
IL_0026: ldarg.0
IL_0027: ldfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
IL_002c: stfld int32 Test/'<EvenNumbers>d__0'::'<>2__current'
We can translate this as follows:
this.'<i>5__1' = 0; //IL_001e to IL_0020
this.'<>2__current' = this.'<i>5__1'; //LHS: IL_0025, IL_002c | RHS: IL_0026, IL_0027
Then the magic continues by setting the internal state to 1:
IL_0031: ldarg.0
IL_0032: ldc.i4.1
IL_0033: stfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
So, the next time we call MoveNext, the switch statement (IL_0008) will jump to IL_003a. Finally, true is returned for the MoveNext call, indicating there is more to be yielded (or stated otherwise: foreach can continue to run).
IL_0038: ldc.i4.1 //1 == true
IL_0039: ret
You can trace this all the way back to (bold):
for (int i = 0; true; i += 2)
yield return i;
which you should (of course) translate to the IEnumerator-based equivalent (see above) to get the clearest possible view on the code (i.e. the loop variable i is not returned by the ret instruction in IL_0039, rather it’s returned through the Current property, which we did examine earlier and returns '<>2__current' which was set in IL_002c).
Time for the second block (IL_003a to IL_004f):
IL_003a: ldarg.0
IL_003b: ldc.i4.m1
IL_003c: stfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
Again the story starts by setting the internal state to -1.
IL_0041: ldarg.0
IL_0042: dup
IL_0043: ldfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
Next, the current value of i (which is stored in a private helper field) is retrieved. Note: The dup instruction in IL_0042 duplicates the value on top of the stack. In fact, IL_0026 could be replaced by a dup instruction as well, so there seems to be a little discrepancy in the C# compiler’s IL generation (although both methodologies, i.e. IL_0041+IL_0042 and IL_0025+IL_0026, have the same result).
IL_0048: ldc.i4.2
IL_0049: add
Now two (2) is added to the value on top of the stack…
IL_004a: stfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
…and the result is stored in the value for i ('<i>5__1'). So far, we’ve done nothing more than:
'<i>5__1' += 2;
which is traced back to (bold):
for (int i = 0; true; i += 2)
yield return i;
Finally, the system jumps to IL_0025,
IL_004f: br.s IL_0025
which triggers the execution of instructions IL_0025 to IL_0039:
IL_0025: ldarg.0
IL_0026: ldarg.0
IL_0027: ldfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
IL_002c: stfld int32 Test/'<EvenNumbers>d__0'::'<>2__current'
IL_0031: ldarg.0
IL_0032: ldc.i4.1
IL_0033: stfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_0038: ldc.i4.1
IL_0039: ret
I explained these when talking about the first (switch-case) block. Basically, the helper field for i ('<i>5__1') becomes the current value (this.'<>2__current'), the internal state is set to 1 and the method returns true (“there is more to find in this sequence, foreach is allowed to continue”).
The “default case” block just returns 0, which means “nothing more to find over here” but should never occur. The only state transitions we saw are -2 to 0, 0 to 1 and 1 to 1 (plus the intermediary state change from/to -1).
IL_0051: ldc.i4.0
IL_0052: ret
3. What about yield break?
Beside of the yield return statement, there’s also yield break to indicate that the sequence ends (no further yielding can be done, not further iteration should be done, MoveNext returns false). The only thing this causes is a more complex state machine (same number of states, but additional conditional logic to check whether yielding should stop). Consider the following (trivial) example:
public static IEnumerable<int> EvenNumbers()
{
for (int i = 0; true; i += 2)
{
if (i == 100)
yield break;
yield return i;
}
}
Now MoveNext has the following look:
.method private hidebysig newslot virtual final
instance bool MoveNext() cil managed
{
.override [mscorlib]System.Collections.IEnumerator::MoveNext
.locals init (bool V_0,
int32 V_1,
bool V_2)
IL_0000: ldarg.0
IL_0001: ldfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_0006: stloc.1
IL_0007: ldloc.1
IL_0008: switch (
IL_0019,
IL_0017)
IL_0015: br.s IL_001b
IL_0017: br.s IL_0059
IL_0019: br.s IL_001d
IL_001b: br.s IL_0073
IL_001d: ldarg.0
IL_001e: ldc.i4.m1
IL_001f: stfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_0024: nop
IL_0025: ldarg.0
IL_0026: ldc.i4.0
IL_0027: stfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
IL_002c: br.s IL_006f
IL_002e: nop
IL_002f: ldarg.0
IL_0030: ldfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
IL_0035: ldc.i4.s 100
IL_0037: ceq
IL_0039: ldc.i4.0
IL_003a: ceq
IL_003c: stloc.2
IL_003d: ldloc.2
IL_003e: brtrue.s IL_0042
IL_0040: br.s IL_0073
IL_0042: ldarg.0
IL_0043: ldarg.0
IL_0044: ldfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
IL_0049: stfld int32 Test/'<EvenNumbers>d__0'::'<>2__current'
IL_004e: ldarg.0
IL_004f: ldc.i4.1
IL_0050: stfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_0055: ldc.i4.1
IL_0056: stloc.0
IL_0057: br.s IL_0077
IL_0059: ldarg.0
IL_005a: ldc.i4.m1
IL_005b: stfld int32 Test/'<EvenNumbers>d__0'::'<>1__state'
IL_0060: nop
IL_0061: ldarg.0
IL_0062: dup
IL_0063: ldfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
IL_0068: ldc.i4.2
IL_0069: add
IL_006a: stfld int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
IL_006f: ldc.i4.1
IL_0070: stloc.2
IL_0071: br.s IL_002e
IL_0073: ldc.i4.0
IL_0074: stloc.0
IL_0075: br.s IL_0077
IL_0077: ldloc.0
IL_0078: ret
}
The path of the iteration with i equal to 100 is indicated in red-purple-orange and causes the method to return false (IL_0073). In case the current value isn’t 100, the branch statement on IL_003e jumps to IL_0042 and true (IL_0055) is returned.
Homework
Try to find out what the following enumerator translates to in IL (without CTRL-C, WIN-R, notepad, ENTER, CTRL-V, …, csc, …, ildasm, … you know what I mean):
private static IEnumerable<string> GetItems()
{
yield return "Hello yield 1";
yield return "Hello yield 2";
yield return "Hello yield 3";
yield return "Hello yield 4";
yield return "Hello yield 5";
}
Conclusion
The iterator feature of C# is far more complex than it might seem at first glance. It hides a complete state machine taking care of state transitions to keep the current “cursor” position in the sequence. This is in sharp contrast to the methodology where one returns a pre-populated collection (say List<SomeType>) that can be traversed using foreach as well (because it’s of course IEnumerable<SomeType>). Iterators provide a lazy pattern where stuff can be calculated when it’s needed. It’s up to the consumer to decide how much of the sequence to consume effectively.
This brings us to the world of so-called “continuations”, where the execution of a piece of code is virtually suspended till the consumer decides he want to get more stuff, which imposes a stateful approach under the covers (as we investigated in this post). Call it a small (procedurally defined) Windows Workflow Foundation state machine if you want to take it so far and if that helps you to understand it (maybe it just makes things more complex, my apologies if that’s the case)… Take a look at Don Box’ post on http://pluralsight.com/blogs/dbox/archive/2005/04/17/7467.aspx as well.
Maybe one last thing: why should I bother about this? One answer is LINQ; take a look at my previous LINQ post to get more information on the Standard Query Operators and download the source to count the number of yield statements encountered. (Tip: My FindString Windows PowerShell cmdlet might be useful to perform the count).
Del.icio.us |
Digg It |
Technorati |
Blinklist |
Furl |
reddit |
DotNetKicks
Filed under: C# 2.0