Monday, August 10, 2009 9:05 AM
bart
Expression Trees, Take Two – Introducing System.Linq.Expressions v4.0
Introduction
Avid blog readers know I have a weak spot for expression trees in particular and the broader picture of meta-programming facilities. With the introducing of LINQ in the .NET Framework 3.5 timeframe, we ended up adding expression trees to the framework as a way to represent code as data. A fundamental requirement to make remote LINQ scenarios such as LINQ to SQL work, as the LINQ expression written by the developer needs to be cross-compiled into the corresponding query language, e.g. SQL. To enable this, two routes could be taken:
- Decompile IL code for the query at runtime, and turn it into the target language – quite roundabout
- Enable representing the query’s code as data at runtime by emitting an expression tree at compile-time
Turning the query into the required target language at compile time isn’t a viable option either, since the front-end compilers would have to know about all of the target query domains; clearly not a good idea.
A quick recap of the 3.5 landscape
Before diving into the .NET 4.0 feature set around enhanced expression trees, let’s have a quick recap of the corresponding baseline features introduced in .NET 3.5. Since LINQ was the driver for this feature after all, it makes sense to take a top-down approach and turn a LINQ query into more primitive building blocks.
Consider the following query expression:
var res = from p in db.Products
where p.UnitPrice > 100
select p.ProductName;
Given this, we can’t say much about the query just yet. All we know is that from a language’s point of view, we can “desugar” the above into the below:
var res = db.Products
.Where(p => p.UnitPrice > 100)
.Select(p => p.ProductName);
From this point on, regular method resolution rules kick in to find a Where method on db.Products, and to find a Select method on the result of calling Where. The argument to both those methods is a lambda expression and here’s where the flexibility comes from. Lambda expressions by themselves don’t have a type:
var twice = (int x) => x * 2;
Though the right-hand side seems to have all the information to infer a type (going from int to an expression of type int, ought to be Func<int, int> right?) the above doesn’t compile. Lambda expressions can be assigned to variables of two distinct types: regular delegates or expression trees thereof. The below illustrates this:
Func<int, int> twiceD = x => x * 2;
Expression<Func<int, int>> twiceE = x => x * 2;
The first one is shorthand for an anonymous method we already supported in C# 2.0:
Func<int, int> twiceD = delegate(int x) { return x * 2; };
However, the second one gives rise to quite a bit more of generated code; in particular, code is emitted that constructs an expression tree at runtime to represent the code the user wrote as data. Given this code, the library function calls (like Where and Select in case of LINQ) can consume that expression tree to analyze the code the user wrote, extracting the intent and turning it into execution in whatever way is right for the task at hand (e.g. by cross-compiling the expression into a target query language like SQL). In particular, the twiceE lambda expression gets turned into:
ParameterExpression x = Expression.Parameter(typeof(int), “x”);
Expression<Func<int, int>> twiceE = Expression.Lambda<Func<int, int>>(
Expression.Multiply(
x,
Expression.Constant(2)
),
x
);
That’s exactly what happens in the LINQ case when using an IQueryable<T> data source: the query operator methods like Where and Select don’t take in a plain delegate (like a Func<T, bool> for the filter fed in to the Where operator) but instead they take in an Expression<TDelegate> thereof (like an Expression<Func<T, bool>>). This causes the compiler to emit code generating an expression tree (by means of the Expression factory methods as shown above) for the argument. Ultimately the Queryable class puts together all those expression tree fragments into a big one representing the whole query expression, ready for the query provider to do whatever magic it sees fit. We won’t go there now.
To finish up this recap, a single picture that tells it all:
There’s on thing I haven’t mentioned yet: the Compile method on the LambdaExpression type as produced by the Expression.Lambda call. This plan simple API surface gives access to a gigantic machinery behind the scenes that turns the expression tree into executable IL code at runtime: the tip of the Compiler-as-a-Service (CaaS) iceberg.
As a little extra, let’s dig a little deeper and show the generated IL code through the Son-of-Strike (SOS) debugger extension. I’m using a Visual Studio 2010 project targeting .NET 3.5 for this demo. If it weren’t for the fact I have a 64-bit machine, I could equally well have used Visual Studio 2008, but mixed mode debugging (native + managed, as required by SOS) isn’t available on 64-bit in the 2008 version. In Visual Studio 2010, this restriction is gone (hooray).
Inspired by the code above, we can write a simple expression tree and have it compile at runtime. The resulting delegate object is based on dynamically emitted code, which we’ll visualize now:
First, we load the SOS extension from the .NET 2.0 installation folder. To do this, you’ll have to enable unmanaged code debugging:
Next, we use !dso to dump stack objects, which will reveal our twice object which is of type Func<int, int>. Dumping the object using !do reveals the target object, the underlying method, etc. We’ll have a look at the _methodBase next, where we’ll be able to see the dynamic method’s code:
The !dumpil command is all we need to dump the contents of the dynamically generated IL body, revealing the “times two” logic. Feel free to experiment with more complex lambda expressions revealing their contents as IL after runtime compilation. For example, what lambda expression did I write to arrive with the following?
!dumpil 026b89f0
This is dynamic IL. Exception info is not reported at this time.
If a token is unresolved, run "!do <addr>" on the addr given
in parenthesis. You can also look at the token table yourself, by
running "!DumpArray 026bde24".
IL_0000: ldarg.1
IL_0001: ldarg.2
IL_0002: callvirt 6000002 System.String.Substring(Int32)
IL_0007: ret
So, to wrap up, the .NET 3.5 story of expression trees consisted of two core parts:
- Library support for expression trees, in the System.Linq.Expressions namespace defined in the System.Core.dll assembly (meaning one can manually new up expression trees and obviously inspect their properties).
- Language support for expression trees, triggered by assigning a lambda expression to an Expression<TDelegate> type, available in C# and VB (as invisible glue to make LINQ work in certain scenarios).
After expressions come … statements
Looking at expression trees in .NET 3.5 one observes they’re only able to represent expressions, i.e. those portions from languages that are valued in their very nature. Things like if and while don’t fall under this umbrella and are statements, and one more step higher up we land at full-blown declarations (like classes and interfaces). It’s clear if we want to represent more than valued expressions and aim our arrows at full method bodies, we need more than pure expression trees. So .NET 4.0 introduces statement trees as an extension to the existing expression tree API (so, as a little unfortunate side-effect of this, the Expressions part in the containing namespace becomes a bit of a misnomer).
The following picture illustrates the gaps we have in expression trees that hinder us from representing full method bodies. The reason we’re looking after such support is to be able to use the expression trees as one of the driving forces behind the Dynamic Language Runtime (DLR). To communicate code to the runtime, one uses expression trees. One could think of this in a rather extremist fashion as follows: where IL is used to represent code in the CLR, expression trees are used in the DLR. Ultimately the DLR is layered on top of the CLR, so at the end of the day expression trees are compiled – at runtime – into IL instructions as shown above with the Compile method of the v3.5 API. And from that point on, the IL can be turned into native code by the JIT compiler.
Everything in green was already exposed through the expression tree APIs in .NET 3.5. Things like method calls, various operators, constants, constructor calls, and variables (though you couldn’t assign to them but use them as parameters to lambdas) can be represented through factory method calls on the Expression type.
One first thing we need to add is all sorts of assignments so that mutation of variables can be achieved. Lots of languages live and breathe mutation, so assignment becomes an essential building block to realize the potential of those languages on top of the DLR. This is indicated in orange. Notice things like n++ fall under this category as well, due to their dual increment and assign nature. In fact, assignments in C# are categorized as expressions (why?), so the old API didn’t expose the whole expression surface of the language. Instead, the LINQ expressions API provided just enough expressiveness as was needed for commonly used operations in the context of writing queries. Writing things like the following doesn’t make sense in that context:
var res = from p in db.Products
where p.UnitPrice > 100
select p.ProductName = “Changed”;
(Note: The above could represent some kind of update statement, but together with LINQ’s deeply rooted lazy evaluation the above would turn out very cumbersome from an operational point of view.)
Finally, we require a way to perform sequencing of operations (the semi-colon in C# if you will) and organize such operations in blocks where the concept of scoping is alive and kicking (the idea of curly braces in C#). In addition, every statement that influences control flow is useful to have too. Examples include if and loops (together with their break and continue mechanisms), returning from a method, throwing and catching exceptions, etc.
Adding all those ingredients to the mix results in the v4.0 version of the expression tree API, including the ability to do dynamic dispatch (which is what the DLR is all about, and is surfaced through the new C# 4.0 dynamic keyword). Note: the “table” mentioned in the rectangle for dynamic dispatch above refers to all the operations supported by the DLR, such as calling properties, invoking methods, etc.
The following query can be used on both .NET 3.5 and .NET 4.0 to illustrate the differences between the APIs:
var ops = from m in typeof(Expression).GetMethods()
where m.IsStatic
group m by m.Name into g
orderby g.Key
select new { g.Key, Overloads = g.Count() };
foreach (var op in ops)
Console.WriteLine(op);
In .NET 3.5, this is the result:
{ Key = Add, Overloads = 2 }
{ Key = AddChecked, Overloads = 2 }
{ Key = And, Overloads = 2 }
{ Key = AndAlso, Overloads = 2 }
{ Key = ArrayIndex, Overloads = 3 }
{ Key = ArrayLength, Overloads = 1 }
{ Key = Bind, Overloads = 2 }
{ Key = Call, Overloads = 6 }
{ Key = Coalesce, Overloads = 2 }
{ Key = Condition, Overloads = 1 }
{ Key = Constant, Overloads = 2 }
{ Key = Convert, Overloads = 2 }
{ Key = ConvertChecked, Overloads = 2 }
{ Key = Divide, Overloads = 2 }
{ Key = ElementInit, Overloads = 2 }
{ Key = Equal, Overloads = 2 }
{ Key = ExclusiveOr, Overloads = 2 }
{ Key = Field, Overloads = 2 }
{ Key = GetActionType, Overloads = 1 }
{ Key = GetFuncType, Overloads = 1 }
{ Key = GreaterThan, Overloads = 2 }
{ Key = GreaterThanOrEqual, Overloads = 2 }
{ Key = Invoke, Overloads = 2 }
{ Key = Lambda, Overloads = 5 }
{ Key = LeftShift, Overloads = 2 }
{ Key = LessThan, Overloads = 2 }
{ Key = LessThanOrEqual, Overloads = 2 }
{ Key = ListBind, Overloads = 4 }
{ Key = ListInit, Overloads = 6 }
{ Key = MakeBinary, Overloads = 3 }
{ Key = MakeMemberAccess, Overloads = 1 }
{ Key = MakeUnary, Overloads = 2 }
{ Key = MemberBind, Overloads = 4 }
{ Key = MemberInit, Overloads = 2 }
{ Key = Modulo, Overloads = 2 }
{ Key = Multiply, Overloads = 2 }
{ Key = MultiplyChecked, Overloads = 2 }
{ Key = Negate, Overloads = 2 }
{ Key = NegateChecked, Overloads = 2 }
{ Key = New, Overloads = 6 }
{ Key = NewArrayBounds, Overloads = 2 }
{ Key = NewArrayInit, Overloads = 2 }
{ Key = Not, Overloads = 2 }
{ Key = NotEqual, Overloads = 2 }
{ Key = Or, Overloads = 2 }
{ Key = OrElse, Overloads = 2 }
{ Key = Parameter, Overloads = 1 }
{ Key = Power, Overloads = 2 }
{ Key = Property, Overloads = 3 }
{ Key = PropertyOrField, Overloads = 1 }
{ Key = Quote, Overloads = 1 }
{ Key = RightShift, Overloads = 2 }
{ Key = Subtract, Overloads = 2 }
{ Key = SubtractChecked, Overloads = 2 }
{ Key = TypeAs, Overloads = 1 }
{ Key = TypeIs, Overloads = 1 }
{ Key = UnaryPlus, Overloads = 2 }
And in .NET 4.0, where I’ve used the same color scheme to outline differences from the previous version (notice some additions exist to the “expression” subset of available tree node types):
{ Key = Add, Overloads = 2 }
{ Key = AddAssign, Overloads = 3 }
{ Key = AddAssignChecked, Overloads = 3 }
{ Key = AddChecked, Overloads = 2 }
{ Key = And, Overloads = 2 }
{ Key = AndAlso, Overloads = 2 }
{ Key = AndAssign, Overloads = 3 }
{ Key = ArrayAccess, Overloads = 2 }
{ Key = ArrayIndex, Overloads = 3 }
{ Key = ArrayLength, Overloads = 1 }
{ Key = Assign, Overloads = 1 }
{ Key = Bind, Overloads = 2 }
{ Key = Block, Overloads = 12 }
{ Key = Break, Overloads = 4 }
{ Key = Call, Overloads = 14 }
{ Key = Catch, Overloads = 4 }
{ Key = ClearDebugInfo, Overloads = 1 }
{ Key = Coalesce, Overloads = 2 }
{ Key = Condition, Overloads = 2 }
{ Key = Constant, Overloads = 2 }
{ Key = Continue, Overloads = 2 }
{ Key = Convert, Overloads = 2 }
{ Key = ConvertChecked, Overloads = 2 }
{ Key = DebugInfo, Overloads = 1 }
{ Key = Decrement, Overloads = 2 }
{ Key = Default, Overloads = 1 }
{ Key = Divide, Overloads = 2 }
{ Key = DivideAssign, Overloads = 3 }
{ Key = Dynamic, Overloads = 6 }
{ Key = ElementInit, Overloads = 2 }
{ Key = Empty, Overloads = 1 }
{ Key = Equal, Overloads = 2 }
{ Key = ExclusiveOr, Overloads = 2 }
{ Key = ExclusiveOrAssign, Overloads = 3 }
{ Key = Field, Overloads = 3 }
{ Key = GetActionType, Overloads = 1 }
{ Key = GetDelegateType, Overloads = 1 }
{ Key = GetFuncType, Overloads = 1 }
{ Key = Goto, Overloads = 4 }
{ Key = GreaterThan, Overloads = 2 }
{ Key = GreaterThanOrEqual, Overloads = 2 }
{ Key = IfThen, Overloads = 1 }
{ Key = IfThenElse, Overloads = 1 }
{ Key = Increment, Overloads = 2 }
{ Key = Invoke, Overloads = 2 }
{ Key = IsFalse, Overloads = 2 }
{ Key = IsTrue, Overloads = 2 }
{ Key = Label, Overloads = 6 }
{ Key = Lambda, Overloads = 18 }
{ Key = LeftShift, Overloads = 2 }
{ Key = LeftShiftAssign, Overloads = 3 }
{ Key = LessThan, Overloads = 2 }
{ Key = LessThanOrEqual, Overloads = 2 }
{ Key = ListBind, Overloads = 4 }
{ Key = ListInit, Overloads = 6 }
{ Key = Loop, Overloads = 3 }
{ Key = MakeBinary, Overloads = 3 }
{ Key = MakeCatchBlock, Overloads = 1 }
{ Key = MakeDynamic, Overloads = 6 }
{ Key = MakeGoto, Overloads = 1 }
{ Key = MakeIndex, Overloads = 1 }
{ Key = MakeMemberAccess, Overloads = 1 }
{ Key = MakeTry, Overloads = 1 }
{ Key = MakeUnary, Overloads = 2 }
{ Key = MemberBind, Overloads = 4 }
{ Key = MemberInit, Overloads = 2 }
{ Key = Modulo, Overloads = 2 }
{ Key = ModuloAssign, Overloads = 3 }
{ Key = Multiply, Overloads = 2 }
{ Key = MultiplyAssign, Overloads = 3 }
{ Key = MultiplyAssignChecked, Overloads = 3 }
{ Key = MultiplyChecked, Overloads = 2 }
{ Key = Negate, Overloads = 2 }
{ Key = NegateChecked, Overloads = 2 }
{ Key = New, Overloads = 6 }
{ Key = NewArrayBounds, Overloads = 2 }
{ Key = NewArrayInit, Overloads = 2 }
{ Key = Not, Overloads = 2 }
{ Key = NotEqual, Overloads = 2 }
{ Key = OnesComplement, Overloads = 2 }
{ Key = Or, Overloads = 2 }
{ Key = OrAssign, Overloads = 3 }
{ Key = OrElse, Overloads = 2 }
{ Key = Parameter, Overloads = 2 }
{ Key = PostDecrementAssign, Overloads = 2 }
{ Key = PostIncrementAssign, Overloads = 2 }
{ Key = Power, Overloads = 2 }
{ Key = PowerAssign, Overloads = 3 }
{ Key = PreDecrementAssign, Overloads = 2 }
{ Key = PreIncrementAssign, Overloads = 2 }
{ Key = Property, Overloads = 7 }
{ Key = PropertyOrField, Overloads = 1 }
{ Key = Quote, Overloads = 1 }
{ Key = ReferenceEqual, Overloads = 1 }
{ Key = ReferenceNotEqual, Overloads = 1 }
{ Key = Rethrow, Overloads = 2 }
{ Key = Return, Overloads = 4 }
{ Key = RightShift, Overloads = 2 }
{ Key = RightShiftAssign, Overloads = 3 }
{ Key = RuntimeVariables, Overloads = 2 }
{ Key = Subtract, Overloads = 2 }
{ Key = SubtractAssign, Overloads = 3 }
{ Key = SubtractAssignChecked, Overloads = 3 }
{ Key = SubtractChecked, Overloads = 2 }
{ Key = Switch, Overloads = 6 }
{ Key = SwitchCase, Overloads = 2 }
{ Key = SymbolDocument, Overloads = 4 }
{ Key = Throw, Overloads = 2 }
{ Key = TryCatch, Overloads = 1 }
{ Key = TryCatchFinally, Overloads = 1 }
{ Key = TryFault, Overloads = 1 }
{ Key = TryFinally, Overloads = 1 }
{ Key = TryGetActionType, Overloads = 1 }
{ Key = TryGetFuncType, Overloads = 1 }
{ Key = TypeAs, Overloads = 1 }
{ Key = TypeEqual, Overloads = 1 }
{ Key = TypeIs, Overloads = 1 }
{ Key = UnaryPlus, Overloads = 2 }
{ Key = Unbox, Overloads = 1 }
{ Key = Variable, Overloads = 2 }
As you can see, there’s a wealth of new capabilities added to the expression trees API and one can’t but love this from the bottom of his .NET developer’s heart. What hasn’t changed though, at the moment, is the language support for expression tree generation. That is, when writing a statement bodied lambda expression and assigning it to an Expression<TDelegate> variable, the compiler won’t turn it into an expression tree using the new nodes for you. It keeps on supporting the subset required for LINQ. Till we truly enter the domain of meta-programming, I except the language to stay with the true expression tree subset.
A first sample
To get readers excited about this API enrichment, let’s have a go at writing a bigger expression tree containing statement and control flow nodes. Our running primes sample is a good choice for this task at hand. But first, the regular C# code to set the scene:
Func<int, List<int>> getPrimes = to =>
{
var res = new List<int>();
for (int n = 2; n <= to; n++)
{
bool found = false;
for (int d = 2; d <= Math.Sqrt(n); d++)
{
if (n % d == 0)
{
found = true;
break;
}
}
if (!found)
res.Add(n);
}
return res;
};
foreach (var num in getPrimes(100))
Console.WriteLine(num);
This code should be fairly straightforward to grasp. Notice I’m using a lambda expression to define the “method” we’re talking about, so that we have a local variable called getPrimes that we can invoke through. Our goal is to generate this code at runtime using the expression trees APIs, so that the use site (the foreach loop at the bottom) can remain the same.
Next, we’re on for a stroll in expression tree land. The top level node is a plain old LambdaExpression but then things start to change from what we’re used to. First of all, we’ll need sequencing which can be achieved through a BlockExpression. Each block can have a number of variables that are in scope, as well as a number of expressions that make part of it. In our case, the body of the lambda has a res variable and two statements: one for the for-loop and one for the return-statement:
var to = Expression.Parameter(typeof(int), "to");
var res = Expression.Variable(typeof(List<int>), "res");
var getPrimes =
// Func<int, List<int>> getPrimes =
Expression.Lambda<Func<int, List<int>>>(
// {
Expression.Block(
// List<int> res;
new [] { res },
// res = new List<int>();
Expression.Assign(
res,
Expression.New(typeof(List<int>))
),
// loop goes here
res
),
to
// }
).Compile();
foreach (var num in getPrimes(100))
Console.WriteLine(num);
Again, this piece of code is not too hard to understand. The Lambda factory method takes in the body and the lambda parameters. Difference from the past API landscape is that the body now can be a BlockExpression, and in our case it has a “res” variable in scope. Assignment is trivial using the Assign factory method passing in the left-hand side and right-hand side. Returning from the lambda expression can simply be done by using the “res” expression since the last valued expression from a block becomes its return value (an Expression.Return factory exists for more complicated cases where one returns from the middle of a block of code, which boils down to a “value-carrying” GotoExpression behind the scenes). Notice this is more general property of expression trees, as languages like Ruby use this property throughout (e.g. the value of an if-statement is the last expression’s value from either of the branches).
Now the loop. Although a whole bunch of loop constructs exist in front-end languages like C#, the expression tree APIs only provide one out of the box. Its goal is to centralize the plumbing involved with re-executing the loop’s body while having break and continue capabilities, as well as maintaining the last iteration’s value as the result of the loop (again for languages like Ruby that work on the DLR using expression trees). As we’ve written a for-loop in the original fragment, we need a way to cross-compile a for-loop into this basic building block for loops:
for (initializer; condition; update)
body
stands for
{
initializer;
while (condition)
{
body;
continueLabel:
update;
}
breakLabel:
}
where the body’s immediate inner statements get a substitution applied to rewrite break and continue statements in terms of the added labels. Notice the use of a separate block, keeping the for-loop’s initialized variables in a separate scope. More-over, the expression tree API’s loop construct doesn’t have a terminating condition, so that too needs to be replaced by a custom jump to the break label:
{
initializer;
while (true)
{
if (!condition)
goto breakLabel;
body;
continueLabel:
update;
}
breakLabel:
}
As our loops don’t use continue statements, we can omit that complication. For the outer loop we end up with:
// {
Expression.Block(
// int n;
new [] { n },
// n = 2;
Expression.Assign(
n,
Expression.Constant(2)
),
// while (true)
Expression.Loop(
// {
Expression.Block(
// if
Expression.IfThen(
// (!
Expression.Not(
// (n <= to)
Expression.LessThanOrEqual(
n,
to
)
// )
),
// break;
Expression.Break(breakOuter)
),
// loop body goes here
// n++;
Expression.PostIncrementAssign(n)
// }
),
breakOuter
)
)
A more thorough explanation is in order here. Think of Expression.Loop as providing an infinite loop. So, right inside the loop’s body (the first argument to the Loop call) we provide a condition to jump out of the loop when the terminating condition holds. As part of the loop’s body we also need to have the inner loop (see further) as well as the update part of the for-statement, which is a post-increment-assign node. One more thing is required here though: the use of a break label to denote the statement right after the loop. We create such a beast using the following call:
var breakOuter = Expression.Label();
and specify it as the last argument to the Loop call, where the expression tree API can take care of emitting the label in the right spot. Inside the loop, we can simply use the Break factory method to emit a GotoExpression that essentially breaks out of the loop. Also notice the use of the outer loop’s block as the scope for the variable “n”, which was declared like this:
var n = Expression.Variable(typeof(int), "n");
No rocket science either. Finally, we need to write the loop’s body which is piece of cake again: use a block to group multiple statements. It also contains an inner for-loop for which we use the same tricks, including the use of a break label etc:
// {
Expression.Block(
// bool found;
new[] { found },
// found = false;
Expression.Assign(
found,
Expression.Constant(false)
),
// {
Expression.Block(
// int d;
new [] { d },
// d = 2;
Expression.Assign(
d,
Expression.Constant(2)
),
// while (true)
Expression.Loop(
// {
Expression.Block(
// if
Expression.IfThen(
// (!
Expression.Not(
// d <= Math.Sqrt(n)
Expression.LessThanOrEqual(
d,
Expression.Convert(
Expression.Call(
null,
typeof(Math).GetMethod("Sqrt"),
Expression.Convert(
n,
typeof(double)
)
),
typeof(int)
)
)
// )
),
// break;
Expression.Break(breakInner)
),
// {
Expression.Block(
// if (n % d == 0)
Expression.IfThen(
Expression.Equal(
Expression.Modulo(
n,
d
),
Expression.Constant(0)
),
// {
Expression.Block(
// found = true;
Expression.Assign(
found,
Expression.Constant(true)
),
// break;
Expression.Break(breakInner)
// }
)
)
// }
),
// d++;
Expression.PostIncrementAssign(d)
// }
),
breakInner
)
),
// if
Expression.IfThen(
// (!found)
Expression.Not(found),
// res.Add(n);
Expression.Call(
res,
typeof(List<int>).GetMethod("Add"),
n
)
)
),
It should be straightforward to find out how the variables d, found and the label breakInner have been created upfront. Around the d <= Math.Sqrt(n) code there’s quite a bit of plumbing needed because of the implicit conversions that take place all over, resulting in the use of a few convert nodes. Other than that, the code above doesn’t use anything new.
For simplicity when trying this out for yourself in .NET 4.0 and Visual Studio 2010, here’s the whole fragment:
var to = Expression.Parameter(typeof(int), "to");
var res = Expression.Variable(typeof(List<int>), "res");
var n = Expression.Variable(typeof(int), "n");
var found = Expression.Variable(typeof(bool), "found");
var d = Expression.Variable(typeof(int), "d");
var breakOuter = Expression.Label();
var breakInner = Expression.Label();
var getPrimes =
// Func<int, List<int>> getPrimes =
Expression.Lambda<Func<int, List<int>>>(
// {
Expression.Block(
// List<int> res;
new [] { res },
// res = new List<int>();
Expression.Assign(
res,
Expression.New(typeof(List<int>))
),
// {
Expression.Block(
// int n;
new [] { n },
// n = 2;
Expression.Assign(
n,
Expression.Constant(2)
),
// while (true)
Expression.Loop(
// {
Expression.Block(
// if
Expression.IfThen(
// (!
Expression.Not(
// (n <= to)
Expression.LessThanOrEqual(
n,
to
)
// )
),
// break;
Expression.Break(breakOuter)
),
// {
Expression.Block(
// bool found;
new[] { found },
// found = false;
Expression.Assign(
found,
Expression.Constant(false)
),
// {
Expression.Block(
// int d;
new [] { d },
// d = 2;
Expression.Assign(
d,
Expression.Constant(2)
),
// while (true)
Expression.Loop(
// {
Expression.Block(
// if
Expression.IfThen(
// (!
Expression.Not(
// d <= Math.Sqrt(n)
Expression.LessThanOrEqual(
d,
Expression.Convert(
Expression.Call(
null,
typeof(Math).GetMethod("Sqrt"),
Expression.Convert(
n,
typeof(double)
)
),
typeof(int)
)
)
// )
),
// break;
Expression.Break(breakInner)
),
// {
Expression.Block(
// if (n % d == 0)
Expression.IfThen(
Expression.Equal(
Expression.Modulo(
n,
d
),
Expression.Constant(0)
),
// {
Expression.Block(
// found = true;
Expression.Assign(
found,
Expression.Constant(true)
),
// break;
Expression.Break(breakInner)
// }
)
)
// }
),
// d++;
Expression.PostIncrementAssign(d)
// }
),
breakInner
)
),
// if
Expression.IfThen(
// (!found)
Expression.Not(found),
// res.Add(n);
Expression.Call(
res,
typeof(List<int>).GetMethod("Add"),
n
)
)
),
// n++;
Expression.PostIncrementAssign(n)
// }
),
breakOuter
)
),
res
),
to
// }
).Compile();
foreach (var num in getPrimes(100))
Console.WriteLine(num);
Inspecting the IL code
Obviously you want to see the generated code after calling Compile. Near the top of this post you saw how to do this using SOS, so I won’t repeat the recipe over here (just make sure to load the sos.dll file from the v4.0 installation folder instead), but below is the result:
!dumpil 023f2bc0
This is dynamic IL. Exception info is not reported at this time.
If a token is unresolved, run "!do <addr>" on the addr given
in parenthesis. You can also look at the token table yourself, by
running "!DumpArray 023f985c".
// res = new List<int>();
IL_0000: newobj 023ef050 is not a MethodDesc
6000003
IL_0005: stloc.0
// n = 2
IL_0006: ldc.i4.2
IL_0007: stloc.1
// n <= to
IL_0008: ldloc.1
IL_0009: ldarg.1
IL_000a: ble.s IL_000f
// “mumble, mumble”
IL_000c: ldc.i4.0
IL_000d: br.s IL_0010
IL_000f: ldc.i4.1
IL_0010: ldc.i4.0
IL_0011: ceq
IL_0013: brfalse IL_001d
IL_0018: br IL_0079
// found = false;
IL_001d: ldc.i4.0
IL_001e: stloc.2
// d = 2
IL_001f: ldc.i4.2
IL_0020: stloc.3
// d <= Math.Sqrt(n)
IL_0021: ldloc.3
IL_0022: ldloc.1
IL_0023: conv.r8
IL_0024: call 023ef9a0 is not a MethodDesc
6000004
IL_0029: conv.i4
IL_002a: ble.s IL_002f
// “mumble, mumble”
IL_002c: ldc.i4.0
IL_002d: br.s IL_0030
IL_002f: ldc.i4.1
IL_0030: ldc.i4.0
IL_0031: ceq
IL_0033: brfalse IL_003d
IL_0038: br IL_005c
// n % d == 0
IL_003d: ldloc.1
IL_003e: ldloc.3
IL_003f: rem
IL_0040: ldc.i4.0
IL_0041: ceq
IL_0043: brfalse IL_004f
// found = true;
IL_0048: ldc.i4.1
IL_0049: stloc.2
IL_004a: br IL_005c
// d++;
IL_004f: ldloc.3
IL_0050: stloc.s VAR OR ARG 4
IL_0052: ldloc.s VAR OR ARG 4
IL_0054: ldc.i4.1
IL_0055: add
IL_0056: stloc.3
IL_0057: br IL_0021
// if (!found)
IL_005c: ldloc.2
IL_005d: ldc.i4.0
IL_005e: ceq
IL_0060: brfalse IL_006c
// res.Add(n);
IL_0065: ldloc.0
IL_0066: ldloc.1
IL_0067: callvirt 023f0548 is not a MethodDesc
6000005
// n++;
IL_006c: ldloc.1
IL_006d: stloc.s VAR OR ARG 5
IL_006f: ldloc.s VAR OR ARG 5
IL_0071: ldc.i4.1
IL_0072: add
IL_0073: stloc.1
IL_0074: br IL_0008
// return res;
IL_0079: ldloc.0
IL_007a: ret
Amazing, isn’t it? That’s it for now. Expect to see more expression tree fun in the future, in my continuation to the Control Library series of posts.
Del.icio.us |
Digg It |
Technorati |
Blinklist |
Furl |
reddit |
DotNetKicks
Filed under: LINQ, .NET Framework v4.0