Thursday, August 28, 2008 10:35 AM bart

To Bind or Not To Bind – Dynamic Expression Trees – Part 3

Last time in this series we were able to compile a stunningly complex “dynamic lambda” x => x – also known as I in the world of combinators – into IL code at runtime. As that’s not particularly useful, we want to move on to slightly more complex expressions like:

var o = DynamicExpression.Parameter("o");
var a = DynamicExpression.Parameter("a");
var b = DynamicExpression.Parameter("b");
var call = DynamicExpression.Call(o, "Substring", a, b);
var func = DynamicExpression.Lambda(call, o, a, b);
Console.WriteLine(func);
Console.WriteLine(func.Compile().DynamicInvoke("Bart", 1, 2));

Or, in pretty print,

(o, a, b) => o.Substring(a, b)

 

Setting the scene

We explained how the translation of a dynamic expression tree takes place in general: as we traverse the tree, individual nodes are visited asking them to append code capturing the expression’s semantics to an IL stream, pushing a value on the stack that corresponds to the evaluated expression. The method that does this translation for every dynamic expression is called “Compile”:

/// <summary>
/// Appends IL instructions to calculate the expression's runtime value, putting it on top of the evaluation stack.
/// </summary>
/// <param name="ilgen">IL generator to append to.</param>
/// <param name="ldArgs">Lambda argument mappings.</param>
protected internal abstract void Compile(ILGenerator ilgen, Dictionary<ParameterDynamicExpression, int> ldArgs);

In here we’re using a simplified concept of “lambda parameters in scope” using ldArgs, avoiding getting into slightly more complex techniques such as hoisting that are required for more involved expression trees. Previously you saw how to implement this method for ParameterDynamicExpression and LambdaDynamicExpression, respectively:

protected internal override void Compile(ILGenerator ilgen, Dictionary<ParameterDynamicExpression, int> ldArgs)
{
    if (!ldArgs.ContainsKey(this))
        throw new InvalidOperationException("Parameter expression " + Name + " is not in scope.");

    ilgen.Emit(OpCodes.Ldarg, ldArgs[this]);
}

and

protected internal override void Compile(ILGenerator ilgen, Dictionary<ParameterDynamicExpression, int> ldArgs)
{
    Body.Compile(ilgen, ldArgs);
    ilgen.Emit(OpCodes.Ret);
}

 

Dynamic method calls and binders

For MethodCallExpression things are a bit more involved than for the expression types above. Before we start, remember the most important portion of the Compile method contract: leave one value on top of the stack that corresponds to the evaluated expression, in this case a method call. What does a method call consist of? Here are the ingredients:

public ReadOnlyCollection<DynamicExpression> Arguments { get; private set; }
public DynamicExpression Object { get; private set; }
public string Method { get; private set; }

As we’re seeing other DynamicExpression objects being referenced in here, it’s already clear we’ll have to evaluate those by recursively calling Compile. So, we could do something along the lines of:

compile Object
foreach argument in Arguments
    compile argument
call Method

That’s the typical structure of a call site, pushing arguments on the stack including the object to invoke the method on. From a stack point of view: n + 1 arguments are pushed, where n is the number of arguments and 1 accounts for the instance to invoke the method one, and next all of those stack citizens are eaten by the method call, producing the single return value on top of the stack. This follows the contract of our Compile method.

There’s a slight problem though: we can’t emit the call because we don’t know what type to invoke it one. The reason is well-known in the meantime: the Object nor the Arguments have strongly-typed information, so just given a string named “Method”, we can’t get the required method metadata to emit a call(virt) instruction. Bummer. But that’s the whole point of dynamic programming, delaying the decision about the executed method/function till runtime because the type might dynamically grow with new members (think of ETS in PowerShell as a sample of such a capability).

One way to solve this problem is to emit a bunch of reflection code to investigate the type of Object at runtime, do the same for all of the arguments, try to find a suitable method to call, etc etc. I shouldn’t explain how complicated this would become :-). There are lots of drawbacks to this: we’re baking in the whole dynamic call infrastructure into the call site and as we’re emitting all of that code, the odds to adapt it without having to recompile the code are off. This whole “locate a suitable method” algorithm could be made extensible too if we’re not emitting it into the generated code straight away. In other words, we want to get out of the IL generating business as soon as we can, and introduce a level of indirection. That particular kind of indirection is what we call a binder.

So what’s a binder precisely? It’s simply a class that contains all the functionality to make well-formed decisions (based on certain desired semantics) about method calls (amongst other invocation mechanisms). Actually we have such a thing in the framework already: System.Reflection.Binder. As the documentation says:

Selects a member from a list of candidates, and performs type conversion from actual argument type to formal argument type.

The list of candidates is something that can be made extensible, allowing methods to be “imported” or “attached” to existing types at runtime. The type conversion clause in the sentence above outlines that the binder is responsible to take the actual passed in arguments (in our case weakly typed) and turn them into (i.e. casting) formal argument types that are suitable for consumption by the selected candidate method. The sample on MSDN for System.Reflection.Binder shows what it takes to implement such a beast. We’re not going to do that though, just to simplify matters a bit. As we’re only interested in method calls, we’ll just implement the bare minimum binder to get the job done and explained. Furthermore, we won’t spend time on implicit conversions for built-in types (like int to long) as the mentioned sample illustrates that already. Last but not least, generics are not brought in the equation either.

Without further delay, let’s show a possible binder implementation:

public class DynamicBinder
{
    public static object Call(object @this, string methodName, params object[] args)
    {
        //
        // Here we're going to be lazy for demo purposes only. Our overload resolution
        // will pick the first applicable method without applying "betterness" rules
        // as outlined in the C# specification (v2.0, section $7.4.2). We don't care
        // about extension methods either (how could the namespace be brought in scope
        // in the context of an expression tree...?) nor other dynamic type extensions
        // such as IExpando (~ IMarshalEx) or e.g. PowerShell ETS.
        //

        var result = (from method in @this.GetType().GetMethods()
                      where method.Name == methodName
                      let parameters = method.GetParameters()
                      where parameters.Length == args.Length
                            && parameters.Where((p, i) => p.ParameterType.IsAssignableFrom(args[i].GetType())).Count() == args.Length
                      select new { Method = method, Parameters = parameters }).SingleOrDefault();

        if (result == null)
        {
            StringBuilder sb = new StringBuilder();
            sb.Append("Failed to bind method call: ");
            sb.Append(@this.GetType());
            sb.Append(".");
            sb.Append(methodName);
            sb.Append("(");

            int n = args.Length;
            for (int i = 0; i < n; i++)
                sb.Append(args[i].GetType().ToString() + (i != n - 1 ? ", " : ""));

            sb.Append(").");
            throw new InvalidOperationException(sb.ToString());
        }

        return result.Method.Invoke(@this, args);
    }
}

This needs some explanation I assume. The signature should be straightforward: given an object @this, we want to call method methodName with zero or more arguments args. The result of this will be an object again (notice we don’t support void return types for methods being called, which isn’t too big of deal when considering functions as lambdas – i.e. no statement lambdas). What’s more interesting though is the way we find a suitable method. I chose to write it as a gigantic LINQ expression just to show how powerful LINQ can be. Let me walk you through it:

var result = (from method in @this.GetType().GetMethods()
              where method.Name == methodName
              let parameters = method.GetParameters()

For all methods available on the left-hand side of the call (i.e. @this) select those methods that have the same name (case sensitive compare – this would be a binder that mimics C# name resolution for method calls) and let parameters be a variable containing the parameters for each of the selected methods going forward. In other words, in what follows we’re seeing a sequence of (method, parameters) pairs mapping each suitable (at least concerning the name) method on the parameters it takes. Next we need to do overload resolution:

              where parameters.Length == args.Length

Here we make sure the number of arguments on the candidate method matches the number of arguments passed in to the binder’s Call call. This implies we don’t consider things like optional arguments supported by some languages which would mean that having less matching parameters (but not more!) would keep the method as a candidate, although there would need to be some ordering to make sure that methods with more arguments take precedence over methods with arguments supplied through optional values. Notice this simple check makes it also impossible to call a “params” method without stiffing the argument in an array upfront.

                    && parameters.Where((p, i) => p.ParameterType.IsAssignableFrom(args[i].GetType())).Count() == args.Length

Now we’re in the clause that’s maybe the most interesting. Here we’re taking all the parameters of the candidate and check that the parameter p on position i has a type that’s assignable from the type of the argument passed in to the binder’s Call method. In essence this is contravariance for arguments. Assume we’re examining a candidate method like this:

class ExperimentalZoo
{
    Animal CloneBeast(Mammal g);
}

and we’re calling the binder as follows:

DynamicBinder.Call(new ExperimentalZoo(), “CloneBeast”, new Giraffe())

As we’re calling the binder with an argument of type Giraffe (args[0].GetType()) and Giraffe inherits from Mammal (parameters[i].ParameterType), the candidate is compatible. However, if we’d call the method with an argument of type Goldfish it would clearly not be compatible (as a fish is not a mammal). This is precisely what the Where clause above enforces. The Count() == args.Length trick at the end makes sure all of the arguments pass the test (using the All operator would be ideal but it hasn’t an overload passing in the index; alternatively a Zip operator would be beneficial too).

Finally we have the select clause:

              select new { Method = method, Parameters = parameters }).SingleOrDefault();

which simply extracts the method (of type MethodInfo) and the parameters (of type ParameterInfo[]) and makes sure we only found one match. This is another simplification for illustrative purposes only – to be fully compliant with e.g. the C# language, we’d have to implement all of the overload resolution rules including “betterness rules” that select the most optimal overload. More information on this can be found in the C# specification, in v3.0 under “7.4.3 Overload Resolution”. The key takeaway though is that we can tweak this binder as much as we want (e.g., left as an exercise, we could implement resolution that takes extension methods into account) without affecting the generated IL code that will simply call into the binder’s Call method.

If we find one result, we can just go ahead and call it by calling through the retrieved Method using the Invoke method, passing in the @this pointer and the args array.

 

Connecting the pieces

Now that we have our beloved binder, we need to glue it together with our dynamic expression compilation. In concrete terms this means we need to emit a call to DynamicBinder.Call in the generated IL for the DynamicCallExpression. This isn’t too hard either:

protected internal override void Compile(ILGenerator ilgen, Dictionary<ParameterDynamicExpression, int> ldArgs)
{
    if (Object == null)
        ilgen.Emit(OpCodes.Ldnull);
    else
        Object.Compile(ilgen, ldArgs);

    ilgen.Emit(OpCodes.Ldstr, Method);

    ilgen.Emit(OpCodes.Ldc_I4, Arguments.Count);
    ilgen.Emit(OpCodes.Newarr, typeof(object));

    LocalBuilder arr = ilgen.DeclareLocal(typeof(object[]));
    ilgen.Emit(OpCodes.Stloc, arr);

    int i = 0;
    foreach (DynamicExpression arg in Arguments)
    {
        ilgen.Emit(OpCodes.Ldloc, arr);
        ilgen.Emit(OpCodes.Ldc_I4, i++);
        arg.Compile(ilgen, ldArgs);
        ilgen.Emit(OpCodes.Stelem_Ref);
    }

    ilgen.Emit(OpCodes.Ldloc, arr);

    ilgen.EmitCall(OpCodes.Call, typeof(DynamicBinder).GetMethod("Call"), null);
}

What’s going on here? First, we check whether an Object has been specified. This is more of an extensibility point in case our binder would like to implement global functions (also left as an exercise, for example you could recognize null.Add(1, 2) as a global Add call, translating into Math.Add(…); or, the Method property could be set to “Math.Add” to denote a static method call). We’ll assume the else case holds true for our samples, causing us to call Compile recursively on the Object dynamic expression. This will add the value corresponding to the Object expression tree’s evaluation on top of the stack (note: you can smell call-by-value semantics already, don’t you?). Next, we load the string specified in the Method property onto the stack as well. Currently the stack looks like:

(string) Method
(object) Object.Compile result

Now we get into interesting stuff as our binder’s Call method expects to see an object[] as its third parameter. How many arguments? On for each of the DynamicExpression objects in the Arguments collection, so we do Newarr passing in the object type object after pushing the number of elements to be allocated on the stack using Ldc_I4 passing in Arguments.Count. Now we have our array, we can store it in a local variable we call “arr”. Time to fill the array by first loading the local, then pushing the index followed by a push of the argument’s value – again obtained by a recursive Compile call on the argument “arg” – and finally calling stelem_ref (as we’re dealing with System.Object we need _ref). The loop invariant is that it doesn’t change the stack height: it cleanly loads three “arguments” to stelem_ref which brings the stack delta back to 0).

Ultimately, we load the array local variable and the stack looks like (semantically):

(object[]) Arguments.Select(arg => arg.Compile()).ToArray()
(string) Method
(object) Object.Compile()

ready for a call to DynamicBinder.Call which turns the stack into:

(object) DynamicBinder.Call(Object.Compile(), Method, Arguments.Select(arg => arg.Compile()).ToArray())

Again, we have managed to keep the house clean with regards to the stack behavior, i.e. the element on top of the stack contains the value corresponding to the entire (MethodCall)DynamicExpression.

 

Testing it

Does it work? Let’s try with our running sample:

var o = DynamicExpression.Parameter("o");
var a = DynamicExpression.Parameter("a");
var b = DynamicExpression.Parameter("b");
var call = DynamicExpression.Call(o, "Substring", a, b);
var func = DynamicExpression.Lambda(call, o, a, b);
Console.WriteLine(func);
Console.WriteLine(func.Compile().DynamicInvoke("Bart", 1, 2));

Recognize the patterns in the output IL?

image

A quick walk-through:

  • IL_0000 loads MethodCallDynamicExpression.Object which in turn was compiled into a ldarg V_0 by the ParameterDynamicExpression’s Compile method (this corresponds to “o”)
  • IL_0006 loads MethodCallDynamicExpression.Method
  • IL_000b to IL_0015 prepares the array for the method call arguments to be passed to the binder
  • IL_0016 to IL_0022 puts the first argument (corresponding to “a” translated into ldarg V_1 through ParameterDynamicExpression.Compile) in the array
  • IL_0023 to IL_002f does the same for the second argument (corresponding to “b” translated into ldarg V_2 through ParameterDynamicExpression.Compile)
  • IL_0030 to IL_0036 finally makes the call through the binder, passing in the results of the above and returning the value produced by the binder

If we now set a breakpoint in the DynamicBinder.Call method and let execution continue, we’ll see:

image

The third line in the Call Stack is where DynamicInvoke is happening:

Console.WriteLine(func.Compile().DynamicInvoke("Bart", 1, 2));

and through the “External Code” corresponding to our emitted dynamic method we got back into the DynamicBinder that now will pick the right Substring method given lhs “Bart” and arguments 1 and 2. Ultimately the following prints to the screen:

image

Magic. To show it’s really extensible we can start to compose things endlessly with our two main ingredients: parameter and method call expressions. Here’s a sample (reverse engineering the nested DynamicExpression factory calls is left as an exercise):

image

Also left as an exercise to the reader is to find values for o and a through h that produce the displayed output above :-). For the record, here’s the corresponding IL:

IL_0000: ldarg      V_0
IL_0004: nop       
IL_0005: nop       
IL_0006: ldstr      "Replace"
IL_000b: ldc.i4     2
IL_0010: newarr     Object
IL_0015: stloc.0   
IL_0016: ldloc.0   
IL_0017: ldc.i4     0
IL_001c: ldarg      V_3
IL_0020: nop       
IL_0021: nop       
IL_0022: stelem.ref
IL_0023: ldloc.0   
IL_0024: ldc.i4     1
IL_0029: ldarg      V_4
IL_002d: nop       
IL_002e: nop       
IL_002f: stelem.ref
IL_0030: ldloc.0   
IL_0031: call       System.Object Call(System.Object, System.String, System.Object[])/BinderFun.DynamicBinder
IL_0036: ldstr      "Substring"
IL_003b: ldc.i4     2
IL_0040: newarr     Object
IL_0045: stloc.1   
IL_0046: ldloc.1   
IL_0047: ldc.i4     0
IL_004c: ldarg      V_1
IL_0050: nop       
IL_0051: nop       
IL_0052: stelem.ref
IL_0053: ldloc.1   
IL_0054: ldc.i4     1
IL_0059: ldarg      V_2
IL_005d: nop       
IL_005e: nop       
IL_005f: stelem.ref
IL_0060: ldloc.1   
IL_0061: call       System.Object Call(System.Object, System.String, System.Object[])/BinderFun.DynamicBinder
IL_0066: ldstr      "Replace"
IL_006b: ldc.i4     2
IL_0070: newarr     Object
IL_0075: stloc.2   
IL_0076: ldloc.2   
IL_0077: ldc.i4     0
IL_007c: ldarg      V_5
IL_0080: nop       
IL_0081: nop       
IL_0082: stelem.ref
IL_0083: ldloc.2   
IL_0084: ldc.i4     1
IL_0089: ldarg      V_6
IL_008d: nop       
IL_008e: nop       
IL_008f: stelem.ref
IL_0090: ldloc.2   
IL_0091: call       System.Object Call(System.Object, System.String, System.Object[])/BinderFun.DynamicBinder
IL_0096: ldstr      "PadRight"
IL_009b: ldc.i4     2
IL_00a0: newarr     Object
IL_00a5: stloc.3   
IL_00a6: ldloc.3   
IL_00a7: ldc.i4     0
IL_00ac: ldarg      V_7
IL_00b0: nop       
IL_00b1: nop       
IL_00b2: stelem.ref
IL_00b3: ldloc.3   
IL_00b4: ldc.i4     1
IL_00b9: ldarg      V_8
IL_00bd: nop       
IL_00be: nop       
IL_00bf: stelem.ref
IL_00c0: ldloc.3   
IL_00c1: call       System.Object Call(System.Object, System.String, System.Object[])/BinderFun.DynamicBinder
IL_00c6: ldstr      "ToUpper"
IL_00cb: ldc.i4     0
IL_00d0: newarr     Object
IL_00d5: stloc.s    V_4
IL_00d7: ldloc.s    V_4
IL_00d9: call       System.Object Call(System.Object, System.String, System.Object[])/BinderFun.DynamicBinder
IL_00de: ret       

Enjoy! Next time … who knows what?

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under: ,

Comments

# To Bind or Not To Bind ??? Dynamic Expression Trees ??? Part 3

Pingback from  To Bind or Not To Bind ??? Dynamic Expression Trees ??? Part 3

# Reflective Perspective - Chris Alcock &raquo; The Morning Brew #168

Pingback from  Reflective Perspective - Chris Alcock  &raquo; The Morning Brew #168

# Dew Drop - August 29, 2008 | Alvin Ashcraft's Morning Dew

Pingback from  Dew Drop - August 29, 2008 | Alvin Ashcraft's Morning Dew

# re: To Bind or Not To Bind – Dynamic Expression Trees – Part 3

Monday, September 01, 2008 8:17 AM by Tuomas Hietanen

Hmmm...

1. Using dynamic compilation it would be possible to create a (evolution algorithm/neural network based) software that takes a set of unit tests as input and makes automatically a software that passes those unit tests.

Will the future of software engineering be writing only unit tests??

2. What about dynamic recursive lambdas?-) This is a nice article:

<blogs.msdn.com/.../recursive-lambda-expressions.aspx>

# WMOC#18 - Microsoft Solver Foundation released - Service Endpoint

Pingback from  WMOC#18 - Microsoft Solver Foundation released - Service Endpoint

# Recent Links Tagged With "stream" - JabberTags

Saturday, September 13, 2008 9:05 PM by Recent Links Tagged With "stream" - JabberTags

Pingback from  Recent Links Tagged With "stream" - JabberTags

# Websites tagged "opcodes" on Postsaver

Friday, September 19, 2008 11:02 PM by Websites tagged "opcodes" on Postsaver

Pingback from  Websites tagged "opcodes" on Postsaver

# Bookmarks about Powershell

Sunday, September 28, 2008 7:00 PM by Bookmarks about Powershell

Pingback from  Bookmarks about Powershell

# Dynamic Expressions Example &laquo; Angel &#8220;Java&#8221; Lopez on Blog

Thursday, January 29, 2009 10:29 PM by Dynamic Expressions Example « Angel “Java” Lopez on Blog

Pingback from  Dynamic Expressions Example &laquo; Angel &#8220;Java&#8221; Lopez on Blog