##
LINQ and The Matrix – Introducing MLinq

Today we’ll take a look at project “MLinq”, a very simple way to perform symbolic matrix calculations based on LINQ expression trees.

# Introduction

But first, why would you even want to do this? Let me tell you I’m a big believer of the power of mathematical notation. Say you’ve been asked to write a piece of code that will carry out rotation of points in an n-dimensional space. To simplify matters, just consider two dimensions. Lots of readers will know this isn’t that hard using matrices:

So far so good, but now you need to know matrix multiplication in order to concretize the formula for both coordinates. Not a big deal for this particular case, but composability suffers: next you want to take the output of this rotation and feed it in to a transposition, another rotation, and so on. Wouldn’t it be nice just to write the formula as-is and let the system figure out how to turn it into execution? Obviously matrices are just the tip of the iceberg, arbitrary mathematical expressions can be represented in a symbolic format and turned to execution somehow. However, we don’t intend to write a whole symbolic math framework as part of the journey, as others have done this before (Maple, Matlab, Mathematica, to name a few)...

Let’s go back to our sample of matrix multiplication. What we’re after is something along those lines:

var t = MathExpression.Symbol<double>("t"); var p = MathExpression.Symbol<Point>("p"); MatrixExpression<double> rotate = new Expression[2, 2] { { MathExpression.Cos(t), -MathExpression.Sin(t) }, { MathExpression.Sin(t), MathExpression.Cos(t) } }; MatrixExpression<double> point = new Expression[2, 1] { { p.X() }, { p.Y() } }; var rot = (rotate * point).Compile<Func<Point, double, Point>>(p, t); Console.WriteLine(rot(new Point(1, 0), Math.PI / 4));

First we define two symbolic constants, t and p. The first symbol, t, stands for the rotation angle (theta) in radians; the second symbol, p, stands for the point to rotate. Next, we define two matrices: one for the rotation matrix (using factory methods for the symbolic representation of cos(theta) and sin(theta)), and another one for the components of the point (actually the X and Y over there could benefit from some dynamic typing features, see further, as the empty method call parentheses pairs are aesthetically unpleasing). Where the magic happens is in the line below:

var rot = (rotate * point).Compile<Func<Point, double, Point>>(p, t);

Here we do a bit too much all at once, but it should still be comprehendible enough. First we multiply the two matrices in a “natural” way (you can already smell operator overloading everywhere, don’t ya?), and friendly ask the resulting expression to compile itself into a delegate that given a point and a rotation angle carries out the rotation, returning the resulting point.

Once we have all of this, we have a compiled piece of code at our service to rotate as much points as we see fit to. The sample above prints the following to the screen, rotating point (1,0) over an angle of 45° (counter-clockwise that is):

(0.707106781186548,0.707106781186547)

… something every self-respecting math-aware programmer should have predicted almost instantaneously.

# Matrix expressions

The first piece of glue we need to get this to work is a way to express matrices. Core question becomes what makes up a matrix. The answer is: cells. What do we want those cells to contain? Answer: expressions. So we end up with an array of cells (let’s stick with two dimensions as we’re talking about matrices and not a higher-order construct like tensors).

Next, what can we do with a single matrix by itself? It makes sense to be able to retrieve an arbitrary cell’s “value”, i.e. the expression that lives in the cell with indices i and j. Notice this “value” isn’t actually a value, it’s an expression representing the cell’s contents in a symbolic way (e.g. x + y, where x and y come from somewhere, maybe as a result of adding together two matrices of the same dimensions).

Now that we know our intent with individual matrices, what about operations on one or more matrices, like negation, addition, subtraction, multiplication, and more matrix-specific operations like transposition and calculating the determinant of a matrix? An obvious choice here is to leverage the power of operator overloading. Say we want to add up two matrices, what will the result look like? Assuming the dimensions of both arrays match up (a requirement to add two matrices together), each cell will consist of an expression that is the sum of the corresponding expressions in both the input matrices. In other words, we do cell-by-cell addition. Operations like multiplication are a bit more involved though.

An important question in this whole discussion is whether we want to build up the individual cell expressions at the time a matrix operation is carried out (like addition), or we just want to keep around the fact we applied an operation between different matrices, without building up the expressions for individual cells. In some sense, the former approach is eager while the latter is lazy. And actually, the lazy approach makes sense as the user might just be interested in retrieving the contents of cell (123, 654) after carrying out a bunch of operations on 1024x1024 matrices. It would clearly be a waste of effort to build up expression trees for all other cells. This said, it should be possible to forcefully build up the resulting (symbolic) matrix expression if the user thinks that’s the thing to do (maybe because all cell values are needed). So we’ll go with the lazy variant as the foundation.

Enough about design, let’s go into implementation. Or wait, one more thing: we’ll treat expressions as immutable for very good reasons and stay in a functional world as much as possible. This said, here’s the class with factory methods for matrices (omitted null checks and such for clarity):

public static class MatrixExpression { public static NewMatrixExpression<T> New<T>(Expression[,] elements) { return new NewMatrixExpression<T>(elements); } public static BinaryMatrixExpression<T> Sum<T>(MatrixExpression<T> left, MatrixExpression<T> right) { return new SumMatrixExpression<T>(left, right); } public static BinaryMatrixExpression<T> Product<T>(MatrixExpression<T> left, MatrixExpression<T> right) { return new ProductMatrixExpression<T>(left, right); } public static UnaryMatrixExpression<T> Negate<T>(MatrixExpression<T> operand) { return new NegateMatrixExpression<T>(operand); } public static UnaryMatrixExpression<T> Transpose<T>(MatrixExpression<T> operand) { return new TransposeMatrixExpression<T>(operand); } public static Expression Determinant<T>(MatrixExpression<T> operand) { throw new NotImplementedException("Left as an exercise for the reader."); } }

For instance, to get into matrix world, we have to start by calling MatrixExpression.New<T> at some point. This takes in a two-dimensional array of LINQ expression trees that represent, symbolically, the contents of the matrix’s cells. I didn’t want to call this guy a “constant expression” because of its possibly parameterized nature (containing x’s and y’s). The other operators are quite easy to grasp I guess: most of them build up new matrix expressions based on existing ones, with the exception of Determinant that returns to LINQ expression tree land (and is left as a fun exercise for the reader :-)). In essence all of those play the role of constructors, which will become more convenient once we have operator overloads in place (see further).

So, what does a matrix expression by itself look like?

public abstract class MatrixExpression<T> { internal MatrixExpression(MatrixExpressionType type) { NodeType = type; } public Expression this[int i, int j] { get { if (i < 0 || i >= Rows) throw new ArgumentOutOfRangeException("i");

if (j < 0 || j >= Columns) throw new ArgumentOutOfRangeException("j"); return Retrieve(i, j); } } public MatrixExpressionType NodeType { get; private set; } public int Rows { get; internal set; } public int Columns { get; internal set; } internal abstract Expression Retrieve(int i, int j); public Expression ToBLOCKED EXPRESSION {

// See further.

} public override string ToString() { // See further.

} public R Compile<R>(params ParameterExpression[] parameters) {

// See further. } public static BinaryMatrixExpression<T> operator +(MatrixExpression<T> left, MatrixExpression<T> right) { return MatrixExpression.Sum(left, right); } public static BinaryMatrixExpression<T> operator *(MatrixExpression<T> left, MatrixExpression<T> right) { return MatrixExpression.Product(left, right); } public static UnaryMatrixExpression<T> operator -(MatrixExpression<T> operand) { return MatrixExpression.Negate(operand); } public static UnaryMatrixExpression<T> operator ~(MatrixExpression<T> operand) { return MatrixExpression.Transpose(operand); } public static implicit operator MatrixExpression<T>(Expression[,] ex) { return MatrixExpression.New<T>(ex); } }

The operator overloads at the bottom shouldn’t be too strange (I’ve omitted a few for simplicity), but maybe the last one deserves some more attention. It’s nothing more but an implicit conversion operator between a two-dimensional array of expressions into a matrix expression, allowing you to write things like:

MatrixExpression<double> rotate = new Expression[2, 2] { { MathExpression.Cos(t), -MathExpression.Sin(t) }, { MathExpression.Sin(t), MathExpression.Cos(t) } };

As you can see, a matrix expression by itself is quite an abstract beast (at least on the instance level). It knows how much rows and columns it has, what its node-type is (just to reduce type casts) and how to retrieve elements (the abstract method will be implemented by the various subclasses and omits bounds checking, we control the bounds of the indices that come in through the indexer and trust implementers to provide a meaningful implementation for indices that are within range). There’s some common functionality though, like pretty printing (ToString). Let’s show that one first:

public override string ToString() { StringBuilder sb = new StringBuilder(); sb.Append("{ "); for (int i = 0; i < Rows; i++) { sb.Append("{ "); for (int j = 0; j < Columns; j++) { sb.Append(Retrieve(i, j).ToString()); if (j != Columns - 1) sb.Append(", "); } sb.Append(" }"); if (i != Rows - 1) sb.Append(", "); } sb.Append(" }"); return sb.ToString(); }

No surprises here, just calling into ToString on the individual expressions for the cells retrieved through the Retrieve method. So, in essence, ToString forces the cells to be “computed” in a symbolic fashion (e.g. ToString invoked on a summation matrix expression will yield *cell-left + cell-right *in a symbolic fashion, e.g. x + y).

Printing isn’t the most exciting thing though, but we’ll defer the discussion of ToExpression and Compile for a little while longer, and show how to implement concrete matrix expressions first.

# Implementing sum and product

So, how does a concrete matrix expression implementation look like? Let’s start by taking a quick look at the NewMatrixExpression, which will act as a leaf node in matrix expression trees:

public sealed class NewMatrixExpression<T> : MatrixExpression<T> { private Expression[,] _elements; internal NewMatrixExpression(Expression[,] elements) : base(MatrixExpressionType.New) { _elements = elements; Rows = elements.GetLength(0); Columns = elements.GetLength(1); } internal override Expression Retrieve(int i, int j) { return _elements[i, j]; } }

This thing does nothing more but storing a reference to the array of expressions (representing each individual cell), with a trivial implementation of Retrieve on top of it. All we’ve done is abstracted the concept of a two-dimensional array a bit further. (Notice we don’t have true immutability here as we pass in a reference to an existing array of expressions, modifying that will have effects on the whole matrix expression tree defined on top of it…).

Enough trivialities, let’s turn to the real stuff by taking a glimpse of the SumMatrixExpression:

public sealed class SumMatrixExpression<T> : BinaryMatrixExpression<T> { internal SumMatrixExpression(MatrixExpression<T> left, MatrixExpression<T> right) : base(MatrixExpressionType.Sum, left, right) { if (left.Rows != right.Rows || left.Columns != right.Columns) throw new InvalidOperationException("Dimensions do not match."); Rows = left.Rows; Columns = left.Columns; } internal override Expression Retrieve(int i, int j) { return Expression.Add(Left.Retrieve(i, j), Right.Retrieve(i, j)); } }

Still we’re fairly minimalistic in the implementation. We do a little checking of the incoming matrices to make sure the dimensions match, and call into our base class constructor to create a “binary matrix expression” (nothing more than a container for a two MatrixExpression instances, of the same generic parameter T, named Left and Right). The interesting part is Retrieve, which locates the requested cell in both matrices and returns a new expression tree representing the sum of the cells. So, internally matrix expressions perform their operations based on the underlying LINQ expression trees.

Cool, so how does a product matrix look like? A bit more complicated due to the less trivial mathematical definition, but again relatively straightforward and without boilerplate code:

public sealed class ProductMatrixExpression<T> : BinaryMatrixExpression<T> { private int _k; internal ProductMatrixExpression(MatrixExpression<T> left, MatrixExpression<T> right) : base(MatrixExpressionType.Product, left, right) { if (left.Columns != right.Rows) throw new InvalidOperationException("Dimensions do not line up."); Rows = left.Rows; Columns = right.Columns; _k = left.Columns; } internal override Expression Retrieve(int i, int j) { var res = Expression.Multiply(Left.Retrieve(i, 0), Right.Retrieve(0, j)); for (int k = 1; k < _k; k++) res = Expression.Add(res, Expression.Multiply(Left.Retrieve(i, k), Right.Retrieve(k, j))); return res; } }

If you can’t remember how to multiply matrices, open up your favorite search engine to find the answer. You’ll see we’re simply encoding that algorithm above by means of a single for-loop, building up an expression tree that’s a sum of products of cells.

I’ll leave the implementation of Transpose (rows become columns and vice versa) and Negate (negate every cell) to the reader.

# Preparing for evaluation

Next we need to take a look at how to evaluate a matrix expression, i.e. how to yield concrete values when supplying substitutions for the symbols. But wait a minute, where to those symbols come from? Say, we’re creating a new matrix and want to stick x and y in some cells, what precisely are x and y? The answer is: ParameterExpression, one of the node types of LINQ expression trees. That by itself is simple enough, but on the surface we might want to abstract that a little to make things more general and more aesthetically pleasing. (Actually, I’m deviating a little here into the world of OMMLinq, where MatrixExpressions are a subset of MathExpressions.) Here’s the glue we want to add:

public class MathExpression { private Expression _ex; internal MathExpression(Expression ex) { _ex = ex; } public static MathExpression operator -(MathExpression op) { return new MathExpression(Expression.Negate(op._ex)); } public static MathExpression Cos(MathExpression arg) { return new MathExpression(Expression.Call(typeof(Math).GetMethod("Cos"), arg._ex)); } public static MathExpression Sin(MathExpression arg) { return new MathExpression(Expression.Call(typeof(Math).GetMethod("Sin"), arg._ex)); } public static SymbolMathExpression Symbol<T>(string name) { return new SymbolMathExpression(Expression.Parameter(typeof(T), name)); } public static implicit operator MathExpression(Expression ex) { return new MathExpression(ex); } public static implicit operator Expression(MathExpression ex) { return ex._ex; } } public class SymbolMathExpression : MathExpression { private ParameterExpression _ex; internal SymbolMathExpression(ParameterExpression ex) : base(ex) { _ex = ex; } public static implicit operator ParameterExpression(SymbolMathExpression ex) { return ex._ex; } }

I’ve omitted a bunch of operators (trigonometry functions, exponentials and logarithms, combinatorials, etc), to reduce the concept of MathExpression to the bare minimum needed for our rotation sample. The key thing here is to see how implicit conversions allow us to go back and forth between LINQ expression trees and math expressions. It’s a bit unfortunate the LINQ expression trees form a closed world (but for a very good reason), so you don’t get to extend that library, ending up with a common base type for all expression tree nodes, so we cook our own. But in essence, once we’re in the world of MathExpression, operations are defined to apply operators and such, without leaking the LINQ expression tree implementation to the outside as part of those operations (unless the raw expression tree is needed, it stays within the walls of this domain-specific expression tree format).

From the above, you can see how we’ve introduce syntactical sugar for a ParameterExpression in the form of MathExpression.Symbol<T>:

var t = MathExpression.Symbol<double>("t"); var p = MathExpression.Symbol<Point>("p");

This feels much more natural and domain-specific than a raw ParameterExpression from LINQ. What Point is defined like shouldn’t be too much of a surprise: two double-valued properties X and Y (and again, immutable).

Given all of this, you can see how (with additional operators on MathExpression if required) larger and more complex math expressions are built: simply by using the constructor factory methods on MathExpression. That’s how we end up with the definition of the rotation and point matrices:

MatrixExpression<double> rotate = new Expression[2, 2] { { MathExpression.Cos(t), -MathExpression.Sin(t) }, { MathExpression.Sin(t), MathExpression.Cos(t) } }; MatrixExpression<double> point = new Expression[2, 1] { { p.X() }, { p.Y() } };

A few things should stand out here. First, why did I decide to make MatrixExpression “compatible with” a two-dimensional array of LINQ expressions, as opposed to MathExpression objects? The sole reason right now is because of simplification in the implementation of matrices; more specifically, Expression.Add (and its brothers) can’t do operator overload resolution by themselves. For instance, if you’d have two matrices of complex numbers and the ComplexNumber class has an operator overload for +, Expression.Add on two such terms won’t be able to find that operator overload by itself. To avoid distraction on implementing overload resolution rules, I’ve omitted this kind of plumbing (although you can still make it crash by using “unconventional” matrices, e.g. of string elements, which do not have a built-in + operator either).

The second observation is the use of methods on p to retrieve the point’s coordinates. As I wanted to show the point’s coordinates explicitly (as opposed to multiplying the rotation matrix, which is 2 x 2, with a point, that would look like 1 x 1), I needed a way to extract the coordinates. Simply calling the properties for X and Y doesn’t help, as those return the double values directly. Rather, we need some kind of indirection: a symbolic representation of “take the X coordinate of point p”. I’ve hacked this up using a couple of extension methods:

static class PointExtensions { public static MathExpression X(this MathExpression ex) { // Could be generalized using C# 4.0 "dynamic". return new MathExpression(Expression.MakeMemberAccess(ex, ((Expression)ex).Type.GetProperty("X"))); } public static MathExpression Y(this MathExpression ex) { // Could be generalized using C# 4.0 "dynamic". return new MathExpression(Expression.MakeMemberAccess(ex, ((Expression)ex).Type.GetProperty("Y"))); } }

Essentially, those build up math expressions based on expression trees that refer to the respective properties. We can’t make this happen automatically while still having somewhat nice syntax. Expression trees can be created for lambda expressions in C#, like – with a closure capturing outer variable p – () => p.X, but not for “stand-alone” expressions. Even if we’d go with the lambda syntax, using p.X won’t work as p is of type SymbolMathExpression and that doesn’t have an X property by itself (the thing it refers to might have though, as is the case in our sample). That’s where “dynamic” in expression trees would come in handy.

# Evaluation

So, how do we want to go about evaluating the result of a symbolic computation using those matrix expressions? The answer is, we need to be able to turn cells or the whole matrix into code that can compute values, given the substitutions for the symbols. For example, if we multiply rotate by point and want to know the topmost, leftmost cell’s value, we still need to supply values for the point and the rotation angle, because the cell’s computation might depend on those. So, ultimately we need to turn the cell into the body of a lambda expression that’s parameterized in the symbols.

Let’s start by looking at the plumbing to evaluate a single cell:

MatrixExpression<double> res = rotate * point; Expression cell00 = res[0, 0]; Console.WriteLine(cell00); LambdaExpression fCell00 = Expression.Lambda(cell00, p, t); var f = (Func<Point, double, double>)fCell00.Compile(); double xNew = f(new Point(1, 0), Math.PI / 4); Console.WriteLine(xNew);

Step-by-step now. The first line is obvious, we take our two matrices and multiply them, giving us back a new matrix containing double-valued (though symbol) cells. Next, we call into the public indexer to retrieve the expression tree for the topmost, leftmost cell. Again, that’s symbolic. You should be able to predicate the third line therefore prints (modulo precise knowledge of Expression.ToString):

((Cos(t) * p.X) + (-Sin(t) * p.Y))

As you can see, this contains ingredients from different cells in the original matrices, but composed through LINQ expression tree operations like product and sum. Next, we want to evaluate this, which means we need to get rid of the symbolic representation of p (the pont) and t (the rotation angle). So, we wrap this in a lambda expression that has p and t as parameters. This is what fCell00 is all about. If you’d printf that one, you’d see something like:

(p, t) => ((Cos(t) * p.X) + (-Sin(t) * p.Y))

We know the type of p and t, as well as the result (a cell of the matrix is of type double), so we can provide a signature for the to-be-compiled lambda: Point –> double –> double, or in C# style: Func<Point, double, double>. That’s what gets stored in f. And finally, we can call through the delegate, applying our function to a point (x = 1, y = 0) and with a rotation angle of 45° (Pi/4 in radians).

Quite some boilerplate code to get this to work, but if you want you could compactify the whole thing quite a bit:

var f = (Func<Point, double, double>)Expression.Lambda((rotate * point)[0,0], p, t).Compile();

and indeed, this kind of logic would rightfully belong inside MatrixExpression, returning a delegate given cell indices and a (symbols) parameter order.

Now that we got an idea about how to evaluate a single cell, what about evaluating a whole matrix? What would the result of that be? Clearly, a two-dimensional array of concrete values that are computed based on substitution values for the symbols. That’s exactly what the ToExpression method on MatrixExpression is meant for:

public Expression ToBLOCKED EXPRESSION { var res = Expression.NewArrayInit(typeof(T[]), Enumerable.Range(0, Rows).Select(i => Expression.NewArrayInit(typeof(T), Enumerable.Range(0, Columns).Select(j => Retrieve(i, j) ))).Cast<Expression>() // Missing generic variance (C# 4.0). ); return res; }

This might look quite dense, but uses the power of LINQ to build up (the expression tree for) an array of symbolic computations, based on the Retrieve method for each individual cell. Insert a few curly braces and it starts to look like nested foreach loops:

public Expression ToBLOCKED EXPRESSION { var res = Expression.NewArrayInit(typeof(T[]), Enumerable.Range(0, Rows).Select(i =>

{ Expression.NewArrayInit(typeof(T), Enumerable.Range(0, Columns).Select(j =>

{ Retrieve(i, j); });

})).Cast<Expression>() // Missing generic variance (C# 4.0). ); return res; }

What this creates is an expression tree to create an instance of a T[][] array. Notice I’m using jagged arrays because expression trees do not support the creation of multi-dimensional arrays at the time of writing. The use of the Cast<Expression> is interesting. If you emit this, you’ll get a compile error saying: cannot convert an IEnumerable<NewArrayExpression> into an IEnumerable<Expression>. Clearly this is valid because of covariance (IEnumerable<Elephant> should be assignable to IEnumerable<Animal>), but we don’t have this feature at our service for the moment (but C# 4.0 will, yippie, hence my comment in the code).

Anyway, let’s try the following:

MatrixExpression<double> res = rotate * point; Expression matrix = res.ToBLOCKED EXPRESSION; Console.WriteLine(matrix); LambdaExpression fMatrix = Expression.Lambda(matrix, p, t); var f = (Func<Point, double, double[][]>)fMatrix.Compile(); double[][] pNew = f(new Point(1, 0), Math.PI / 4); pNew.Print();

Notice the parallels with the code to evaluate a single cell. Now the first Console.WriteLine prints:

new [] {new [] {((Cos(t) * p.X) + (-Sin(t) * p.Y))}, new [] {((Sin(t) * p.X) + (Cos(t) * p.Y))}}

or, with some manual touch-ups,

new [] {

new [] {((Cos(t) * p.X) + (-Sin(t) * p.Y))},

new [] {((Sin(t) * p.X) + ( Cos(t) * p.Y))}

}

Clearly the formula we expected for both coordinates, in a LINQ expression tree format. Again, the compilation code is fairly similar, but notice how the return type of the delegate is degraded to a double[][] instead of a Point. We’ll do something about this in a minute, but first the output of the call to Print():

{ { 0.707106781186548 }, { 0.707106781186547 } }

What’s Print anyway? Just a plain extension method on jagged arrays:

static class Extensions { public static void Print<T>(this T[][] array) { StringBuilder sb = new StringBuilder(); sb.Append("{ "); int m = array.Length; for (int i = 0; i < m; i++) { int n = array[i].Length; sb.Append("{ "); for (int j = 0; j < n; j++) { sb.Append(array[i][j].ToString()); if (j != n - 1) sb.Append(", "); } sb.Append(" }"); if (i != m - 1) sb.Append(", "); } sb.Append(" }"); Console.WriteLine(sb); } }

So, what about that broken return type in our delegate (also some kind of covariance, although a double[][] is not really to be treated as a Point in the general case)? Can we do better? With a little hack we can. Say we can trust the user to call Compile on a MatrixExpression, passing in a compatible delegate for the result, like this:

var rot = (rotate * point).Compile<Func<Point, double, Point>>(p, t);

(Notice this trust relationship is of a kind that we can still throw exceptions in the abuser’s face :-).) It’s clear we can take a look at the generic parameter passed in to Compile, make sure it’s a Func<…>, an extract the return type from it (the last generic parameter). Once we have that, we can emit a LINQ expression tree that wraps the result of a ToExpression call in a Convert expression tree node that makes the return value of the desired type (Expression.Convert is smart enough to find such conversions):

public R Compile<R>(params ParameterExpression[] parameters) { var t = typeof(R); if (!t.IsGenericType || !t.GetGenericTypeDefinition().Name.StartsWith("Func")) throw new Exception("Not a Func<...>."); var a = t.GetGenericArguments(); var r = a[a.Length - 1]; return Expression.Lambda<R>(Expression.Convert(this.ToBLOCKED EXPRESSION, r), parameters).Compile(); }

Isn’t that beautiful? We just need to have a conversion available from double[][] to Point, obviously one that checks – omitted below – the input dimensions to be conform with expectations for a point’s dimensions (1 x n, with n = 2 in our case, thus a vector):

public static implicit operator Point(double[][] point) { return new Point(point[0][0], point[1][0]); }

This will do the trick for now, but notice how this makes the intermediate array allocation totally redundant. Can we eliminate that inefficiency? Sure, we have all the materials in the room to do so. We just shouldn’t apply the "Point conversion” from the outside, but extract the two relevant cells from the resulting matrix and feed them in to the Point constructor straight away. Adding this intelligence to our little engine isn’t that hard, even in a somewhat generic way (i.e. “intelligent conversions” as expression tree rewriters), but let’s keep that for another time when we talk about optimizing techniques (tip: try this at home with sparse matrices, i.e. lots of zero-valued cells, and watch the output of simple matrix calculations…).

Just for the record, here are the results for rotating 10,000,000 (randomly chosen) points with a direct (but naive, i.e. equivalent to the generated expressions from the matrix multiplication, without extracting common subexpressions for sine and cosine calculation) implementation versus our symbolic one (excluding the initial compilation of the lambda):

Symbolic: 00:00:05.2810564

Classic: 00:00:03.0292995

Applying the optimization for Point-conversion I’m hinting at above puts them on par actually (we even did a little better, but that’s just a test noise artifact):

Symbolic: 00:00:03.1099769

Classic: 00:00:03.2106196

and if we’d go all the way in the optimizer, we could even eliminate common subexpressions (a bit tricky though, but doable), at the cost of additional compile and analysis time upfront (and getting into territory where LINQ expression trees do not longer help as we need to generate multiple statements now, welcome DLR!).

Did I tell you I’m a big believer of the power of mathematical notation? Yielding control to the runtime might sound frightening to some, but it has its merits as you can see…

Have fun!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicksFiled under: C# 3.0, LINQ, Crazy Sundays

## # LINQ and The Matrix – Introducing MLinq - B# .NET Blog

Thank you for submitting this cool story - Trackback from DotNetShoutout

## # Link Listing - February 10, 2009

Link Listing - February 10, 2009

## # re: LINQ and The Matrix – Introducing MLinq

"ToBLOCKED EXPRESSION" ? :)

## # re: LINQ and The Matrix – Introducing MLinq

Something funny happened on the way to the forum:

"

That’s exactly what the ToExpression method on MatrixExpression is meant for:

public Expression ToBLOCKED EXPRESSION

"

## # Reflective Perspective - Chris Alcock » The Morning Brew #284

Pingback from Reflective Perspective - Chris Alcock » The Morning Brew #284

## # re: LINQ and The Matrix – Introducing MLinq

There was something dissatisfying about the solution. In particular, having to mirror so much of the Math operations (were this carried to a full implementation), as well as creating a parallel expression stack when, arguably, the stack already in System.Core should suffice.

So I followed along, writing my own implementation.

Oy! I see why you made this the way you did. No user-defined up/down casts, the fact that you can't dot into a symbolic parameter of some arbitrary type, etc. All these start forcing the solution down a path of workarounds.

At this point, I start considering the wisdom of using Lambdas to hold parameters, with the thought of performing a binding operation later on. But in that case, you can't do multi-dimensional array initialization (WHY NOT!?!).

So.. good article. Really shows some current shortcomings in and otherwise fantastic language.

A small wish list, to make this work the way I'd like:

- strongly-typed parameters,

- the ability to create bare expressions (non-Func), using said strongly-typed parameters

- the ability to dot into strongly-typed parameters (part of creating bare BLOCKED EXPRESSION

## # re: LINQ and The Matrix – Introducing MLinq

Posts like this are why I subscribe to your RSS feed.

Also, ToBLOCKED EXPRESSION?

## # re: LINQ and The Matrix – Introducing MLinq

Hi folks,

Thanks for the feedback on the BLOCKED EXPRESSION thing (you'll never know what I wrote here...); I'll look into that. It seems my blog engine has some hidden treasure I haven't encountered before :-). However, my blog engine and I seem to agree on one thing: expression trees are the C# developer's **** (oops, got blocked again :-)).

Thanks,

-Bart

## # re: LINQ and The Matrix – Introducing MLinq

Hi Keith,

You're reflecting exactly my state of mind when I was hacking this up over the holidays. The fact the expression tree API is a closed world is a blessing and a curse at the same time. Since I need different tree nodes from a conceptual point of view anyway (like matrices, n-ary operators, etc), I don't worry about that too much and use LINQ expression trees as the underlying glue.

However, it would be much nicer if that could be done within the same class hierarchy, which would cause a few problems for the API though:

First, things like Compile would need to allow to interoperate with custom visitors that know how to translate unknown tree nodes (also the ExpressionType enum becomes a problematic point lacking extensibility).

Second, some of my expressions are dynamically typed in their very nature (points have X and Y, and possibly more coordinates) and declaration time static typing conflicts with that a bit. The v4.0 expression trees will allow dynamic, so things will get better on that field.

Third, the expression tree APIs do not allow use for intermediate representations that are not necessarily sound in the type dimension, or the other way around, the type system is often not expressive enough to cope with my custom nodes (operator overloads, conversions, etc are all bound to the static "interface" of a type). More flexibility in constructing expression trees would help to decouple construction from verification checks which would come in handy when dealing with custom nodes.

The main thing I have to work around though (see a future blog post) is compiling and combining IL fragments. The only compilable construct is the LambdaExpression, which often is too limiting as we want to mix and match LINQ expression compilations with custom expression node compilations (e.g. a SumMathExpression would become a call to Enumerable.Range, or with full statement tree support, a foreach loop accumulating values). In this post, I omitted that complexity by just presenting a subset of my work that can be expressed purely in terms of expressions.

Thanks,

-Bart

## # re: LINQ and The Matrix – Introducing MLinq

Combining IL fragments sounds like asking for the DLR to me. You can do that.

The strongly typed parameters, though, are something that would require a blessed class. They'd have to allow something in generics which is not, AFAIK, possible, that is:

Symbol<T>: T

.. such that an instance of Symbol<T> can, at compile time, have members of T safely dotted off of it. Dynamic *could* do this, but at that point you're no longer strongly typed.

In a way, what's lacking is expression composability, and that's been noted elsewhere. If you had Expression<T>, and if the language allowed you to -- strictly in the context of creating new expressions -- lift members of T, then you could have the following code:

class Point { double X; double Y }

var a = Expression.Symbol<Point>();

var b = Expression.Symbol<Point>():

var inner = a.X * b.X + a.Y * b.Y; // Expression<double>

## # RealTime - Questions: "What are words that start with Matri-?"

Pingback from RealTime - Questions: "What are words that start with Matri-?"

## # Expresiones Lambda | Mi blog

Pingback from Expresiones Lambda | Mi blog