Tuesday, April 08, 2008 2:00 AM
bart
Pattern Matching in C# - Part 2
In our journey to find the perfect match, we've arrived at an interpreter-based pattern matcher in the previous post. Although there are quite some limitations and the syntax isn't as sweet as we'd like it to be, it's fully functional. However, what about the performance of our pattern matcher? Consider the following benchmark:
Random rand = new Random();
var persons = new List<Person>();
for (int i = 0; i < 100000; i++)
persons.Add(new Person("Bart", rand.Next(1, 10)));
var watch = new Stopwatch();
var baseline = persons.Select(p =>
{
Person person = p as Person;
if (person != null && person.Age == 5)
return true;
else
return false;
});
watch.Start();
foreach (var p in baseline)
;
watch.Stop();
long baseMs = watch.ElapsedMilliseconds;
Console.WriteLine(watch.Elapsed);
var res = persons.Select(p => new Pattern<bool>(p)
.Match((string name) => new Person(name, 5), name => { return true; })
.Else(() => { return false; })
);
watch.Reset();
watch.Start();
foreach (var p in res)
;
watch.Stop();
Console.WriteLine(watch.Elapsed + " " + watch.ElapsedMilliseconds / baseMs);
Debug.Assert(baseline.SequenceEqual(res));
This comparison is "fair enough" - given a series of 100,000 randomized Person objects, we want to find matches with age 5. Although our pattern match will do slightly more work (by design) to extract the Name property to bind it to the name variable (which is unused), the essentials are pretty much the same: in both cases we use iterators, we measure the time to iterate over the sequence and the pattern match has identical semantics in both cases. Of course we check the validity of the result using an Assert operation. Here's the result of this benchmark:
Some profiling
Correctness has proven to be okay, but speed has been proven to be a little problematic, but where's the bottleneck? Although it's pretty apparent from the code itself (explanation to follow), let's go with CLR Profiler - the art of engineering is measurement after all.
| Manual pattern match | Our pattern match |
Summary for C:\temp\Switch++\Pattern\bin\Release\Pattern.exe Allocated bytes: 442,449 Relocated bytes: 0 Final Heap bytes: 442,449 Objects finalized: 0 Critical objects finalized: 0 Gen 0 collections: 0 Gen 1 collections: 0 Gen 2 collections: 0 Induced collections: 0 Gen 0 Heap bytes: Unknown Gen 1 Heap bytes: Unknown Gen 2 Heap bytes: Unknown Large Object Heap bytes: Unknown Handles created: 68 Handles destroyed: 0 Handles surviving: 68 Heap Dumps: 0 Comments: 0 | Summary for C:\temp\Switch++\Pattern\bin\Release\Pattern.exe Allocated bytes: 1,286,172 Relocated bytes: 55,642 Final Heap bytes: 296,884 Objects finalized: 1 Critical objects finalized: 1 Gen 0 collections: 1 Gen 1 collections: 0 Gen 2 collections: 0 Induced collections: 0 Gen 0 Heap bytes: 1,048,612 Gen 1 Heap bytes: 57,660 Gen 2 Heap bytes: 12 Large Object Heap bytes: 13,440 Handles created: 44 Handles destroyed: 0 Handles surviving: 44 Heap Dumps: 0 Comments: 0 |
Definitely the allocated bytes go much higher (notice I've reduced the number of iterations to only 1000). The allocation graph shows clearly where the huge number of allocations comes from:
<Benchmark>b__2 is an anonymous method generated by the lambda expression in here:
var res = persons.Select(p => new Pattern<bool>(p)
.Match((string name) => new Person(name, 5), name => { return true; })
.Else(() => { return false; })
);
which results in Pattern<T> objects but also an bunch of LambdaExpression objects because the call to Match takes in an Expression<...> so the compiler generates the expression tree objects by calling things like Expression.New and Expression.Lambda. If you were to follow the MethodBase::GetMethodFromHandle path, this is the code responsible to get the ConstructorInfo object for the new Person(...) call inside our lambda, which becomes a parameter to the NewExpression.
A breakdown of the allocations shows us clearly how much we're allocating for expression trees and reflection info. The age of objects tells us something too but shouldn't be taken too serious since our run didn't last long enough to really tell what's going on there (exercise: play a bit with the profiler for longer runs):
Telling even more is the call-graph:
And here's a closer look at the call graph for an individual iteration:
Enough stats for now - what are we going to do about it? Those match objects don't really change much in a series of pattern matches inside a loop body. Yes, we're dragging around some metadata that tells the pattern match is already has taken on a value (_hasValue) to suppress subsequent match clause evaluations, but other than that a pattern match object can be reused. There are certain edge cases of course where reuse might get trickier, such a multi-threading, but that too we can fix by eliminating global state.
Lazy evaluation
Our first attempt uses eager evaluation and mimics a classic switch statement expression in a syntactical fashion:
int res = new Pattern<int>(o) .Match(clause, (...) => { // body }) .Match(clause, (...) => { // body }) .Else(() => { // body }) ; | int res = switch(o) { case clause: { // body } case clause: { // body } default: { // body } }; |
This would be fine if the pattern matcher would be code instead of an object. This form of a pattern match could be called closed since it's already bound to an object (passed in through the constructor) and it's eager because every match clause evaluates straight away. What we really want is to reuse a pattern match object to match many different objects, something we can denote as an open pattern match since it's not bound to an object at construction time. It will also be lazy since it won't evaluate till we force it to do so by passing it an object. This is much like LINQ queries that simply capture the intended operation without performing it straight away, rather it waits for a stimulus (which is iteration in that case) to go ahead and execute whatever is needed to obtain the requested results.
Let's start by building the open form of the Pattern<T> object by adding a constructor:
class Pattern<T>
{
private bool _hasValue;
public Pattern() { }
public Pattern(object o)
{
if (o == null)
throw new ArgumentNullException("o");
Object = o;
}
public object Object { get; private set; }
public T Result { get; private set; }
...
}
No rocket science so far. We've just added a default constructor which implicitly captures some state. The fact the Object property will be null in that case acts as the distinction mechanism between lazy and eager and for the same reason we disallow a closed pattern match to be bound to a null value, in an attempt to keep things simple. You could certainly go ahead and add another boolean to distinguish between closed and open (exercise).
The idea for an open pattern is to capture all the information required to operate on an object at a later point in time, so we'll change our Match and Else methods. First of all, Else can no longer return the result since the whole intent is to defer execution. Since there's no overloading on a method's return type, we'll simply change it as follows:
public Pattern<T> Else(Func<T> f)
{
if (this.Object == null)
{
if (_else != null)
throw new InvalidOperationException("Else clause has been set already.");
_else = f;
}
else
{
if (!_hasValue)
{
_hasValue = true;
Result = f();
}
}
return this;
}
introducing another member variable _else:
private Func<T> _else;
The main difference is obvious - if we're in the case of a closed pattern match, the old code does all the work and assigns the result to the Result property if no other match clause has caused a final value to be set. In the case of an open pattern match, we simply cache the body delegate of the Else clause and make sure there's only one (something we don't protect against in the eager case - exercise). This change means the caller of a close pattern must add a call to the Result property getter to obtain the result:
var res = persons.Select(p => new Pattern<bool>(p)
.Match((string name) => new Person(name, 5), name => { return true; })
.Else(() => { return false; }).Result
);
This doesn't trigger the eager evaluation though - it simply returns the result that has already been calculated. One could even try (exercise) to reduce the need for calling Result by introducing an implicit conversion operator, converting a Pattern<T> into a T but there's a caveat (what would that be?).
How are we going to change the Match calls? Here's the revised MatchInternal:
private Pattern<T> MatchInternal(LambdaExpression e, Delegate f)
{
if (this.Object == null)
{
if (_else != null)
throw new InvalidOperationException("Else clause has been set already.");
_matches.Add(new MatchEntry() { Match = e, Action = f });
return this;
}
...
}
Again, we simply distinguish between closed and open, and in the latter case we make sure there's still room for a match clause and if so, we add a new MatchEntry to a list of match entries. A what? Here's the definition of a match entry:
internal class MatchEntry
{
public LambdaExpression Match { get; set; }
public Delegate Action { get; set; }
}
Those entries keep the pair of the match clause (Match) and match body (Action) and are stored in a list:
private List<MatchEntry> _matches = new List<MatchEntry>();
Finally, we need to introduce an evaluation mechanism and what's better than an Execute method, taking in an object and returning a T?
public T Execute(object o)
{
if (Object != null)
throw new InvalidOperationException("Execution is only valid on unbound pattern match objects.");
if (_else == null)
throw new InvalidOperationException("Can't evaluate the match. Incomplete match object. Are you missing an Else clause?");
foreach (MatchEntry entry in _matches)
{
Dictionary<ParameterExpression, PropertyInfo> bindings;
if (TryMatch(o, entry.Match, out bindings))
{
return Evaluate(o, entry.Match, entry.Action, bindings);
}
}
return _else();
}
We don't use the Result property since the Pattern<T> has an open character: it shouldn't know about any object whatsoever, neither about any result. The logic is simple: only unbound (open) pattern match objects can be used with the Execute method and _else should be present (an alternative is to either return default(T) or to throw an exception, which is not really functional). Finally we cycle through the list of match entries and try to find the right match. In case we find one, we evaluate and return the result. Otherwise, control reaches the Else case.
Re-evaluating the performance
We expect this to be better already, but let's measure through our extended benchmark:
var pattern = new Pattern<bool>()
.Match((string name) => new Person(name, 5), name => { return true; })
.Else(() => { return false; });
watch.Reset();
watch.Start();
var res2 = persons.Select(p => pattern.Execute(p));
foreach (var p in res2)
;
watch.Stop();
Console.WriteLine(watch.Elapsed + " " + watch.ElapsedMilliseconds / baseMs);
Debug.Assert(baseline.SequenceEqual(res2));
Here's the result for 100,000 iterations:
Definitely better with just a simple change. However, there's definitely more room for improvement and by making our move towards an open pattern match, we're enabling lots of new scenarios. We'll cover those next time, but let's finish off this episode by looking into the profiling data for this run. As expected, allocation numbers have gone down:
Summary for C:\temp\Switch++\Pattern\bin\Release\Pattern.exe
Allocated bytes: 756,860
Relocated bytes: 0
Final Heap bytes: 756,860
Objects finalized: 0
Critical objects finalized: 0
Gen 0 collections: 0
Gen 1 collections: 0
Gen 2 collections: 0
Induced collections: 0
Gen 0 Heap bytes: Unknown
Gen 1 Heap bytes: Unknown
Gen 2 Heap bytes: Unknown
Large Object Heap bytes: Unknown
Handles created: 44
Handles destroyed: 0
Handles surviving: 44
Heap Dumps: 0
Comments: 0
There's obviously still an allocation cost to build up the open pattern match but things have improved. The number of calls triggered by the iteration's select clause have gone down too:
but this figure allows us to pick some other targets: GetCustomAttributes to extract the constructor's mapping metadata is a huge cost. Again expanding a single iteration, we now see:
Notice the number of calls on the top line: 485. We come from 637 in the previous round with the closed pattern, due to the cost of allocating the new pattern object (with all its expression tree magic etc). Eliminating the loop body's cost is our next goal.
Stay tuned!
Del.icio.us |
Digg It |
Technorati |
Blinklist |
Furl |
reddit |
DotNetKicks
Filed under: C# 3.0, Functional programming, Crazy Sundays