Monday, August 17, 2009 8:20 PM
bart
LINQ to Ducks – Bringing Back The Duck-Typed foreach Statement To LINQ
I promise, it will be a (relatively) short post this time. You all know the foreach statement in C#, don’t you? Think twice before you answer and tell me exactly how the following works:
foreach (int x in src)
{
// Do something with x.
}
Got an answer? Let me disappoint you: if you have the answer, you’re wrong. There’s no single answer to the question above as you need to know more about the type of src to make a final decision on how the above works…
You may say that clearly that object needs to implement IEnumerable or IEnumerable<T>, and maybe you’ll even mention that in the former case the compiler inserts a cast for you when it gets “x” back from the call to the IEnumerator’s Current property getter. In other words, the code gets translated like this:
var e = src.GetEnumerator();
while (e.MoveNext())
{
var x = (int)e.Current; // without the cast if src was an IEnumerable<T>
// Do something with x.
}
A worthy attempt at the translation but not quite right. First of all, the variable x is declared in an outer scope (causing some grief when talking about closures, but that’s a whole different topic…). Secondly, the enumerator may implement IDisposable, in which case the foreach-statement will ensure proper disposal a la “using”:
{
int x;
using (var e = src.GetEnumerator())
{
while (e.MoveNext())
{
x = (int)e.Current; // without the cast if src was an IEnumerable<T>
// Do something with x.
}
}
}
That’s a bit more sane, but we’re missing out on another kind of source foreach can work with: any object, as long as it exposes the enumeration pattern of GetEnumerator in tandem with MoveNext and Current. Here’s a sample object that just works fine with the foreach-statement:
class Source
{
public SourceEnumerator GetEnumerator()
{
return new SourceEnumerator();
}
}
class SourceEnumerator
{
private Random rand = new Random();
public bool MoveNext()
{
return rand.Next(100) != 0;
}
public int Current
{
get
{
return rand.Next(100);
}
}
}
With its usage shown below:
foreach (int x in new Source())
Console.WriteLine(x);
Okay, that’s flexible, isn’t it? In fact, the foreach-statement can be said to be duck typed: it’s not the nominal type that matters (i.e. Source is explicitly declared to be an IEnumerable, and SourceEnumerator an IEnumerator) but just the structure of the object that determines “compatibility” with the foreach-statement.
But who says foreach over a collection immediately starts thinking about LINQ, no? Say the consumer of Source looked like this:
List<int> res = new List<int>();
foreach (int x in new Source())
if (x % 2 == 0)
res.Add(x);
A great candidate for LINQ it seems, especially as we start adding more and more logic to the “query”. Nothing surprising about this conclusion, but trying to realize it fails miserably:
Why? Because LINQ is statically typed (update: to be taken with a grain of salt, see comments below this post; agreed, it'd be more precise to write LINQ to Objects as the subject of this sentence), so it expects what I’ve referred to as a nominal enumerator implementation: something that has explicitly stated to be an IEnumerable and not something that “accidentally” happens to look like that. Question of the day: how to morph an existing structural enumerator onto a nominal one so it can be used with LINQ? Sure, we could write specialized code for the Source object above that essentially creates an iterator on top of Source:
static void Main()
{
var res = from x in IterateOver(new Source())
where x % 2 == 0
select x;
foreach (var x in res)
Console.WriteLine(x);
}
static IEnumerable<int> IterateOver(Source s)
{
foreach (int i in s)
yield return i;
}
But maybe you’re in a scenario with plenty of those structural enumerator constructs around (e.g. some Office automation libraries expose GetEnumerator on types like Range, while the Range object itself doesn’t implement IEnumerable hence it’s not usable with LINQ), so you want to generalize the above. Essentially, given any object you’d like to provide a duck-typed iterator over it, a suitable task for another extension method and C# 4.0 dynamic:
static class DuckEnumerable
{
public static IEnumerable<T> AsDuckEnumerable<T>(this object source)
{
dynamic src = source;
var e = src.GetEnumerator();
try
{
while (e.MoveNext())
yield return e.Current;
}
finally
{
var d = e as IDisposable;
if (d != null)
{
d.Dispose();
}
}
}
}
Question to the reader: why can’t we simply write a foreach-loop over the “source casted as dynamic” object? Tip: how would you implement the translation of foreach when encountering a dynamic object as its source?
Yes, you’re cluttering the apparent member list on System.Object, so use with caution or just use plain old method calls to do the “translation”. What matters more is the inside of the operator, using the dynamic type quite a bit to realize the enumeration pattern. Notice how easy on the eye dynamically typed code looks in C# 4.0. With much more casts, it’d look like this:
static class DuckEnumerable
{
public static IEnumerable<T> AsDuckEnumerable<T>(this object source)
{
dynamic src = (dynamic)source;
dynamic e = src.GetEnumerator();
try
{
while ((bool)e.MoveNext())
yield return (T)e.Current;
}
finally
{
var d = e as IDisposable;
if (d != null)
{
d.Dispose();
}
}
}
}
And now we can write:
var res = from x in new Source().AsDuckEnumerable<int>()
where x % 2 == 0
select x;
foreach (var x in res)
Console.WriteLine(x);
Dynamic glue – why not? In fact, even objects from other languages (like Ruby or Python) that follow the pattern will now work with LINQ, and for existing compatible objects the operator call is harmless (but wasteful). Oh, and notice you can also have an IEnumerable of “dynamic” objects if you’re dealing with objects originating from dynamic languages...
Can you implement the AsDuckEnumerable operator in C# 3.0? Absolutely, if you limit yourself to reflection-based discovery methods (left as an exercise for the reader).
Enjoy!
Del.icio.us |
Digg It |
Technorati |
Blinklist |
Furl |
reddit |
DotNetKicks
Filed under: LINQ, Dynamic languages, C# 4.0