Tuesday, February 26, 2008 10:58 PM bart

The missing operator - ForEach

A small post this time. While playing around with LINQ queries lately I noticed one minor missing piece that's merely a convenience thing but anyhow I thought to share it with the world: a ForEach operator. Such a "sequence operator" (to use an old-fashioned word, remember LINQ to Objects used to be called the Standard Query Operators, explaining the abbreviation used in my LINQSQO project at http://www.codeplex.com/LINQSQO) would allow us to write a query an iterate over it directly; look at it as a postfix variant of the foreach keyword if you want.

Here's how it looks like:

static class MoreEnumerable
   public static void ForEach<T>(this IEnumerable<T> source, Action<T> action)
      if (source == null)
         throw new ArgumentNullException("source");

       if (action == null)
         throw new ArgumentNullException("action"); 

      foreach (T item in source)

I won't elaborate on possible combinations with the Parallel FX extensions library and will leave that to the reader. Anyway, here's how your brand new home-brew operator would be used:

(from p in products where p.UnitPrice > 123 select new { Name = p.ProductName, Price = p.UnitPrice }).ForEach(p => {

which is similar to List<T>'s ForEach<T> method. Notice you'll have full IntelliSense inside the lambda body - the type of p is inferred through the generic parameter T which is the anonymous (projection) type in the sample above.

Have fun!

Update: Apparently people read my posts as late as I'm posting them :-) which is of course well appreciated. I've posted a few personal insights on the pros and cons of this pattern in this post's comments section. Actually the original goal of the post was just to show some "more extensions" one could envision (have a set of other "functional style operators" coming up) but I like the idea of turning it into discussion mode :-). All feedback is welcome!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks


# re: The missing operator - ForEach

Wednesday, February 27, 2008 12:18 AM by Keith J. Farmer

Whether or not to include such an operator has been the subject of debate.  I seem to recall some rather good arguments (from Eric Lippert?) as to why it's not necessarily a good thing to have, after someone asked for it earlier.

Admittedly, I have an entire section of one project devoted to variations of this theme.  Maybe after we get our new server up and running (yay IIS7!) I'll post what I've been cobbling together.

# re: The missing operator - ForEach

Wednesday, February 27, 2008 12:22 AM by Frank Quednau

Alternatively, do a .ToList().ForEach(...) on your LINQ. Only a couple characters more and you save yourself the extension.

# re: The missing operator - ForEach

Wednesday, February 27, 2008 12:53 AM by Mikael Söderström

Hi Bart,

PLinq has something similar to that.

public static void ForAll<T>(this IParallelEnumerable<T> source, Action<T> action);

/Mikael Söderström

# re: The missing operator - ForEach

Wednesday, February 27, 2008 1:00 AM by TraumaPony

I thought there was a method that did this (ForAll)?

# re: The missing operator - ForEach

Wednesday, February 27, 2008 1:10 AM by bart

Hi folks,

Thanks for the responses! There are definitely different ways to think about it. I wouldn't rate it a "must have" (especially since creating such extensions is plain easy nowadays for people who really want it - whether or not excessive use of extension methods is a good thing is subject of another discussion obviously) but rather a "nice to have".

This being said, I should immediately add that - as always - abstraction might become a trap if one forgets what's really going on behind the scenes (and if the abstraction leaves room for abuse because of lack of knowledge of "internals", one might well wonder whether it's a good abstraction in the first place).

To name one thing: piggybacking on a known pattern could be a dangerous thing. The foreach loop has always been break-able since it's (duh) a loop construct. Having a foreach this style effectively blocks that capability which is inherently in conflict with the use of IEnumerable<T> in iterators and in any sort of lazy pattern including different flavors of LINQ (implementations based on a data-stream or server-side paging better are breakable in a proper way). In other words, to some extent it's a lazy evaluation blocker. One could well work around it by e.g. making the loop body a Func<T, bool> returning true in order to continue and false to break from the loop but that makes the foreach-pattern evaporate.

Similarly, I wouldn't recommend the use of ToList() all too often unless you really need all the results in a materialized form.

The goal of this post (and I have some others on other "extension-style functional additions" in the queue for posting sooner or later) is to show some possible patterns - I might want to turn them in quizzes to stimulate these kind of discussions :-).



# re: The missing operator - ForEach

Wednesday, February 27, 2008 1:35 AM by Matt Ellis

My take on this is that LINQ is all about Query, and adding a ForEach operator mixes the query with the processing of the results of the query. I would personally prefer to maintain separation, and see code that described the query, and then explicitly looped over it. var results = from p in products where p.UnitPrice > 123 select new { Name = p.ProductName, Price = p.UnitPrice }; foreach(var result in results) Console.WriteLine(result); This just looks clearer to me. Cheers Matt

# re: The missing operator - ForEach

Wednesday, February 27, 2008 11:00 AM by bart

Hi Michael,

My reference to the "Parallel FX extensions" was a hint in that direction - I have some posts on the Parallel class coming up that will elaborate on the subject.



# re: The missing operator - ForEach

Wednesday, February 27, 2008 11:14 AM by bart

Hi Matt,

Sure - LINQ is all about query, hence its name. Strictly speaking this is about IEnumerable<T> and it just so happens to be that LINQ for very well-defined reasons layers on top of that. Combining iteration and querying (or any kind of IEnumerable<T> grabbing) consolidates the pattern of "query followed by foreach" which often doesn't have any instructions in between. Even if there are, the point between defining the query and iterating over it is a dead zone (something to elaborate on in the context of parallel FX, more specifically futures) since the query doesn't magically start to (pre)fetch results before you actually start iterating over it.

Essentially that's a somewhat bigger "problem" (in fact it's just all about awareness) because of lazy constructors / declarative sugar (but oh so sweet!) in an imperative program. So, this post captures a typical pattern in one expression: query and iterate over it (with limitations outlined in a previous comment of mine) straight away.

From a broader point of view one could argue querying and iteration could be captured in a separate language feature (foreach x in (from ... select) doesn't really qualify for the title of eye candy), more along the lines of query - keyword - loop body (which would support continue and break keywords, eliminating the issues addressed in my previous comment).



# re: The missing operator - ForEach

Wednesday, February 27, 2008 11:17 AM by bart

Hi TraumaPony,

There are similar constructs indeed: ForEach on List<T> and ForAll in the Parallel FX library as pointed at by Michael.



# re: The missing operator - ForEach

Wednesday, February 27, 2008 2:31 PM by Keith J. Farmer

Actually, I think the bigger issue isn't about breaking a loop, it's about handling exceptions. That's a problem in general, but (IMHO) more likely to happen in processing loops since they play a little looser with the idea of what people think they *should* put into the expression (ie, it's more likely by culture, not by capability). There are of course various foreach patterns: ForEach (Consuming) ForEachThen (processes, then emits the original sequence) ForEachAlso (processes, and emits each sequence item as it's processed) .. this comes up, I think, because of the problem of side-effects. Consider the order of Console.WriteLine executions and how it differs between the xThen and xAlso variants. One is sequential, the other is nested. Unravelling these in an expression isn't (in my experience) easy, and the usefullness of the extension starts to degrade thereby. And that's where I think part of the "we can, but should we?" position comes into being.

# re: The missing operator - ForEach

Wednesday, February 27, 2008 5:28 PM by bart

Hi Keith,

Thanks for your thorough analysis - I agree on the different shapes different patterns can yields and each have their subtleties. When an apparently sequential thing start to take on different behavior, things become tricky. The ForEach I proposed is a typical consumer but a "continuation" (to abuse the noble world :-)) style of extension belongs to the alternatives as you point out.

Yet another alternative is to play dirty with AsEnumerable (when required - since we don't have "statement trees" at the time of writing), followed by an inpure Select (an F#-ish "unit" type would come to rescue to avoid the need for a return type - Action and Func are only siblings) and triggering the iteration using (fill in your greedy operator of choice). Weird, weirder, weirdest...

Exception handling is indeed a painful art in general and especially in these kind of scenarios - just look at the recent invention of AggregateException to deal with parallel exceptions (oh my :-)). We definitely need such a thing but the code by itself should try to avoid leaking exceptions from the (parallel) loop as much as possible (much like isolated worker processes do on a larger scale). Consise curly-free syntax for lambdas definitely doesn't stimulate the addition of big fat bodies (although Parallel.For and its siblings are sold by their likeliness with their imperative nephew) containing all required exception handling to avoid leaking exceptions out of the processing body. Or how adding inpurity... (I don't have to elaborate on this I guess :-)).

As an aside, the concept of error streams (as in PowerShell) has its beauties too especially in pipeline-based systems (objects and errors flow through the pipe as first class citizens, without blocking progress) where the units of work stand by themselves (weak ACID - if 99 out of 100 succeed with the 20th failing, don't keep tasks 21-100 from running). LINQ is pretty much a pipeline (although not as explicit - thanks to abstractions - as >> and |> operators in F#) after all, with the most notable difference being its pull-based (foreach sucks data out of the pipeline) character instead of push-based (dir/get-childitem feeds data into the pipeline). Of course, the current frameworks and runtimes think differently about error handling and debates around side-effects (and concepts like purity annotations) in a more functional world are - to say the least - hot nowadays.



# re: The missing operator - ForEach

Thursday, February 28, 2008 1:32 PM by Keith J. Farmer

Yeah, PS is a different sort of beast. And it's interesting when you try to wrap a push in with a pull, and vice-versa. One would think something like that would end up somewhere in the framework (eg, Stream.AsEnumerable, or Enumerable.AsEvent). Maybe I should have a talk with someone... I did that once: wrapped a serial port with a background worker that would emit the byte stream as an IEnumerable, which would feed into a deserialization engine which would fire events for new messages received, which would then get logged, or otherwise waited upon. I actually added some code to do that to a extensions assembly I'm writing at home, but haven't had a chance to test it. Worked amazingly well.

# Extended LINQ: additional operators for LINQ to objects | Igor Ostrovsky Blogging

Pingback from  Extended LINQ: additional operators for LINQ to objects | Igor Ostrovsky Blogging

# Why Would I Create A Custom LINQ Operator?

Thursday, February 12, 2009 6:23 PM by BusinessRx Reading List

Here are three different reasons: For an operation that doesn’t exist. For readability. For performance.