Wednesday, April 04, 2007 5:33 AM bart

Introducing PLINQ

On the last edition of Developer & IT Pro Days in Belgium last week, Raj Pai - program manager on the C# team at MS Corp - delivered the first public demo of PLINQ aka Parallel LINQ. Let's have a quick look...

By now, every blog reader should be familiar with the concept of LINQ, the Language INtegrated Query technology in .NET 3.5. This technology closes the gap between different querying languages by integrating these in one easy to use framework that's integrated tightly with the language (C# 3.0 and VB 9.0 amongst others in the future). By doing so, querying becomes a first class citizen of the language and dealing with various kinds of data (objects, relational, hierarchical, ... you name it) is really easy to do (no need to learn various querying languages anymore).

However, querying data typically consists of many operations that can be done (partially) in parallel, like querying multiple data sources to feed a join operation down the line (assume you're getting data from a relational DBMS, other data from an XML file, yet other data from a Web Service, etc and you need to join these together somewhere further by means of a join-expression in LINQ). Querying the original sources is an operation that can be done perfectly in parallel, speeding up the overall app's performance. This is exactly what PLINQ is capable of doing: without developers needing to worry about all the nasty details around parallel execution, PLINQ make it available right away. In the world of multi-core this (transparent) functionality is a must-have in order to benefit from today's processor capabilities at a maximum level.

Like all constructs in LINQ, PLINQ is based on extension methods, in this case the AsParallel method. Once you've built the expression tree representing the query, the AsParallel operation is added at the very end, which tells the "LINQ engine" to figure out parallel jobs and do all of the magic required to make the app benefit from multi-threaded execution. A conceptual example is shown below:

var result = (from p in db.Products join pd in xml.ProductDescriptions on p.ID equals pd.ID where p.Price > 100 select new { p.Name, p.Price, pd.Description }).AsParallel();

In such a query, both data sources can be retrieved in parallel, before performing the join operation. In order to set your mind on this concept, I've created a simple non-PLINQ-related sample illustrating the power of parallel execution when doing long-running operations. As an example, consider the calculation of prime numbers in a given range, using a very naive and simple algorithm. The contract to the calculation is "GetPrimes(long from, long to)" and we'll create both a single-threaded version and a multi-threaded version:

1 using System; 2 using System.Collections.Generic; 3 using System.Diagnostics; 4 using System.Threading; 5 6 class Parallel 7 { 8 static void Main() 9 { 10 long MAX = 2000000; 11 12 Stopwatch sw = new Stopwatch(); 13 sw.Start(); 14 foreach (long p in GetPrimes(0, MAX)) 15 ; 16 sw.Stop(); 17 Console.WriteLine("Single threaded: " + sw.Elapsed); 18 for (int i = 2; i <= 8; i++) 19 { 20 THREADS = i; 21 sw.Reset(); 22 sw.Start(); 23 foreach (long p in GetPrimesParallel(0, MAX)) 24 ; 25 sw.Stop(); 26 Console.WriteLine("Multi threaded ({0}): " + sw.Elapsed, THREADS); 27 } 28 } 29 30 static IEnumerable<long> GetPrimes(long from, long to) 31 { 32 bool found; 33 34 for (long i = Math.Max(from, 2); i <= to; i++) 35 { 36 found = false; 37 38 for (long j = 2; j <= Math.Sqrt(i) && !found; j++) 39 if (i % j == 0) 40 found = true; 41 if (!found) 42 yield return i; 43 } 44 } 45 46 static int THREADS = 4; 47 48 static IEnumerable<long> GetPrimesParallel(long from, long to) 49 { 50 long interval = (to - from) / THREADS; 51 52 ManualResetEvent[] whs = new ManualResetEvent[THREADS]; 53 List<long>[] results = new List<long>[THREADS]; 54 55 long b = from; 56 57 for (int i = 0; i < THREADS; i++) 58 { 59 PrimesThreadInfo info = new PrimesThreadInfo(); 60 61 info.evt = whs[i] = new ManualResetEvent(false); 62 info.result = results[i] = new List<long>(); 63 64 info.from = b; 65 info.to = (i == THREADS - 1 ? to : b + interval); 66 b += interval + 1; 67 68 new Thread(GetPrimesThread).Start(info); 69 } 70 71 WaitHandle.WaitAll(whs); 72 73 foreach (List<long> lst in results) 74 foreach (long p in lst) 75 yield return p; 76 } 77 78 static void GetPrimesThread(object obj) 79 { 80 PrimesThreadInfo info = obj as PrimesThreadInfo; 81 if (info != null) 82 { 83 List<long> res = info.result; 84 85 foreach (long p in GetPrimes(info.from, info.to)) 86 res.Add(p); 87 88 info.evt.Set(); 89 } 90 } 91 92 internal class PrimesThreadInfo 93 { 94 public long from, to; 95 public List<long> result; 96 public ManualResetEvent evt; 97 } 98 }

The code is far from optimal, since the partitioning of the original range in lines 64-66 doesn't take care of the fact that calculation times for large numbers are higher (because of the nature of our simple algorithm that tests all candidate divisors that are lower than the square root of the current number). However, this simple implementation shows the power of parallel execution:

Tests were conducted on a Intel Centrino Dual Core machine running at 2.16 GHz on Windows Vista.

You can see clearly the huge difference between single-threaded and dual-threaded in this case. Adding additional threads doesn't help that much, but going from one to two threads certainly does (a 34% performance gain).

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under: , ,

Comments

# LINQ ist ja schon fett...

Wednesday, April 04, 2007 3:19 PM by TheUndeadable entwickelt

... aber PLINQ setzt dem ganzen noch die Krone auf: http://community.bartdesmet.net/blogs/bart/archive/2007/04/04/introducing-plinq.aspx var result = (from p in db.Products join pd in xml.ProductDescriptions on p.ID equals pd.ID where p.Price > 100

# Sadek Drobi&#8217;s Blog &raquo; LinQing your processors&#8230; In Parallel!

# The IQueryable tales - LINQ to LDAP - Part 3: Why do we need entities?

Saturday, April 07, 2007 6:32 PM by B# .NET Blog

Introduction Welcome back to the LINQ-to-LDAP series. So far, we've been discussing: Part 0: Introduction

# Parallel LINQ

Monday, July 09, 2007 9:53 AM by Daniel Moth

Parallel LINQ

# Parallel LINQ

Friday, October 19, 2007 3:04 AM by Willem Meints

With dual core and even quad core processors around it&#39;s time to ramp up the workload so that software

# PLINQ

Tuesday, December 04, 2007 7:59 AM by Haresh

Hi Friends ! We have heard a lot about LINQ and PLINQ. PLINQ stands for Parallel Language Integrated

# plinq | bdotnet

Thursday, April 28, 2011 3:08 AM by plinq | bdotnet

Pingback from  plinq | bdotnet

# Tech-Ed EMEA 08 II/I. rész – Fejlesztői szemmel

Tuesday, August 14, 2012 11:36 AM by Csala Péter szakmai blogja

Fejlesztői szemszögből nézve a Tech-Ed -ről, illetve az ott látott különböző MS technológiáról/termékekről