Friday, June 06, 2008 11:41 PM bart

LINQ to MSI - Part 0 - Introduction

Introduction

Lately I've been delivering talks entitled "LINQ to Anything", to be repeated this summer at TechEd Africa. The goal of those talks is to focus on LINQ from the extensibility point of view, in other words: how to write query providers like LINQ to AD or LINQ to SharePoint (amongst many others I should give credits like LINQ to Amazon). Obviously, I'm always looking to improve the content of the talk and as a firm believer in the "small is beautiful" approach, smaller samples that pinpoint the core of LINQ providers are always welcome.

This time around, I decided to trade the IQueryable stuff for the manual implementation approach of the query pattern, something not employed in LINQ to AD or LINQ to SharePoint, although the former one will get a rewrite this summer that employs this methodology. But why this change? Well, although some providers definitely benefit from the whole range of query operators, I'd say there's some kind of break-even analysis to be made when considering the IQueryable path. IQueryable is a funny interface since it's not the interface you'd expect. How's that? Just take a glimpse of it:

image

This interface is somewhat unique because it offers an asymmetric world view. Interfaces always have this tension between implementers (who want just a handful of methods to implement) and users (who want as much flexibility as possible). Abstract base classes are often used as an escape valve, offering convenience overloads, but this conflicts with the single inheritance philosophy limiting its usefulness. Interfaces with code, providing defaults for some methods (such as convenience overloads, all falling back to the most flexible variant) are one direction considered by language and runtime teams. However, back to IQueryable's asymmetric approach. In the diagram above, you see precisely five members to be implemented, but when you looks at the consumer side you'll see more:

image

What you're seeing here are all the extension methods brought in scope by importing the System.Linq namespace. In fact, IQueryable is one of the first interfaces that powers itself by relying on a paired set of extension methods being available. First, this avoids requiring people to implement all of the query operators manually; there are way too many of them and chances to mess it up are way too high. Second, there's no single-inheritance trap by making say Queryable the abstract base class implementing IQueryable, just leaving 5 gaps to be implemented by developers. Finally, these extension methods allow IQueryable to grow over time without developers having to worry about it directly (although obviously new query operators will have to be recognized by the query parser written by implementers if they want to take benefit of the new operator, say for example Zip, introduced in PLINQ).

IQueryable is just one part of the picture, there's also IQueryProvider but let's skip this one for now. The take-away of this discussion is actually on the consumer side of IQueryable, the thing users are faced with when writing queries. The flexibility of IQueryable lies in its endless "fluent" chaining capability (just like e.g. System.String has lots of methods that return a System.String, allowing to build arbitrary operator chains) but if you're not going to support the lion's part of those, you're somewhat trapping developers by making them believe at compile time the whole thing will work. Say you don't have support for ordering, an IQueryable will still allows you to write things like:

var res = from product in products orderby product.Price descending select product;

while you'll throw a NotSupportedException at runtime pointing out that OrderByDescending (hidden in the fragment above) is a non-supported operator for your provider. Compile-time errors would be better in this case as pointed out in my rant about Q: Is IQueryable the Right Choice for Me?. Since it's my belief that most of the providers to be written by individuals in the near future will most likely address "basic-queryable" stores, i.e. databases of some sort not supporting the whole set of relational operators, I want to keep IQueryable as the cherry on the pie at the end of the talk, while keeping the core of the talk available for the study of query patterns and their implementation (also allowing for more live coding, which I absolutely love to do).

 

Ducks quack

As pointed out in my Q: Is IQueryable the Right Choice for Me? post, the C# 3.0 and VB 9.0 languages do not rely on interfaces being implemented in order to provider query capabilities. Any type that has the right methods available can participate in query comprehensions (read: language integrated query syntax -  I vote for the lower-case abbreviation "linq" to be used when talking about the language feature and the upper-case abbreviation "LINQ" to be used when talking about everything including providers). This is actually some form of what I'd call "statically typed duck typing": as long as there are methods with right signatures, queries will work. For example, in order for the "where" keyword to work, the compiler expects to find a method that has as its first real parameter (real because the method could be an extension method or a regular instance method) something compatible with a boolean-producing lambda. Typical samples include a Func<T, bool> or an Expression<Func<T, bool>>.

Notice this is not something new for languages like C#. Patterns like "foreach" rely on a method called GetEnumerator to be available on the type being enumerated over. It doesn't necessarily need to be an IEnumerable of some sort. Some people believe in over-typing and think this isn't "quite right" for a currently statically typed language like C#, but honestly: if an object has a GetEnumerator method returning either an IEnumerator or IEnumerable, what are the odds that object isn't enumerable? It's just like duck typing: if it walks like a duck and quacks like a duck, it ought to be a duck (or, you're right, a genetically manipulated dog - let's forget about AOP for a while :-)), it only has a stronger typed signature requirement.

 

Goals and non-goals of LINQ to MSI

Just like LINQ to AD, the goal of LINQ to MSI is to act as a sample. Although many people have mailed me already telling they absolutely love LINQ to AD and are productizing it, I have to stress the fact it's sample-level quality (which nevertheless can be very useful to extend upon). This being said, LINQ to AD and LINQ to SharePoint will get their promised updates some time in the near future (I keep a little vague on the scheduling I know, but both projects I'd categorize as "personal incubation projects" have spread their tangles into other projects I'm working on from time to time).

Where the sample distinguishes itself from LINQ to AD is in its approach:

  • We won't go down the IQueryable route.
  • We'll focus more on the structure of and cross-relationships between a query provider, entity objects, data collections and query objects.
  • There'll be less focus on implementing the query parser.

So how can MSI be queried? Well, MSIs are just little databases. There's this tool called Orca that comes with the Windows SDK that allows you to inspect an MSI's internal structure. For example, in the picture below I opened up the Windows PowerShell 2.0 CTP MSI (which just happened to be the first one in my temp folder):

image

No secrets are therein, so feel free to do the same :-). Obviously, changing an MSI is at your own risk. But why would you like to LINQify MSI? As I said, it's a sample in the first place, but the common theme in LINQ is about democratizing data access from .NET programming. The query language used by MSI is a lightweight SQL variant, but do you really want to learn another SQL dialect? Since the answer is no (allow me to answer on the reader's behalf), that's already one good reason. In addition, stores that weren't easily accessible in the past become more accessible now through unified query provider models and syntax, which is definitely a good thing. You can think of and endless number of samples with MSI, for example:

  • Create your own Orca tool.
  • Develop a setup builder tool that uses the LINQ to MSI entity model to write an MSI (this won't be supported directly though the sample implementation, but adding DML functionality shouldn't be too hard - maybe one day I'll show it to you).
  • Make an ASP.NET handler that allows users to "download this website as MSI", grabbing content dynamically and composing an MSI that will install an offline copy of the site to your local IIS (dreaming aloud, but definitely feasible for the braver readers out there).
  • Etc.

To convince you it actually works, here's the output of the following piece of LINQ code (omitted the straight-forward foreach loop):

var msi = new MyMsi(@"C:\temp\PowerShell_Setup_x86.msi");
var res = from prop in msi.Properties select new { prop.Name, prop.Value };

image

Next time, we'll start by the plumbing of MSI interop to allow querying an MSI database from managed code. You'll see classes like MsiConnection, MsiCommand and MsiDataReader appear on the surface of the bloge. Stay tuned!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under:

Comments

# 3 Of A Kind &raquo; Blog Archive &raquo; LINQ to MSI - Part 0 - Introduction

Pingback from  3 Of A Kind  &raquo; Blog Archive   &raquo; LINQ to MSI - Part 0 - Introduction

# re: LINQ to MSI - Part 0 - Introduction

Saturday, June 07, 2008 12:10 PM by Hal Rottenberg

Bart, I don't know enough about the dev side to know if this is a stupid question or not but here goes: Would I be able to take advantage of LINQ to MSI (or LINQ in general from a wider point-of-view) from within PowerShell? I know someone made an MSI snapin but I seem to recall it being a pretty simple thing. Having the ability for admins to query and work with MSI packages seems like it could be awfully useful, and the point of not learning yet another SQL variant rings true for everyone, not just developers. :) thanks Co-host, PowerScripting Podcast (http://powerscripting.net)

# 3 Of A Kind &raquo; Blog Archive &raquo; 3 Of A Kind ?? Blog Archive ?? LINQ to MSI - Part 0 - Introduction

Pingback from  3 Of A Kind  &raquo; Blog Archive   &raquo; 3 Of A Kind ?? Blog Archive ?? LINQ to MSI - Part 0 - Introduction

# 3 Of A Kind &raquo; Blog Archive &raquo; 3 Of A Kind ?? Blog Archive ?? 3 Of A Kind ?? Blog Archive ?? LINQ &#8230;

Pingback from  3 Of A Kind  &raquo; Blog Archive   &raquo; 3 Of A Kind ?? Blog Archive ?? 3 Of A Kind ?? Blog Archive ?? LINQ &#8230;

# Dew Drop &ndash; June 9, 2008 | Alvin Ashcraft's Morning Dew

Pingback from  Dew Drop &ndash; June 9, 2008 | Alvin Ashcraft's Morning Dew

# OJ is innocent. It was Skip Operators &raquo; Blog Archive &raquo; What others have been saying about skip operator

Pingback from  OJ is innocent. It was Skip Operators  &raquo; Blog Archive   &raquo; What others have been saying about skip operator

# W&ouml;chentliche Rundablage: Silverlight 2, WPF, ASP.NET MVC, jQuery&#8230; | Code-Inside Blog

Pingback from  W&ouml;chentliche Rundablage: Silverlight 2, WPF, ASP.NET MVC, jQuery&#8230; | Code-Inside Blog

# Weekly Links: Silverlight 2, WPF, ASP.NET MVC, jQuery&#8230; | Code-Inside Blog International

Pingback from  Weekly Links: Silverlight 2, WPF, ASP.NET MVC, jQuery&#8230; | Code-Inside Blog International

# OJ is innocent. It was Skip Operators &raquo; Blog Archive &raquo; Quick Roundup

Pingback from  OJ is innocent. It was Skip Operators  &raquo; Blog Archive   &raquo; Quick Roundup

# OJ is innocent. It was Skip Operators &raquo; Blog Archive &raquo; Quick scan of the net - skip operator

Pingback from  OJ is innocent. It was Skip Operators  &raquo; Blog Archive   &raquo; Quick scan of the net - skip operator

# OJ is innocent. It was Skip Operators &raquo; Blog Archive &raquo; Quick Roundup

Pingback from  OJ is innocent. It was Skip Operators  &raquo; Blog Archive   &raquo; Quick Roundup

# LINQ to MSI

Friday, July 25, 2008 10:51 AM by InstallSite Blog

LINQ stands for Language-Integrated Query and enables you to directly query databases in .NET programming

# Development in a Blink &raquo; Blog Archive &raquo; One of the most important properties of LINQ: its flexibility

Pingback from  Development in a Blink  &raquo; Blog Archive   &raquo; One of the most important properties of LINQ: its flexibility