Wednesday, February 18, 2009 2:11 AM bart

The M Programming Language – Part 2 – Collections and Extents

Last time in this series, we looked at M’s structural type system, pointing out the differences compared to nominal type systems and why structural typing has its benefits when dealing with data. Obviously there’s more to data than types: we need containers to store that data in. That’s what collections and extents are for. Ready? Go!

 

Collections

In the last post, we’ve already seen some core operator that acts on collections: the in operator. But wait a minute, didn’t in operate on a value and a type, like “Hello” in Text? Turns out M’s type system is so unified that types and collections are very closely related. Indeed, a type describes a set of possible values something declared to be of that type (notably a value) can take. But besides types, collections can be used too in order to group values together. Values contained in a collection are called elements. Also, collections by themselves are treated as values.

Again, we’ll rely on MrEPL to teach us how M works in an interactive fashion. Let’s start by playing around a bit with the syntax to define collections of values:

image

Collections are built from comma-separated lists (“lists” used in an informal way here – see further) of values in between a pair of curly braces. Much like collection initialization syntax in C# 3.0 and beyond (or arrays before that). A few notable things:

  • There’s an empty collection.
  • Collections can contain different types of values.
  • Nesting of collections is possible as collections themselves can be treated as values.

When talking about collections different questions come to mind: “are duplicates allowed?” (a.k.a. “is it a set or not?”), “does order matter?” (a.k.a. “is is a list or not?”) are the most important ones. Let’s put those questions to the test and find out:

image

Clearly, duplicates are allowed but order is irrelevant. (You know a name for such a collection, don’t you?) It’s not hard to see why this design was chosen, as M is about modeling data in a general way, and typically maps to repositories based on database technologies. We’ll see later how uniqueness can be enforced when dealing with “real” data.

From the sample above, you can already infer a few operators that act on sets: == and != check for equality and inequality respectively. What about other operations? As M is inspired by set theory it shouldn’t be too surprising operators exist to check for subset (<=) and superset (>=) relationships and to define unions (|) and intersections (&):

image

Notice though how set operations always return sets (i.e. containing no duplicates). The choice for | and & as operators for union and intersection respectively comes from the relationship there is between set theory and logic. If a set is defined as a predicate (the membership condition that determines whether a value is part of the set or not), taking the union means finding all the elements that have either one (or both) of the sets’ predicates evaluate to true. Similarly, the intersection holds all elements where both predicates are true. Actually think of it this way:

{ 1 } ~ (Number where value == 1)
{ 2 } ~ (Number where value == 2)

then

({ 1 } | { 2 }) ~ (Number where value == 1 || value == 2)
({ 1 } & { 2 }) ~ (Number where value == 1 && value == 2)

What are the other operations that can be carried out on collections? What about checking a value belongs to a collection and what about query operations?

image

Friends of LINQ should be immediately familiar with the query operators like filtering and projection. Next, let’s point out some other nice features such as checking for the number of elements (including duplicates) in a collection and turning a collection into a set by means of the Distinct operator:

image

In the previous post, we’ve seen how to declare types. Collections obviously can take values of any type, so you can have collections of things like Products. Let’s show a sample based on entities, which consist of name/value pairs (notice the anonymous construction of values below):

image

 

Collection types

In the previous paragraph we’ve been looking at collection values, which can be thought of as containers of elements that themselves are values. That makes sense, right? Now, we’ll take a look at the same concept from a different angle, using types. Previously we haven’t spelled out the type of a particular collection value, but obviously there should be a way to do this. If not, how would we say things like: I have a Person type and each object of that type should contain a collection of numbers (whatever they represent).

The common base for collection types is called, no surprise, Collection. First of all, it’s important to know the difference between a singleton (which is a collection with one value) and a scalar (a single value, which could be based on a type that is compound by itself):

image

In the sample above you’ve seen two distinct “types” of types. Actually their “order” is different: Number (representing a “scalar” value) versus Collection (representing a collection of values). But how do we go from a “scalar” type (a Number, a Person, whatever) to a collection type based on that? The answer is by means of a type constructor. You already know type constructors from the world of the CLR. Given any type T you can build up a new type like T[] for an array of objects of that type (I’m not using the word “value” here as that would be ambiguous in CLR lingo). Notice that no-one had to declare a Person[] explicitly; the mere fact there is a Person type allows the [] type constructor to be applied to it, yielding a new type (constructed by the runtime) that represents an array of Person objects. If you read ECMA 335 cover-to-cover you’ll discover other constructs in the CLR that play the role of type constructors although there aren’t that many. M though has quite a few type constructors that allow you to define a collection:

image

The type constructors restrict the type of the collection elements and the cardinality bounds of the collection. Four constructs are available to limit the cardinality: the three Kleene operators (? = 0 or 1, * = 0 or more, + = 1 or more) and the #m..n operator specifying (inclusive) lower and upper bounds to the element count.

Based on this, we can define our own collection types. It’s important to note though that constraints in collection type definitions have two “pseudo”-variables available: value, referring to the collection itself, and item, referring to each individual item in the collection:

image

I’ll leave it to the reader to play a bit more with collection types.

 

Extents

Values are one thing, but without storage for them there not really very usable in modeling scenarios where you want to keep data around. So we need dynamic storage for those value (this includes, and typically is, a collection of values), which is what we call an extent. To show how this works, we’ll walk through the tool chain and create a table of Person values, using the following key steps in the declaration of the model:

  1. Define the type for the values, i.e. Person.
  2. Define an extent for the values, based on a collection type over our entity type (i.e. Person* becomes People).
  3. Wrap the whole thing in a module in order to make it deployable.

Here’s the basic sample:

image

Notice that in order to make this work, you’ll need to provide a concrete type for storage. E.g. for Age, you’ll use an Integer32 or so (well, if you expect the modeled people to get really old that is :-)).

image

Where’s the extent in the sample above? The last line in the Demo module is where the storage is allocated, concretized using a table definition in SQL. How to get this model in SQL Server was subject of my introductory post: Getting Started with Oslo – Introducing “M”. In the next episodes we’ll dive a little deeper into things like SQL generation, computed values and queries, before tackling MGrammar.

Cheers!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under: ,

Comments

# re: The M Programming Language – Part 2 – Collections and Extents

Wednesday, February 18, 2009 5:35 AM by Doug

Great posts on Oslo. Very helpful.

# Dew Drop - February 18, 2009 | Alvin Ashcraft's Morning Dew

Wednesday, February 18, 2009 8:22 AM by Dew Drop - February 18, 2009 | Alvin Ashcraft's Morning Dew

Pingback from  Dew Drop - February 18, 2009 | Alvin Ashcraft's Morning Dew

# Reflective Perspective - Chris Alcock &raquo; The Morning Brew #290

Wednesday, February 18, 2009 11:26 PM by Reflective Perspective - Chris Alcock » The Morning Brew #290

Pingback from  Reflective Perspective - Chris Alcock  &raquo; The Morning Brew #290

# The M Programming Language – Part 2 – Collections and Extents

Wednesday, February 18, 2009 11:54 PM by progg.ru

Thank you for submitting this cool story - Trackback from progg.ru

# The M Programming Language ??? Part 2 ??? Collections and Extents - B# &#8230;

Pingback from  The M Programming Language ??? Part 2 ??? Collections and Extents - B# &#8230;

# re: The M Programming Language – Part 2 – Collections and Extents

Saturday, February 21, 2009 12:48 PM by Frank Quednau

Hi,

sadly I cannot comment on the story I was working through (Introducing "M"). The moment I want to look at the Repository SQL I get the statement "Unable to generate SQL because there are no computed values or extents defined.". However, even after just copying your example on the page I get that statement. Do you have any idea why that could be?

Thanks and kind regards

# re: The M Programming Language – Part 2 – Collections and Extents

Sunday, February 22, 2009 6:54 PM by bart

Hi Frank,

I've seen this message myself, but never in a context where it's inappropriate :-). For example, if you write:

module Demo {

   type Person {

       Name : Text;

       Age : Integer32;

   }

}

you should get it because this only defines a type but not associated (table) storage. To do that, you'd need to add the following to the module:

  People : Person*;

Then the error goes away. Please let me know if you experience further problems.

Thanks,

-Bart

# The M Programming Language (Oslo)

Monday, March 09, 2009 2:08 PM by IHateSpaghetti {code}

” Oslo ” is the codename for Microsoft’s forthcoming modeling platform that helps you build your own