March 2005 - Posts

After all the Comega stuff (which will be continued for a while I guess), let's post some funny things (allowing you to take a breath after all this nasty new syntax and geeky IL stuff).

You ever wondered about these funny codenames such as Everett, Whidbey, Longhorn, BlackComb, Yukon, Whitehorse? Well, you can find all these somewhere in North America using MapPoint :-). For example, the roadmap of development tools is going from Everett over Whidbey to Orcas. Everett is a town near Redmond and Seattle. By crossing some water ("Possession Sound") you end up on Whidbey island (likely they skipped Gedney, another island, as a codename). By going further in the northwestern direction, you'll end up on another island called Orcas. What will be next? San Juan, Saltspring, Lopez all sounds pretty good in my private opinion.

Other codenames have also funny stories associated with them such as Longhorn, BlackComb, Whistler (see http://geekswithblogs.net/evjen/archive/2003/11/10/490.aspx). Yukon and Whitehorse can be found in Canada and Kodiak is in Alaska (other islands in that neighborhood such as Woody and Long were rejected for some reason, maybe you can think of some).

Feel free to complete this (rather useless) list :-)

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Introduction

In the 4th episode of my "Adventures in Comega" series I'll be talking about a (smaller) language feature called "possibly-null values". As an example, consider a boolean value which has only two values (binary logic): true or false. However, what happens if you want to express that the logical value can be possibly unknown? As a bool is a value type and the domain is only {0,1} we can't possibly express this. Possibly-null values solve this problem by allowing you to assign null to the variable (or, that is the same as not assigning to it anything whatsoever) to indicate that the value is unknown, but only if the value is marked as being possibly-null.

 

Reference types and null

Today, you can use the "null" value for reference types. Basically, null indicates that the variable is not assigned to. As the matter in fact, behind the scenes that's the same as having a NULL pointer in the variable, because of the nature of reference types. The memory location of the reference type-variable contains just an address (that is, a pointer) to another place in memory where the real data is stored (which is called dynamically allocated memory, cf. malloc in C). In C# today the null keyword is used for this purpose: indicating whether an object has been assigned to or not. Quite often you'll see code like this:

if (null != someVar)
  //do this
else
  //ow, there is some problem, handle it

C#-programming style tip: As a sideremark allow me to explain the programming style of putting a constant first in an equality comparison expression (null != somevar). The idea is that when you're comparing things in C#, you always need to type two characters: == for equality and != for non-equality. It's possible to forget one of these characters easily (not by lack of language knowlegde I hope, but because of a typo). When you do something like a == 5, there's no problem. But if you forget one of both equality symbols, you get a = 5, an assignment. Because assignments evaluate to a boolean value (true when the value is not 0, false if it is 0) this code will compile too (in C/C++ it does without warnings normally, C# will warn you about this risky construction). By reversing the constant and the variable like this 5 == a it's still possible to make the same mistake (5 = a) but now you'll get an error because a constant cannot be assigned to.

Another place where null values are used, is in database (you probably know the DbNull value). Today, in O/R mapping you can't directly express that a boolean or an integer or another basic typed field has as its value null, because null is not in the domain. Comega will help to solve this problem too.

 

NullReferenceException, casts and "as"

One of the others things that are related to the concept null is the NullReferenceException. Take a look at the following code:

SomeClass c = null;
c.DoSomething();

Although this compiles, the CLR will throw a NullReferenceException at runtime because you can't perform an operation on a null-valued variable. Or, in C-terms, you can't dereference a nullpointer:

SomeClass *c = NULL; //or "SomeClass* c = NULL", anyway c is a pointer (indicated by the asterisk)
(*c).DoSomething(); //the same as c->DoSomething(), but the *c syntax tells a little more in this demo for C-newbies :-)

The way to solve this problem is to put the whole thing in a try...catch block or by testing on the value of c. In the same way, the next piece of code with a property will fail:

SomeClass c = null;
string s = c.SomeStringValuedProperty;

Yet another place where the null value is present is when you're using the keyword "as" in C# to perform a cast that can possibly fail:

//assume you got some variable o of type System.Object (e.g. through a method parameter)
MyClass c = (MyClass) o; //will throw an exception if o is not a (subtype of) MyClass instance
MyClass cbis = o as MyClass; //won't throw an exception but will assign null to cbis if the type constraints are not fulfilled

 

Introducing possible-null values

In Comega, this problem is solved using possible-null values, as shown in the next example:

bool? b = null; //you can perform the test (null == b)

This piece of code declares a boolean variable that can be possibly null (indicated by the ?). So, you can assign null, true and false to it, and you can test it for a null value. In the case of a boolean value, this is kind of a ternary logic. Now, if you're using such a variable, you can even cast a null-valued variable without encountering an exception:

MyClass? c = null;
MyClass d = (MyClass) c; //d is not possibly-null but as c is possibly null, this cast does not throw an exception

Notice that you can't do this casting with a value type such as a bool, if you write this:

bool? b = null;
bool a = (bool) b; //will throw an exception, value types without a ? are never nullable

 

Transitivity of null values

One of the things Comega wants to solve by introducing the concept of possibly-null values is the (infamous?) NullReferenceException. The idea is to make null transitive, so that a property getter call on a null-valued variable can return null too:

MyClass? c = null;
bool? b = c.SomeBooleanValuedProperty; //s will be null; no exception should be thrown

"Homework": what will be the result of the following code snippets?

MyClass? c = null;
string s = c.SomeStringValuedProperty;

and of this:

MyClass? c = null;
string? s = c.SomeStringValuedProperty;

and this:

MyClass? c = null;
MyClass child = c.SomeChildMyClassValue;

 

And once again ... we'll dive into the IL stuff

Let's keep it as simple as possible this time :-). Consider this piece of code:

public static void Main()
{
  bool? b;
  Console.WriteLine(b);
  b = true;
  Console.WriteLine(b);
}

In compiled format, the IL of Main is this:

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       32 (0x20)
  .maxstack  5
  .locals init (valuetype StructuralTypes.'Boxed' V_0)
  IL_0000:  ldloca.s   V_0
  IL_0002:  call       instance object StructuralTypes.'Boxed'::ToObject()
  IL_0007:  call       void [mscorlib]System.Console::WriteLine(object)
  IL_000c:  ldc.i4.1
  IL_000d:  newobj     instance void StructuralTypes.'Boxed'::.ctor(bool)
  IL_0012:  stloc.0
  IL_0013:  ldloca.s   V_0
  IL_0015:  call       instance object StructuralTypes.'Boxed'::ToObject()
  IL_001a:  call       void [mscorlib]System.Console::WriteLine(object)
  IL_001f:  ret
} // end of method Test::Main

Clearly, the type of bool? is translated into a StructuralType called Boxed, with a generic approach indicating the type of the target variable (in this case System.Boolean). We already saw the Boxed type in the previous post (also when using ? but then to indicate the number of occurrences inside a content class' definition struct). Now, you can take a closer look at the Boxed type.

One of the first things you'll see is the IsNull method:

.method public hidebysig instance bool  IsNull() cil managed
{
  // Code size       10 (0xa)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  ldfld      bool[] StructuralTypes.'Boxed'::'box'
  IL_0006:  ldnull
  IL_0007:  ceq
  IL_0009:  ret
} // end of method 'Boxed'::IsNull

This is the one being used to determine the "null-ness" of the variable. Furthermore, there is a getter (GetValue) and a setter (SetValue), which are both self-explanatory (the same statement holds for the constructor and the Equals method).

Also, you'll find a couple of static methods for the operator overloads for equality, inequality and casting (both explicit and implicit). These are pretty simple to understand too if you know the nature of the comparison overloads (one for == , one for == and one for == thus in total 6 comparison operator static methods).

Notice you'll also find a class called BoxedEnumerator (generic - constructed with System.Boolean in our example - too) which was not used directly in our sample.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Introduction

One of the targets of the Comega language is to build a bridge between semi-structured data (read: XML) and objects. In future posts I'll describe how Comega fills the gap between relational data (read: SQL) and objects. But in this post, let's concrate on the former case.

 

About DTD and XSD

By itself, XML is nothing more than a large text string or text file containing semi-structured data separated and ordered by means of a tagging mechanism. Although the different fields can be distinguished, there's a stringent need to give fields a meaning by using types. There's another need too: that of being capable to express certain constraints on the usage of fields (for example: has to occur, is optional, can occur one or more times, etc). That's where DTD/XSD comes into play, also known as XML schemas. Of course, the .NET Framework supports this kind of stuff by default (using the System.Xml namespace) but Cw want to integrate these things deeper in the language itself.

 

Content classes - a first view

Let's take a simple example of a book (library) collection. As you know, a book has a title, one or more authors, an ISBN code and optionally you can categorize it in one or more categories. In DTD, this looks as follows:

<!ELEMENT Book (Title, Authors, ISBN, Categories)>
<!ELEMENT Title (#PCDATA)>
<!ELEMENT Authors (Author+)>
<!ELEMENT ISBN (#PCDATA)>
<!ELEMENT Categories (Category*)>
<!ELEMENT Author (#PCDATA)>
<!ELEMENT Category (#PCDATA)>

As you can see, the symbols + and * are used to indicate respectively "one or more" and "zero or more". There's also the symbol ? that can be used to indicate "zero or one" (= optional). What we don't have here is strong typing.

As an alternative we can use XSD to describe the same structure:

<element name="Book">
  <complexType>
    <sequence>
      <element name="Title" type="string"/>
      <element name="Authors">
        <complexType>
          <sequence minOccurs="1">
            <element name="Author" type="string"/>
          </sequence>
        </complexType>
      </element>
      <element name="ISBN" type="string"/>
      <element name="Categories">
        <complexType>
          <sequence minOccurs="1">
            <element name="Category" type="string"/>
          </sequence>
        </complexType>
      </element>
    </sequence>
  </complexType>
</element>

Instead of using *, +, ? XSD is using the minOccurs and maxOccurs attributes for the tags. Functionally, it's the same and here we have strong typing of the elements.

Both structures can be used to define a book like this:

<Book>
  <Title>Title goes here</Title>
  <Authors>
    <Author>First Author</Author>
  </Authors>
  <ISBN>0123456789</ISBN>
  <Categories>
    <Category>One</Category>
    <Category>Two</Category>
  </Categories>
</Book>

However, it's far from cool to use this kind of data definitions inside code. Did you ever use XmlDocument (DOM) or other XML processing APIs like SAX? The construction of this kind of data objects is far from easy and looks rather clumpsy when viewed inside code. Luckily, there are a couple of ways to get around this, most notably the use of a strongly typed DataSet in the .NET Framework (created by using xsd.exe). But in the end, the internal representation of elements marked with ?, +, * is based on collection types and you get to see this directly, e.g. through the DataTable's Rows collection. Okay, you can iterate over it, but the translation battle going on to map both data representations is pretty visible.

So, how can Cw help us accomplishing a better model to cope with this semi-structured model in a more object-oriented fashion? The answer is content classes, which are based on the DTD syntax but have strongly typing aboard using the type model of the language and runtime (therefor every object can be used in the structure). Here's the book sample as a content class:

public class Book {
  struct {
    string Title;
    struct {
      string Author;
    }+ Authors;
    string ISBN;
    struct {
      string Category;
    }* Categories;
  }
}

Optional fields can be declared in a similar fashion using the ? symbol. For example, a book can have an optional URL with additional information and/or errata:

public class Book {
  struct {
    string Title;
    struct {
      string Author;
    }+ Authors;
    string ISBN;
    struct {
      string Category;
    }* Categories;
    string? URL;
  }
}

Nice, isn't it? Now, how to use this. The answer is again pretty simple and understandable: use XML inside the code, like this:

public Book GetSomeBook()
{
  return <Book>
           <Title>Title goes here</Title>
           <Authors>
             <Author>First Author</Author>
           </Authors>
           <ISBN>0123456789</ISBN>
           <Categories>
             <Category>One</Category>
             <Category>Two</Category>
           </Categories>
         </Book>;

}

In an analogous fashion one can declare and assign a variable using XML syntax, like this (you don't need to mention the type):

b = <Book>
      <Title>Title goes here</Title>
      <Authors>
        <Author>First Author</Author>
      </Authors>
      <ISBN>0123456789</ISBN>
      <Categories>
        <Category>One</Category>
        <Category>Two</Category>
      </Categories>
    </Book>;

Okay, looks pretty static right now, isn't it? How can we make it somewhat more dynamically so that we can construct a book with a given title and ISBN for example:

public Book GetSomeSpecificBook(string title, string ISBN)
{
  return <Book>
           <Title>{title}</Title>
           <Authors>
             <Author>First Author</Author>
           </Authors>
           <ISBN>{ISBN}</ISBN>
           <Categories>
             <Category>One</Category>
             <Category>Two</Category>
           </Categories>
         </Book>;
}

This will construct a book using the given data. Notice that in between the curly braces one can specify a full expression too (e.g. to create a sum of certain values). Notice you can still use the default constructor approach too.

Now, assume you have a Book instance, how to grab the data from it in order to display it, transfer it, or something else? Look at the following example:

public ProcessBook()
{
  Book b = GetSomeBook();

  foreach (it in b.Categories.Category) { Console.WriteLine(it); }
}

Notice the usage of the it iterator variable again (which is assigned the right type automatically). As you can see the b.Categories.Category is in fact equivalent to the XPath expression /Categories/Category that you'd use in classic XML processing in order to obtain the values. Queries (which will be explained later in another post) can be applied as well, including transitive queries (get all values associated with a certain "label" in nested structures, using the ... notation) and the use of member selection to obtain a stream of values (see previous post for more information about streams) which can be combined with the filter [...] syntax. So, as you can see, this technology is very very broad already.

 

Extending the content class

As a content class is a class, it can also contain other members, such as methods. In fact, the defined struct defines the data structure that class is representing, in another way than using standard private attributes. Logically, these methods will have access to the data "attributes" of the class too in order to manipulate the data or to query the data. In order to do this, declare a method inside the class definition. Now, assume that categories have a structure like this "maincat-subcat-subcat" and you want to determine whether a book is in a certain main category. However, we have multiple categories associated with a book. So, one of the approaches would be to use the foreach(it in ...) syntax to iterate over all the categories associated with the book instance. As an alternative let's use a so-called transitive query. By using this...Category we'll obtain a stream of all the categories associated with the current book instance. Then, we can use the :: operator to refine our result by applying a filter that returns on its turn a filtered stream. Together, this looks like this:

this...Category::*[SomeFilter(it)]

So, inside the filter we're using a method that gets the current value of the iterator that is doing the filtering (called it, as explained earlier). Last but not least, you need to define the "SomeFilter" method. As we only want to use it locally in our "main category boolean method", we can use something called nested methods in Cw. The total implementation is this:

public virtual bool HasMainCategory(string category)
{
  bool IsOfMainCategory(string category, string sel)
  {
    return category.StartsWith(sel + "-");
  };

  return this...Category::*[IsOfMainCategory(it, category)] != null;
}

So, if we find a category in the list of categories with the given main category, we'll return true, otherwise false.

 

What's the IL :-)

Time for the nerdy stuff, what's a content class translated to upon compilation? Again, let's investigate this incrementally. We'll kick off with a very simple sample:

class Test
{
  struct {
    string val;
  }
}

This is likely not that useful, but fairly interesting for sake of demonstration purposes. Compile and ildasm will give you this:

.field public valuetype StructuralTypes.Tuple_String_val sequence
.custom instance void [System.Compiler.Runtime]System.Compiler.AnonymousAttribute::.ctor() = ( 01 00 00 00 )

So, the compiler defines a "structural type" called Tuple_String_val, also declared as a sequence. Further examination of that helper class results in this:

.class public auto ansi sealed Tuple_String_val
       extends [mscorlib]System.ValueType
       implements [System.Compiler.Runtime]StructuralTypes.ITupleType,
                  System.Collections.Generic.'IEnumerable'

{
} // end of class Tuple_String_val

As you can see the class is derived from an ITupleType (an interface) and is a generic IEnumerable collection of strings too. Furthermore, there is a public field val (that we declared explicitly):

.field public string val

And the expected method GetEnumerator to get the enumerator:

.method public virtual instance class System.Collections.Generic.'IEnumerator'
        GetEnumerator() cil managed
{
  // Code size       12 (0xc)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  ldobj      StructuralTypes.Tuple_String_val
  IL_0006:  newobj     instance void System.Collections.Generic.EnumeratorTuple_String_val::.ctor(valuetype StructuralTypes.Tuple_String_val)
  IL_000b:  ret
} // end of method Tuple_String_val::GetEnumerator

This explains the possibility to use the foreach construct to iterate over the object.

Okay, time for something more. What about the ?, + and * symbols? Consider the following sample:

class Test
{
  struct {
    string* val1;
    string+ val2;
    string? val3;
  }
}

This is far more heavy when you look at the IL. For the *, not that much changes. The basic difference in the StructuralType is the fact that you end up with a collection instead of a simple string as the attribute:

.field public class System.Collections.Generic.'IEnumerable' val1

For the +, the situation is far more complex. First of all, there should be a val2 field in the Tuple_IEnumerable that looks like this:

.field public valuetype StructuralTypes.'NonEmptyIEnumerable' val2

Again it's a generic type created using the System.String type, but this time it's of the type "NonEmptyIEnumerable". That is exactly what + is supposed to do ("one or more"). So inside the StructuralTypes section you'll find this type declared. Inside it, you'll find mainly enumerator logic and quite a bit conversion functions (implicit/explicit) to convert to various helper types. The helper types (also in StructuralTypes) include NonNull and Boxed, both with a generic nature (in our case, typed with the System.String type). I won't cover these in much more detail right now.

And finally we have the ? operator that leads by itself to a Boxed type:

.field public valuetype StructuralTypes.'Boxed' val3

This type again implements the generic IEnumerable for System.String.

Combined alltogether you'll see a fairly complicated set of helper types popping up after compilation. Our Books sample for instance results in 10 helper types to be created. The nesting of the structs in our content type can be examined in that case and has the following look:

.field public valuetype StructuralTypes.'NonEmptyIEnumerable' Authors
.field public class System.Collections.Generic.'IEnumerable' Categories
.field public string ISBN
.field public string Title

So there are two other Tuple types for the nested structs. And on the class level the following declaration can be found:

.field public valuetype StructuralTypes.'Tuple_String_Title_NonEmptyIEnumerable_Authors_String_ISBN_IEnumerable_Categories' sequence

So, in the end two StructuralTypes are referred to in the declaration of the type: one for the authors and one for the categories.

Time to examine the constructor logic that is spit out by the compiler when it finds the XML declaration. In order to keep things (a bit) simple, let's use the following content class:

class Test
{
  struct {
    string* val;
  }

  public Test GetTest()
  {
    return blah;
  }
}

This is the result:

.method public hidebysig static class Test
        GetTest() cil managed
{
  // Code size       56 (0x38)
  .maxstack  5
  .locals init (class Test V_0,
           string V_1,
           class System.Collections.Generic.'List' V_2,
           valuetype StructuralTypes.'Tuple_IEnumerable_val' V_3,
           class Test V_4,
           class Test V_5)
  IL_0000:  newobj     instance void Test::.ctor()
  IL_0005:  stloc.0
  IL_0006:  ldstr      "blah"
  IL_000b:  stloc.1
  IL_000c:  newobj     instance void System.Collections.Generic.'List'::.ctor()
  IL_0011:  stloc.2
  IL_0012:  ldloc.2
  IL_0013:  ldloc.1
  IL_0014:  call       instance int32 System.Collections.Generic.'List'::Add(string)
  IL_0019:  pop
  IL_001a:  ldloca.s   V_3
  IL_001c:  ldloc.2
  IL_001d:  stfld      class System.Collections.Generic.'IEnumerable' StructuralTypes.'Tuple_IEnumerable_val'::val
  IL_0022:  ldloc.0
  IL_0023:  ldloc.3
  IL_0024:  stfld      valuetype StructuralTypes.'Tuple_IEnumerable_val' Test::sequence
  IL_0029:  ldloc.0
  IL_002a:  stloc.s    V_4
  IL_002c:  br         IL_0031
  IL_0031:  ldloc.s    V_4
  IL_0033:  stloc.s    V_5
  IL_0035:  ldloc.s    V_4
  IL_0037:  ret
} // end of method Test::GetTest

So, there's a call to add the "blah" string to the collection which is returned further on, after it has been wrapped into a Tuple_IEnumerable.

 

Question for you guys

There is some mistake in the previous sample. When you try to do this:

  public static Test GetTest()
  {
    return blahbla;
  }

you'll end up with this error message from the compiler:

test.cw(12,34): error CS2518: Invalid content 'val' in element 'Test', the content for this element is already complete.

Make a fix to the code in order to get rid of this problem. Tip: it's just a one-character fix. In the end, I want to be able to write this:

  public static void Main()
  {
    Test t = GetTest();
    foreach(it in t.val)
      Console.WriteLine(it);
  }

which should print

blah
bla

on the screen. Enjoy!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Introduction

Streams in Cw are a way to create a kind of arrays (which consist of elements of a certain defined type) that are only created when these are needed (we call this lazy construction). As the matter in fact, streams are nothing more than autogenerated classes that are spit out by the compiler upon compilation. However, the concept op streams makes their usage completely transparent because of the automatic implementation of IEnumerable, which provides support for the "foreach iterator" usage, and - as explained further on - even more mechanisms to iterate over the elements.

 

Declaration

The first thing to do is to declare a stream in Cw. As I mentioned before, a stream is kind of an array with elements of a certain type. Therefore, we need to declare the type of course. To indicate you want to construct a stream, you're using the * operator. As an example, consider a stream of integers:

int* a;

C/C++ folks will recognize a pointer notation in this. It might help to think of this notation as one of the notations to declare an array in C/C++, and that idea makes sense pretty much as the concept of a stream is based on the concept of arrays.

 

Yield

Now it's time to populate the stream. As a stream is a "lazy built array", the system will build it dynamically by "yielding" the values in it. A simple approach looks like this:

int* GetStream()
{
     yield return 0;
}

By calling the GetStream method, you'll end up with a stream that contains the value 0. Not that exciting, but enough to start explaining the concepts a little further. The usage of the stream looks now as follows:

void UseIt()
{
     int* a;
     a = GetStream();
     foreach(int i in a)
          Console.WriteLine(i);
}

By executing this code, you'll see ... 0 on the screen. Predictable I guess. Now the point is that the GetStream method could do more than just one yield too to build the stream. Even more, you can populate the stream based on decision logic, loops, and so on, like this:

int* GetStream(int s, int e)
{
     while(s <= e)
          yield return s++;
}

By calling GetStream(1,5), you'll get a stream that contains 1,2,3,4,5.

 

How does it work?

Okay, you've seen the basic principles of the stream and yield right now. Let's take a look at how this gets constructed internally. Because Cw runs on the .NET Framework v1.1, it's just generating (that is, the cwc.exe compiler) MSIL code. When you inspect the generated assembly through ildasm, you'll see your method GetStream in the IL-code looking like this:

.method private hidebysig static class System.Collections.Generic.'IEnumerable'
        GetStream(int32 s,
                  int32 e) cil managed
{
  // Code size       31 (0x1f)
  .maxstack  2
  .locals init (class Streams/'closure:765' V_0,
           class System.Collections.Generic.'IEnumerable' V_1,
           class System.Collections.Generic.'IEnumerable' V_2)
  IL_0000:  newobj     instance void Streams/'closure:765'::'.ctor$PST06000007'()
  IL_0005:  stloc.0
  IL_0006:  ldloc.0
  IL_0007:  ldarg.0
  IL_0008:  stfld      int32 Streams/'closure:765'::s$PST04000001
  IL_000d:  ldloc.0
  IL_000e:  ldarg.1
  IL_000f:  stfld      int32 Streams/'closure:765'::e$PST04000002
  IL_0014:  ldloc.0
  IL_0015:  stloc.1
  IL_0016:  br         IL_001b
  IL_001b:  ldloc.1
  IL_001c:  stloc.2
  IL_001d:  ldloc.1
  IL_001e:  ret
} // end of method Streams::GetStream

What's going on here? Quite a lot, but the most interesting part is actually the fact that the GetStream method is creating an instance of some "closure:765" class, which was generated during the compilation and has the following signature:

.class auto ansi sealed nested private specialname 'closure:765'
       extends [mscorlib]System.Object
       implements [mscorlib]System.Collections.IEnumerable,
                  System.Collections.Generic.'IEnumerator',
                  [mscorlib]System.Collections.IEnumerator,
                  [mscorlib]System.IDisposable,
                  System.Collections.Generic.'IEnumerable'
{
} // end of class 'closure:765'

As you can see, the class is nested and is implementing a bunch of IEnumera* interfaces, both generic ad "classic" (notice that the System.Collections.Generic namespace is present in Cw on .NET v1.1 too, whileas this is one of the big features in C# 2.0 today).

Secondly, this class has two privatescope-d variables s and e that are used by the GetStream method to pass through the parameters to the nested class:

.field privatescope int32 s$PST04000001
.field privatescope int32 e$PST04000002

Beside of this, there's also the field "currentValue" that's being used to report the current value of the stream to the caller (via the enumerator):

.field private int32 'current Value'

The real "magic" is going on in the MoveNext method that is called every time the next element has to be retrieved. The contents of this method is quite predictable and will make decisions based on the current value together with s and e to return the desired value in the stream:

.method public virtual instance bool  MoveNext() cil managed
{
  // Code size       74 (0x4a)
  .maxstack  5
  .locals init (class Streams/'closure:765' V_0,
           int32 V_1)
  IL_0000:  ldarg.0
  IL_0001:  stloc.0
  IL_0002:  ldarg.0
  IL_0003:  ldfld      int32 Streams/'closure:765'::'current Entry Point: '
  IL_0008:  switch     (
                        IL_0015,
                        IL_0046)
  IL_0015:  ldloc.0
  IL_0016:  ldfld      int32 Streams/'closure:765'::s$PST04000001
  IL_001b:  ldloc.0
  IL_001c:  ldfld      int32 Streams/'closure:765'::e$PST04000002
  IL_0021:  bgt        IL_0048
  IL_0026:  ldarg.0
  IL_0027:  ldloc.0
  IL_0028:  ldfld      int32 Streams/'closure:765'::s$PST04000001
  IL_002d:  stloc.1
  IL_002e:  ldloc.0
  IL_002f:  ldloc.1
  IL_0030:  ldc.i4.1
  IL_0031:  add
  IL_0032:  stfld      int32 Streams/'closure:765'::s$PST04000001
  IL_0037:  ldloc.1
  IL_0038:  stfld      int32 Streams/'closure:765'::'current Value'
  IL_003d:  ldarg.0
  IL_003e:  ldc.i4.1
  IL_003f:  stfld      int32 Streams/'closure:765'::'current Entry Point: '
  IL_0044:  ldc.i4.1
  IL_0045:  ret
  IL_0046:  br.s       IL_0015
  IL_0048:  ldc.i4.0
  IL_0049:  ret
} // end of method 'closure:765'::MoveNext

First, there is some branching going on based on the current value of s and e, and if still in the scope, s is incremented (add) and set to the current value and the method returns.

Finally, main calls the GetStream method and calls the enumerator to iterate over the collection in order to Console.WriteLine the values to the screen:

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       66 (0x42)
  .maxstack  7
  .locals init (class System.Collections.Generic.'IEnumerable' V_0,
           class System.Collections.Generic.'IEnumerable' V_1,
           class System.Collections.Generic.'IEnumerator' V_2,
           int32 V_3,
           int32 V_4)
  IL_0000:  ldc.i4.1
  IL_0001:  ldc.i4.5
  IL_0002:  call       class System.Collections.Generic.'IEnumerable' Streams::GetStream(int32,
                                                                                                       int32)
  IL_0007:  stloc.0
  IL_0008:  ldloc.0
  IL_0009:  stloc.1
  IL_000a:  ldloc.1
  IL_000b:  brfalse    IL_003b
  IL_0010:  ldloc.1
  IL_0011:  callvirt   instance class System.Collections.Generic.'IEnumerator' System.Collections.Generic.'IEnumerable'::GetEnumerator()
  IL_0016:  stloc.2
  IL_0017:  ldloc.2
  IL_0018:  brfalse    IL_003b
  IL_001d:  ldloc.2
  IL_001e:  callvirt   instance bool System.Collections.Generic.'IEnumerator'::MoveNext()
  IL_0023:  brfalse    IL_003b
  IL_0028:  ldloc.2
  IL_0029:  callvirt   instance int32 System.Collections.Generic.'IEnumerator'::get_Current()
  IL_002e:  stloc.3
  IL_002f:  ldloc.3
  IL_0030:  stloc.s    V_4
  IL_0032:  ldloc.s    V_4
  IL_0034:  call       void [mscorlib]System.Console::WriteLine(int32)
  IL_0039:  br.s       IL_001d
  IL_003b:  call       string [mscorlib]System.Console::ReadLine()
  IL_0040:  pop
  IL_0041:  ret
} // end of method Streams::Main

Notice the return type for the int*; it's just a generic enumerable of Int32 values. For the geeks, take a look at the closure:765 nested class's get_Current method. You'll remark that it's using boxing, something that has to do with the usage of a non-generic class (boxing/unboxing). For more information about these issues and the evolution in .NET v2.0, consult documentation about generics in C# v2.0 and so on.

 

Intermediate wrap-up

So, what did we see so far? By declaring a stream, you're in fact declaring a class that is IEnumerable and builds its content at runtime by executing a yield statement that was translated to code inside the MoveNext method of the IEnumerable implementation of the stream type. Thus, a stream is effectively building its contents when the program is executing in an incremental fashion, whereas classic collections (arrays or System.Collection objects) are typically built upfront and then iterated over by means of the enumerator code (e.g. by using foreach).

 

Even more stuff ... apply-to-all-expressions

But there is more, something we call "apply-to-all-expressions". In my code samples you saw the typical usage of the foreach loop construct to iterate over the values in the collection (in this case, in the stream). Cw supports another construct that doesn't require the declaration of another variable to hold the values of the elements in the collection by means of the keyword "it". Basically what happens is that you attach a code-block to an instance of a stream and inside that codeblock the "it" keyword has the right type (that is, the type of the elements in the stream) that can be used to retrieve the value for the current iteration. Let's show you:

void UseIt()
{
     int* a;
     a = GetStream();
     a.{ Console.WriteLine(it); };
}

This code can of course be abbreviated to:

void UseIt()
{
     GetStream().{ Console.WriteLine(it); };
}

Or the code block can contain multiple statements. When you go back to the IL code for this program, you'll notice two things:

  • The nested closure class has another identifier.
  • The is another nested closure class in the class.

The second remark is the most interesting one. So, locate the original closure and the new one and open up the new one to look at more details. In my case, the new one is called closure:561 and contains a function called "Function:544" that has the following IL code inside it:

.method privatescope instance void  'Function:544$PST06000010'(int32 it) cil managed
{
  .param [0]
  .custom instance void [mscorlib]System.Diagnostics.DebuggerHiddenAttribute::.ctor() = ( 01 00 00 00 )
  .custom instance void [mscorlib]System.Diagnostics.DebuggerStepThroughAttribute::.ctor() = ( 01 00 00 00 )
  // Code size       12 (0xc)
  .maxstack  8
  IL_0000:  ldarg.1
  IL_0001:  call       void [mscorlib]System.Console::WriteLine(int32)
  IL_0006:  br         IL_000b
  IL_000b:  ret
} // end of method 'closure:561'::'Function:544'

This is where the code of the apply-to-all-expression is compiled to. One parameter is passed to the method, containing the strongly typed "it" value. The caller function has changed a little too, in order to call this function:

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       99 (0x63)
  .maxstack  9
  .locals init (class Streams/'closure:561' V_0,
           class System.Collections.Generic.'IEnumerable' V_1,
           int32 V_2)
  IL_0000:  newobj     instance void Streams/'closure:561'::'.ctor$PST0600000F'()
  IL_0005:  stloc.0
  IL_0006:  ldloc.0
  IL_0007:  ldc.i4.1
  IL_0008:  ldc.i4.5
  IL_0009:  call       class System.Collections.Generic.'IEnumerable' Streams::GetStream(int32,
                                                                                                       int32)
  IL_000e:  stfld      class System.Collections.Generic.'IEnumerable' Streams/'closure:561'::p$PST04000005
  IL_0013:  ldloc.0
  IL_0014:  ldfld      class System.Collections.Generic.'IEnumerable' Streams/'closure:561'::p$PST04000005
  IL_0019:  stloc.1
  IL_001a:  ldloc.1
  IL_001b:  brfalse    IL_005c
  IL_0020:  ldloc.0
  IL_0021:  ldloc.1
  IL_0022:  callvirt   instance class System.Collections.Generic.'IEnumerator' System.Collections.Generic.'IEnumerable'::GetEnumerator()
  IL_0027:  stfld      class System.Collections.Generic.'IEnumerator' Streams/'closure:561'::'foreachEnumerator: 2$PST04000006'
  IL_002c:  ldloc.0
  IL_002d:  ldfld      class System.Collections.Generic.'IEnumerator' Streams/'closure:561'::'foreachEnumerator: 2$PST04000006'
  IL_0032:  brfalse    IL_005c
  IL_0037:  ldloc.0
  IL_0038:  ldfld      class System.Collections.Generic.'IEnumerator' Streams/'closure:561'::'foreachEnumerator: 2$PST04000006'
  IL_003d:  callvirt   instance bool System.Collections.Generic.'IEnumerator'::MoveNext()
  IL_0042:  brfalse    IL_005c
  IL_0047:  ldloc.0
  IL_0048:  ldfld      class System.Collections.Generic.'IEnumerator' Streams/'closure:561'::'foreachEnumerator: 2$PST04000006'
  IL_004d:  callvirt   instance int32 System.Collections.Generic.'IEnumerator'::get_Current()
  IL_0052:  stloc.2
  IL_0053:  ldloc.0
  IL_0054:  ldloc.2
  IL_0055:  call       instance void Streams/'closure:561'::'Function:544$PST06000010'(int32)
  IL_005a:  br.s       IL_0037
  IL_005c:  call       string [mscorlib]System.Console::ReadLine()
  IL_0061:  pop
  IL_0062:  ret
} // end of method Streams::Main

 

Constructing new streams based on existing streams

Based on an apply-to-all-expression you can build up a new stream, that's built by converting the type or by calling some method in order to make a conversion. A basic sample looks like this:

string* newStream = GetStream().{ return it.ToString() };

This will be created in a similar fashion as the previous example. This time another function will be created for the apply-to-all-expression that performs the return it.ToString(); code. But there is more going on, because we are declaring another stream type based on a string this time. This results in another stream class being created, nested inside the other stream class:

.class auto ansi sealed nested private specialname 'closure:1241'
       extends [mscorlib]System.Object
       implements [mscorlib]System.Collections.IEnumerable,
                  System.Collections.Generic.'IEnumerator<System.String>',
                  [mscorlib]System.Collections.IEnumerator,
                  [mscorlib]System.IDisposable,
                  System.Collections.Generic.'IEnumerable<System.String>'
{
} // end of class 'closure:1241'

Inside the MoveNext method you'll find code that calls the conversion function this time:

.method public virtual instance bool  MoveNext() cil managed
{
  // Code size       118 (0x76)
  .maxstack  10
  .locals init (class Streams/'closure:561'/'closure:1241' V_0,
           class System.Collections.Generic.'IEnumerable' V_1,
           int32 V_2,
           int32 V_3)
  IL_0000:  ldarg.0
  IL_0001:  stloc.0
  IL_0002:  ldarg.0
  IL_0003:  ldfld      int32 Streams/'closure:561'/'closure:1241'::'current Entry Point: '
  IL_0008:  switch     (
                        IL_0015,
                        IL_0072)
  IL_0015:  ldloc.0
  IL_0016:  ldfld      class System.Collections.Generic.'IEnumerable' Streams/'closure:561'/'closure:1241'::Collection$PST04000009
  IL_001b:  stloc.1
  IL_001c:  ldloc.1
  IL_001d:  brfalse    IL_0074
  IL_0022:  ldloc.0
  IL_0023:  ldloc.1
  IL_0024:  callvirt   instance class System.Collections.Generic.'IEnumerator' System.Collections.Generic.'IEnumerable'::GetEnumerator()
  IL_0029:  stfld      class System.Collections.Generic.'IEnumerator' Streams/'closure:561'/'closure:1241'::'foreachEnumerator: 3$PST0400000D'
  IL_002e:  ldloc.0
  IL_002f:  ldfld      class System.Collections.Generic.'IEnumerator' Streams/'closure:561'/'closure:1241'::'foreachEnumerator: 3$PST0400000D'
  IL_0034:  brfalse    IL_0074
  IL_0039:  ldloc.0
  IL_003a:  ldfld      class System.Collections.Generic.'IEnumerator' Streams/'closure:561'/'closure:1241'::'foreachEnumerator: 3$PST0400000D'
  IL_003f:  callvirt   instance bool System.Collections.Generic.'IEnumerator'::MoveNext()
  IL_0044:  brfalse    IL_0074
  IL_0049:  ldloc.0
  IL_004a:  ldfld      class System.Collections.Generic.'IEnumerator' Streams/'closure:561'/'closure:1241'::'foreachEnumerator: 3$PST0400000D'
  IL_004f:  callvirt   instance int32 System.Collections.Generic.'IEnumerator'::get_Current()
  IL_0054:  stloc.2
  IL_0055:  ldloc.2
  IL_0056:  stloc.3
  IL_0057:  ldarg.0
  IL_0058:  ldloc.0
  IL_0059:  ldfld      class Streams/'closure:561' Streams/'closure:561'/'closure:1241'::Closure$PST0400000A
  IL_005e:  ldloc.3
  IL_005f:  call       instance string Streams/'closure:561'::'Function:595$PST06000014'(int32)
  IL_0064:  stfld      string Streams/'closure:561'/'closure:1241'::'current Value'
  IL_0069:  ldarg.0
  IL_006a:  ldc.i4.1
  IL_006b:  stfld      int32 Streams/'closure:561'/'closure:1241'::'current Entry Point: '
  IL_0070:  ldc.i4.1
  IL_0071:  ret
  IL_0072:  br.s       IL_0039
  IL_0074:  ldc.i4.0
  IL_0075:  ret
} // end of method 'closure:1241'::MoveNext

Remark the nesting depth and the call to the function to perform the conversion, which looks pretty simple:

.method privatescope instance string  'Function:595$PST06000014'(int32 it) cil managed
{
  // Code size       17 (0x11)
  .maxstack  3
  .locals init (string V_0,
           string V_1)
  IL_0000:  ldarga.s   it
  IL_0002:  call       instance string [mscorlib]System.Int32::ToString()
  IL_0007:  stloc.0
  IL_0008:  br         IL_000d
  IL_000d:  ldloc.0
  IL_000e:  stloc.1
  IL_000f:  ldloc.0
  IL_0010:  ret
} // end of method 'closure:561'::'Function:595'

I'd recommend to mess around in the IL a little more to see what's going on if you're really interested in this stuff. Once you understand the basic tricks, it's fairly easy to understand what's the magic stuff all about.

 

More samples?

Comega comes with a bunch of examples of streams that are interesting to check out further. I strongly recommend to ildasm the generated code to get a better image of the overall structure and ideas.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

In the upcoming days/weeks I'll be posting more about Comega on an (ir)regular basis. This first post is meant as a general introduction in the Comega project.

What is it?

Comega (abbreviated as Cw, w standing for the Greek letter "omega") is a research programming language that is created by Microsoft Research and contains a bunch of new language features that will likely (partially) make it into C# v3.0. The homepage of Cw can be found on http://research.microsoft.com/Comega/.

How to get it?

You can get the "Comega compiler preview 1.0.2" on the website mentioned above. It will integrate with Visual Studio .NET 2003 and it will install some samples too to introduce the language.

Why a new language?

Well, it's not really a new language, it's rather a collection of new language features. Summarized in one sentence, Cw is focusing mainly on briding the gap between various data models (formerly known as X# or Xen), including the relational model, the object model and XML. The overall idea is to extend C# with language constructs that make it easier to program against structured relation data and semi-structured XML data. But there is more than just this:

  • Streams, iterator functions and apply-to-all-expressions allow you to do similar things as with sequences in XPath and XQuery. In fact, part of this has already been realized in C# v2.0 with generic collections. As the matter in fact, one of the differences is that C# is now aware of the concept of streams (that are declared using *, which looks somewhat like C/C++ pointers at first sight). Multiple streams cannot be embedded but will be flattened into one stream.
  • Content classes are in fact declared as structs and are the C#-language equivalent for DTDs in the XML space. The idea is to declare "content" that is semi-structured based on a struct definition that is extended with *, + and ? "flags" that work in a similar fashion as these operators in regular expressions (optional, one or more, zero or one). By declaring such a type, you can use XML directly in the language to declare an instance of that type and to perform various operations on it (this feature of embedding XML in code is also referred to as XML literals in Cw).
  • Nullable types introduces kind of a ternary logic in the language, allowing value types to be null too (I call it ternary because of the sample of having a bool? type - no the ? is not a typo - that can have values true, false and null). This means that a null reference can be propagated to its properties (e.g. retrieving the length of a string which is set to null returns null too).
  • Choce types are the equivalent of the XSD xs:choice element in the XML world. Basically it's very similar to union types in C/C++ where the start address of more than one variable (type) is the same (thus values that are projected on the same memory location are sitting "on top of each other"). Because of the static checks of the compiler, you can assign to the choice type without referencing the internal type (e.g. choice{int; string;} x = 1; will set the int value whereas choice{int; string; } y = "bart" will set the string value).
  • Anonymous structs are the same as xs:sequence declarations in XSD. Basically it just means that you can declare a struct that has various types in it and you can just assign to it without having to use field names or whatsoever (e.g. struct {string; string; double price} allows you to instantiate an instance using new {"Duvel", "Beer", 2.50 } and to take advantage of it by using indexers and properties for the named fields).
  • Built-in query operators for XPath. As you can use XML inside the Cw language directly, you need a way to query the "variables" too. In order to do so, the 'dot' operator plays the same role as XPath queries (Products.Product.Price maps to '/Products/Product/Price') and of course this is strongly-typed. However, there is more. When using the * operator you retrieve a stream (same operator to declare and use streams). This way you can get all the products by using Products.Product.*, which is a stream container that you can iterate over and so on. Last but not least the [...] notation allows you to perform selections on the objects, returning a stream with the found elements.
  • Built-in query operators for SQL. To bridge the gap between object-oriented design and relational databases, Cw supports database objects that represent a relational database on some server somewhere. Queries can then be done (that is, after importing the schema) by using built-in operators such as select. The big advantage of this is the static typing at compile time and the fact that you can avoid classic O/R mapping tools (cf. O/R tools, ObjectSpaces, etc). The syntax is the same as SQL (select ... from ... where ... order by ... group by ... having ..., support for joins etc) but it's using classic operators such as &&, ||, ! to create logical expressions.
  • SQL DML statements in Cw. As you have query statements, you also have DML statements (i.e. update, insert, delete) in Cw, also in a strongly typed fashion. There's also built-in transaction support using the keywords transact, commit and rollback associated with a database object (looks somewhat like try...catch...finally or using block structures).
  • Concurrency is also tightly integrated in Cw. That means you'll find the notion of synchronous and asynchronous methods (async keyword, allows you to create thread without using a thread library or async constructions from the .NET Framework directly). Beside of that, there's the notion of a chord. Explained very briefly (in more detail later on in a future post) a chord is a list of methods that is associated with a method declaration. Semantically, a chord tells the system that the method in question can only be executed if all the chords have been called previously. By using this mechanism, synchronisation issues can be solved rather declaratively than by using classic constructs like semaphores. In fact, this is the field of a subset of Cw (in fact it's the other way around, Cw is a merge of various projects) called Polyphonic C# (see http://www.research.microsoft.com/%7Enick/polyphony/).

Interesting kick-off readings include http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnexxml/html/xml01142005.asp and http://www.infoworld.com/article/05/03/22/HNcomega_1.html which was released to the web very recently.

Check out my blog the upcoming days/weeks for more information to come and some samples of Comega.

UPDATE: All my Comega posts will be listed on http://community.bartdesmet.net/blogs/bart/category/40.aspx too.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Posted Friday, March 25, 2005 8:52 PM by bart | with no comments
Filed under:

Today I took the time to explore some exciting projects on future language features and alternative languages on the CLR. It's pretty exciting to see where we're going to on a longer term and how new languages features will help developers to write better code faster (yeah, I know this sounds like marketing, but it's just what it is).

  • First of all I kicked off by taking a look at dynamic languages due to the recent post of Jason Zander on the availability of Jim Hugunin's IronPython 0.7 release (more on http://blogs.msdn.com/jasonz/archive/2005/03/23/401095.aspx). I was following IronPython for quite a while (messed around with the 0.6 release earlier on). You can find IronPython (which is a pretty fast implementation of Python on the CLR) via the GotDotNet workspace on http://www.gotdotnet.com/workspaces/workspace.aspx?id=ad7acff7-ab1e-4bcb-99c0-57ac5a3a9742. The idea of dynamic languages is to have dynamic compilation. That means - simply stated - that you can write code as you go (e.g. instantiate a certain class and perform actions on it) without having to write the code first and compiling it (therefore it's kind of an interpreter but with compilation aboard). Jim Hugunin (http://blogs.msdn.com/hugunin) is the guy who worked on AspectJ but works now on the CLR team at Microsoft on the field of dynamic languages. I'm expecting more exciting stuff coming up from that side.
  • Secondly it's worth to take a look at F# (together with Abstract IL and ILX) on http://research.microsoft.com/projects/ilx/. F# has to be situated on the field of ML and functional programming, but it's an implementation that runs on top of the .NET Framework. The author/inventor of F# is Don Syme (visit his blog on http://blogs.msdn.com/dsyme). One of the cool things about the MS Research sites are the endless links to other projects, such as "Generics for .NET" which has made it in .NET v2.0. F# can be downloaded and will integrate with the Visual Studio .NET environment.
  • The next one is my today's favorite: Comega (http://research.microsoft.com/Comega/). This language brings a lot of powerful stuff to the C# language including abstraction of asynchronous concurrency (also for event-based apps over networks) and data-orientation. Where you have on the one hand C# as a general-purpose language today and on the other hand various languages for data querying such as SQL and XQuery, Comega has all these things inside one single (research) language. Nevertheless, C# 3.0 will bring some of this stuff to you as mentioned by Anders Hejlsberg on Channel 9 a couple of months ago (see http://channel9.msdn.com/ShowPost.aspx?PostID=10276). The idea of having query capabilities in C# in such a way is a big one in my opinion. This kind of technology will make the object/relation mapping stuff we have today useless to a big extent and in my private opinion this is one of the reasons why the ObjectSpaces technology might have slipped out of .NET v2.0 (although some of the stuff will appear in the WinFS timeframe in the Longhorn wave).
  • C# Blue is a project by Mike Stall that shows you the implementation of a C# compiler in ... C# itself, based on the Reflection APIs in the .NET Framework. Although it's not complete, it's a great sample of reflection-emit stuff. More information on http://blogs.msdn.com/jmstall/archive/2005/02/06/368192.aspx.
  • Singularity (http://research.microsoft.com/os/singularity/) is the implementation of an experimental OS prototype for "dependable systems" (which is a system that behaves as expected by the creators, users and owners). In order to create this kind of systems, configuration becomes a central concept in the OS through means of several abstractions in order to create a "self-describing artifact" in the end. One of the cool things about it is the use of MSIL code to build the thing and the presence of garbage collection in the heart of it.
  • Spec# is yet another programming language from Microsoft Research to help developing "more cost effective and high-quality software". I haven't played with it yet, but will do soon. It came to my attention since some KU Leuven folks are involved in this project too (see http://research.microsoft.com/SpecSharp/).
  • Last but not least, another interesting Microsoft Research project is the Advanced Compiler Technology project on http://research.microsoft.com/act/. As they have a good description of what they are doing, I don't need to copy it over here.

So, if you have ever time to play around with these things... :-) For more .NET language projects, check out http://www.dotnetlanguages.net/DNL/Resources.aspx.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Came across this nice project (nice because it has to do with .NET of course): a PHP compiler written in .NET. No, I'm not a PHP fan. Yes, I'm a .NET fan :-) More info on http://www.php-compiler.net/. Especially the benchmarks are interesting: the fastest PHP runtime is running on IIS, thanks to the underlying framework of course. And another nice effort, develop PHP inside Visual Studio .NET with full debugging support.

Anyway, counting down for ASP.NET v2.0 to come :-))))

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Posted Friday, March 18, 2005 6:51 PM by bart | with no comments
Filed under:

Question: How to make a page that can show itself (that is, the source code) when creating samples?

Answer: Take a look at http://www.bartdesmet.net/download/hash.aspx. Basically the code is just as simple as this:

string f = Server.MapPath(Request.Path);
using(StreamReader r = new StreamReader(f))
{
   code.Text = Server.HtmlEncode(r.ReadToEnd());
}


Which you can put in an .ascx too, in order to display the .aspx that contains it (cf. Request.Path). So, it becomes as easy as doing a <%@ Register Tagprefix="src" Tagname="view" Src="showcode.ascx" %> together with <src:view runat="server" />.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Posted Friday, March 18, 2005 2:54 AM by bart | with no comments
Filed under:
Bink reports the release dates of these products and the service pack on http://bink.nu/Article3605.bink: March 28, 2005. Big releases coming up once again...Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Let's try to bypass the hype of the VS2005 tools and show you guys some of the enhancements on the field of csc.exe, the C# compiler, in .NET v2.0.

  1. The version number

    Yeah, right. Just to mention the build I'm describing is v2.0.40607.42, which is beta 1.
  2. Strong naming of assemblies

    Remember the assemblyinfo.cs file and the attributes to put a strong name on an assembly? Basically, you needed to create a strong name using sn.exe and to alter the assemblyinfo.cs file to point to the key pair you created in order to create a strong-named assembly. In C# v2.0 you have an additional way to do this kind of stuff, without using the [assembly: ...] attributes:

    csc -keyfile:thekey.snk ...

    As you'll see, registering the assembly in the GAC (one of the reasons to strong name an assembly) using gacutil.exe will work directly. Beside of the -keyfile flag, there are other flag for this purpose too: -keycontainer and -delaysign
  3. Specification of the target platform

    Maybe you've seen it in the toolbars inside VS2005 already, but now it's possible to target a certain platform (x86, x64 and "AnyCPU") when compiling to IL. In doing so, the assembly's manifest will contain a flag .corflags, set to the target platform (0x00000001 being AnyCPU and 0x00000003 being x86 for example). Notice that compilation to x64 currently does not work, since the mscorlib.dll is currently not built to target that platform (will change of course) and this file is referenced to during compilation.
  4. Dropped - Incremental compilation

    The /incremental switch has disappeared. As I did rarely use it, I won't miss it (I suppose).
  5. warnaserror extension

    In csc 7.x targeting .NET Framework v1.x, there was already the option to do /warnaserror+. However, by doing so, all warnings were treated as being errors (therefore causing termination of the compilation when warnings are found). Now it's possible to specify that a specified list of warnings need to be treated as errors, whileas others remain just "warnings".
  6. Errorreport - hopefully you don't need it

    Used to send error reports about compiler errors (that is, errors in csc.exe itself) to Microsoft, specifying the mode to do this (none, send, prompt).
  7. Aliasing when using references

    Using this

    csc /r:SomeAlias=MyAssembly.dll Hello.cs

    allows you to write this code

    extern alias SomeAlias;
    public class Hello
    {
         public static void Main()
         {
              SomeAlias::Fully.Qualified.ClassName.StaticMethodInClassName("blabla");
         }
    }

    Right, two C++ "things" (extern and ::) have come to C# :-).

Another nice one - Conditional compile symbols (works in C# 1.x too, but I'd like to mention it because I like it so much :-))

You know #if and #endif, the preprocessor directives that can be used to include code (or not) based on some preproc "variable"? People who've done C and C++ certainly know this (#include, #define, #ifdef, etc), and it's available in C# too. However, in order to set such a preproc variable, you needed to jump in your code file in order to define the variable using the #define instruction. Let's give an example:

//compile with csc /define:WORLD Hello.cs
//or add #define WORLD in the code

using System;
class Hello
{
 public static void Main()
 {
#if WORLD
  Console.WriteLine("Hello World");
#endif
 }
}

Use of targets file (MSBUILD)

In the .NET Framework installation folder (Windows\Microsoft.NET) you'll now find a file called Microsoft.CSharp.targets which contains the XML description of the build process and everything around it, as used by MSBUILD and the tools. In this file, you'll find a tag, setting various variables on the "attributes of the compiler" (which are mapped to flags when using the command-line compiler directly of course). If you want to compare this system to existing similar technologies, you can compare it with makefile and ant-things, but much smarter :-). Note there is a Microsoft.Common.targets file too, that is included in the CSharp targets file. A full elaboration of these files would be outside the scope of this post, but be sure to check out these files (also when you're interested in ClickOnce, as there are targets such as "ComputeClickOnceManifestInfo" included in this file to support ClickOnce (which means once again, that you can do ClickOnce deployment without using the VS2005 tools directly).

ILdasm - No more tricks

Few people knew it, the /adv flag of ildasm.exe v1.x. It's not in the help, it's not in the /? information of the tool. The idea was pretty simple, use ildasm /adv and you get more menus, to view the PE (portable executable) and CLR headers of a file, statistics, etc. Now, it's there by default, no more /adv flag needed.

ILasm would bring us too far in the context of this post, but if I find the time, I'll write something about it. One of the nice new additions is the generation of Edit-and-Continue deltas for E&C support. This proves that the Edit & Continue support is on the level of the .NET Framework (IL-code level) and thus it can be supported for any language (E&C support is coming indeed for both VB and C#).

Ngen - extra options

Ngen (native image generator) has some new options, for example, to defer the creation of native images till the computer is idle (especially interesting for large assemblies). Additionally, the various actions are better described right now (install, update, finish, uninstall, display).

What about VB?

Well, one of the remarkable new features is the support for documentation generation using the vbc.exe compiler (and in the VB language generally spoken). This was already supported on csc (using the /doc flag), and now it's there on VB too. As I'm not a huge VB user (anymore), I can't really tell you everything about the new VB features, but there are various sources around the net to tell you more about this.

Let's stop for now; I'll come back to this topic later on.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

More Posts Next page »