March 2008 - Posts

A while ago I posted a functional way of exception handling, introducing similar functionality as exception filters (VB's When keyword). I admitted it was a crazy Sunday afternoon idea, maybe I should create a category entitled "Crazy Sundays" since this post very much belongs to that same category (update: I did create the category).

It all started a few weeks ago when I was explaining the way LINQ works, starting by focusing on the concept of extension methods and mentioning sexy words like monads, continuations, etc. After the session somebody came to me and wondered what other ideas could be expressed in a similar way like LINQ's query operator chaining. I came up with a couple of uses and this post concretizes one of those.

 

The rules

Okay, let's get started. What's up? Cloning the switch functionality as is? Well, almost, but adding some other stuff to it. First, take a look at what we have today, baked in by the language (section 15.7.2 of the C# Language Specification):

switch-statement:
     switch ( expression ) switch-block

switch-block:
     { [switch-sections] }

switch-sections:
     switch-section
     switch-sections switch-section

switch-section:
    switch-labels statement-list

switch-labels:
     switch-label
     switch-labels switch-label

switch-label:
     case constant-expression :
     default :

What's interesting about the above are the limitations imposed by the switch statement. First of all there's the governing type defined by the switch expression, which needs to be of (or convertible into using one user-defined implicit conversion) a built-in integral numerical type or a string (the exact list of types is specified in 15.7.2). You can think of it as a big if-else if-else statement although the implementation might be quite different, using switching tables (IL switch) and for (more than 5 I believe) strings a Dictionary. I won't go in further IL detail nor dive into the subtleties of nullables (maybe another time).

Another thing to know about is the fact the language has a "no fall through" rule (read: forgetting a break statement), which eliminates a series of common problems in other curly brace languages that will remain unnamed. In addition to this, one can reorder the switch sections at will without affecting the semantics of the switch. And last but not least, all of the (values of the) labels should be unique.

All of this being said, we're going to break certain rules over the course of this fun activity. Beware of this, especially when you'd be tempted (I doubt) to embrace this idea.

 

Simple switch

We'll start by defining simple switch logic. How could we mimic a switch statement by means of method calls? Right, extension methods. Before we go there, it's quite important to pick a target for those extensions (the 'this' parameter). Sure, we could go for System.Object but do we want to spoil such a fundamental type with (seemingly) additional methods? I'm tempting to say no, but feel free to have another opinion, so we'll define a wrapper. Quick-n-dirty, here it is:

class Switch
{
    public Switch(object o)
    {
        Object = o;
    }

    public object Object { get; private set; }
}

Exercise: Adding a special class for switching logic has some drawbacks. What about a struct? Predict what would happen if you trade the class keyword for the struct keyword above. Will the fragment still compile? If not, what needs to change? Try to push forward the choice of a struct in the rest of this post if you're convinced about the alternative.

In fact, we could even forget about extension methods at this point in time, since we own the Switch class. You can choose either way, but to keep myself honest on the goal of Crazy Sunday posts I'll stick with extension methods.

Exercise: Abandon my idea of using extension methods and go with instance methods from here on.

A first difference has become apparent already, we'll support all types to be used in our switch. Next, we have to pick the syntax we're aiming for. It should go along those lines:

void Do(int age)
{
     new Switch(age)
          .Case(a => (int)a < 18, a =>
          {
               Console.WriteLine("Young");
          })
          .Case(18, a =>
          {
               Console.WriteLine("Middle-age");
          })
          .Default(a =>
          {
               Console.WriteLine("Old");
          });
}

There are a couple of remarkable things in here. Let's analyze case by case:

  • We use System.Object as our base type, so the first switch needs a cast. Further on, we'll do something about this.
  • Again in the first switch, notice we use a Func<object, bool> as the switching condition. This goes beyond the simply constant-based comparison of the typical switch.
  • The second switch is a typical one, comparing just a value for equality.
  • Finally, we have the familiar default base case.

The whole thing 'returns' void, but to allow for chaining we need to pass through objects between the Case 'labels' obviously. We could go further and make the whole thing a valued expression, but let's not go there for now.

Another notable (but obvious) thing is the lack of break keywords. There's nothing to break after all, so we need to bake the semantics into the method calls. We'll stick with "no fall-through by default" but will provide an overload:

void Do(string name)
{
     new Switch(name)
          .Case(s => ((string)s).StartsWith("B"), s =>
          {
               Console.WriteLine(((string)s) + " starts with B.");
          }, true)
          .Case(s => ((string)s).StartsWith("Ba"), s =>
          {
               Console.WriteLine(((string)s) + " starts with Ba.");
          })
          .Default(s =>
          {
               Console.WriteLine(((string)s) + " starts with who knows what.");
          });
}

The true parameter to the first Case call indicates to fall through. Time for some implementation work. Here's a first set of (extension) methods:

static class SwitchExtensions
{
     public static Switch Case(this Switch s, object o, Action<object> a)
     {
          return Case(s, o, a, false);
     }

     public static Switch Case(this Switch s, object o, Action<object> a, bool fallThrough)
     {
          return Case(s, x => object.Equals(x, o), a, fallThrough);
     }

     public static Switch Case(this Switch s, Func<object, bool> c, Action<object> a)
     {
          return Case(s, c, a, false);
     }

     public static Switch Case(this Switch s, Func<object, bool> c, Action<object> a, bool fallThrough)
     {
          if (s == null)
          {
               return null;
          }
          else if (c(s.Object))
          {
               a(s.Object);
               return fallThrough ? s : null;
          }

          return s;
     }
}

Notice the way chaining works, by returning null to break the chain. Extension methods and classes make sense after all, although (exercise) you can still (?) work around it (what about a Switch.Break thingy?). Let's bring Default on the scene too:

     public static void Default(this Switch s, Action<object> a)
     {
          if (s != null)
          {
               a(s.Object);
          }
     }

This is where we close the loop by returning void, so that no subsequent Case or Default calls can be made (which really wouldn't make sense).

Exercise: What would it take to turn the whole thing in a valued expression?

 

Generic switch

Remember the first case 'label' of our first sample? A reminder:

           .Case(a => (int)a < 18, a =>
          {
               Console.WriteLine("Young");
          })

The cast is ugly and this became even more apparent in the second sample where we had to cast a string multiple times. Not only is this inefficient, it's ugly and is a bummer for IntelliSense. Let's fix this by introducing a generic Switch<T> class:

class Switch<T>
{
    public Switch(T o)
    {
        Object = o;
    }

    public T Object { get; private set; }
}

The extensions are simple once more:

     public static Switch<T> Case<T>(this Switch<T> s, T t, Action<T> a)
     {
          return Case(s, t, a, false);
     }

     public static Switch<T> Case(this Switch<T> s, T t, Action<T> a, bool fallThrough)
     {
          return Case(s, x => object.Equals(x, t), a, fallThrough);
     }

     public static Switch<T> Case(this Switch<T> s, Func<T, bool> c, Action<T> a)
     {
          return Case(s, c, a, false);
     }

     public static Switch<T> Case(this Switch<T> s, Func<T, bool> c, Action<T> a, bool fallThrough)
     {
          if (s == null)
          {
               return null;
          }
          else if (c(s.Object))
          {
               a(s.Object);
               return fallThrough ? s : null;
          }

          return s;
     }

     public static void Default<T>(this Switch<T> s, Action<T> a)
     {
          if (s != null)
          {
               a(s.Object);
          }
     }

This allows us to write our previous samples more concise:

void Do(string name)
{
     new Switch<string>(name)
          .Case(s => s.StartsWith("B"), s =>
          {
               Console.WriteLine(s + " starts with B.");
          }, true)
          .Case(s => s.StartsWith("Ba"), s =>
          {
               Console.WriteLine(s + " starts with Ba.");
          })
          .Default(s =>
          {
               Console.WriteLine(s + " starts with who knows what.");
          });
}

Much cleaner.

 

Type switch

Crazy or not, there's most of the time always something useful to it. What about capturing the following pattern?

void Do(Control c)
{
     if (c is Label)
     {
          Label l = (Label)c;
          // ...
     }
     else if (c is Button)
     {
          Button b = (Button)c;
          // ...
     }
     else
     {
          // ...
     }
}

This is a common pattern when dealing with extensions to UI code that need to process all sorts of controls, or when writing parsers as with System.Linq.Expressions where you have to switch on the type of the expression. Unfortunately, the code above isn't the most efficient one. First we do type checks, followed by raw casts. Use of the as keyword is better (even FxCop will tell you):

void Do(Control c)
{
     Label l;
     Button b;
     if ((l as Label) != null)
     {
          // ...
     }
     else if ((c as Button) != null)
     {
          // ...
     }
     else
     {
          // ...
     }
}

But soon it starts to become uglier. I'm not claiming to improve things concerning readability or efficiency in this post, suffice to say I'm capturing a pattern. Enter our type switch. We'd like to be able to rewrite the code above as:

void Do(Control c)
{
     new Switch(c)
          .Case<Label>(l =>
          {
               // ...
          })
          .Case<Button>(b =>
          {
               // ...
          })
          .Default(cc =>
          {
               // ...
          });
}

First of all, notice we piggyback on the non-generic switch. Every Case-'label' already has type information and can only be entered if the switch expression is of the specified type, therefore the body of each label's action body will have the original expression casted to the specified type. E.g. when typing b. in the second label, you'll see the IntelliSense list for a Button variable. The only drawback is the Default block where cc won't be of a more specific type. Obviously you could make it Default<T>, passing in Control in the sample above.

Exercise: Think about the reason not to use the generic Switch<T> in here (tip: see the implementation below).

On to the implementation. Almost trivial again:

public static Switch Case<T>(this Switch s, Action<T> a) where T : class
{
    return Case<T>(s, o => true, a, false);
}

public static Switch Case<T>(this Switch s, Action<T> a, bool fallThrough) where T : class
{
    return Case<T>(s, o => true, a, fallThrough);
}

public static Switch Case<T>(this Switch s, Func<T, bool> c, Action<T> a) where T : class
{
    return Case<T>(s, c, a, false);
}

public static Switch Case<T>(this Switch s, Func<T, bool> c, Action<T> a, bool fallThrough) where T : class
{
    if (s == null)
    {
        return null;
    }
    else
    {
        T t = s.Object as T;
        if (t != null)
        {
            if (c(t))
            {
                a(t);
                return fallThrough ? s : null;
            }
        }
    }

    return s;
}

Default has been specified already although you could have a Default<T> as well (as outlined previously).

Exercise: Why the generic constraint in the code above? Any way around it (without using plain casts and exception handling obviously...)? What about nullables?

Ultimately the same chaining is made possible with the above, but this time by switching on types. Notice fall-through is still relevant, not just because we have all sorts of conditions through Func<T, bool> but also because the type hierarchy we're dealing with. That is (one of the rules we're breaking): order matters.

To show you the above works like a charm:

image

Yes, there are ways around this with a classic switch using the Expression.NodeType enum value, but sometimes you want more or other (sealed) object-hierarchies lack such infrastructure vehicles.

 

Valued switches

So, what would it take to make the switch valued, meaning it doesn't return a void but any "projection" you want? In fact, this is much like functional languages where we have if-expressions (instead of statements), much like we have the ternary operator in curly brace languages (and the new If in VB 9.0). I won't nag too much about this, but such a construct isn't seldom seen. Take LISP for example, with:

(cond (e1 e1') (e2 e2') ... (en en'))

so that

(if e1 e2 e3) = (cond (e1 e2) ('T e3))

where if is redefined in terms of cond: if e1 evaluates true, e2 is returned; otherwise, car (old name for head, standing for current address register, a historical name) of the second argument is evaluated (i.e. 'T = true) and if that returns true (i.e. always) e3 is returned.

In order to enable this, we'll need to create a new generic Switch object that not only takes in a type specifying the source but also a target type. This is our definition:

class Switch<T, R>
{
    public Switch(T o)
    {
        Object = o;
    }

    public T Object { get; private set; }
    public bool HasValue { get; private set; }
    public R Value { get; private set; }

    public void Set(R value)
    {
        Value = value;
        HasValue = true;
    }
}

It looks a bit like Nullable<R> with the HasValue and Value properties. Essentially, once a value has been assigned (through Set), HasValue will flip to true which indicates we've found a match. The semantics are that the first match in a Switch-expression wins, although one could easily adapt this. However, notice this is less efficient that an early return from a function since we'll have to forward the result till the end of the method call chain that makes up the Switch-expression. Let's make it concrete with just three functions (it only gets easier it seems):

public static Switch<T, R> Case<T, R>(this Switch<T, R> s, T t, Func<T, R> f)
{
    return Case<T, R>(s, x => object.Equals(x, t), f);
}

public static Switch<T, R> Case<T, R>(this Switch<T, R> s, Func<T, bool> c, Func<T, R> f)
{
    if (!s.HasValue && c(s.Object))
    {
        s.Set(f(s.Object));
    }

    return s;
}

public static R Default<T, R>(this Switch<T, R> s, Func<T, R> f)
{
    if (!s.HasValue)
    {
        s.Set(f(s.Object));
    }

    return s.Value;
}

Actually this starts to look a little LINQ-familiar, with Func<T, bool> being a predicate (as in Where) and Func<T, R> being a projection (as in Select*). The idea is simple: a case evaluates the condition only if the switch hasn't a final value yet. If the test (c) passes, the projection is carried out (f) and the value is set (Set). Default is unconditional and has to be supplied as the final 'projection' but it could well be trivial (especially when case labels are present for all cases that can occur, e.g. when switching on an enumeration or a fixed object hierarchy). Notice there's no type switch functionality (exercise). Here's a trivial sample of this switch at work:

var res =
     from x in typeof(string).GetMembers()
     select new Switch<MemberInfo, string>(x)
            .Case(m => m is MethodInfo, m => m.Name + " is a method")
            .Case(m => m is PropertyInfo, m => m.Name + " is a property")
            .Default(m => m.Name + " is something else");

foreach (var s in res)
    Console.WriteLine(s);

producing the following result:

image

 

Conclusion

Crazy but lot of fun. And much room for follow-up. Just a few ideas: Expression<T>, Reflection.Emit. Anyway, enough for now. Have a nice week!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Two weeks ago I did a little tour through Europe spreading the word on a couple of our technologies including Windows PowerShell 2.0. In this blog series I'll dive into a few features of Windows PowerShell 2.0. Keep in mind though it's still very early and things might change towards RTW - all samples presented in this series are based on the CTP which is available over here.

 

Introduction

Previously in this series we covered new scripting capabilities with script cmdlets, script internationalization and a few language enhancements in Windows PowerShell 2.0. But writing scripts is just one piece of the puzzle, how to debug those when something isn't quite right? To answer that question, Windows PowerShell 2.0 introduces script debugging capabilities.

 

Set-PsDebug

The Windows PowerShell debugging story embodies a series of features that cooperate with each other. First we have some configuration options using Set-PsDebug. This is the cmdlet you'll use to configure the debugging options of the system. There are a few configuration options:

  • -off: turns script debugging off
  • -trace: specifies a trace level, where 0 is like -off, 1 traces script execution on a line-per-line basis, 2 does the same but also traces variable assignment and function calls
  • -strict: like VB's Option Strict, makes the debugger throw an exception is a variable is used before being assigned to

Below is a run showing some Set-PsDebug options:

image

Notice all debugging output triggered by the trace level set through Set-PsDebug is prefixed with DEBUG. In order to write to the debug output yourself, there's Write-Debug which I'll leave as an exploration for the reader.

 

Working with breakpoints

Where it really gets interesting is the concept of breakpoints which are "points" where execution is "broken". In PowerShell that corresponds to the following:

  • A line (and column) number in a certain script;
  • Calls to a specified function;
  • Invocations of a specified command;
  • Variable access.

Once we have specified where we need to focus on what to do when the breakpoint is hit. When no action is specified, the shell will spawn a sub-shell that has access to the current state of the execution so that variables can be inspected and other actions can be taken during debugging. Alternatively, one can specify a script-block as an action.

Enough theory, let's map those concepts on cmdlets. Breakpoint in PowerShell are called PSBreakpoints, so let get-command be our guide:

image

It obviously all starts with New-PSBreakpoint and all other cmdlets are self-explanatory. Time to show a few uses of breakpoints. First, create a simple script called debug.ps1:

function bar {
   $a = 123
   if ($a -gt 100)
   {
      $a
      foo
   }
}

function foo {
   $a = 321
   Write-Host $a
}

"Welcome to PowerShell 2.0 Script Debugging!"
bar

Invoking it should produce no surprises:

image

First we'll set a breakpoint on a specific line of the script using New-PSBreakpoint -Script debug.ps1 -Line 16 and re-run the script. Notice - with tracing on to show line info of the script executing - we're breaking on the call to bar:

image

Also notice the two additional > signs to the prompt below. This indicates we've entered a nested debugging prompt. Now we need to control the debugger to indicate what we want to do. For that purpose there are a few Step-* cmdlets as shown below:

image

With Step-Into you simple go to the next statement, possibly entering a function call. With Step-Over you do the same, but you "step over" function calls straight to the line below the call. Step-Out is used to exit from a breakpoint and let the script continue to run till the next breakpoint is hit (or till it completes). A quick run:

image

So far we've been stepping through the code line-by-line. Notice the line numbers being shown next the DEBUG: word when tracing is enabled. The second DEBUG: line shows the output of the Step-Into command, showing where we'd end up next (preview of the next line). Now we're inside the foo function call, but you might wonder how we got there and which functions have been called before: enter Get-PsCallstack:

image

From the original prompt (0), we executed debug.ps1, which called into bar and foo subsequently to end up in the nested debugger prompt. While debugging you'll obviously want to investigate the system, for example to see what $a contains, so you can simple print the variable. Finally, we continue to step and exit the nested prompt because the script has stopped:

image

Time for some bookkeeping: let's get rid of this breakpoint. Easy once more, using Remove-PSBreakpoint:

image

So illustrate a few other concepts, we'll set a breakpoint on a function, on a command invocation and on variable access:

image

Re-run the script and watch out. Here's the output - we break four times: two variable $a assignments, one foo call and one call to Write-Host:

image

Notice the use of exit to escape from the nested prompt and to make the script execution continue to the next breakpoint. An alternative would be to use Step-Out. Especially the variable assignment debugger breakpoint option is very attractive because in lots of cases you see state of a variable being changed and you simple want to trace back where changes are happening.

Other stuff you might want to take a look into includes the -Action parameter to New-PSBreakpoint, the ability to clone breakpoints using -Clone, enabling/disabling breakpoints and the HitCount property of breakpoints.

For more information on debugging, simply take a look at get-help about_debug.

Happy debugging!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Two weeks ago I did a little tour through Europe spreading the word on a couple of our technologies including Windows PowerShell 2.0. In this blog series I'll dive into a few features of Windows PowerShell 2.0. Keep in mind though it's still very early and things might change towards RTW - all samples presented in this series are based on the CTP which is available over here.

 

Introduction

After a couple of language-related features, we move into infrastructure in this post. Enter the world of remoting in Windows PowerShell 2.0, at least partially. So what's up? The Universal Code Execution Model, to which the Remoting features belong, is one of the core enhancements of Windows PowerShell 2.0 allowing script to be run either locally or remotely in different modes:

  • 1-to-many "fan-out" - execute script on a (large) number of machines (e.g. web farms)
  • Many-to-1 "fan-in" - delegating administration of a server by hosting PowerShell inside the service
  • 1-on-1 "interactive" - managing a server remotely much like "secure telnet"

In addition to this, other features are part of the Universal Code Execution Model:

  • Restricted Runspaces - the idea of having a runspace that's restricted in what it can do (e.g. concerning the operations and the language used)
  • Mobile Object Model - the plumbing that makes it possible to have objects travel around the network (i.e. serialization and deserialization infrastructure)
  • Eventing - adding the concept of events to PowerShell manageability, allowing actions to be taken when certain events occur
  • Background jobs - running commands (cmdlets, scripts) in the background asynchronously

For the remote execution (remote meaning cross runspace boundaries), Windows PowerShell uses the WS-MGMT protocol which is enabled by the WINRM service:

image

The nice thing about using web services is its firewall friendliness. However, in order to enable PowerShell to work with it, one needs to run a script first: $pshome\Configure-WSMAN.ps1. It opens the required ports, checks the service is installed and executes a set of winrm configuration commands that enable endpoints.

 

Background jobs

We'll stick with background jobs for now. There are 6 cmdlets to manage background jobs, known by the PSJob noun in PowerShell 2.0 speak:

image

What's better to start with than Start-PSJob? Here's the syntax:

PS C:\temp> start-psjob -?

NAME
    Start-PSJob

SYNOPSIS
    Creates and starts a Windows PowerShell background job (PsJob) on a local or remote computer.

SYNTAX
    Start-PSJob [-Command] <String> [[-ComputerName] <String[>] [-Credential <PSCredential>] [-Port <Int32>] [-UseSSL] [-ShellName <String>] [-ThrottleLimit <Int32>] [-InputObject <PSObject>] [-Name <String>] [<CommonParameters>]

    Start-PSJob [-Command] <String> [[-Runspace] <RemoteRunspaceInfo[>] [-ThrottleLimit <Int32>] [-InputObject <PSObject>] [-Name <String>] [<CommonParameters>]

Notice the synopsis: on a local or remote computer. This is where remoting enters the picture, with the concept of a remote runspace. We won't go there though, let's stick with local execution and start a command:

start-psjob "start-sleep 30"

This will show the following:

image

Normally, "start-sleep 30" would block the interactive console for 30 seconds (feel free to try). However, now we have sent off the command to the background, in a session with Id 1. The way this works roughly is by having runspaces and communication channels between them to send commands and receive data. The fact data is available is indicated by the HasMoreData property on the job. Without going in too much details, running commands remotely follows the same idea and results can be streamed back from the server to the client so that you can retrieve results piece-by-piece.

Back to our sample now. Of course we can stop a background job by using stop-psjob:

image

What the sample above shows as well is waiting for a PSJob to complete when you need to get the results at that particular point in time, e.g. after having done some more foreground work. Notice the wait-psjob above is blocked for the remainder of the 30 seconds while the background job is completing. Ultimately it returns like this:

image

 

Where's my data, dude?

(Added a comma to disambiguate with some other Microsoft product:-)). Having background jobs is one thing, but getting results back is another thing. For Parallel Extensions folks, it like drawing the line between a Task and a Future<T>. So dude, where's my cup of T? The answer lies in the difference between wait-psjob and receive-psjob. While wait-psjob simply waits for a job to finish, receive-psjob receives data from it. What's really happening is that the foreground runspace talks to the background session to get data back which travels across boundaries (whether or not that boundary is on the local machine or across the network), cf. the Mobile Object Model.

image

An interesting thing to look at is the get-member output for these objects, more specifically the NoteProperty members on it:

image

This is where you see the objects are really deserialized across boundaries, which is one of the tasks of the Mobile Object Model. For example, the PSIPHostEntry shows the origin of the object which is particularly useful when working with remoting. In this context notice that a background job can spawn other background jobs by itself, meaning objects might be aggregated from various sources before they travel your way.

Another thing to realize is that data is streaming in. Assume you're asking for a bunch of results to come in from a remote server. These results are typically emitted by the pipeline object-by-object (unless there's a cmdlet that returns an array of objects or so, which can look to the pipeline - depending on how the result is returned - as one big object) so it makes sense to get the current results, wait for new ones to be produced and get subsequent results. Essentially the pseudo-algorithm is:

while ($job.HasMoreData)
{
    receive-psjob $job
    # do some other stuff
}

Here's a concrete sample:

image

The first time I called receive-psjob only the "get-process | select -f 5" pipeline would have yielded results, so I receive that data while the HasMoreData flag is still set to true. About 30 seconds later, I call receive-psjob $bar again. By then the results of "get-service | select -f 5" have come in too, and HasMoreData indicates there's nothing more to come (the State indicates the background job has completed).

Enjoy your dream-PSJob!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Two weeks ago I did a little tour through Europe spreading the word on a couple of our technologies including Windows PowerShell 2.0. In this blog series I'll dive into a few features of Windows PowerShell 2.0. Keep in mind though it's still very early and things might change towards RTW - all samples presented in this series are based on the CTP which is available over here.

 

Introduction

This time we'll take a brief look at a few language enhancements in Windows PowerShell 2.0. There are three such enhancements that deserve a little elaboration at the time of writing:

  • Splat - 'splatting' of a hashtable as input to a cmdlet invocation
  • Split - splitting strings
  • Join - the reserve of split

 

Splat

Splatting allows the entries of a hash-table to be used in the invocation of a cmdlet - more specifically, keys become named parameters and values become input to those parameters. Here's a sample:

$procs = @{name="notepad","iexplore"}
get-process @procs

And the result looks like this:

image

Of course multiple parameters can be specified at once (that's the whole point of the hashtable anyhow):

$gm = @{memberType="ScriptProperty","Property";name="[a-d]*"}
get-process @gm

image

In other words, invocation parameterization information can now be kept and passed around as data.

 

Split and join

Split and join are fairly trivial in fact. These are the equivalents of System.String's split and join operations but now exposed as language-integrated operators.

"bart","john" -join ","
"bart,john" -split ","

image

Simple but oh so handy :-). Have fun!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Two weeks ago I did a little tour through Europe spreading the word on a couple of our technologies including Windows PowerShell 2.0. In this blog series I'll dive into a few features of Windows PowerShell 2.0. Keep in mind though it's still very early and things might change towards RTW - all samples presented in this series are based on the CTP which is available over here.

 

Introduction

Imagine you're working for a company that operators in different countries or regions with different languages. Or you're creating a product that will be used by customers around the globe. Hard-coding messages in one language isn't likely going to be the way forward in such a case. Unfortunately with the first release of Windows PowerShell copy-paste localization of scripts was all too common. In this post we'll take a look at the Windows PowerShell 2.0 Script Internationalization feature.

 

String tables

In order to allow for localization, string tables are used quite often. The idea of a string table is to have key-value pairs that contain the (to-be) localized strings in order to separate the logic from the real string messages. Windows PowerShell 2.0 has this new cmdlet called ConvertFrom-StringData which is described as: "Converts a string containing one or more "name=value" pairs to a hash table (associative array)." If you read a little further in the get-help output you'll see the following:

The ConvertFrom-StringData cmdlet is considered to be a safe cmdlet that can be used in the DATA section of a script or function. When used in a DATA section, the contents of the string must conform to the rules for a DATA section. For details, see about_data_section.

Data sections are new to Windows PowerShell 2.0 and deserve a post on their own. For the purpose of this post, it suffices to say that a data section is a section used in script that can only contain data-operations and therefore it only supports a subset of the PowerShell language.

Let's use ConvertFrom-StringData on its own for now:

image

In here I'm using a so-called "here-string" that spans multiple lines, each line containing a key = value pair.

 

Localizable scripts

Time to put the pieces together and create a localizable script:

Data msgTable
{
ConvertFrom-StringData @'
    helloWorld = Hello, World :-).
    errorMsg = Something went horribly wrong :-(.
'@
}

Write-Host $msgTable.helloWorld
Throw $msgTable.errorMsg

Here's the result:

image

In the fragment above, notice the use of the Data keyword to denote a data section. In the remainder of the script, $msgTable is used as the variable to denote the hash table created by the ConvertFrom-StringData invocation in the data section.

 

Localized string tables

We already achieved some decoupling between the code and the messages, simply by putting the messages in a separate table. Now we have to blend in the actual culture of the system, which is exposed as $UICulture now:

image

We don't need to use this variable directly though. Using the new Import-LocalizedData cmdlet we can make PowerShell search for the right string table by investigating the directory structure. The idea is to have .psd1 files (a new extension) that contain localized string tables in subdirectories that denote the culture specified by language code and region code:

C:\temp\PS2\I18N\demo.ps1
C:\temp\PS2\I18N\nl-BE\demo.psd1
C:\temp\PS2\I18N\fr-FR\demo.psd1

Let's create the nl-BE\demo.psd1 file:

ConvertFrom-StringData @'
    helloWorld = Hallo, Wereld!
    errorMsg = Oeps! Dit ging vreselijk fout.
'@

Just copy the contents of the data section to a separate .psd1 file and translate it. Such files are "data files" (hence the d in the name) and substitute the contents of a data section. This doesn't happen magically of course, we need to call Import-LocalizedData in our script:

Data msgTable
{
ConvertFrom-StringData @'
    helloWorld = Hello, World :-).
    errorMsg = Something went horribly wrong :-(.
'@
}

Import-LocalizedData -bindingVariable msgTable

Write-Host $msgTable.helloWorld
Throw $msgTable.errorMsg

Import-LocalizedData extracts the $UICulture and tries to find the right .psd1 file. When found, the contents of the file are assigned to the binding variable which points at a data section.

Now when we run on an nl-BE machine, we'll see the following:

image

Enjoy!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Two weeks ago I did a little tour through Europe spreading the word on a couple of our technologies including Windows PowerShell 2.0. In this blog series I'll dive into a few features of Windows PowerShell 2.0. Keep in mind though it's still very early and things might change towards RTW - all samples presented in this series are based on the CTP which is available over here.

 

Introduction

In this first post we'll take a look at script cmdlets. Previously, in v1.0, the creation of cmdlets was an exclusive right for developers using any managed language (typically VB.NET or C#). I've been blogging about this quite a bit in the past all the way back to May 2006:

To work around this limitation, lots of IT Pros have been writing PowerShell scripts that take the naming pattern of cmdlets but the invocation syntax of those is completely different than real cmdlets. For example, there's no built-in notion of mandatory parameters to scripts unless you write your own validation. Similarly, things such as -whatif and -confirm are not supported but these scripts.

Starting with PowerShell 2.0, the creation of cmdlets is now possible using script as well. In this post, I'll port my file hasher cmdlet to a script cmdlet.

 

The basics

Creating a script cmdlet starts by creating a script file, e.g. get-greeting.ps1. Below is the skeleton of a typical script cmdlet:

Cmdlet Verb-Noun
{
   Param(...)
   Begin
   {
   }
   Process
   {
   }
   End
   {
   }
}

The minimalistic script cmdlet would simply consist of a Process section, like this:

Cmdlet Get-Greeting
{
   Process
   {
      "Hello PowerShell 2.0!"
   }
}

In order to execute, save the file (e.g. get-greeting.ps1) and load it using . .\get-greeting.ps1. Now the get-greeting cmdlet is in scope and can be executed:

image

If the cmdlet is executed as part of a pipeline, which means (possibly) multiple records that are flowing through the pipeline have to be processed, the Process block will be executed for each of those. However, the Begin and End blocks will be triggered only once. Before we can go there, let's take a look at parameterization.

 

Parameterization

Parameterization is maybe the most powerful thing about script cmdlets. It all happens in the Param section. Let's extend our greeting cmdlet with a parameter:

Cmdlet Get-Greeting
{
   Param([string]$name)
   Process
   {
      "Hello " + $name + "!"
   }
}

Perform the same steps to load the cmdlet and execute it, first without arguments, then with an argument:

image

The first invocation is not really what we had in mind. The parameter needs to be mandatory instead. In script cmdlets, this is easy to do, simply by adding an attribute to the parameter:

Cmdlet Get-Greeting
{
   Param([Mandatory][string]$name)
   Process
   {
      "Hello " + $name + "!"
   }
}

Now, PowerShell will enforce this declaration and require the parameter to be supplied:

image

Here you see how the PowerShell engine takes over from the script author. Beyond simple mandatory parameters, on can specify validation attributes as well, such as AllowNull, AllowEmptyString, AllowEmptyCollection, ValidateNotNull, ValidateNotNullOrEmpty, ValidateRange, ValidateLength, ValidatePattern, ValidateSet, ValidateCount, ValidateScript. The latter is interesting in that it is not available to managed code cmdlets at the time being - it allows a script function to be specified to carry out validation of the parameter's value (e.g. a script that validates ZIP codes or SSN numbers, that can be reused across multiple script cmdlets).

 

The pipeline

Let's make our cmdlet play together with the pipeline now. We're already emitting data to the pipeline, simply by our "Hello" ... expression that produces a string. However, we'd like to grab data from the pipeline too. This can be done by binding a parameter to the pipeline:

Cmdlet Get-Greeting
{
   Param([ValueFromPipeline][Mandatory][string]$name)
   Process
   {
      "Hello " + $name + "!"
   }
}

image

Here the strings "Bart" and "John" are grabbed from the pipeline to be bound to the $name parameter. To show that Begin and End are only processed once, change the cmdlet as follows:

Cmdlet Get-Greeting
{
   Param([ValueFromPipeline][Mandatory][string]$name)
   Begin
   {
      Write-Host "People can come in through the pipeline"
   }
   Process
   {
      "Hello " + $name + "!"
   }
   End
   {
      Write-Host "Goodbye!"
   }
}

and the result is:

image

Typically Begin and End are used to allocate and free shared resources for reuse during record processing.

 

Interacting with the pipeline processor

There's still more goodness. Using the $cmdlet variable inside the script cmdlet, one can extend the capabilities even more. To see what this can do, create a simple script cmdlet:

Cmdlet Get-Cmdlet
{
   Process
   {
      $cmdlet | get-member
   }
}

This is the result:

image

We won't be able to take a look at each of those, but let's play with a couple of those: ShouldProcess and WriteVerbose.

Cmdlet Get-Greeting -SupportsShouldProcess
{
   Param([ValueFromPipeline][Mandatory][string]$name)
   Begin
   {
      #Write-Host "People can come in through the pipeline"
   }
   Process
   {
      if ($cmdlet.ShouldProcess("Say hello", $name))
      {
         $cmdlet.WriteVerbose("Preparing to say hello to " + $name)
         "Hello " + $name + "!"
         $cmdlet.WriteVerbose("Said hello to " + $name)
      }
   }
   End
   {
      #Write-Host "Goodbye!"
   }
}

Notice the addition of -SupportsShouldProcess in the Cmdlet declaration. This tells the engine our cmdlet is capable of supporting -whatif and -confirm switches. Inside the implementation we add an if-statement that invokes ShouldProcess specifying the action description and the target ($name). The result is this:

image

Essentially, -whatif answers that ShouldProcess call with false, skipping the real invocation but still printing the actions and targets the operation would have triggered. When using -confirm, the user is prompted each time (unless [Yes|No] to All is answered obviously) a ShouldProcess call is made.

When using the -verbose switch, the WriteVerbose calls are emitted to the console as well:

image

 

Porting the File Hasher cmdlet

Enough introductory information, let's do something real. Here's the script for my old file hasher cmdlet ported as a script cmdlet:

Cmdlet Get-Hash
{
   Param
   (
      [Mandatory][ValidateSet("SHA1","MD5")][string]$algo,
      [Mandatory][ValueFromPipelineByPropertyName][string]$FullName
   )
   Begin
   {
      $hasher = [System.Security.Cryptography.HashAlgorithm]::Create($algo)
   }
   Process
   {
      $fs = new-object System.IO.FileStream($FullName, [System.IO.FileMode]::Open)
      $bytes = $hasher.ComputeHash($fs)
      $fs.Close()

      $sb = new-object System.Text.StringBuilder
      foreach ($b in $bytes) {
         $sb.Append($b.ToString("x2")) | out-null
      }

      $sb.ToString()
   }
}

Pretty simple, isn't it? A few implementation highlights:

  • I have two parameters, comma-separated in the Param(...) section.
  • The first parameter should either be MD5 or SHA1 (case-insensitive), which I'm validating using ValidateSet. Anything but those two will fail execution of the cmdlet.
  • The second parameter is taken from the pipeline by property name. Notice FullName is a property on file objects, so this allows to pipe the output of get-childitem (dir) in a file system folder to the get-hash cmdlet.
  • Creation of the hasher algorithm is straight-forward but is done in the Begin section to allow reuse across multiple processed records.
  • The core of the implementation is simple: it opens the file as specified in the $FullName parameter, feeds the stream into the hasher and turns the bytes into their string representation. Notice the use of out-null to suppress any output from the $sb.Append call to bubble up to the pipeline, only the $sb.ToString() result is reported.

Here's the result:

image

Hashes are calculated for all *.cs files. I didn't extend the sample to print the file name (would be simply to do) or to report it as part of the output (wrapping a file name and the hash result in an object, which is harder to do) but if you go back to my original file hasher cmdlet post, you'll see there's another option using the Extended Type System.

Enough for now. As you saw in this post, script cmdlets unlock an enormous potential to extend PowerShell with first-class citizen cmdlets simply by leveraging your scripting knowledge in PowerShell. Together with some other features such as script internationalization (coming up in this series) and packages and modules (not in the current CTP) this is just the tip of the iceberg of PS 2.0 Production Scripting.

Happy script-cmdlet-ing!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

More Posts