In the previous posts on CLR Hosting with .NET v2.0 I explained the basic principles of - guess what - hosting the CLR inside a process by using the mscoree library. This post will explain what the various options are for starting the CLR and how the basic lifecycle of a hosted CLR will look like. In the next episodes we'll dive into more detailed stuff on how to customize the hosted CLR in much more detail.
Basic code skeleton for CLR hosting
In part 1 of this episode, I showed the following code fragment to launch the CLR:
ICLRRuntimeHost *pHost = NULL;
HRESULT res = CorBindToRuntimeEx(L"v2.0.40607", L"wks", STARTUP_SERVER_GC, CLSID_CLRRuntimeHost, IID_CLRRuntimeHost, (PVOID*) &pHost);
res = pHost->Start();
Let's now explain in somewhat more detail what the various parameters of the CorBindToRuntimeEx function mean and how you can use these parameters to customize the behavior of the CLR.
Being in the heart of the .NET Framework, the CLR is upgraded in every release of the .NET Framework and thus there should be a versioning concept when talking about the CLR as such. In order not to break existing applications, the .NET Framework supports side-by-side installation of different versions of the CLR (and the .NET Framework BCL assemblies in the GAC). The various versions of the CLR on your machine can be found in the Windows\Microsoft.NET\Framework folder on the system, where every subfolder contains a specific version of the .NET Framework (e.g. 1.0.3705, 1.1.4322). Furthermore, there's a key in the registry that enumerates all of the installed versions of the .NET Framework. This key can be found in the registry under HKLM\SOFTWARE\Microsoft\.NETFramework\Policy. However, although you can have multiple versions of the .NET Framework and CLR on the same machine, there will be only one startup shim on the machine, with the name mscoree.dll. This DLL is using the registry to locate the .NET Framework installation files for the requested CLR version and to hand over the execution control to the requested CLR engine which is installed in one of the Windows\Microsoft.NET\Framework subfolders.
When loading a specific version of the CLR, the shim still has a choice to make: whether to load the server build or workstation build, as explained in the next paragraph. However, as the implementer of a CLR Host you do have to make a decision too, which is the version loading strategy you want to follow:
- A first choice is to go hand in hand with a certain version of the CLR, which is in my opinion the best possible choice. SQL Server 2005 uses this approach and will always load the version of the CLR that it was originally built with. I can't give a version number yet, as the final builds of SQL Server 2005 and the .NET Framework v2.0 are not available yet. But if you even wondered why the SQL Server 2005 and .NET Framework v2.0/Visual Studio 2005 releases are so tightly bound to each other, this is the reason why: in order to finalize the SQL Server 2005 development and make the RTM build, the v2.0 build of the CLR and .NET Framework has to be frozen first.
- The other option is to load the latest version of the CLR that's available on the system. I won't cover this because I do not prefer this strategy as it can lead to compatibility problems (although this should be reduced to a minimum, it's better to stick with a specific version of the CLR to avoid this kind of problems proactively).
As a result, the first parameter of the CorBindToRuntimeEx function contains the version number in the format "v<major>.<minor>.<build>". Tip: also look at CorBindToCurrentRuntime which uses a configuration file to retrieve information about which version of the CLR to load.
Warning: The version which will be loaded does not have to match the one you specify, because of possible policy configuration on the machine. You can force the shim to load the specified version "literally" by using the STARTUP_LOADER_SAFEMODE flag in the third parameter of CorBindToRuntimeEx.
If not implementing a custom CLR Host, you can use the app.exe.config file to configure a specific version to be loaded. If multiple supportedRuntime-elements are specified, evaluation occurs from top to bottom. A requiredRuntime-element is only needed for backward compat on machines that only do have version 1.0.3705 of the .NET Framework, so I do not mention it in the configuration file below.
<supportedRuntime version="v<major>.<minor>.<build>" /> <!-- specify version that is supported by app-->
The CLR always ships with two different builds of the core execution engine, being the workstation build (mscorwks.dll) and the server build (mscorsvr.dll). Now, what's the difference between the two? Let's start with the begin: one of the key architectural elements of the CLR is the garbage collector for automatic memory management (in contrast to unmanaged code, e.g. in C with malloc/free or C++ with new/delete). Now, how does garbage collection work? An intermezzo...
<INTERMEZZO Title="Garbage collection in the .NET Framework">
The lifecycle of an object in the CLR is pretty straightforward. First of all memory has to be allocated, which is done by the newobj instruction in MSIL-code that is accessibly through the new keyword in C# and similar keywords in other managed languages. This operator is mapped on low level calls to allocate memory, just as the malloc function in C has to do to obtain memory. Next, the object has to be initialized which is done by calling the constructor of the type. After these two initial steps, the object is ready to be used and starts its lifetime in the hard .NET world :-). Now, at a certain point in time the object is not longer needed. When this is the case, the developer can optionally (!) explicitly (or implicitly) indicate that the object is no longer needed by calling the (optional) Dispose method on the object (cf. the IDisposable pattern and the using keyword in C#). This phase allows clean-up of the allocated resources and the state embedded in the object itself. Once this phase (if needed) has completed, it's time to free the memory which is done by the garbage collector.
This leaves us with some questions. The first one is how the CLR does allocate memory when the newobj instruction is called. First of all, remark that newobj is kind of an object oriented virtual machine language instruction, so in the end it has to mapped on a low-level processor instruction asking for n bytes of memory. Therefore, the CLR starts by calculating how much space is needed on the managed heap to hold the object and additional information that the CLR uses to do housekeeping stuff. Basically, the managed heap looks like an array of objects sitting next to each other and a pointer (NextObjPtr) that indicates the next position where an object can be allocated on the heap. This mechanism allows fast memory allocation, as there is no need to a list traversal to find free memory. Memory in the managed environment thus has a contigious look-n-feel. However, what I told you so far is pretty wishful thinking: the available memory isn't infinite so at a certain point in time newobj will come to the conclusion that there is no address space left to allocate the object because the heap is full (NextObjPtr + n bytes > end of address space). So far for the easy stuff :-).
The CLR has a so-called ephemeral garbage collector which means that objects are grouped in generations that are related to an object's lifetime. Basically this means that newly allocated objects will be living in generation 0. Assume the CLR is running for a while now and 10 objects have been created of which 4 are not longer needed. In comes a request for memory allocation on the managed heap through a newobj call. The CLR however has a built-in threshold value for the size of generation 0. Assume this threshold has been reached and therefore it's not possible to allocate memory directly for the new-to-be-created object in generation 0. At this point in time, the garbage collector comes into play. It takes a look at the objects in the managed heap and concludes that 4 of these objects are not longer needed. Assume these are objects 3,6,7,8, then the managed heap will be compacted to 1,2,4,5,9,10. When the garbage collector has finished its job, these objects no longer belong to generation 0 but are moved to generation 1. As a result, generation 0 is now empty (NextObjPtr is reset to the initial position in generation 0) and the newobj call can continue (aforementioned condition for available address space is met). Collection generation 0 generally reclaims enough memory to continue and will be quite effective because generation 0 is rather small (and therefore analysis and garbage collection goes fast) but also because of the fact that a lot of objects do only live for a short time. If objects survive this collection, they end up in generation 1 that consists of objects that have a longer lifetime. It's however not difficult to see that generation 1 will grow too, therefore it has a threshold too. When this situation occurs (triggered by a full generation 0 and an unsuccessful move of generation 0 objects to generation 1), generation 1 (which is larger than generation 0) will be collected too. Objects that survive this garbage collection process will move to generation 2 (the last generation in the CLR). When this is done, generation 0 is analyzed and collected, promoting the survivor objects to generation 1. It's clear that a generation 1 level collection takes more time than a generation 0 level collection, but also that collections of generation 1 will occur less frequently than the collections of generation 0. In the end, the characteristics of an ephemeral garbage collector can be explained by the very basic assumption that newly created objects are likely to have a short lifetime and old (surviving) objects are likely to live longer. When garbage collection can't free any memory, the CLR will throw an OutOfMemoryException.
Note the whole garbage collection mechanisme is far more complex than what I did describe above. A few points to take care of include:
- The garbage collector changes the addresses of objects in memory. Therefore, thread safety is a must so that other parties do not access wrong memory locations when garbage collection is being performed. As the matter in fact, all managed code threads have to be suspended before the garbage collector can start its mission. In order to do this safely, the CLR has to take track of a lot of things in order to make sure the suspension does not hurt the thread when it has to be resumed after the garbage collection took place. This is based on so-called safe points. If a thread does not reach a safe point in a timely fashion, the CLR will perform a trick called thread hijacking to modify the thread's stack. To make things even more complicated, unmanaged code needs special treatment in some cases too. This would bring us too far, so I do refer to the book "Applied .NET Framework Programming" for more information about this.
- Large objects (larger than about 85 KB) are allocated on a separate large object heap and start their lifecycle in generation 2. The reason for this special treatment is to reduce shifting of large memory blocks when performing garbage collection. As a result, it's recommended to use large objects only when these are long-lived.
- Objects that are collected during a garbage collector run can have a finalizer (in C# defined by a destructor-like syntax; e.g. ~MyObject). Such an object has a Finalize method that should be called when the object is deleted and ends its lifecycle. The garbage collector uses a finalization list to determine whether an object needs finalization first. If that's the case, the object's pointer is put on a freachable queue (f stands for finalization). Such an object is not (!) considered to be garbage (yet) because it's still reachable (as the name tells us). A special thread in the CLR checks this queue on a regular basis to finalize the objects that are listed in there. During the next run of the garbage collector it can be determined which objects have now become real garbage because the finalization took place and the objects have disappeared from the freachable queue.
In this discussion I made one big assumption: the CLR knows which objects are not needed anymore. How does it do that? The mechanism that allows the CLR to do this is based on the concept of a root, which is "some location" that contains a memory pointer to an object. Examples are global variables, static variables, variables on the current thread stack and CPU registers that point to a reference type. When the Just-In-Time compiler does its work, it maintains an internal table which maps begin and end code offsets (with code I do mean native code, the result of jitting) to the root(s) of the method that's executing. During the execution of the native code, the garbage collector can be called (because a running-out-of-memory condition as explained above). It's clear that this will happen at a certain code offset. Each offset is embeddded in a region between a begin offset and end offset. By looking in the table created by the JIT compiler, a set of roots can be found. Beside of this table information we also do have thread stack information when the execution is interrupted by the garbage collecting process. Using the thread's call stack, the garbage collector can perform a thread stack walk to find the roots for all of the calling methods, again by using internal tables constructed by the JIT compiler at runtime for each method. Once this information is known, the garbage collector can create a reachable objects graph that is used to find out which objects are still needed. When recursing through this graph, object that are still in use are marked. Any unmarked objects after this phase are considered to be garbage (as these are not reachable starting from a root anymore) and therefore can be collected. In the last phase, the garbage collector walks over the managed heap and looks for large (to avoid little memory shifts that wouldn't give much of gain to the end result) contiguous regions of free space (i.e. unmarked garbage objects). When such a region is found, the garbage collector compacts the heap by shifting the objects in memory. When doing this, the roots are updated because memory addresses of the moved objects have changed of course.
I'll tell more about garbage collector in a later post when talking about memory management in CLR Hosting.
Okay, now you should have some picture of how garbage collection works. Back to the difference between the server and workstation builds of the CLR. In the intermezzo I explained how threads have to be suspended so that the garbage collector (thread) can kick in to do its job. On server machines (I'll define the term "server" in a moment) we like to minimize the overhead of this garbage collection as much as we can. Assume you have a machine with multiple processors. In that case it's possible to run garbage collections in parallel on the machine. This is exactly what the server build supports. By default the workstation build will be loaded and the server build can't even be loaded when you don't have a multiprocessor machine. This explains the second parameter of the CorBindToRuntimeEx function. It can take the following two values:
Now, the third parameter of the CorBindToRuntimeEx function is related to the garbage collector too. For the workstation build, there is support for two different modi to run the garbage collector in. The first is the concurrent mode (STARTUP_CONCURRENT_GC) that will work on multiprocessor machines (but we're not running the server build of the CLR, remember that, we're talking about the workstation build specifically). In this mode, collections will happen concurrently in a background thread while the foreground threads are working. On a uniprocessor machine, collections happen on the same threads as the foreground code. Nonconcurrent mode does the collections in the same threads as the foreground code. The server build always uses nonconcurrent mode and nonconcurrent mode while running the workstation build is the recommended value for non UI-intensive apps (e.g. SQL Server 2005 uses nonconcurrent mode).
Warning: there is a trick to load the server build on non-supported configurations (see above) by specifying the "svr" parameter in combination with concurrent mode.
The default of concurrent collection in workstation builds is a nice one. If you want to disable this without going through the process of creating a full CLR Host, you can use a configuration file (<app.exe>.config) to specify the garbage collector's behavior in relation to this:
<gcConcurrent enabled="..." /> <!-- true is the default; useful when running in workstation build (default) to turn concurrent collection off, e.g. in batch processing apps -->
<gcServer enabled="..." /> <!-- set to true to load the server build (not the default); however, if non on a multiproc, the workstation build will be loaded instead -->
Domain-neutral code introduced
A last concept for now is the concept of domain-neutral code, of which the behavior can be set through the third parameter of CorBindToRuntimeEx too. The three possible options are:
STARTUP_LOADER_OPTIMIZATION_SINGLE_DOMAIN // no domain neutral loading
STARTUP_LOADER_OPTIMIZATION_MULTI_DOMAIN // all domain neutral loading
STARTUP_LOADER_OPTIMIZATION_MULTI_DOMAIN_HOST // strong name domain neutral loading
Now, what is domain-neutral code? Let's start with a refresh of the mechanism of DLLs in Windows. DLL stands for Dynamic Linked Library and contains a library of code that can be used by various applications on the machine. The good is the idea, the bad and the ugly is the DLL Hell as a result of versioning troubles. However, the concept is fine in a sense that when multiple applications use the same DLL at the same time, it's possible for the OS to keep the instructions of the dll only once in memory, therefore reducing the working set of the applications that use it. One copy of the code (which does not change) is sufficient for all apps that are dependent on it to execute.
Domain-neutral code is the equivalent in .NET of this kind of sharing of common code across multiple dependent applications. In managed code, things are a little more complicated however due to the JIT compiler. CLR Hosting allows you to customize the behavior of domain-neutral assembly code loading thorugh the startup parameters and through the implementation of the IHostControl interface which I'll explain later on in this episode. The three possible startup parameters dictate the CLR to disable all domain neutral loading (except for mscorlib, the core of the class library, which is always domain-neutral) or to enable domain neutral loading for all assemblies or some way in between based on strong named assemblies. In fact, all assemblies that are to be loaded domain-neutrally have to form a "closure", that is all referenced assemblies of a given domain-neutral loaded assembly have to be loaded domain-neutral too. These three default startup parameter values follow this rule.
Q&A: Why are not all assemblies loaded domain-neutrally to reduce the overall working set as much as possible? The answer is that once an assembly is loaded domain-neutrally, it can't be unloaded anymore without unloading the appdomain and the entire process. It's clear that this behavior is not desirable for CLR hosts such as ASP.NET or SQL Server 2005, where it should be possible to replace an assembly without restarting the server (service) to free resources.
Starting and stopping the CLR
The call to CorBindToRuntimeEx actually initializes the CLR. The result of this is a return parameter of the type ICLRRuntimeHost that can be used for further interaction with the CLR. The first call you'll make is a call to the method Start to start the mission of the CLR in your process. Once this is done, there is no real way back. Although you can stop the CLR by calling the method Stop, it's not possible to restart the CLR. Neither is it possible to completely unload the CLR from the process. Once the CLR has been in a process, it can't be reinitialized or restarted without creating a new process.
To finish this post on CLR Hosting basics, I want to tell you something about delay loading. It's clear that loading the CLR takes some time to complete and maybe you come to the conclusion you have been doing this work for nothing, because during the lifetime of the process no managed code has to be executed. Suppose for example you want to offer managed code support in a database engine. It would be a waste of resources to load the CLR always by default when the database engine starts if nobody needs it later on. Instead, it would be nicer to be able to load the CLR when it's needed to do so (e.g. because of a COM component that calls managed code). For that purpose the CLR Hosting API provides some mechanisms to defer loading.
The first (easy) way to defer loading is to prepare the loading but wait until it's actually needed. This is called deferred startup and can be initiated by using the STARTUP_LOADER_SETPREFERENCE flag as the third parameter of the CorBindToRuntimeEx function. The only thing this does is saving the passed value for the version parameter. When the CLR needs to be loaded, that version will be used (e.g. by calling CorBindToRuntimeEx a second time). This is however very limited in a sense that you can't control other startup parameters for the CLR.
To support that, a function called LockClrVersion is available in the startup shim (see mscoree.IDL) that takes a callback to a function that needs to be called when the real initialization takes place, giving you as a host a chance to manipulate various settings. The signature looks as follows:
STDAPI LockClrVersion(FLockClrVersionCallback hostCallback,FLockClrVersionCallback *pBeginHostSetup,FLockClrVersionCallback *pEndHostSetup);
The first parameter is very straightforward. The two next parameters contain pointers to callback functions (provided to us by the shim) that are to be called before and after initialization to allow housekeeping by the CLR to know which thread is the owner of the initialization to control the overall (exclusively-granted-to-one-thread) loading process and to block any managed code requests that could interfere with this. Skeleton code for lazy CLR loading takes the following form:
FLockClrVersionCallback begin_init, end_init;
//we're in control; notify the shim to grant us the exclusive initialization right
//CorBindToRuntimeEx stuff goes here
//mission completed; tell the shim we're ready
//tell the shim we want to take control when needed
LockClrVersion(init, &begin_init, &end_init);
//wait till something happens that causes the CLR to be loaded
Initialization and startup of the CLR is fully customizable as you saw in this post, even in a delayed loading scenario. In the next posts I'll show how to control the behavior of the CLR even further using self-written CLR Hosting API implementations.Del.icio.us
| Digg It