Sunday, July 24, 2005 12:54 AM bart

CLR Hosting - part 3 (memory management)

Introduction

A first specific aspect of CLR Hosting I want to cover in this blog post series is the memory management part of the CLR Hosting APIs. It all started some years ago when the SQL Server team started to look at the possibilities to integrate the CLR into the SQL Server database engine to give developers the opportunity to use their knowledge of managed code development to extend the functionality of the database, by writing things such as stored procedures, triggers, user defined functions, user defined types in their favorite languages. One of the reasons of considering this integration was undoubtly the gain of developer productivity and the possibility to integrate database development tighter with the Visual Studio IDE tools, but also the characteristics of the CLR and the .NET Framework as such have played an important role in this decision. For example, extending the functionality of the database using an eXtended Procedure in C++ is a very dangerous thing, whileas doing the same in a managed environment using the CLR eliminates a lot of risks that are introduced by running custom code in the core of the database engine. Things such as memory management, type safety, exception handling mechanisms, etc are great aspects of the CLR and were considered to be a welcome gift in the SQL Server developers world.

However, integrating the CLR in a product like SQL Server which has very high needs on the field of performance, scalability, reliability and data integrity (transactions you know) is not as easy as it might look in the first place. One example of the difficulties that kick in is the management of memory allocation and all of the stuff around it. Why is this? Well, as I said SQL Server has a high need for performance and wants to do everything it can to keep this as high as possible and to avoid conditions that can undermine the runtime quality attributes of the product. The way to do this is in SQL Server is called the SQL Server OS, which can be seen as being the heart of the product. As the name suggests, this core part of the database engine acts like a mini OS because it's responsible to control all threading, locking and scheduling stuff (User Mode Scheduler, fibers, cooperative scheduling) but also all of the stuff around memory management. The latter one is the thing I'll be focusing on in this post.

 

Case study: How SQL Server manages memory

In order to ensure the performance and scalability of a SQL Server instance, the database engine needs a way to control and track all of the memory allocations that are done. It does so to ensure that the amount of allocated memory is kept between boundaries (by default, SQL Server will get as much memory as it can) and to avoid pages to be swapped to disk because the speed gap between primary (RAM) and secundary memory (disk) can lead to a serious performance drawback. Generally spoken, Windows offers applications three ways of allocating memory: virtual memory, shared memory and heaps. The first one is the primary means by which SQL Server allocates memory when that's needed. The virtual memory functionality in the Windows OS can be summarized by explaining the some key functions in the Win32 API to work with this mechanism:

  • VirtualAlloc - allocates memory in the address space of the process that requests the memory (there is an extension to this - suffixed with Ex - that allows more complex memory allocation)
  • VirtualFree - used to free the memory that was allocated previously using VirtualAlloc
  • VirtualLock and VirtualUnlock are used to lock virtual memory pages in the physical memory
  • VirtualProtect is used to change the protection attributes of a virtual memory page; examples of these attributes are PAGE_EXECUTE, PAGE_EXECUTE_READ, PAGE_EXECUTE_READ_WRITE, etc including support for copy-on-write mechanisms and so on
  • VirtualQuery - obtain information about the virtual memory on the system

So, when we do want to embed the CLR inside SQL Server, we need to make sure SQL Server still can track all of the memory that's allocated to provide database functionality (e.g. in a managed stored procedure). The CLR Hosting API allows us to do this, by providing a memory management provider that the CLR will call every time it needs memory to do its job. Instead of sending these calls to the Win32 functions directly, these are sent to the SQL Server OS (more specifically the Memory Management Subsystem in there) to handle the request for memory, given the constraints of the SQL Server configuration concerning memory (e.g. min/max mem settings).

I won't cover the heap and shared memory in much detail over here. Instead I'll just mention the most important aspects of these. Let's start with the heap. A heap is controlled by a thing called a heap manager and consists of a region in memory divided in pages of reserved space. The heap manager can be called to get a piece of memory to work with. Typically heaps are used when objects or structures with similar sized need to be allocated in memory. An example of the usage of a heap is the new operator in C++ or the malloc function in C. The basic functions for heap management are:

  • HeapCreate and HeapDestroy are used to create and destroy so-called private heaps.
  • HeapAlloc allocates memory from the heap (cf. C's malloc)
  • HeapFree releases memory from the heap (cf. C's free)

Shared memory is the last mechanism that Windows offers to work with memory. The idea of shared memory is to allocate a memory region and to allow shared access to it for multiple processes, so it can be used for inter process data exchange. SQL Server uses shared memory as a fast way to communicate with client applications on the same machine, bypassing protocols such as TCP/IP (and therefore the whole network OSI stack) or named pipes, through the Net-Library related functionality. The basic functions include:

  • CreateFileMapping to create a so-called section object to be used with either shared memory or a memory-mapped file
  • MapViewOfFile creates a mapped view for a file in the physical memory
  • FlushViewOfFile writes modified pages in a mapped view to disk

Details about all of these functions can be found in the Platform SDK of Windows XP and Windows Server 2003.

 

Controlling Virtual Memory

In the section above, I presented a list of functions in the Win32 API that are used to manage virtual memory, including VirtualAlloc, VirtualFree, VirtualQuery and VirtualProtect. The CLR Hosting API provides an interface called IHostMemoryManager that exposes similar functionality to perform this kind of work. So, when the CLR needs to do something related to virtual memory, it will check whether a custom memory manager was hooked in by the host. If that's the case, the CLR will call that manager instead of calling the Win32 API directly to obtain, free, ... virtual memory.

The full interface of IHostMemoryManager is shown below:

interface IHostMemoryManager : IUnknown
{
    HRESULT CreateMalloc([in] BOOL fThreadSafe,
                         [out] IHostMalloc **ppMalloc);   
   
    HRESULT VirtualAlloc([in] void*       pAddress,
                         [in] SIZE_T      dwSize,
                         [in] DWORD       flAllocationType,
                         [in] DWORD       flProtect,
                         [in] EMemoryCriticalLevel eCriticalLevel,
                         [out] void**     ppMem);
   
    HRESULT VirtualFree([in] LPVOID      lpAddress,
                        [in] SIZE_T      dwSize,
                        [in] DWORD       dwFreeType);
   
    HRESULT VirtualQuery([in] void *     lpAddress,
                         [out] void*     lpBuffer,
                         [in] SIZE_T     dwLength,
                         [out] SIZE_T *  pResult);
   
    HRESULT VirtualProtect([in] void *       lpAddress,
                           [in] SIZE_T       dwSize,
                           [in] DWORD        flNewProtect,
                           [out] DWORD *     pflOldProtect);
   
    HRESULT GetMemoryLoad([out] DWORD* pMemoryLoad,
                          [out] SIZE_T *pAvailableBytes);
   
    HRESULT RegisterMemoryNotificationCallback([in] ICLRMemoryNotificationCallback * pCallback);
}

The most interesting functions in here are the ones that start with Virtual, as these are the equivalent of the well-known Win32 API functions. Actually, when you look at the Platform SDK, you'll find all of the stuff you need to know to understand the corresponding functions in the CLR Hosting API:

The explanation of the CLR Hosting virtual memory management functions can be found over here. One significant difference between the Win32 and CLR functions is the VirtualAlloc function's parameters. VirtualAlloc in the CLR Hosting API has an additional input parameter of the enumeration type EMemoryCriticalLevel:

typedef enum
{
    eTaskCritical = 0,
    eAppDomainCritical = 1,
    eProcessCritical = 2
} EMemoryCriticalLevel;

The CLR will use this parameter to inform the host about the consequences when the memory request is denied. E.g. if the eProcessCritical value is supplied, the result of denying the memory allocation can be a process termination. The greater than operator follows the gradation of severity when memory allocation fails (process > app domain > task).

The VirtualAlloc function can return several values, including S_OK to indicate the memory allocation succeeded, E_FAIL to report a catastrophic event (causing the CLR in the process to become unavailable) and E_OUTOFMEMORY (the host has decided the CLR can't get any memory right now). In the Win32 API the return value is actual the pointer to the allocated memory instead of an HRESULT value. In a similar fashion, the other Virtual* functions provide logical wrappers around the equivalent Win32 functions.

 

Heap management

The IHostMemoryManager isn't responsible for heap management by itself. Instead, it provides a function CreateMalloc to obtain an instance to a IHostMalloc object that controls heap memory:

    HRESULT CreateMalloc([in] BOOL fThreadSafe,
                         [out] IHostMalloc **ppMalloc);

The first parameter in here indicated whether thread safety is needed. The .NET Framework 2.0 build I'm using is actually a little outdated as newer releases (> 2.0.40607) have replaced this with a MALLOC_TYPE enumeration only consisting of two values. I won't cover this in detail now. The IHostMalloc interface looks as follows, offering three functions:

interface IHostMalloc : IUnknown
{
    HRESULT Alloc([in] SIZE_T  cbSize,
                  [in] EMemoryCriticalLevel eCriticalLevel,
                  [out] void** ppMem);
   
    HRESULT DebugAlloc([in] SIZE_T      cbSize,
                      [in] EMemoryCriticalLevel       eCriticalLevel,
                      [in] char*       pszFileName,
                      [in] int         iLineNo,
                      [out] void**     ppMem);

    HRESULT Free([in] void* pMem);
}

Alloc should be self-explanatory I guess, given the explanation of the EMemoryCriticalLevel type. DebugAlloc is used in debugging scenarios and allows to link to a source code file and a line number in that file. The Free function should ring a bell too for everyone who has ever done malloc/free stuff in plain old C.

 

File mapping

In the interface listing of IHostMemoryManager above, this stuff is not present yet (again because of the use of an early build). However, the second release of the .NET Framework CLR Hosting APIs will allow you as a runtime host to obtain information about the needs of the CLR concerning file mappings. To load an execute assemblies, the CLR uses the MapViewOfFile Win32 API function. It's clear that such an action to load an assembly (I'll cover assembly loading in a next post in this CLR Hosting episode on my blog) requires memory. As we want the host (think about SQL Server in particular if it helps to clarify stuff) to be able to get to know everything about memory allocations and releases, it's necessary for the CLR to report the allocation (and release) fo virtual address space (e.g. to count the total size of memory space used by the hosted CLR inside the host process). The CLR does this through the following functions:

    HRESULT NeedsVirtualAddressSpace([in] LPVOID       startAddress,
                                     [in] SIZE_T       size);


    HRESULT AcquiredVirtualAddressSpace([in] LPVOID       startAddress,
                                        [in] SIZE_T       size);


    HRESULT ReleasedVirtualAddressSpace([in] LPVOID       startAddress);

So, when the CLR uses MapViewOfFile it will call AcquiredVirtualAddressSpace to report this to the host and when the UnmapViewOfFile Win32 function was called, this is reported through ReleasedVirtualAddressSpace. The NeedsVirtualAddressSpace function gets called when a call of MapViewOfFile failed because of a low on memory condition. This gives the CLR Host a chance to make memory available in order to allow the CLR to retry this operation.

 

About garbage collection and hints to the CLR

This leaves us with two functions in the IHostMemoryManager interface that we didn't explain yet:

    HRESULT GetMemoryLoad([out] DWORD* pMemoryLoad,
                          [out] SIZE_T *pAvailableBytes);
   
    HRESULT RegisterMemoryNotificationCallback([in] ICLRMemoryNotificationCallback * pCallback);

The first function, GetMemoryLoad, is the equivalent in the CLR Hosting API for the Win32 function called GlobalMemoryStatus (more information over here). Both parameters of the CLR function are output parameters, which obviously means the host tells something to the CLR. The first parameter has to report a percentage of the memory load whileas the second one has to give the exact number of bytes that the CLR will still be able to allocate through the memory manager before an allocation error occurs. The CLR will call this function regularly to get a picture of the status of the memory managed by the host. These values are kept by the CLR to determine when the next round of garbage collection should be launched.

The second function is used by the CLR to pass through a pointer to a ICLRMemoryNotificationCallback object (remember the convention that interfaces started with ICLR are provided by the CLR, whileas the ones starting with IHost are provided by the host developer, that is you). The implementation of the function will be pretty straightforward, namely to "remember" the passed-in pointer in some global variable for later use. The functionality in the ICLRMemoryNotificationCallback interface is very limited:

interface ICLRMemoryNotificationCallback : IUnknown
{
    // Callback by Host on out of memory to request runtime to free memory.
    // Runtime will do a GC and Wait for PendingFinalizer.
    HRESULT OnMemoryNotification([in] EMemoryAvailable eMemoryAvailable);
}

Basically, what this OnMemoryNotification function allows the host to do is to tell the CLR about an out of memory condition (that is expected to happen "soon"). As a reaction on this reported status, the CLR can invoke the garbage collector to free memory. The possible values are declared in an EMemoryAvailable enumeration:

typedef enum
{
    eMemoryAvailableLow = 1,
    eMemoryAvailableNeutral = 2,
    eMemoryAvailableHigh = 3
} EMemoryAvailable;

Currently, only the first value (a low memory condition) will trigger the GC, but in the future the CLR will possible use the other values to drive the scheduling of the next GC round or so.

 

Hook in your memory manager

Assume you've written your memory manager in a class called MyHostMemoryManager (implementing the IHostMemoryManager interface). The next step is to tell the CLR about this during the initialization phase. To do this, you have to implement the IHostControl interface, for example in a class called MyHostControl. The function you need to implement is called GetHostManager.

HRESULT __stdcall MyHostControl::GetHostManager(REFIID id, void **ppHostManager)
{
   if (id == IID_IHostMemoryManager)
   {
      MyHostMemoryManager *pMemoryManager = new MyHostMemoryManager();
      *ppHostManager = (IHostMemoryManager*) pMemoryManager;
      return S_OK;
   }
   else
   {
      *ppHostManager = NULL;
      return E_NOINTERFACE; //tell the CLR we don't take care for the requested manager
   }
}

Now, the CLR is able to obtain an instance of the memory manager you've implemented. The startup code will have the following format:

ICLRRuntimeHost *pHost = NULL;
HRESULT res = CorBindToRuntimeEx(L"v2.0.40607", L"wks", STARTUP_CONCURRENT_GC, CLSID_CLRRuntimeHost, IID_CLRRuntimeHost, (PVOID*) &pHost);

assert(SUCCEEDED(res));

MyHostControl *pHostControl = new MyHostControl();
pHost->SetHostControl((IHostControl*) pHostControl);

This should do the trick. By calling SetHostControl, the CLR will start to ask your CLR host what responsibilities it wants to take. When asking for a memory manager, you'll return your memory manager object as explained above. As a reaction on this, the CLR will call the RegisterMemoryNotificationCallback function to offer you a pointer to the callback object you can use to report memory status to the CLR (through OnMemoryNotication). When pHost->Start() is called later on, the memory manager will be used to handle memory requests by the CLR.

 

Controlling the garbage collector

Controlling the garbage collector of the CLR can be done by two parties: the CLR and the Host. For this particular reason, two interfaces are present in mscoree:

interface ICLRGCManager : IUnknown
{
    /*
     * Forces a collection to occur for the given generation, regardless of
     * current GC statistics.  A value of -1 means collect all generations.
     */
    HRESULT Collect([in] LONG Generation);
   
    /*
     * Returns a set of current statistics about the state of the GC system.
     * These values can then be used by a smart allocation system to help the
     * GC run, by say adding more memory or forcing a collection.
     */
    HRESULT GetStats([in][out] COR_GC_STATS *pStats);
   
    /*
     * Sets the segment size and gen 0 maximum size.  This value may only be
     * specified once and will not change if called later.
     */
    HRESULT SetGCStartupLimits([in] DWORD SegmentSize, [in] DWORD MaxGen0Size);
}

interface IHostGCManager : IUnknown
{
    // Notification that the thread making the call is about to block, perhaps for
    // a GC or other suspension.  This gives the host an opportunity to re-schedule
    // the thread for unmanaged tasks.
    HRESULT ThreadIsBlockingForSuspension();

    // Notification that the runtime is beginning a thread suspension for a GC or
    // other suspension.  Do not reschedule this thread!
    HRESULT SuspensionStarting();

    // Notification that the runtime is resuming threads after a GC or other
    // suspension.      Do not reschedule this thread!
    HRESULT SuspensionEnding(DWORD Generation);
}

The ICLRGCManager is the easier one of both, because you don't have to implement it yourself. Instead, you can ask the ICLRControl object for the CLR-provided manager for garbage collector management. The mechanism to do this is pretty straightforward:

ICLRGCManager *pCLRGCManager = NULL;
res = pHost->GetCLRManager(IID_ICLRGCManager, (void**) &pCLRGCManager);

Once you have a reference to the ICLRGCManager object, you can use it to perform the following actions:

  • Collect - ask the GC to collect a certain generation (see the part 2 post of the CLR Hosting series for more information about the garbage collector generations model); use -1 to collect all of the generations. Note: in managed code you can call System.GC.Collect();
  • GetStats - pass in an object of type COR_GC_STATS telling the CLR which statistics - of the COR_GC_STAT_TYPES enumeration - you want (COR_GC_COUNTS and/or COR_GC_MEMORYUSAGE) and obtain the stats through the same object;
  • SetGCStartupLimits - control the segment size (>= 4 MB; multiple of 1 MB) and the size of generation 0 (>= 64 KB).

In part 2 of the CLR Hosting episode on my blog I told you a bit about the GC's need to suspend threads to reach a safe point and to kick in the garbage collection in a safe way. The CLI implementation of the garbage collector mentions the following in the gcee.cpp file:

// The contract between GC and the EE, for starting and finishing a GC is as follows:
//
//      LockThreadStore
//      SetGCInProgress
//      SuspendEE
//
//      ... perform the GC ...
//
//      SetGCDone
//      RestartEE
//      UnlockThreadStore

We're now talking about the SuspectEE (execution engine) and RestartEE steps:

void GCHeap::SuspendEE(SUSPEND_REASON reason)
{
    //...
    ThreadStore::TrapReturningThreads(TRUE);
    //...
    hr = Thread::SysSuspendForGC(reason);
}

void GCHeap::RestartEE(BOOL bFinishedGC, BOOL SuspendSucceded)
{
    //...
    ThreadStore::TrapReturningThreads(FALSE);
    //...
    Thread::SysResumeFromGC(bFinishedGC, SuspendSucceded);
    //...
}

In the introduction of this post I explained the need of SQL Server to be able to control almost everything through its own "SQL OS" layer. This "everything" includes thread and synchronization management. If a host (such as SQL Server 2005) has this need, it's likely it wants to know about the thread manipulation that's performed by the CLR's GC to be able to (reach a safe point and) run. For that purpose, a host can provide a IHostGCManager implementation, providing three functions:

  • ThreadIsBlockingForSuspension - Notify the host about the fact that the thread that's calling this method will be suspended soon to perform some work (e.g. to perform a garbage collection). This is what the SysSuspendForGC function is called for. As explained in the comments for the function, this allows the host to perform scheduling work to use resources as effectively as needed (which is certainly the case of SQL Server 2005 because of performance and scalability needs as explained earlier).
  • SuspensionStarting - The thread suspension is starting now.
  • SuspensionEnding - Tells the host that suspension of the thread is stopping, also telling the host which generation was garbage collected during the thread's suspension.

 

Conclusion

The CLR Hosting APIs allow you as a developer of a CLR host to control virtually any aspect of memory management in relation to the CLR. You can take advantage of this to control memory limits, to track memory usage for statistics and performance tuning, and so on.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under:

Comments

# ASP.NET Internals [1/2] | Warren Tang's Blog

Thursday, May 19, 2011 6:11 AM by ASP.NET Internals [1/2] | Warren Tang's Blog

Pingback from  ASP.NET Internals [1/2] | Warren Tang's Blog