Saturday, December 18, 2004 2:25 AM bart

How to boost your managed code performance? Some tips...

Last days, I've been evaluating some medium-sized project on the field performance, more specifically on the level of CLR managed code. Let's give a collection of tips in this post to make your app more performing.

1. Let garbage collection do its work

Garbage collection is a great thing for various reasons (of which automatic memory management is the most important one of course). For the realtime software people, I know the GC is not a blessing for that kind of software, but it was never meant to be used for this class of scenarios. I won't cover the GC here in full detail but what you should know is that objects are grouped according to their lifetime and volatitly in what we call generations. Currently, there are three generations (Gen 0 - 2) and objects are promoted to higher generations when they live longer (0 --> 1 --> 2). The frequency of garbage collection is lower when the generation is growing. Garbage collection and finalization go hand in hand. Finalization is the process that is executed on an object before the managed memory of the object is reclaimed and has - in C# - the same syntax as a destructor in C++ (although the semantics are completely different). More info on finalization, object disposal and GC can be found on the MSDN website. The GC is smart enough to decide when to work, so there is not need to force the GC to start collecting objects using GC.Collect(). Consequently, the GC.WaitForPendingFinalizers() method should not be called as well (unless you have a really really serious reason - I can't think of one right now - to do this). There are two GCs on the system, one for workstations and one for servers with multiple processors (needed to split the heap over the various CPUs and to perform GC according to this model).

To Finalize or not to Finalize

Don't implement the Finalize method always because the GC needs to promote an object that supports finalization to older generations in order to be able to execute finalization. This behavior has the undesired side-effect that objects become long-lived, something you don't want if your object's lifetime is short. When adding a "destructor", use the IDisposable interface always (see sample further). Only implement finalization when needed (maybe disposal is sufficient) and keep the code simple and short! If combined with threading on your object, take especially care of the synchronization of the threads (cleanup code should be threadsafe if the type is).

Dispose correctly

If you've used the using syntax in C#, you probably know that this works only with IDisposable classes. I don't mean the "using" to import a namespace, rather I mean this:

using (SqlConnection conn = new SqlConnection(dsn)) {
   //use conn

}

Disposing objects are used when external resources are called that need to be freed explicitly by the caller (in the case of "using", the disposing is done automagically and implicitly). You'll see this disposing pattern being applied on various places, such as database connections. Why don't we use finalization rather than disposing on this kind of objects? Well, the reason is that finalization is a GC-initiated process that is executed asynchronously, whileas the disposable pattern gives you the control needed to release resources in a timely fashion. One thing to keep in mind is that the GC can request finalization on an object, so this should be suppressed during the execution of the Dispose method call. In order to do so, don't ever forget to call GC.SuppressFinalization in the Dispose method. As a side-remark, database connection et al support a Close method as well, which is in most cases just a context-specific synonym for disposing an object (therefore more self-explaining to the developer). Another tip is to keep track of whether the Dispose object has been called already on an object. If that's the case, a second call should not be permitted which can be solved by throwing an ObjectDisposedException. Typically, the creation of an IDisposable object is more complex than just writing the Dispose method. A typical skeleton is this:

class MyDisposableClass : IDisposable //you can seal the class if you want
{
    private bool _hasDisposed = false;

    public void Dispose() {
       if (!_hasDisposed) {
          Dispose(true);
          GC.SuppressFinalize(this);
       }
    }

    //Special method for object disposal; param indicates the call source
    protected virtual void Dispose(bool disp) {
       if (disp) {
          //dispose only when Dispose was called; used for managed resources
       }

       //always do finalization; e.g. closing handles to unmanaged resources

       _hasDisposed = true;
    }

    ~MyDisposableClass() {
       Dispose(false);
    }
}

Weak references

Weak referenes (class WeakReference) allows to releasenon-critical objects from memory when there is memory pressure and the GC comes into play. What I mean but non-critical can be illustrated using the sample of a cache. Objects in a cache are pretty useful for the caller to improve the performance, but strictly speaking these objects are not needed since these can be resurrected all the time. So, under memory pressure, these objects can be resurrected safely without affecting the functionality of the system. Weak references are typically used on non-trivial objects (objects that contain quite some data). Basically, the WeakReference class is a wrapper that is recognized during garbage collection and can be used to release the memory when needed. The usage is pretty easy:

//Some object is created
MyWeakReferencedObject obj = new MyWeakReferencedObject();

//Wrap the object
WeakReference ref = new WeakReference(obj);

//Store the reference somewhere, e.g. in a collection
//...
//Retrieve the reference somewhere, e.g. get it from a collection

//Unwrap the object
MyWeakReferencedObject o = null;
if (ref.IsAlive)
   o = (MyWeakReferencedObject) ref.Target;
if (o == null)
   //GC has occurred, retrieve the object again (resurrection) from the source

//Normal operation continues

 

2. Use threading with care

Use multithreading with care

Multithreading makes your applications more responsive (in the presentation layer) and can be used to execute tasks that can't interfere with each other in a parallel way. This sounds great, and indeed it is. However, threads are the basic units of work on a processor and the OS is responsible to schedule threads for execution, which is done by context switching (therefore it can have a negative impact on the overall performance of the application when you have a bunch of threads). This is the reason why SQL 7 and higher are actually avoiding scheduling on the level of the kernel by means of the UMS (User Mode Scheduler) in order to reduce the number of context switches.

Threads are cute, ThreadPools are cuter

ThreadPools are a queuing mechanism that is using a set of pre-instantiated threads that are living in a pool. When work needs to be done, no new threads need to be initialized (which actually takes time to allocate the thread structure), a thread only needs to be grabbed from the pool when available. The way to use this mechanism is amazingly simple:

WaitCallback cb = new WaitCallback(myObject.MyTargetMethod);
ThreadPool.QueueUserWorkItem(cb); //will call the myObject.MyTargetMethod as soon as a thread becomed available in the pool

.NET v2.0 BackgroundWorkers

In .NET Framework v2.0 there is support for a BackgroundWorker that offloads you from the complexity of performing background work and communicating back to the calling thread for progress indication (typically used in a WinForms app). Use this technology if you can instead of your own implementations of this mechanism.

Timers versus threads

When tasks need to be performed on a regular basis, use the various Timer components to do the trick. It's threading, but offloads you as a developer from the plumbing of timer logic and wait loops. You'll find timers typically in WinForms apps (Controls --> Timer), Windows Services (Components --> Timer), etc. If you don't use components to use some kind of timer, use the System.Threading.Timer class (straightforward usage).

Let threads shut down gentry, don't use Thread.Abort

Threads should commit suicide in order to stop. This allows resources to be cleaned up in the right way, to release locks, etc. Thread.Abort should not be used, instead use some boolean value that is checked periodically inside the thread to detect whether it has to stop its work (i.e. to commit suicide). When you need to stop the thread, set the boolean stop indication value to true.

Avoid deadlocks, avoid Thread.Resume and Thread.Suspend

Synchronization of the work across threads should never be obtained using Suspend/Resume calls, rather you should use lock in C# or various objects in the Threading namespace such as Mutex, Monitor, etc to do this kind of work. Personally, the lock keyword is my reliable partner in threading development.

 

3. Don't wait if you don't have to - Asynchronous invocation

Why wait for a call to return if we can do other operations in the meantime. Or why wait for a call to return, therefore blocking the active thread. There are a series of classes in the BCL that support BeginInvoke and EndInvoke to make an asynchronous call. That means that when calling the BeginInvoke method, the method will return immediately. You supply the parameters to the method as well as a callback (delegate). When the work is done, the specified delegate's method will be called. In there, you can call the EndInvoke method on the object to retrieve the returned value (or any exceptions). The EndInvoke method will give your access to an AsyncState object that can be used to get more details about the original request and status. A typical example is an asynchonous web service call.

 

4. Why generics are great

Collections in .NET v1.x are based on the mother of all types, the type "Object". Thus, when working with a colleciton, you can add objects of any type to that collection. However, when you retrieve the object from the collection, you end up with an object of the type "Object", so you need to upcast the returned object to the right type. This operation is expensive. .NET v2.0 will support generics that help you to solve this problem (by specifying the type you want to use in the collection):

//v1.x
ArrayList lst = new ArrayList();
lst.Add(1); //store an int
int val = (int) lst[0]; //retrieve the int needs casting

//v2.0
ArrayList lst = new ArrayList();
lst.Add(1);
int val = lst[0];


Note that generics work at development time, and the IDE will recognize the targeted type(s) of the collection.

 

5. Be aware of remote objects or objects in other appdoms

When using .NET Remoting, be aware that using a remote object can cause a bunch of calls to be sent to another appdomain or even another machine. Think of this:

MyRemoteObject o = new MyRemoteObject();
o.Property1 = "Hello";
string msg = o.Property1;
object zzz = o.DoSomething("World");

This piece of code - when invoked on a transparent proxy object that represents a remote object living somewhere else - causes a lot of message to be passed over the wire: create the object, call the property setter, retrieve a value from the getter, call a method and return the value.

Therefore, if you can, think of using a stateless approach, e.g. using web services or be at least aware of the negative impact remote objects can have. Thus, transparancy comes at the cost of performance.

 

6. Exceptions - only when needed

Exceptions should not be the default way to return from a method. It are exceptions, what's in a word? Make sure you're using exceptions the right way.

Finally finally

Everyone knows and uses try ... catch, but please don't forget the finally block. All too often, this block is omitted. Think of this classic sample:

try {
    SqlConnection conn = new SqlConnection(dsn);
    //use conn
}
catch {}

What's wrong here? A lot! First of all, a catch-all block is something to avoid whenever you can. Secondly, the connection should be closed when it has been used:

try {
    SqlConnection conn = new SqlConnection(dsn);
    //use conn
    conn.Close();
}
catch (SqlException ex) {
    //catch it
}

Better isn't it? Yes, but not good enough yet. Although we have replaced the catch block to catch only the exceptions we can and should catch, we don't close the connection when an error occurs. Instead, use this:

SqlConnection conn = new SqlConnection(dsn);
try
{
    //use conn
}
catch (SqlException ex) {
    //catch it; don't attempt to close over here
}
finally {
    conn.Close(); //or better, check connection state first using conn.State
}

Finally always executes, even when you perform a return statement inside the try block. That's why it's just great. If you only have try ... finally (without catch), it's possible that you can replace the code with the C# using syntax, as explained earlier in the IDisposable coverage.

Don't rethrow but wrap

Rethrowing an exception is simply, just use throw inside the catch block. However, this call is very expensive (stack unwinding etc), so avoid it. Maybe it's better not to catch the exception (e.g. in a helper method) and to catch it higher up the stack. Or - to help abstraction, at the cost of performance - wrap the exception in another exception-type (self-written) and throw that one. An example is to catch a SqlException and to throw it as a DALException (derived from ApplicationException) to the higher layer. Use an InnerException if you want the original exception to be passed to the higher level.

 

7. Strings are immutable

Strings are immutable. Once allocated, their size can't be increased or decreased. So, what happens over here?

string s = "Bart";
s += " De Smet";

Indeed, a temporary string needs to be created and the original string "Bart" needs to be GCed. A better example:

string str = "This string is becoming ";
for (int i = 0; i < 1000000; i++)
   str = str + "longer and "
str = str + "longer."

In such a case you don't want to play the heap of the CLR (strings are reference types) and the GC will get tired to clean the 1M phantom objects. Even worse, the loop will become slower and slower at each pass, since a lot of char copying needs to be performed to concat a long string with another string.

String operator overload '+' - use with care

Use the concatenation operator + if you know that the number of concats is limited. In fact, I like to use String.Format instead in quite some cases to split the layout from the content. E.g. (both okay):

//Solution 1:
string envelope = "Name: " + name + "\nAddress: " + address + "\nZIP: " + zip + "\tCity: " + city

//Solution 2:
string envelope = String.Format("Name: {0}\nAddress: {1}\nZIP: {2}\tCity: {3}", name, address, zip, city);

Use StringBuilder inside loops etc

StringBuilders use an internal buffer of characters (char array) to store the string and grow when the array is running out of space (by doubling the size of the array). An example:

System.Text.StringBuilder sb = new System.Text.StringBuilder();
sb.Append("This string is becoming ");
for (int i = 0; i < 1000000; i++)
   sb.Append("longer and ");
sb.Append("longer.");
string str = sb.ToString();

 

8. Reflection is slooooooooow

Relfection can be great if you use it with care. Late binding looks as it is a holy grail for software extensibility (and yes it is, as I explained in a post on my blog earlier). If you use late binding (e.g. when loading "providers" from a configuration file using reflection), make sure you can cache the retrieve object instance to avoid a second binding performance hit (caused by Activator.CreateInstance for example).

 

9. Ngen.exe

NGen.exe is a tool that comes with the .NET Framework SDK that allows you to precompile an assembly (containing MSIL code) to native code, targeting the current hardware platform and processor instruction set (of the machine on which you run ngen). This increases performance but comes at the price of cross-platform portability of assemblies. In v1.x you can't use it however with ASP.NET assemblies in the web folder's bin directory, but this will change with precompilation in .NET v2.0 for ASP.NET pages.

 

10. ASP.NET tips (sneak peak)

If I find some time later on, I'll cover ASP.NET specific performance tips. Let's give a fairly huge list of tips you can use:

  • Cache, cache, cache whenever possible. A 1 second cache is better than no cache in a lot of cases.
  • Limit the number of HttpModules.
  • Reduce round trips to the client (use client-side script and validation).
  • Don't call long-running tasks when handling events server-side (use async calls, message queues, fire-and-forget, etc if possible).
  • Disable viewstate if not needed.
  • In-process state is faster than using the ASP.NET state server or a SQL Server db; use these options only in web farms.
  • Use output buffering and Response.Flush to reduce roundtrips (chunky versus chatty).
  • Server.Transfer avoids a roundtrip to the server compared to the use of Response.Redirect. However, Server.Transfer remains on the server all the time and can therefore bypass the HttpModules on the system; and you have to keep in the scope of the same app.
  • In ASP.NET v2.0 use database cache invalidation with SQL Server 7/2000/2005.
  • Tweak the machine.config performance settings on the field of connections and worker/io threads and the threading pools.
  • Avoid aggressive use of IIS 6 app pool recycling.
  • Perfmon should be your guide to improve performance.
  • Never ever forget to call Dispose on database connections etc since these are typically used very very much inside web apps.
  • Avoid to open up a bunch of connections to a db, e.g. inside a OnItemDatabound event handler for a DataList, DataGrid, Repeater control. Chatty is bad, chunky is good.
  • Don't bypass SQL Server connection pooling; use the same connection string all the time to enable pooling to do its job.
  • Gzip/deflate compression can increase network speed, but can put additional pressure on the processor on both the client and the server; use it with care.
  • Page.IsPostBack should be used to avoid data rebinding on postbacks etc
  • Impersonation is slow; avoid to do this on a per-request basis; use a trusted subsystem for the database when you can.
  • Use user controls to apply different caching on various webcontrols individually.
  • In production, debugging and tracing should be disabled (e.g. trace.axd should not be reachable); use customErrors.
  • Use one dev language per folder; multiple languages (C#, VB.NET) cause different assemblies to be created in the bin folder.
  • Databind on the level of controls, not on the page level (e.g. grid.DataBind() instead of this.DataBind()).
  • Avoid <% .. %>regions as well as <%# ... %>regions containing embedded script and DataBinder.Eval calls. This makes separation of code and content vague (remember plain old ASP-spaghetti code?). Use the OnItemDataBound event of the controls instead.
  • Windows Server 2003 kernel caching (http.sys) should be used whenever possible ().
  • @OutputCache is more than "Duration" and "VaryByParam". Take a look at the Location attribute as well (Client <-> Server <-> ClientAndServer) for use with proxy servers etc.
  • Avoid Application[] and Session[] collections when you can; e.g. instead of using Application[] you can use static properties on the application object level as well.
  • State is using serialization to store objects; use basic types to make serialization simpler. This is only applicable when using StateServer or SQL as state containers.
  • Session state, avoid if not needed and make it read-only on pages where you don't need write-access (<%@ Page EnableSessionState="ReadOnly" %>).
  • When creating server controls, don't use string concat the generate HTML, but use HtmlTextWriter instead.
  • Use Server.GetLastError() as catch-it-all inside the global.asax's Application_Error event handler.
  • ASP.NET uses MTA (multithreaded apartment) for its threading; STA COM objects are discouraged if not needed (cf. AspCompat flag).
  • Limit what you render to the client; use paging on DataGrids, turn off viewstate if not needed, make repeating regions lightweight.
  • Remote middletiers offer a great benefit on the field of encapsulation and abstraction, but come at the cost of performance.
Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under: ,

Comments

# re: How to boost your managed code performance? Some tips...

Monday, January 17, 2005 11:54 PM by bart

I'm curious if there is another approach to try..catch..finally for cleaning up SQL connections.

From the blog entry....

try {
//use conn
}
catch (SqlException ex) {
//catch it; don't attempt to close over here
}
finally {
conn.Close(); //or better, check connection state first using conn.State
}

I'm actually using a "using" block to ensure that the SQL Connection is always disposed of correctly.:

try
{
using( SqlConnection con = new SqlConnection(...)
{
// do sql calls
}
}
catch (SqlException ex)
{
//catch it; don't attempt to close over here
}

# re: How to boost your managed code performance? Some tips...

Wednesday, January 19, 2005 1:24 AM by bart

Hi Boris,

The using pattern (which I blogged about before) is definitely a good choice as well. You could actually discuss about the order of a try and using block however when used together. If you're really curious about this, I'd invite you to take a look at the IL-code being generated when you use the using pattern and try...catch pattern to see what's really happening behind the scenes. If I find some time, I'll post some info on that too.

# A new comment is posted in igooi.com

Wednesday, March 23, 2005 6:30 AM by TrackBack

TrackBack from igooi.com