CLR 4.5: .Net Framework Kernel Improvements

In this post I’ll go through some of the enhancements and improvements done by the CLR team as part of the performance improvements in .Net 4.5. In most cases developers will not have to do anything different to take advantage of the new stuff, it will just works whenever the new framework libraries are used.

Improved Large Object heap Allocator

I’ll start by the most “asked-for” feature from the community – compaction on LOH. As you may know the .Net CLR has a very tough classification regime for its citizens(objects) any object that is greater than or equal to 85.000 bytes considered to be special(large Object) and needs different treatment and placement, that’s why the CLR allocate a special heap for those guys and then the poor guys lives in a different heap frankly called Small Object Heap (SOH).

SOH Allocations and Garbage Collections

The main difference between these two communities are basically around how much disturbance the CLR imposes in each society. When a collection happen on SOH the CLR clear the unreachable objects and reorganize the objects again (all citizens of the SOH) to compact the space and group all the free space at the beginning of the heap. this decreases the chances of fragmentation in the heap and it can only happen because the citizens of SOH are lighter and easier to move around. on the other hand when collection occur on LOH ( Only occur during Gen 2 Collection) the CLR doesn’t com[pact the memory cause it’s very expensive process to move these large object around.You see where this is leading, now we have to deal with fragmentation issues in LOH and from time to time a Out of Memory Exceptions.

LOH Allocations and Garbage Collections

Because the LOH is not compacted, memory management happens in a classical way. where the CLR keeps a free list of available blocks of memory. When allocating a large object, the runtime first looks at the free list to see if it will satisfy the allocation request. When the GC discovers adjacent objects that died, it combines the space they used into one free block which can be used for allocation.

Well the CLR team didn’t make the decision yet to compact LOH which is understandable because of the cost of that operation, on the other hand they improved the method by which the CLR manages the free lists, therefore making more effective use of fragments. So now the CLR will revisit the memory fragments “Free Slots” that earlier allocation didn’t use..

Also in Server GC mode, the runtime balances LOH allocations between each heap. Prior to .NET 4.5, we only balanced the SOH.

Many of the citizens of LOH are similar in nature which lends itself to the idea of Object Pools that can essentially reduce the LOH fragmentation.

Background mode for Server GC

Couple of years ago I blogged about a new Background GC in Workstation mode in CLR 4.0 the main idea is while executing Gen 2 collection, CLR checks at well defined points if Gen0/Gen1 has been requested, if true then Gen2 is paused until the lower collection runs to completion then it resumes. Read more about Background mode for Workstation GC.

(.NET 4.5) Gen0/Gen1 collections can proceed during Gen2 GC

New CLR supports Background mode for Server GC which introduces support for concurrent collections to server collector. It minimizes long blocking collections while continuing to maintain high application throughput.

In the above schematic diagram notice the following:

  • As in .Net 4.0 whenever Gen 0 or Gen 1 happen the managed threads allocating objects on the heap are paused until the collection is done.
  • Gen 2 will now pauses on Server mode too (As in Client/Workstation mode) during Gen 0 and Gen 1 giving it priority to finish first.
  • Gen 2 as usual runs in the background while the managed threads still allocating objects on the SOH heap.

Auto NGEN

.Net Framework prior to v4.5 install both the IL and NGened images for all the assemblies to the dev machines which consumes double the space required or more just for the framework. The CLR team conducted a research to find out what are the most commonly used assemblies out of the framework and named those the “Frequent Set” this set of assemblies only NGened now which saves a lot of space and dramatically decrease the framework disk footprint.

image

Now the perfectly good question that present itself is What about performance when non-NGened assemblies are used?!

The CLR team introduce a new replacement to the ngen engine called Auto NGen Maintenance Task. When you create a new application uses non-NGened assemblies or it’s not NGened itself here is what happen:

  1. The user runs the application
  2. Every time the application run it creates a new type of logs called “Assembly Usage Logs” in the AppData windows directory.
  3. Now the new Auto NGen Maintenance Task takes advantage of a new feature in Windows 8 called “Automatic Maintenance” where a background efficient performance enhancement jobs can run in the background while the user is no heavily using the machine.
  4. The Auto NGen Maintenance Task goes through the Logs from all the managed apps the user ran and creates a picture of what assemblies are frequently used by the user and are not NGened and which ones are NGened and not used frequently.
  5. Then the task NGen these assemblies on fly and removes the NGen images not used (This will be known as Reclaiming Native Images Process).
  6. The next time the app runs it gain some boost after it uses the NGened images.

image

Some notes about Auto NGen process

  • The assembly must targets the .NET Framework 4.5 Beta or later.
  • The Auto NGen runs only on Windows 8
  • For Desktop apps the Auto NGen applies only to GAC assemblies
  • For Metro styles apps Auto NGen applies to all assemblies
  • Auto NGen will not remove not used rooted native images (Images NGened by the developers). Basically the Auto NGen removes only images created by the same process read more about reclaiming native images here.

More Information

Related Posts

Hope this Helps,

Ahmed

CLR 4.0: New Enhancements in the Garbage Collection

Note: This blog post transferred from my OLD BLOG and was originally posted in 2008.

image

The current Garbage Collection does pretty good job in reclaiming the memory of Gen 0 and Gen 1, those Generation’s objects live in ephemeral segments which is very small and GC reclaims their memory very fast, on the contrary most of Gen 2 objects live in other large segments which make Gen 2 large objects collection slower than other collections.

The GC team actually made great improvements in collection algorithms on both the server and the workstation to make it faster and reduce latency.

Enhancements in Server Garbage Collection

The current server GC is very efficient in terms of maximizing the overall throughput; this because GC’s Gen 2 actually pauses all the current running managed code on the server while it runs. And It turns out that this makes the GC as fast as all of us need “BUT” the cost is generating those long pauses on the server managed code execution, and increasing the latency of course!

What the CLR team did in v4.0 is they allow you to be notified before Gen 2 collection (LOH collection or Large Object Heap Collection) happens. You might ask how this could help me in reducing those latency on my server? And in fact there are good news and bad news; the good news is yes this will help you to reduce the latency and reduce the long pauses on your server, and the bad news is this will not help everyone in reality it will help you if you only uses some Load Balancing techniques.

What now CLR offering is a notification model you can use to know when GC starts Gen 2 collection on the current server, so you can switch the user traffic through your load balancer to another application server and then start Gen 2 collection for the old traffic on the first application server; your user will not feel that same latency and long pauses as before.

 

I’m gona walk you through sample code to learn you how to benefit from this new enhancement in your server applications. 

image
   1: public class Program
   2: {
   3:     public static void Main(string[] args)
   4:     {
   5:         try
   6:         {
   7:             // Register on the FullGCNotification service
   8:             // Set the Maximum generation Threashold and 
   9:             // the large object heap threshold
  10:             GC.RegisterForFullGCNotification(10, 10);
  11:  
  12:             // wait for the notification to happen on a new thread
  13:             Thread fullGCThread = new Thread(new ThreadStart(WaitForFullGC));
  14:             fullGCThread.Start();
  15:         }
  16:         catch (InvalidOperationException ex)
  17:         {
  18:             Console.WriteLine(ex.Message); 
  19:         }
  20:     }
  21:     public static void WaitForFullGC()
  22:     {
  23:         while (true)
  24:         {
  25:             // This is a blocking call, once it returns with succeed
  26:             // status, this means that Gen 2 collection is about to happen
  27:             GCNotificationStatus status = GC.WaitForFullGCApproach();
  28:  
  29:             if (status == GCNotificationStatus.Succeeded)
  30:             {
  31:                 // now you call your custom procedure to switch
  32:                 // the trafic to another server
  33:                 OnFullGCApproachNotify();
  34:             }
  35:  
  36:             // Now you are waiting for  GC to complete Gen 2 collection
  37:             status = GC.WaitForFullGCComplete();
  38:             // once it finish you call your custom procedure to switch back
  39:             // the traffic to your first server
  40:             if (status == GCNotificationStatus.Succeeded)
  41:             {
  42:                 OnFullGCCompleteNotify();
  43:             }
  44:  
  45:         }
  46:     }
  47:  
  48:     private static void OnFullGCApproachNotify()
  49:     {
  50:         // 1. Direct the new traffic away from this server
  51:         // 2. Wait for the old traffic to finish
  52:         // 3. Call GC.Collect, and this is the interesting part because
  53:         // Microsoft always tells you not to call GC.Collect yourself.
  54:         // but here you will need to do that because there are no more traffic
  55:         // redirected to this server, so you might wait forever before the GC starts
  56:         // so you need to start the GC.Collect() yourself
  57:         GC.Collect();
  58:     }
  59: }
 
Enhancements in Workstation Garbage Collection
 
Today’s CLR have a Concurrent Collection algorithm for workstation’s GC, this algorithm can do most of Gen 2 objects collection without pausing the running managed code too much at least not as much as on GC’s Server algorithms.
So the problem occur when ephemeral segments fills up during GC is busy making Gen 2 collection on other segments then a new objects allocated from ephemeral segments that’s when the pauses happen on the workstation and the user feels the latency; Concurrent Collection Algorithm can’t run Gen 0 and Gen 1 at the same time as Gen 2 is occurring.
 
New in CLR 4.0 a new collection algorithm used for Workstation’s GC collection instead of the Concurrent Collection Algorithm, this new algorithm is called Background Collection Algorithm, one key thing about this new Background Algorithm is it can do Gen 0 and Gen 1 at the same time Gen 2 is occurring; and in that way you will not see long pauses in your client application as before only in very unusual circumstances.
image
In the upper chart you can see a statistical comparison between the Old Concurrent Algorithm and the Background Algorithm performance. As you ca see at the start of the application in both of the algorithms there is one pause but it takes half the time in the Background algorithm, one the application continue working in the case of Concurrent GC  there are multiple long pauses, on the contrary in the Background GC you se far a few longer pauses as before.
 
With new GC Notification Algorithm on server managed applications and new Background Collection Algorithm on Workstation managed applications; the CLR team leverage a new performance experiences with fewer long pauses and great latency.
 
Related Stuff:
  1. PDC 2008: PC49, Microsoft .NET Framework – CLR Futures (Video|PPTX).
  2. Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework (Jeffrey Richter).
Hope this Helps,
Ahmed