CLR

Memory control : Use GCHandle to pin down the objects

In the .NET framework memory control is mostly autonomous and is controlled by the CLR. Although in managed languages we do not care much about controlling memory but when we have to interact with other languages we have to be aware of the memory implications. When we create an object in memory there is no guarantee that the object will remain in the same location it was created in. This is because the GC moves memory around by itself when needed to.

Variable movement in memory

I am sure that everyone reading this have already read about how GC collects memory and generations. As you know that GC allocates memory spaces in bulk. When needed GC collects unused memory and moves it. For example if the object you are using is collected, in order make better allocation GC can move it a different block of memory and free the block you were in. Let think we have a variable called "x" and it allocated in the memory like in the position like shown below.

 GC Generation

Figure 1: Before GC collection

in the figure above 1 is the space that are allocated and 0 are the ones that need to garbage collected because there is no reference to it. When GC collects it might decide to move the occupied items to the top block keep the second block empty by compacting. If so our memory could look like this diagram below

GC Generation1 
Figure 2: After compacting the variable moved to different virtual address

As you see that GC has moved all items to the first block of memory and the second block is empty which will result in faster memory allocation for new variables and it makes sense to do to so. Our variable x has moved to a different location in memory space which means the address of the memory where the variable is kept has changed. (Please keep in mind that the address space used does not have anything is physical ram as the OS may decide to change the physical address anytime and the address we are referring to is virtual address. Usually GC collect after 256KB of memory has become occupied, depending on the version of GC and OS and the mode it runs. As GC is independent of the thread you are running, it can collect any time.

GC Thread

If you are doing unmanaged memory operation and you are using the memory address and GC can come in move the memory away as you use the address to do unmanaged operations.

Pin down: API Calls to PInvoke

So can can this be a problem when we are calling windows APIs? For example, what happens if the window handle (hwnd) that I specified in the pinvoke is moved to a different memory address. The variable is pinned down to memory before the PInvoke call and pinned status is released just after the call. This is done automatically by the CLR. So what would have happened if GC had come in while the unmanaged PInvoke was running? See the figure below

GC Generation 3

As you can see in the figure above that the variable x  did not move to another virtual address in memory. This is useful when I need some unmanaged code to access a memory address that is constant. For example lets think of a scenario where I have an integer array which I pass to some unmanaged function and then I change the values of the variable at times and unmanaged function read changed values in the integer array and does some work. In such a scenario I will need the array to remain in one constant space. So I will need GCHandle class to pin it down in memory

GCHandle class

We find the class in System.Runtime.InteropServices namespace. In order use this class we will need SecurityPermission with Unmanaged code flag most of the time. Each application domain has a for GC handles, with this GCHandle class we can control what items are in that table. The Garbage Collector actually reads this table  and abides by it. Each entry in the table points to an object in the managed heap and how GC should treat it. There are four behaviors types as described in the GCHandleType enumeration. They are as follows,
    1. Weak
    2. WeakTrackResurrection
    3. Normal
    4. Pinned

Weak and WeakTrackResurrection for tracking weak referenced objects (see earlier post WeakReference: GC knows the Best for more on weak references). The Normal type is used to keep an object alive even if there are no reference to it. We are interested in the last type called Pinned. This tells the Garbage Collector to keep this object in memory even if there are no reference to it and never to move this object around in memory. See an example below on how to use the GCHandle class

string name = "My Name";
byte[] nameinbyte = ASCIIEncoding.ASCII.GetBytes(name);
// Pin down the byte array
GCHandle handle = GCHandle.Alloc(nameinbyte, GCHandleType.Pinned); 
IntPtr address = handle.AddrOfPinnedObject ();
// Do stuff ... with the pinned object address 
// ....
handle.Free();

Things to Remember

Please note that too many pinned object will make the GC slowdown. Also only blittable types and arrays of blittable types can be used. If you have written a custom object you can make it blittable by implenting a custom marshaler with the ICustomMarshaler interface.

kick it on DotNetKicks.com

Selected posts on CLR, threading, Internationalization and WPF

I have been professionally working in the software industry for over 9 years but I am relatively new to the blogging community. I have only started to blog around 5 months ago. I have selected some of my previous posts which I liked. Please find the links below

CLR Fun Stuff: How is my C# code converted into machine instructions?
This post decribes the process of MSIL being converted to binary instruction bt the JIT compiler.

WeakReference: GC knows the Best
A basic look at the Weak Reference class which helps to re-aquire data left for garbage collection.

Multi-core CPU Caching: What is volatile?
What is the impact of the volatile keyword in the world of multi-cored CPUs

Dll heaven: Executing multiple versions of the same assembly
How can you execute code from different versions of an assembly in a single process.

Spider's Web: Spawning threads in .NET, The Basics
A basic look at threading. It was going to be 1st part of a 5 part series, but it was discontinued as there was a good free e-book available on .NET threading.

Basic WPF: Stylize standard tabs into Aqua Gel
How can you change the look of the standard controls with styles in WPF

Fun with Generics: Using constraints to limit Generics applicability using the 'where' keyword
A less used but powerful architecture tool, type limiting in generics

Internationalization: Creating a custom culture
How can you add support for your language/culture if .NET does not have built in support for the language like my language ' Bangla'

kick it on DotNetKicks.com

Dynamic Language Runtime: What is it?

On many blog entries around the world, specially from the bloggers from Microsoft, one would find a numerous mention about the Dynamic Language Runtime (also known as the DLR). Specially since many MS bloggers are writing a lot about IronPython, DLR is reflected in their topics. So I was questioning myself a few days ago about the DLR and how is it different from the CLR? Here are my findings.

Dynamic Programming Languages

The first and most common example of dynamic programming languages is Javascript, where one can define types and its methods at runtime. By definition, a dynamic language is a high level programming language whose behavior at runtime is similar to its behavior at compile type. So the compile time behaviors in normal programming languages can also execute at runtime for the dynamic languages. For example a class (type) can be defined at runtime or its methods can be defined at runtime in JavaScript. So a types behavioral properties can be extended at runtime. The type or the object system may be defined or changed at runtime, even the inheritance tree might be modifiable.    

Dlr1    

These behaviors can be emulated by standard runtime like the .NET CLR but, it takes a lot of work to do that. Thus the DLR is born.

Dynamic Language Runtime

First of all let me state that the CLR already has support for dynamic languages, however adding the DLR on top of CLR makes it much more easier to implement a dynamic language and makes communication between multiple dynamic languages a breeze just the way CLR did it for multiple .NET languages. This added layer on top of the existing CLR provides the following services.

Shared Dynamic Type System (I am calling this DTS)
Unlike the CTS where type safety is a crucial requirement the DTS allows us to morph the types at runtime. So we can add methods to the a type or modify it. Also two dynamic languages can talk to each other with same vocabulary.

Dlr23_2

Dynamic Method Dispatching
Ever heard of C++ virtual function tables? Dynamic dispatching is dynamically being able to change which code executes for a method at runtime. There is also another form of dynamic dispatching which is per instance dynamic method dispatching. For example at runtime we can define what code will execute for the method of a class, which is simple method dispatching. But if we are able to define for an instance of a class which code will execute for the instance of that class type that would be instance based dynamic dispatching.

Dynamic Code Generation
Code generation capabilities at compile time is called dynamic code generation. The generated code may have the capability to modify itself. For example, JavaScript eval() function is very powerful and it is able to compile and modify itself.

Hosting Capabilities
Just like the CLR, the DLR also has a hosting API.

Summary

So the DLR makes it a whole lot easier for two dynamic languages to have shared code. The keyword here is sharing. Also the DLR is built on the existing CLR. Microsoft has its DLR source code available at codeplex with IronPython project. IronPython is the first sample DLR language and MS plans to implement DLR in VB.NET 10 and next Jscript.

kick it on DotNetKicks.com


CLR Fun Stuff: How is my C# code converted into machine instructions?

As we all know that our .NET code (C#, VB.NET etc) are converted into MSIL instructions which in turn are put into the assemblies. MSIL is a high level language, not as low as machine instructions, so it needs to converted into machine specific binary code or needs to be interpreted somehow. Since interpretation would make the execution significantly slower, the code is converted into machine code on access. This is done via a procedure called Just in time compilation (JIT). Just in time is a management concept that was introduced by Ford Motors into production environment. This is a process where inventory was brought on time just before they were needed, and this saved the warehousing or storing costs.

How does JIT compilation happen?

In programming JIT works like this. Since the executable or the library is made of MSIL (bytecode or any other intermediate form) instruction it needs to be compiled into machine code, but if we convert all of the code into machine format then it will take time. For example if JITwe have an application has 50 functions and we use 3 of them regularly, then if each time when we load the program and compile all the 50 functions then it would be a waste of time and would take a long time to load. What JIT does is to convert the function's MSIL into machine code just before executing the function. See the figure to see how a code is compiled via JIT. Once a code has been transformed into native machine code it stays in memory and next calls to function are pointed to that same memory so that the conversion to machine code is done only once in the lifetime of an executable.

A little deeper look: The secret undocumented CLR function that does it all

When a .NET application loads then it loads the MsCorEE.dll which loads the correct version of the MsCorWks.dll (the version of .NET we are running) which contains all core functions of the .NET runtime. For more detail on loading see the web, one good resource can be this post NET Foundations - .NET execution model. There is this function called _CorExeMain which actually loads the CLR and all the types that are required into memory. There is a memory table for types and there are tables for functions and properties of each types.

Lets say we have a class that looks like this

class TestClass
{
   static void ConsoleAdd ( int value1, int value2)
  {
     Console.WriteLine ( value1 + value2);
  }
}

Now a careful look would tell us that only 2 types are used here TestClass and Console. If we call the TestClass.ConsoleAdd function from the main method this is how the memory looks like before the function is called.

jit2

Before the call both the ConsoleAdd and WriteLine and other functions are pointing to the secret JITC function. This is how each time the JITC functions compiles the code into native code and replaces the function pointer in the function table for the type.

            JIT3

Now lets look at the memory after it has been JITed.

JIT3_001

Last words

So now we know how our C# code is compiled in to native code. It may be hard to believe that at certain times JITed code runs faster than the native compiled code. When a code is JIT compiled it takes the advantage of the exact CPU instructions present the the machine but the native compiled code compiles into more generic class of machine instructions. We will read about advantage and disadvantages of JIT another day.

kick it on DotNetKicks.com


New* .NET 3.5 Feature: AddIn Framework Resources (Part 2 of 2)

I was starting to write a 3 part series on the new AddIn Framework in .NET 3.5. However I came across some resources which makes me end this series and rather point to those posts and articles since they have already described it so nicely. Here are the articles and blog posts

MSDN Magazine : CLR Inside Out : .NET Application Extensibility by Jack Gudenkauf and Jesse Kaplan

This MSDN article introduces the concept of System.Addin namespace concepts and how it is structured. This actually provides

MSDN Magazine : CLR Inside Out : .NET Application Extensibility Part 2  by Jack Gudenkauf and Jesse Kaplan

This is the second article of the extensibility series that describes the pipeline architecture that was used, each of the components and how it is implemented. A must read anyone who wants to implement the addin model.

How To: Build an Add-In using System.AddIn by Guy Burstein

This is the practical hands on example with code examples and many screenshots. A tutorial cannot get any better than this. A must visit to the addin implementer. Make sure you visit this article, this is written by a tech evangellist at Microsoft.

CLR Add-In Team Blog

This is the .NET based blog of the people who made the framework and contains many inside information you may not find in the documentation. It is a nice place to visit every once in a while.

Everything you need to know to get started with System.AddIn by James Green

This is another good start point for addin examples with visual diagrams.

kick it on DotNetKicks.com


New* .NET 3.5 Feature: AddIn Framework ( Part 1 of 2 )

2ne of the new items that were included in the .NET 3.5 framework is a built in way to add extensibility to your application using add-ins, also known as plug-in. Many of us have already added extensibility in our own application using interfaces. However this framework comes with a few built in features for addin lifetime management, security isolations etc.

Sandbox Isolation

If we want to add termination features to an addin we would need to load it to to different appdomain since we may want to unload it or reload it when any error happens. Also when the addin runs in a separate domain space it is less likely to corrupt any part of the application.

sandbox

Please also note that when an addin is unloaded, the unloading causes the other assemblies to be unloaded as well on which the addin is dependent upon. This happens because the AppDomain is unloaded.

Discovery

The new framework also supports discovery of addins within a folder. Also you can also search for a certain addin. According to the documentation each addin has its own folder and its own set of assemblies.

Security

The security of the sandboxed addin can be created when we create the application domain for the addin to run on or we can even use the policy level security to control the addin behavior.

Versioning

Versioning is provided via contract isolation, both the addin host and the addin itself can version independently of each other. A concept of adapter assembly is present for both the add in and the host so that the implementation can change independently.

Termination

Due to running the addin in different AppDomain boundary termination of the AppDomain automatically clears memory and all other resources. However if we needed to do this manually then we would have to find the application domain that hosts the assembly and unload it. The framework provides a nifty class to unload the addin and its AppDomain

AddInController.GetAddInController(addin).Shutdown();

Next

Related Post:

Dll heaven: Executing multiple versions of the same assembly

kick it on DotNetKicks.com


Dll heaven: Executing multiple versions of the same assembly

Everyone wants their application to be backward compatible and if the application is based on a plug-in based architecture then such a feature can be a nice addition. When can such a feature be useful? Suppose we have a plug-in subsystem which we have upgraded to newer plug-in system and we want to support older systems.

So if we want to load 2 different versions of a same assembly what do we need to do?First, each version needs to be loaded into 2 different AppDomains.

Why ?

Because when an assembly is loaded it cannot be unloaded so if we have 2 different assemblies once a type has been loaded from one of them, it will be reused each time we want to instantiate a type. But an assembly loaded into an AppDomain can be unloaded by unloading the AppDomain itself. I am going to demonstrate how to load 2 versions of the same assembly into same executable and run them simultaneously.

Example Step 1: In order to make it simple and avoid the reflection trouble I am going to create an interface and put it into a dll and reference the interface to the concrete dll. So here is code for the interface class in a simple dll that will be used to invoke methods.

namespace InterfaceLib
{
    public interface IExecute
    {
        string Execute();
    }
}

Example Step 2:I will build this into a single dll and then reference it to the interface. We need to make the class inherit from MarshalByRefObject since we want to communicate across AppDomains.

namespace AssemblyVersionTest
{
    public class SimpleClass : MarshalByRefObject,
        InterfaceLib.IExecute
    {
        public string Execute()
        {
            return "This is executed from version 1";
        }
    }
}

So we have a class that is referenced from MarshalByRefObject and implement our interface. Lets compile this dll and after doing so ... take the dll and rename it to AssemblyVersionTest1.dll

Lets now change this line

return "This is executed from version 1;

to

return "This is executed from version 2";

Then again compile the dll and rename it to AssemblyVersionTest2dll

Example Step 3:Now we come to our third application, the console executable. Lets put the 2 versions of the dll to the bin path of the console executable and use the following code

// Create an app domain
AppDomain appDomain1 = AppDomain.CreateDomain("Version1");
// Instantiate the object from the app doamin and the first version file
object obj= appDomain1.CreateInstanceFromAndUnwrap(
    AppDomain.CurrentDomain.BaseDirectory + "\\AssemblyVersionTest1.dll",
    "AssemblyVersionTest.SimpleClass");
IExecute iex = (IExecute)obj;

// Instantiate the object from the app doamin and the second version file
AppDomain appDomain2 = AppDomain.CreateDomain("Version2");
object obj2 = appDomain2.CreateInstanceFromAndUnwrap(
    AppDomain.CurrentDomain.BaseDirectory + "\\AssemblyVersionTest2.dll",
    "AssemblyVersionTest.SimpleClass");
IExecute iex2 = (IExecute)obj2;

Console.WriteLine(iex.Execute());
Console.WriteLine(iex2.Execute());

Now observe that we have had renamed the different versions of the dlls and loaded them explicitly. The output of the above code should be ...

This is executed from version 1
This is executed from version 2

Now we can see that the domains have loaded 2 different versions of the same class in the same executable and executed them.

kick it on DotNetKicks.com


WeakReference: GC knows the Best

When I first read about weak references in .NET more than 5 years ago back my first thought was to use it for Caching. The concept was already present in Java before .NET since Java had garbage collection before. Still today I don't see many developers using this awesome class.

What is a Weak Reference?

We all know that garbage collectors start cleaning memory for objects that do not have any reference. A weak reference is a way to have some pointer to an object that does have any reference (strong reference). When we need to access a weak referenced object we can just check if the object is alive and then access it if the object is alive at all. Since .NET is a garbage collection based runtime environment, like all GC based runtimes it does not immediately clean up the memory allocated for the instantiated objects. 

Why should we use it?

Not having a strong reference to the object but at the same time having a pointer to the object enables this class to be well suited for caching. 

When to use Weak Reference?

Since GC executes when there is memory pressure and cleans objects that are in memory. At the same time if memory and processing is expensive for your application then you can reduce pressure on memory and processing at the same time. Let try an example ... 

Lets assume that we have an object that contains is 500KB of data and when we fetch it it quite expensive to get because of the IO operation required to fetch it from database and we need to validate it with some rule. And at the same time we have to have 1000 of these objects instantiated with different sets of data. 

We can use traditional cache but that would use too much memory or we can fetch the instance each time for database. Both solutions have its own flaw. The first uses too much memory and the second one uses too much processing. This would be the best solution to use weak reference 

There are 2 cases possible when we need to access any instance of the object in question 

1. It may not be garbage collected: So the object is still in memory and we can associate a strong reference to it and use it. This saves performance but uses memory without any extra pressure since GC takes the best decision when to collect. 

2. It may have been collected and does not exist anymore: In this scenario we will fetch the object again. So we would be using processing power. The memory pressure was high and GC decided to collect and our onject went with that so we need to fetch the object again. Here again we are letting GC decide when the memory pressure is enough that we would to a high processing action. 

So the basic belief behind WeakRefernce is that "GC knows best". It will clean up memory when needed and we puny humans should respect its decision on memory management. 

How to use WeakReferences?

All you need to do is create a WeakReference class with the object in question passed into the constructor. Keep a storng refence to the weak reference object. When need later then check if obect is alive by checking 'IsAlive' property and then use it. The code sample below shows the lifetime of an object when using a weak reference ...

// Create the object
Book book = new Book("My first book", "Me");
// Set weak reference
WeakReference wr = new WeakReference(book);
// Remove any reference to the book by making it null
book = null;

if (wr.IsAlive)
{
    Console.WriteLine("Book is alive");\
    Book book2 = wr.Target as Book;
    Console.WriteLine(book2.Title);
    book2 = null;
}
else
    Console.WriteLine("Book is dead");
            
// Lets see what happens after GC
GC.Collect();
// Should not be alive
if (wr.IsAlive)
    Console.WriteLine("Book is alive");
else
    Console.WriteLine("Book is dead");

The output should be

Book is alive
My first book
Book is dead

So folks ... that all about weak references.

kick it on DotNetKicks.com