Tuesday, September 30, 2008

What Happens to the .NET Code You Write

What Happens to the .NET Code You Write?

Ever wondered what happens to your code on compilation and execution?

Suppose you write a code like this:

static void Main(string[] args)

{

TestClass testClass = new TestClass();

}



The managed language compiler builds it into MSIL (Microsoft Intermediate Language) like this:

.method private hidebysig static void Main(string[] args) cil managed{ .entrypoint .maxstack 1 .locals init ( [0] class DotNetCodeToMachineCode.TestClass testClass) L_0000: nop L_0001: newobj instance void DotNetCodeToMachineCode.TestClass::.ctor() L_0006: stloc.0 L_0007: ret }
On execution, CLR’s JIT compiler will convert this to assembly language like this (for a 32 bit x86 processor):



//static void Main(string[] args)

//{

00000000 push ebp

00000001 mov ebp,esp

00000003 push edi

00000004 push esi

00000005 push ebx

00000006 sub esp,34h

00000009 xor eax,eax

0000000b mov dword ptr [ebp-10h],eax

0000000e xor eax,eax

00000010 mov dword ptr [ebp-1Ch],eax

00000013 mov dword ptr [ebp-3Ch],ecx

00000016 cmp dword ptr ds:[0091856Ch],0

0000001d je 00000024

0000001f call 794C717F

00000024 xor edi,edi

00000026 nop

//TestClass testClass = new TestClass();

00000027 mov ecx,3450248h

0000002c call FFCA0E54

00000031 mov esi,eax

00000033 mov ecx,esi

00000035 call FFCBB3C0

0000003a mov edi,esi

//}

Note: Here - ebp, esp, eax, esi, edi etc are the general purpose registers of the x86 processor.



The assembly will be converted to machine language (binaries) before loading into the instruction area of the RAM. The Hex representation of machine code is given below:

//static void Main(string[] args)

//{

00000000 55 //push ebp

00000001 8B EC //mov ebp,esp

00000003 57 //push edi

00000004 56 //push esi

00000005 53 //push ebx

00000006 83 EC 34 //sub esp,34h

00000009 33 C0 //xor eax,eax

0000000b 89 45 F0 //mov dword ptr [ebp-10h],eax

0000000e 33 C0 //xor eax,eax

00000010 89 45 E4 //mov dword ptr [ebp-1Ch],eax

00000013 89 4D C4 //mov dword ptr [ebp-3Ch],ecx

00000016 83 3D 6C 85 91 00 00 //cmp dword ptr ds:[0091856Ch],0

0000001d 74 05 //je 00000024

0000001f E8 5B 71 4C 79 //call 794C717F

00000024 33 FF //xor edi,edi

00000026 90 //nop

//TestClass testClass = new TestClass();

00000027 B9 48 02 45 03 //mov ecx,3450248h

0000002c E8 23 0E CA FF //call FFCA0E54

00000031 8B F0 //mov esi,eax

00000033 8B CE //mov ecx,esi

00000035 E8 86 B3 CB FF //call FFCBB3C0

0000003a 8B FE //mov edi,esi

//}



But remember that in RAM it will be saved as pure binaries like this:

//static void Main(string[] args)

//{

00000000 01010101 //push ebp

00000001 10001011 11101100 //mov ebp,esp

00000003 01010111 //push edi

00000004 01010110 //push esi

00000005 01010011 //push ebx

00000006 10000011 11101100 00110100 //sub esp,34h

00000009 00110011 11000000 //xor eax,eax

0000000b 10001001 01000101 11110000 //mov dword ptr [ebp-10h],eax

0000000e 00110011 11000000 //xor eax,eax

00000010 10001001 01000101 11100100 //mov dword ptr [ebp-1Ch],eax

00000013 10001001 01001101 11000100 //mov dword ptr [ebp-3Ch],ecx

00000016 10000011 00111101 01101100 10000101 10010001 00000000 00000000 //cmp dword ptr ds:[0091856Ch],0

0000001d 01110100 00000101 //je 00000024

0000001f 11101000 01011011 01110001 01001100 01111001 //call 794C717F

00000024 00110011 11111111 //xor edi,edi

00000026 10010000 //nop

//TestClass testClass = new TestClass();

00000027 10111001 01001000 00000010 01000101 00000011 //mov ecx,3450248h

0000002c 11101000 00100011 00001110 11001010 11111111 //call FFCA0E54

00000031 10001011 11110000 //mov esi,eax

00000033 10001011 11001110 //mov ecx,esi

00000035 11101000 10000110 10110011 11001011 11111111 //call FFCBB3C0

0000003a 10001011 11111110 //mov edi,esi

//}



Before executing the method [“Main()” in this case], the starting address of that method is pushed onto the “Call Stack” along with it’s parameters and local variables. The reference variables (object pointers) will also be placed on the stack. These references will be pointing to their objects residing on the heap area of the RAM.



Instruction binaries will be moved to the processor from the RAM (normally chunks of this will be buffered in the L1/L2 cache of the processor for speedy access) and will be executed one by one. Intermediate results, flags and certain pointers (stack pointer, program counter etc) will be saved in the processor registers. Result binaries will be saved back to the RAM. As and when the method is completed, the binaries those were “pushed to” for that method execution get “popped out”(top most items) and removed from the stack. This cycle repeats for the rest of the methods as well.



You don’t have to worry about all these steps while you “JIT and Run”. But it would be interesting to think that you are juggling with thousands of binaries while writing a few lines of code!! Hope you enjoyed it!

No comments:

Post a Comment

 
web counter
Download a free hit counter here.