Saturday, July 10, 2010

Just In Time Compiler for Managed Platform- Part 1: Code Generation

First we need to generate executable code block.

So let us write some code in C++.
int add(int x, int y) 
{
int r;
r= x+y;
return r;
}

int main()
{
int r1 = add(13, 23);
printf("Returned value = %d", r1);
}

Great! It returns the right value. OK, but that is very basic. We want to generate the function from data in a simple buffer-

First we need a machine equivallent code for the function above:

unsigned char addcode[] = {  
0x55, //push ebp
0x8B, 0xEC, //mov ebp,esp
0x81, 0xEC, 0xC0, 0x00, 0x00, 0x00, //sub esp,0C0h
0x53, //push ebx
0x56, //push esi
0x57, //push edi

//r=x+y;
0x8B, 0x45, 0x08, //mov eax,dword ptr [x]
0x03, 0x45, 0x0C, //add eax,dword ptr [y]
0x89, 0x45, 0xF8, //mov dword ptr [r],eax

//return r;
0x8B, 0x45, 0xF8, //mov eax,dword ptr [r]

0x5F, //pop edi
0x5E, //pop esi
0x5B, //pop ebx
0x8B, 0xE5, //mov esp,ebp
0x5D, //pop ebp

0xC3 //ret
};

OK, this code is generated by Visual Studio compiler. We use it to generate our own code block in memory.

To get a memory block we can use to generate executable code we use the following Windows API:

LPVOID WINAPI VirtualAlloc(
__in_opt LPVOID lpAddress,
__in SIZE_T dwSize,
__in DWORD flAllocationType,
__in DWORD flProtect
);

First we allocate a 4096 byte executable code block and put it in a function pointer:

int (*addfn)(int, int) = (int (*)(int, int)) VirtualAlloc(NULL, 4096,  MEM_COMMIT, PAGE_EXECUTE_READWRITE);

Then we copy our executable code to this memory block:

memcpy(addfn, addcode, sizeof(addcode)); 

Now the majic - we call the function and print the return value:

int r1 = (*addfn)(13,23); 
printf("Returned value = %d", r1);

Thats easy- right?

Now we release the memory since we are gentle citizen-

VirtualFree(addfn, NULL, MEM_RELEASE);

Thats it. here is the full code:
#include [windows.h][stdio.h]...

unsigned char addcode[] = {
0x55, //push ebp
0x8B, 0xEC, //mov ebp,esp
0x81, 0xEC, 0xC0, 0x00, 0x00, 0x00, //sub esp,0C0h
0x53, //push ebx
0x56, //push esi
0x57, //push edi

//r=x+y;
0x8B, 0x45, 0x08, //mov eax,dword ptr [x]
0x03, 0x45, 0x0C, //add eax,dword ptr [y]
0x89, 0x45, 0xF8, //mov dword ptr [r],eax

//return r;
0x8B, 0x45, 0xF8, //mov eax,dword ptr [r]

0x5F, //pop edi
0x5E, //pop esi
0x5B, //pop ebx
0x8B, 0xE5, //mov esp,ebp
0x5D, //pop ebp

0xC3 //ret
};

int main()
{
int (*addfn)(int, int) = (int (*)(int, int)) VirtualAlloc(NULL, 4096, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
memcpy(addfn, addcode, sizeof(addcode));

int r1 = (*addfn)(13,23);
printf("Returned value = %d", r1);

VirtualFree(addfn, NULL, MEM_RELEASE);

return 0;
}

Thats all for now. We can now generate code in memory and execute it. First step to design a JIT compiler.