I've been experimenting with building a JIT in Python, and came across Keystone Engine.
You can install the entire Keystone module using pip
if you plan
on only using it from within Python,
pip install keystone-engine
To write our experimental JIT, we'll need to use a few functions from the Windows API. The basic idea is that we need to allocate some memory, then copy over some assembled code into that memory and be able to make it executable.
VirtualProtect
,
VirtualAlloc
,
and
VirtualFree
We can use the ctypes
module to interface with these Windows API functions,
It's just a matter of translating each function's respective arguments and return types
into ctypes
types.
import ctypes
We'll first write an error check function which are called implicitly after the function is called
to verify the result of the API call. If there is an error we know that these API Functions all call
SetLastError
, and the ctypes.WinError
exception will call GetLastError
to retrieve the proper error code.
def win32_bool_ptr_errcheck(result, func, args):
if not result:
raise ctypes.WinError()
return result
For VirtualProtect
the signature is,
BOOL VirtualProtect(
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flNewProtect,
PDWORD lpflOldProtect
);
And we can write that in Python as,
VirtualProtect = ctypes.windll.kernel32.VirtualProtect
VirtualProtect.restype = bool
VirtualProtect.argtypes = [ctypes.c_void_p, ctypes.c_size_t,
ctypes.c_int, ctypes.POINTER(ctypes.c_int)]
VirtualProtect.errcheck = win32_bool_ptr_errcheck
For VirtualAlloc
,
LPVOID VirtualAlloc(
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flAllocationType,
DWORD flProtect
);
the respective Python,
VirtualAlloc = ctypes.windll.kernel32.VirtualAlloc
VirtualAlloc.restype = ctypes.c_void_p
VirtualAlloc.argtypes = [ctypes.c_void_p, ctypes.c_size_t,
ctypes.c_int, ctypes.c_int]
VirtualAlloc.errcheck = win32_bool_ptr_errcheck
and finally VirutalFree
,
BOOL VirtualFree(
LPVOID lpAddress,
SIZE_T dwSize,
DWORD dwFreeType
);
the respective Python,
VirtualFree = ctypes.windll.kernel32.VirtualFree
VirtualFree.restype = bool
VirtualFree.argtypes = [ctypes.c_void_p, ctypes.c_size_t, ctypes.c_int]
VirtualFree.errcheck = win32_bool_ptr_errcheck
VirtualAlloc
and VirtualProtect
both take a memory protect flag flProtect
,
and VirtualAlloc
takes an addition allocation type flag flAllocationType
.
We can represent these in Python using the enum
module,
import enum
class Page(enum.IntEnum):
EXECUTE = 0x10
EXECUTE_READ = 0x20
EXECUTE_READWRITE = 0x40
EXECUTE_WRITECOPY = 0x80
NOACCESS = 0x01
READONLY = 0x02
READWRITE = 0x04
WRITECOPY = 0x08
class Memory(enum.IntFlag):
COMMIT = 0x00001000
RESERVE = 0x00002000
RESET = 0x00080000
RESET_UNDO = 0x1000000
DECOMMIT = 0x00004000
RELEASE = 0x00008000
We'll be implementing a simple square function,
int square(int a) {
return a * a;
}
We know that from the x64 calling convention, a function like this would assemble to,
square:
mov rax, rcx
imul rax, rax
ret 0
We'll define the signature in Python for our square function like so, taking an integer and returning an integer,
SQUARE_PROC = ctypes.CFUNCTYPE(ctypes.c_int, ctypes.c_int)
Using Keystone is quite simple in Python, we just create a an instance of Keystone with the hardware architecture and the mode and then encode our assembly.
CODE = b"""
mov rax, rcx
imul rax, rax
ret 0
"""
ks = keystone.Ks(keystone.KS_ARCH_X86, keystone.KS_MODE_64)
encoding, _ = ks.asm(CODE)
Now we just need to allocate some memory for the encoding, copy over the bytes from the Keystone encoding, and then permissions to execute and cast the memory to a function pointer and we can execute it.
#Allocate space for the encoding.
memory = VirtualAlloc(None, len(encoding), Memory.COMMIT, Page.READWRITE)
#Copy over the bytes from the Keystone encoding.
ctypes.memmove(memory, bytes(encoding), len(encoding))
#Modify the permissions to allow it to be executed.
old_protect = ctypes.c_int(0)
VirtualProtect(memory, len(encoding), Page.EXECUTE, ctypes.byref(old_protect))
#Create a callable function from our signature.
square = SQUARE_PROC(memory)
#And finally, call our function.
print(square(2))
VirtualFree(memory, 0, Memory.RELEASE)
And we've created our square function at run time and can execute it.
You can view the source for this experiment here.