This is part of my series on EvilVM, a project to create a tiny injectable VM with language support for EvilTM agent features for pen-testers like me. In other words, a programming language and agent framework specifically designed for post-exploitation. The goal is to allow ad-hoc, reliable adaptation of malicious agents to situational needs during a pen-test. And to build it from scratch for fun (of course). The index and overview for this project can be found here.

Executing Instructions

Last time I got my basic garbage collector up and running. In the future I’ll make some enhancements to it – I’ve been reading more about GC design, which is a topic that goes as deep as you want. But for now, I’d like to start getting some actual code running. Code running means my GC needs a bytecode virtual machine wrapped around it. I’ve looked over the design of a few other VMs – notably, the JVM, the Squeak Smalltalk VM, Lua, and a few toy VMs on github. I believe I’m doing this in a relatively normal way, though this is also a topic that goes as deep as you like.

The first thing that came to mind was, “What instructions do I need?” I.e., what does my VM actually need to do? Most language interpreter / compiler tutorials walk through doing basic arithmetic first. In fact, what I find most frustrating about most language tutorials is that they often stop there, leaving me hanging on how to handle a more complex (and actually useful) language.

So I think I’m going to take a slightly different approach to building and presenting my bytecode VM. The first order of business is the ability to run instructions. What are instructions? What is code? Executable code may as well be the same as in a physical machine – a sequence of bytes. In hardware architecture, instruction sets are very complicated, but I’m targeting something small and simple. So code looks like this:

uchar *code;

An instruction is a single byte. It either takes no argument or one argument, and the argument can’t be anything bigger than a 32-bit number. To execute code, the interpreter needs a function for each instruction, and some process for walking through the sequence of instructions. It turns out that C has some mechanics that work perfectly for this; I’ll use a loop and a table of function pointers. The functions look like this:

typedef uchar* (*funptr)(VM *vm, uchar *code);
uchar* (*vm_funtable[256])(VM *vm, uchar *code);

If you’re not used to function pointers in C, this says that a “funptr” is a pointer to a function that takes a “VM” struct and a pointer to an unsigned character, returning another pointer to an unsigned character. In other words, the instruction receives the current configuration of the VM and the current address being executed. It has to return another code pointer because some instructions may redirect execution (think of jumps, branches, function calls, etc.). In the average case, the function will just return the address of the next instruction. In any case, if the instructions work this way, then the VM executes code with a loop like this:

EVILAPI void *vm_exec(VM *vm, uchar *code) {
  uchar *ip   = code;
  
  while (ip && !(*ip == 'h')) {
    ip = (*vm_funtable[*ip])(vm, ip);
  }
}

I’ll be adding some more stuff in there eventually (error handling, performance monitoring, etc.). But that’s basically it – one whole interpreter. And if you didn’t notice, I slipped in the definition of our first instruction. It’s the HALT instruction, designated by a 0x68 byte (or ‘h’ in C). This means that our interpreter can already evaluate a very simple program:

vm_exec(vm, "h");

This program isn’t very useful, but at least we know we can successfully halt execution! Halting is about as basic as it gets. But there’s something else that’s a little less basic, but just as important – NOPs! A NOP is a “no operation” instruction. I want to define a default NOP so that any undefined instruction is just treated as such. So this will look something like this:

uchar *vm_instr_nop(VM *vm, uchar *code) {
      return code + 1;
    }

A NOP just does nothing, so all our interpreter function for this instruction has to do is increment the pointer to the next instruction’s address. It seems silly, but we want to at least be able to handle erroneous code somewhat gracefully, as well as giving the programmer and/or compiler flexibility in putting code together. To make this functional, we add NOPs to the vm_funtable:

void vm_init_funtable() {
  for (int i = 0; i < 256; i++) {
    vm_funtable[i] = &vm_instr_nop;
  }
}

So now our slightly more useful VM can run these exciting programs, which will run a bunch of NOPs and then halt:

vm_exec(vm, "nnnnnnnnh");
vm_exec(vm, "Hello worldh");
vm_exec(vm, "abcdefgh");

We want to do interesting things, though, so we need some powerful instructions. Let’s start with an instruction that interacts with the stack. The PUSHC instruction puts a literal character onto the stack. We need to define a function to handle this instruction:

uchar *vm_instr_pushc(VM *vm, uchar *code) {
  object *obj = vm_object(vm, TYPE_CHAR);
  obj->i8_value = *(code + 1);
  vm_push(obj, vm);
  return code + 2;
}

So what’s going on here? As I said above, the function starts with the current VM state and the address of the current instruction (which should be a PUSHC char – you’ll see in a minute that it’s denoted by the character ‘c’). The immediate value in this case is a single character, which is just the character literal we want to put on the stack. We allocate space with the GC for an object, make sure it’s the character type, and set its value. Once it’s pushed on the stack, we increment the instruction pointer by 2 and return the address of the next instruction. Why 2? PUSHC is an instruction with a 1-byte argument, so it takes up 2 bytes of code space.

Now we can add this definition to the vm_funtable array, and our interpreter should be able to understand this new instruction:

void vm_init_funtable() {
  for (int i = 0; i < 256; i++) {
    vm_funtable[i] = &vm_instr_nop;
  }
  vm_funtable['c'] = &vm_instr_pushc;
}

So, as before, now we can make an interesting program that actually does something:

vm_exec(vm, "cHceclclcoc cwcocrclcdc!h");
// stack: 'H' 'e' 'l' 'l' 'o' ' ' 'w' 'o' 'r' 'l' 'd' '!'

In a similar manner, we can create functions that push other objects onto the stack. For example, here’s one way we can push a 32-bit integer onto the stack:

uchar *vm_instr_pushi(VM *vm, uchar *code) {
  object *obj = vm_object(vm, TYPE_INT);
  obj->i32_value = *((int*)(code + 1));
  vm_push(obj, vm);
  return code + 5;
}

This code just takes the next location in the code region, casts it to an integer pointer, and reads it in to an object to push on the stack. Note that this code’s exact behavior will possibly vary depending on the endianness of the machine where it’s compiled. It does make it efficient though, not to need to do a bunch of bit shifting to get the value out. Also notice that since a 32-bit integer takes up 4 bytes, that once this instruction is done, the instruction pointer must increase by 5. Here’s an example using it:

vm_exec(vm, "i\x01\x00\x00\x00h";

This pushes the number 1 onto the stack (watch that endianness!).

Alright, that’s it for this installment – next time I’ll go into booleans, tests, and jumps. That’s where this bytecode VM will actually get useful for writing interesting code.