I’m not using this blog space for much right now, so I thought I’d start a project and document it over time here. This will be the main post for the EvilVM project, and I’ll try to link subsequent posts into this article over time so it’ll become a sort of index page for it.

  1. EvilVM: Garbage Collector (2016-06-25)
  2. EvilVM: Executing Instructions (2016-07-08)
  3. EvilVM: Choosing Useful Instructions (2016-07-23)
  4. EvilVM: Building an Assembler (2016-07-31)
  5. EvilVM: Concatenative Language – coming soon

Introduction & Motivation

I’m a pen-tester for work, but in real life I’m a hacker. I don’t mean hacker like Johnny Lee Miller and Angelina Jolee. I’m also not a hacker like you see at Defcon. I mean I’m a hacker like what ESR talks about in the jargon file. Before I was infosec, I was a programmer, computer science student, and UNIX afficianado. My tech-heroes were the guys in the story of Mel and Steven Levy’s book about the MIT hacker subculture.

A hacker, properly defined by ESR, is:

A person who enjoys exploring the details of programmable systems and how to stretch their capabilities, as opposed to most users, who prefer to learn only the minimum necessary. RFC1392, the Internet Users’ Glossary, usefully amplifies this as: A person who delights in having an intimate understanding of the internal workings of a system, computers and computer networks in particular.

Probably my favorite piece of tech is programming languages. I’m a language hobbyist, picking up new ones all the time, hoping to make my toolbox bigger and broader so I can get stuff done quickly and powerfully when required. In the pen-testing world, I have to make some concessions in this area, because for some reason most infosec researchers stick with just a couple languages.

But I end up writing code all the time. In the PCI world, penetration testing comes with a challenging “boss level” for every assessment. Once you work your way around a network and have an escalated position, your final stage is to compromise the cardholder data environment. If they did their job right, it’s pretty hard. And so I end up doing a lot of post-exploitation, trying to mount a reasonable attack on the segmentation boundary. This often means custom malicious agents that operate on end-user systems, and the specifics of their operation change with every environment and segmentation design.

I’ve experimented with augmenting Meterpreter, building agents that embed other languages (like Lua, which really worked out pretty well, to be honest), and just embracing the pain and writing things in C (like my RSA-pwning keylogger I presented at DerbyCon last year). But none of these solutions has been ideal.

Cue EvilVM

I think it’s time to build a malicious programming language. There are unique programming challenges that you find in writing malicious agents, and no matter what existing environment you choose, you’re going to be making compromises and choose which part you’re OK with still being “hard”. You can use a fun, fuzzy language, but then low-level stuff is hard. You can use your favorite scripting language, but then you’re bundling a big interpreter and a bunch of files. You can use PowerShell, but then you’re fighting all the insane restrictions Microsoft seems to think should apply to a script but not the binaries. Etc.

In this project, I’m going to build EvilVM – a runtime environment that implants a super-small bytecode VM on a target system. On top of this, I will build a companion language (or languages, why settle for one?) for compiling agent behavior for this VM, and the infrastructure needed to sustain persistent communication and interaction with the VM. Along the way, I have a few ideas about core language and runtime features that I think might make a good platform for remote-managed agents.

A few basic thoughts about my initial requirements:

  • The EvilVM agent must be very small, 10s of kilobytes, max
    • Single file
    • EXE or injectable DLL
  • I/O streams should be heavily abstracted to allow even very strange communications strategies
  • I may borrow some Erlang concepts for multi-processing and reliability
  • Easy interoperability with the native environment
    • Use native DLLs
    • Call into C code
    • Manipulate native structures
  • Delivery of additional code at runtime
    • Compiled bytecode for the VM
    • Delivery of DLLs or C object code
  • High level, expressive language
    • First class functions
    • Dynamic typing
    • Metaprogramming
    • Very small core language
  • High availability
    • “Hypervisor” layer control of deployed VM
    • Reflection and discoverability
  • Native support for “malicious” actions
    • DLL injection
    • Manipulating security contexts / tokens
    • Monitoring for ‘target’ behavior
    • Keylogging

There are also some anti-requirements – things that are definitely not the objective:

  • Performance – if it’s within 100x of C, I’m good
  • Large standard library
  • Graphics, databases, etc.

So far, this has been a really fun project, and I hope to keep it up to completion. I’ve been a language nerd forever, but this’ll be the first full-spectrum language project I’ve done – from core language to compiler for a custom bytecode VM. Can’t wait!