Well, I guess I settled on that last addressing mode. It's two more modes, which pretty much just copy the existing immediate-memory-location and something-from-the-stack addressing modes.

Again, my terminology might be way off, but I'm going to call it indirect-whatever mode.

Syntax in the assember for it is **address, or *$stack_offset.

Here's what it's doing in code for the "fetch" part. (Excuse the crappy temporary error handling for now.)

inline LilyVM::Word LilyVM::fetchParameterByType(LilyVM::ParamType type)
    switch(type) {
        case PARAM_NULL:             return 0;
        case PARAM_IMMEDIATE:        return fetchInstruction();
        case PARAM_ADDRESS:          return fetchRaw(fetchInstruction());
        case PARAM_STACK:            return fetchRaw(stackPointer + fetchInstruction());
        case PARAM_INDIRECT_ADDRESS: return fetchRaw(fetchRaw(fetchInstruction()));
        case PARAM_INDIRECT_STACK:   return fetchRaw(fetchRaw(stackPointer + fetchInstruction()));


    // TODO: Throw error (bad instruction).
    cout << "Bad addressing mode for read: " << type << endl;

    return 0;

And here's a little example snippet of assembly demonstrating it.

    mov 123 **someAddress
    mov 123 *$somestackOffset

The ability to add in-line, immediate data was also added in the last couple of revisions. There are assembly directives to have the next few Words be some explicit values, or to fill the next 'n' Words with some specific value.

I think the incomplete VM code might almost be not-embarrassing enough to toss up GitHub soon.

Next up: Profiling and comparisons against Lua and native code. Then taking out some of the modulo crap and trying it again.

[ permalink / comments ]

To test the existing functionality of LilyVM and its assembler, I decided to implement a simple, well-known cryptography algorithm, RC4. It's not the kind that anyone should use today, because it's got a lot of known effective attacks against it, but it's an easy real-world example of a simple algorithm to play with.

I also like to use it as a random number generator.

Lessons learned tonight:

  • Some more addressing modes to handle things like de-reference pointers without separate load/store instructions would make things MUCH more concise. I think I can add more addressing modes without any overhead.
  • I thought about adding a "swap" instruction because it's done twice in this example. Still not sure about this one.
  • I need better debugging tools! (You'd think inspecting the state of your own VM would be easy.)
  • The code is getting a bit messy and needs a cleanup pass.
  • I really really really need a way to define arbitrary data. In this example I used instructions in place of data just so I had data to work with on the algorithm. Something to just occupy a block of some number of Words, literal numbers, and possibly literal strings would go a long way here.
  • Emacs's asm-mode is barely suitable for this. Maybe I just need to get used to it.
  • I ended up using the stack as though it was just a big series of registers. I guess it's how local variables really would be used normally, so maybe it's not so bad.
  • Being able to create labels that map to arbitrary values instead of just memory locations could have replaced a lot of the arbitrarily-numbered stack positions.
  • I lack sufficient error handling. I want to avoid excessive error checking, and I want to run in places where exceptions are disabled. (I'm looking at you, Unreal 4.) I might have to resort to the evil black magic that is setjmp()/longjmp().
  • I have need function calls. I have no kind of calling convention or anything. There was only a little bit of duplicated code here, though.

Here's the RC4 implementation I made a long time ago that I tested it against.

The actual code follows below. Excuse the lack of syntax highlighting. I guess I need to make the syntax a little more like some common assembler syntax to use an off-the-shelf syntax highlighter.

View full article

[ permalink / comments ]

Disclaimer: I am not experienced at this, so I might use the wrong terminology, or have some totally stupid ideas. The point is to learn by doing, and sometimes that involves failing and looking really stupid. Now having said that...

Last night I started working on a project that I've been toying with the idea of for a while.

I want to make a virtual machine that meets all the following critera:

  • Portable, with no weird dependencies. Should just compile with my utilities library and standard C++. Should run on every platform supported by C++.
  • Sandboxable. Should be able to run untrusted code. (Lua is not built for this.)
  • Really fast. As fast as possible without just going and writing a JIT. (Don't want to deal with individual platform weirdness just yet.)
  • Pause/resume capable. (Something DerpScript is not.)
  • Can save and restore VM state. (Something Lua cannot do with any level of sanity.)

And optionally:

  • A new LLVM backend targeted to it, or an assembler that can read LCC's intermediate assembly representation in a similar manner to q3asm.

And for these goals:

  • Scripting for games, allowing fast iteration time and dynamic reloading of game logic. The execution time must be fast to be suitable for this.
  • User-programmable game scripts that can safely be run on a multiplayer server, even though they're written by potentially hostile players. We need sandboxing, suspend/resume support, and the ability to take and restore state snapshots for this to work out.
  • General purpose scripting. Maybe get it integrated with that texture generator tool I was working on. For this, it needs a sane API and the ability to hold references to data outside of the VM. I don't know how I'm going to handle the latter part, and the way I did it in DerpScript isn't going to cut it.

So after one night of work I have the start of the VM, a mostly complete assembler (not using LCC's assembly), and a disassembler. At the moment it's possible to write some very simple assembly programs and do basic flow control and memory access. Only the "add" instruction has been implemented on the arithmetic side.

The specs of the VM are (right now):

  • No addressable general purpose registers. Everything is just direct memory access. There's a program counter and stack pointer, and that's it at the moment. So far there is no way to modify them directly. Push/pop instructions exist as a way to modify the stack pointer, with no direct access (yet). Jump and branch instructions modify the program counter, but there's no way to read it (yet).
  • Three addressing modes: Immediate, direct memory, and indirect with an immediate value as an offset from the stack pointer. There are additional instructions for dereferencing pointers in memory. (This could all change, because I'd rather have the dereference-pointer ops be replaced with addressing modes built into the opcodes.)
  • Memory is divided into 32-bit Words. Each instruction with encoded addressing modes takes up exactly one Word, plus one Word per parameter. This can change with a modification to a typedef to change the meaning of Word, but there's a minimum usable size. 8-bit Words won't work just because the encoded instructions won't fit in them.

That's all I have right now. I'll add more information as I piece it together. I haven't done a good job of making the code presentable, so no source code or example code yet. But maybe soon.

[ permalink / comments ]

Next page
Copyright © 2015 Kiri Jolly
All rights reserved