2016-11-16 Virtual Machine Memory Layout for a Functional Language for Embedded Systems

I am currently designing a virtual machine for new language, with the following goals:

  • Memory constrained – should be able to run on a Pebble watch, which has 22KB available for user applications, as well as on larger systems.
  • Purely Functional Language on top, means that pointers always point back in time/space, – which has nice features that can be exploited in the garbage collector.
  • Clojure-like Persistent Data Structures means we will have tons of short-lived objects.

The best approach for a garbage collector, is probably a generational mark-and-compact. The mark-and-compact has the benefit of no fragmentation, and thus saving memory. Due to pointers always point backwards, it is possible to use the same algorithm in generations.

In order to be compact, the memory word-size of the virtual machine depends on the amount of memory, i.e. 16 bit if less than 64K.

Every chunk of memory has the following layout:

  • one word with the size of the memory chunk size
  • one word containing a type tag, and a count of pointers ptr (which are always placed in the beginning of the chunk).
  • one word used for garbage collecting and compacting gc.
  • ptr words of pointers or tagged integers (if uneven, shift one right)
  • sizeptr * bytes-per-word bytes of data

Allocation is just to create a new chunk after the most recent chunk. The tag/sizes are set to their proper value, the gc-word and pointers are zeroed out.

Garbage collection happens through 4 phases:

  1. Marking: gc-word is used to keep a -1 -terminated linked lists of memory chunks yet to visit.
  2. Address calculation. Memory is traversed, a counter keeps track of used memory, and the gc-word of every live chunk, is set to its address after movement. (a possible optimisation to add here later, is to merge free memory to make the next two travesals faster, – needs to be benchmarked to see if it has effect)
  3. Address update. Memory is traversed, and pointers in live objects are updated to be the gc-word of the chunk pointed to.
  4. Compaction: the live chunks are moved to their new location, and the gc-word is zeroed out. All the empty space is now in the end of the heap.

Because data is immutable, we do not have pointers from old to young generations, and we can just garbage a younger part of the heap, or all of it, with the same algorithm.

The execution stack is growing downwards, and the heap is growing upwards.