Memory Model
The ray_t 32-byte block header, buddy allocator, slab cache, thread-local heaps, arena allocator, and copy-on-write ref counting.
malloc, calloc, realloc, or free. All allocation goes through ray_alloc / ray_free (general) or ray_arena_alloc (bulk short-lived). Only allocator internals may use ray_sys_alloc / ray_sys_free.
The ray_t Block Header
Every object in Rayforce — atoms, vectors, lists, tables — begins with a 32-byte ray_t header. Data follows immediately at byte 32 via a flexible array member (data[]).
typedef union ray_t {
/* Allocated: object header */
struct {
/* Bytes 0-15: nullable bitmask / slice / ext nullmap */
union {
uint8_t nullmap[16];
struct { ray_t* slice_parent; int64_t slice_offset; };
struct { ray_t* ext_nullmap; ray_t* sym_dict; };
struct { ray_t* str_ext_null; ray_t* str_pool; };
};
/* Bytes 16-31: metadata + value */
uint8_t mmod; /* 0=heap, 1=file-mmap */
uint8_t order; /* block order (size = 2^order) */
int8_t type; /* negative=atom, positive=vector */
uint8_t attrs; /* attribute flags */
uint32_t rc; /* reference count (0=free) */
union {
int64_t i64; /* I64/SYMBOL/DATE/TIME atom */
double f64; /* F64 atom */
ray_t* obj; /* pointer to child */
struct { uint8_t slen; char sdata[7]; }; /* SSO string */
int64_t len; /* vector element count */
};
uint8_t data[]; /* element data (flexible array) */
};
/* Free: buddy allocator block */
struct {
ray_t* fl_prev;
ray_t* fl_next;
};
} ray_t;
Field Details
| Bytes | Field | Purpose |
|---|---|---|
| 0-15 | nullmap[16] | Inline null bitmap for vectors with up to 128 elements. For longer vectors, ext_nullmap points to a separate bitmap vector. For RAY_SYM columns, sym_dict points to the dictionary. For RAY_STR columns, str_pool points to the string pool. For slices, stores parent pointer and offset. |
| 16 | mmod | Memory mode: 0 = heap-allocated, 1 = file memory-mapped |
| 17 | order | Buddy allocator block order. Block size = 2^order bytes. Range: 6..30 (64 bytes to 1 GB). |
| 18 | type | Signed byte: negative = atom, 0 = LIST, 1-13 = vector types (BOOL through STR), 98 = TABLE, 99 = DICT, 100-103 = function types, 127 = ERROR |
| 19 | attrs | Attribute flags: RAY_ATTR_ARENA (arena-allocated, retain/release are no-ops), slice flag, sorted flag |
| 20-23 | rc | Reference count. 0 = free block. Incremented by ray_retain, decremented by ray_release. |
| 24-31 | i64 / f64 / len | For atoms: the value itself (up to 8 bytes, or 7-byte SSO string). For vectors: element count. For lists/tables: child count. |
| 32+ | data[] | Element array. Layout depends on type: contiguous typed elements for vectors, ray_t* pointers for lists/tables. |
fl_prev/fl_next free-list pointers. The buddy allocator chains free blocks through these fields without any additional bookkeeping memory.
Buddy Allocator
The primary allocator uses a classic buddy system with order-based free lists. Block sizes are powers of two, ranging from order 6 (64 bytes, enough for a ray_t header + 32 bytes of data) to order 30 (1 GB).
Allocation
- Compute the minimum order that fits the requested size + 32-byte header
- Search the free list for that order
- If empty, try the next larger order and split the block, placing the unused half on the lower-order free list
- Repeat until a block is found or allocation fails
Deallocation
- Check if the buddy block (the other half of the parent) is also free
- If so, remove the buddy from its free list and coalesce both halves into a single block of order+1
- Repeat coalescing until the buddy is not free or maximum order is reached
- Place the final block on the appropriate free list
Constants governing the allocator:
#define RAY_ORDER_MIN 6 /* minimum block: 64 bytes */
#define RAY_ORDER_MAX 30 /* maximum block: 1 GB */
Slab Cache
For the most common allocation sizes, a slab cache provides O(1) allocation and deallocation by maintaining per-order stacks of pre-split blocks.
#define RAY_SLAB_CACHE_SIZE 64 /* max blocks cached per order */
#define RAY_SLAB_ORDERS 5 /* orders 6..10 (64B to 1KB) */
The slab cache covers orders 6 through 10 (64 bytes to 1 KB), which account for the vast majority of allocations: scalar atoms, short vectors, list nodes, and string pool chunks. When a slab is empty, it refills from the buddy allocator. When full, freed blocks fall through to buddy deallocation.
Thread-Local Heaps
Each thread has its own heap, identified by a heap_id (u16). Heap IDs are allocated from an atomic bitmap, ensuring no two threads share a heap.
Why Per-Thread Heaps?
The buddy allocator's free lists are not thread-safe by design. Rather than adding locks (which would destroy performance for the most common allocation path), each thread operates on its own set of free lists. This eliminates all contention on the fast path.
Cross-Heap Free
When thread A allocates a block and thread B needs to free it (common in parallel query execution), the free is deferred:
- Thread B pushes the block onto a lock-free LIFO (compare-and-swap linked list) associated with thread A's heap
- Thread A periodically calls
ray_heap_flush_foreign()to drain this LIFO and coalesce the blocks into its own free lists
The handoff API:
ray_heap_push_pending(block); /* enqueue to owning heap's LIFO */
ray_heap_drain_pending(); /* drain LIFO into local free lists */
ray_heap_flush_foreign(); /* reclaim all foreign-freed blocks */
This design means the common case (thread-local alloc/free) is completely lock-free, and the uncommon case (cross-thread free) uses a single atomic CAS per block.
Arena Allocator
For bulk short-lived allocations (e.g., the symbol intern table, temporary parse trees), Rayforce provides an arena (bump) allocator.
/* Create an arena */
ray_arena_t arena;
ray_arena_init(&arena);
/* Allocate from the arena (bump pointer, no individual free) */
void* ptr = ray_arena_alloc(&arena, size);
/* Reset (reuse memory without returning to OS) */
ray_arena_reset(&arena);
/* Destroy (return all memory) */
ray_arena_destroy(&arena);
Blocks allocated from an arena carry the RAY_ATTR_ARENA flag in their attrs field. This flag makes ray_retain and ray_release no-ops, since arena blocks are freed all at once when the arena is destroyed. This eliminates reference counting overhead for temporary objects.
Use Cases
- Symbol intern table — String atoms are arena-allocated since they live for the duration of the process
- Parse trees — Rayfall parser allocates AST nodes from a per-parse arena, freed after compilation
- Temporary buffers — Intermediate hash tables, sort buffers, and other transient structures
COW Ref Counting
Rayforce uses copy-on-write (COW) semantics to allow safe sharing of data between multiple consumers without copying.
API
| Function | Behavior |
|---|---|
ray_retain(v) | Increment reference count. If RAY_ATTR_ARENA is set, this is a no-op. |
ray_release(v) | Decrement reference count. If it reaches zero, free the block (and recursively release children for lists/tables). No-op for arena blocks. |
ray_cow(v) | If rc > 1, allocate a new block and copy the data, returning the new block (caller owns it). If rc == 1, return the same block (caller already has exclusive ownership). |
COW Cleanup Pattern
After ray_cow returns a new copy, all error paths must release it to prevent leaks. The standard pattern uses goto fail:
ray_t* original = vec;
vec = ray_cow(vec);
if (RAY_IS_ERR(vec)) return vec;
/* ... mutate vec ... */
if (error_condition) goto fail;
return vec;
fail:
if (vec != original) ray_release(vec);
return RAY_ERR_PTR(err);
How COW Enables Zero-Copy Operations
- Slices —
ray_vec_slicecreates a new header pointing into the parent's data, incrementing the parent's refcount. No data is copied. - Table projections — Selecting a subset of columns creates new table headers that share the underlying column vectors via retain.
- CSR sharing — Multiple graph queries can share the same CSR vectors. Mutations (e.g., adding edges) trigger COW only on the modified vector.
- DAG intermediates — When a DAG node's output feeds multiple consumers, the vector is shared via retain rather than copied.
Allocation API Summary
| Function | Purpose | Thread Safety |
|---|---|---|
ray_alloc(size) | General allocation (buddy + slab) | Thread-local (no locks) |
ray_free(v) | General deallocation (may defer cross-heap) | Lock-free CAS for cross-heap |
ray_retain(v) | Increment ref count | Atomic increment |
ray_release(v) | Decrement ref count, free at zero | Atomic decrement + free |
ray_cow(v) | Copy-on-write (copy if shared) | Safe (reads rc atomically) |
ray_arena_alloc(a, size) | Bump allocation from arena | Not thread-safe (per-arena) |
ray_arena_reset(a) | Reset arena (reuse memory) | Not thread-safe |
ray_arena_destroy(a) | Free all arena memory | Not thread-safe |
ray_sys_alloc(size) | System allocator (mmap/VirtualAlloc) | Thread-safe (OS-level) |
ray_sys_free(ptr, size) | System deallocator (munmap/VirtualFree) | Thread-safe (OS-level) |
Memory Layout Visualization
A complete picture of how a vector looks in memory:
/* ray_t* vec = ray_vec_new(RAY_I64, 4); with values [10, 20, 30, 40] */
Offset Field Value
------ ----- -----
0 nullmap[16] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
16 mmod 0 (heap)
17 order 7 (block = 128 bytes = 32 header + 96 data)
18 type 5 (RAY_I64)
19 attrs 0
20 rc 1 (one reference)
24 len 4 (4 elements)
32 data[0] 10 (int64_t)
40 data[1] 20
48 data[2] 30
56 data[3] 40
For a vector with nulls, the nullmap bits at offset 0 indicate which elements are null (bit 0 = element 0, etc.). Vectors with more than 128 elements use ext_nullmap to point to a separate bitmap vector.