Documentation

Everything you need to build, query, and analyze with Rayforce — the pure C17 zero-dependency columnar dataframe library with a native graph engine.

Getting Started

Quick Start — Install prerequisites, build from source, start the REPL, create your first table, and run your first query in under 5 minutes.

Rayfall Language

Syntax & Types — Complete reference for the Rayfall language: atoms (integers, floats, booleans, symbols, strings, dates, times, timestamps, GUIDs), vectors, lists, tables, dictionaries, function calls, quoting, and comments.

Functions Reference — All 170+ built-in functions organized by category: arithmetic, comparison, logic, aggregation, higher-order, collection, sorting, table, query, join, I/O, string, temporal, type, and system operations.

Queries

Select & Update — The select and update query verbs with filtering, grouping, column expressions, and aggregation. Bridges directly to the Rayforce DAG executor.

Joins — Left joins, inner joins, window joins, and as-of joins. Time-windowed matching for event data.

Linked Columns — Integer columns that store row indices into another table for O(1) field access. Replaces hash-probe foreign keys with a single array indirection per row.

Pivot & Window — Pivot tables for reshaping data, xbar for bucketing, and xrank for ranking within groups.

Vector Search — Embeddings, cosine/L2/inner-product metrics, brute-force KNN, HNSW-accelerated ANN, and filter-aware retrieval via select ... nearest ... take.

Indexes & Acceleration

Indexes Overview — The full landscape of index-like structures Rayforce ships: per-column accelerators, HNSW vector indexes, linked columns, partition pruning, CSR graph indices. Mental model plus a decision matrix that points you at the right tool.

Accelerator Indexes — Per-column .idx.zone / .idx.hash / .idx.sort / .idx.bloom structures. Build once at user request, ride alongside the column through the pipeline, drop on mutation.

Indexes Guide — When to bother building one, how to choose between the kinds, worked workflows for analytics / ANN / cross-table / parted-table queries, and the lifecycle traps.

Operations

Math & Aggregation — Arithmetic operators (+ - * / %), rounding, aggregates (sum, avg, min, max, med, dev, count), and statistical functions.

String Operations — String splitting, pattern matching with like, concatenation, and format.

C API

Core API — The ray_t abstraction, vector operations, table construction, and the single public header include/rayforce.h.

DAG & Execution — Build lazy DAGs with ray_graph_new(), chain operations, and execute with the fused morsel-driven executor.

Graph Engine

CSR Storage — Double-indexed CSR (forward + reverse) edge indices, columnar file persistence, and mmap support.

Algorithms — 1-hop expand, BFS variable-length paths, shortest paths, A*, Yen's K-shortest, betweenness/closeness centrality, clustering coefficients, random walks, MST, and worst-case optimal joins (LFTJ).

Architecture

Pipeline & Optimizer — The lazy DAG execution pipeline: type inference, constant folding, sideways information passing, factorize, predicate pushdown, filter reorder, fusion, and dead code elimination.

Memory Model — Buddy allocator with thread-local arenas, slab cache, COW ref counting, arena (bump) allocator, per-VM heaps with lock-free cross-heap reclamation.

Storage

Files & Partitions — Cross-platform file I/O with locking, atomic rename, sym intern table persistence, and CSV loading with parallel parse.

Guides

Error Handling — try/raise patterns, null propagation, defensive programming, and error recovery in pipelines.

Memory & Monitoring — .sys.mem, .sys.gc, .sys.info, progress bar, timeit profiling, and memory budget management.

Type Casting — The as function, implicit coercion rules, null semantics, symbol vs string, date/time arithmetic, and GUID generation.

Reference

All Functions — Complete alphabetical reference for all 170+ Rayfall built-in functions with signatures, descriptions, and examples.