DAG & Execution C API
Build lazy operation DAGs, optimize with 10 rewrite passes, and execute as fused morsel-driven bytecode. Plus graph traversal, CSR storage, and columnar I/O.
ray_* DAG functions return ray_op_t* nodes. Nothing executes until you call ray_execute(g, root). This allows the optimizer to rewrite the entire plan before any data flows.
DAG Construction
ray_graph_new
ray_scan) resolve column names against this table. The table is retained.ray_graph_free
Source Operations
Source operations produce data from tables, constants, or external vectors.
OP_SCAN node.Unary Operations
Element-wise unary operations. These are fuseable — the optimizer merges chains of unary/binary ops into single morsel passes.
| Function | Opcode | Description |
|---|---|---|
ray_neg(g, a) | OP_NEG | Arithmetic negation |
ray_abs(g, a) | OP_ABS | Absolute value |
ray_not(g, a) | OP_NOT | Logical NOT |
ray_sqrt_op(g, a) | OP_SQRT | Square root |
ray_log_op(g, a) | OP_LOG | Natural logarithm |
ray_exp_op(g, a) | OP_EXP | Exponential (e^x) |
ray_ceil_op(g, a) | OP_CEIL | Ceiling |
ray_floor_op(g, a) | OP_FLOOR | Floor |
ray_isnull(g, a) | OP_ISNULL | Returns BOOL: true if null |
ray_upper(g, a) | OP_UPPER | Uppercase string |
ray_lower(g, a) | OP_LOWER | Lowercase string |
ray_strlen(g, a) | OP_STRLEN | String byte length |
ray_trim_op(g, a) | OP_TRIM | Strip whitespace |
ray_cast
target_type (e.g., RAY_F64, RAY_I64).Binary Operations
Element-wise binary operations. All are fuseable into morsel passes.
| Function | Opcode | Description |
|---|---|---|
ray_add(g, a, b) | OP_ADD | Addition |
ray_sub(g, a, b) | OP_SUB | Subtraction |
ray_mul(g, a, b) | OP_MUL | Multiplication |
ray_div(g, a, b) | OP_DIV | Division |
ray_mod(g, a, b) | OP_MOD | Modulo |
ray_eq(g, a, b) | OP_EQ | Equal |
ray_ne(g, a, b) | OP_NE | Not equal |
ray_lt(g, a, b) | OP_LT | Less than |
ray_le(g, a, b) | OP_LE | Less than or equal |
ray_gt(g, a, b) | OP_GT | Greater than |
ray_ge(g, a, b) | OP_GE | Greater than or equal |
ray_and(g, a, b) | OP_AND | Logical AND |
ray_or(g, a, b) | OP_OR | Logical OR |
ray_like(g, a, b) | OP_LIKE | SQL LIKE (case-sensitive) |
ray_ilike(g, a, b) | OP_ILIKE | SQL LIKE (case-insensitive) |
ray_min2(g, a, b) | OP_MIN2 | Element-wise minimum |
ray_max2(g, a, b) | OP_MAX2 | Element-wise maximum |
ray_if
then_val where cond is true, else_val otherwise.String binary/ternary ops
ray_concat accepts an array of N inputs.Aggregation Operations
Reduction operations that collapse a column to a single value (or per-group values when combined with ray_group).
| Function | Opcode | Description |
|---|---|---|
ray_sum(g, a) | OP_SUM | Sum of values |
ray_prod(g, a) | OP_PROD | Product of values |
ray_count(g, a) | OP_COUNT | Count of non-null values |
ray_avg(g, a) | OP_AVG | Average (mean) |
ray_min_op(g, a) | OP_MIN | Minimum value |
ray_max_op(g, a) | OP_MAX | Maximum value |
ray_first(g, a) | OP_FIRST | First non-null value |
ray_last(g, a) | OP_LAST | Last non-null value |
ray_count_distinct(g, a) | OP_COUNT_DISTINCT | Count of distinct values |
ray_stddev(g, a) | OP_STDDEV | Sample standard deviation |
ray_stddev_pop(g, a) | OP_STDDEV_POP | Population standard deviation |
ray_var(g, a) | OP_VAR | Sample variance |
ray_var_pop(g, a) | OP_VAR_POP | Population variance |
Structural Operations
Pipeline breakers that reshape data: filtering, sorting, grouping, joining, and projecting.
ray_filter
ray_sort_op
ray_group
n_keys key columns, applying n_aggs aggregate operations (specified as OP_SUM, OP_COUNT, etc.) to the corresponding input columns.ray_distinct
ray_join
ray_asof_join
ray_window_op
ray_head / ray_tail / ray_select
ray_head returns the first N rows, ray_tail returns the last N rows. ray_select projects specific columns from a table node.Graph Operations
Graph traversal operations work on CSR edge indices (ray_rel_t). See CSR / Relationship API below to build the index.
ray_expand
ray_var_expand
min_depth to max_depth hops. With track_path=true, outputs the full path for each reached node.ray_shortest_path
max_depth hops.ray_wco_join
Additional Graph Algorithms
| Function | Algorithm |
|---|---|
ray_pagerank(g, rel, max_iter, damping) | Iterative PageRank |
ray_connected_comp(g, rel) | Connected components (label propagation) |
ray_dijkstra(g, src, dst, rel, weight_col, max_depth) | Weighted shortest path |
ray_louvain(g, rel, max_iter) | Louvain community detection |
ray_degree_cent(g, rel) | Degree centrality |
ray_topsort(g, rel) | Topological sort (Kahn's) |
ray_dfs(g, src, rel, max_depth) | Depth-first search |
ray_astar(g, src, dst, rel, weight, lat, lon, props, max_depth) | A* shortest path |
ray_k_shortest(g, src, dst, rel, weight_col, k) | Yen's k-shortest paths |
ray_cluster_coeff(g, rel) | Clustering coefficients |
ray_random_walk(g, src, rel, walk_length) | Random walk traversal |
ray_betweenness(g, rel, sample_size) | Betweenness centrality (Brandes) |
ray_closeness(g, rel, sample_size) | Closeness centrality |
ray_mst(g, rel, weight_col) | Minimum spanning forest (Kruskal) |
Optimizer & Executor
ray_optimize
root: type inference, constant folding, sideways information passing, factorize, predicate pushdown, filter reorder, projection pushdown, partition pruning, fusion, dead code elimination. Returns the optimized root node (may differ from input).ray_execute
ray_t* result (vector or table). Returns an error object on failure — check with RAY_IS_ERR(). The caller owns the result and must release it.CSR / Relationship API
Build, save, load, and query double-indexed CSR edge indices for graph traversal.
ray_rel_build
sort_targets=true for sorted adjacency lists (required for WCO join).ray_rel_from_edges
src_col and dst_col. Specify the number of source and destination nodes explicitly.ray_rel_save / ray_rel_load / ray_rel_mmap
.col files in a directory. ray_rel_mmap memory-maps the files for zero-copy access.ray_rel_free
Storage API
Columnar file I/O for vectors, splayed tables, and CSV.
Column I/O
.col file, load it back, or memory-map it for zero-copy reads. The file format includes type, length, null bitmap, and element data.Splayed Tables
.col files (one per column) plus a symbol table. Load reconstructs the table from the directory.CSV I/O
ray_read_csv loads a CSV file with automatic type inference, parallel parsing, and null handling. ray_read_csv_opts allows custom delimiter, header flag, and null string. ray_write_csv writes a table to CSV.Complete Examples
Example 1: Filter + Group + Sum
#include <rayforce.h>
int main(void) {
ray_heap_init();
ray_sym_init();
ray_t* trades = ray_read_csv("trades.csv");
/* Build the operation DAG — nothing executes yet */
ray_graph_t* g = ray_graph_new(trades);
/* Filter: keep only rows where flag == 0 */
ray_op_t* flag = ray_scan(g, "flag");
ray_op_t* pred = ray_eq(g, flag, ray_const_i64(g, 0));
ray_op_t* region = ray_filter(g, ray_scan(g, "region"), pred);
ray_op_t* amount = ray_filter(g, ray_scan(g, "amount"), pred);
/* Group by region, sum amounts */
ray_op_t* keys[] = { region };
uint16_t agg_ops[] = { OP_SUM };
ray_op_t* agg_ins[] = { amount };
ray_op_t* grp = ray_group(g, keys, 1, agg_ops, agg_ins, 1);
/* Optimize (10 passes) and execute */
ray_t* result = ray_execute(g, ray_optimize(g, grp));
if (result && !RAY_IS_ERR(result)) ray_release(result);
ray_graph_free(g);
ray_release(trades);
ray_sym_destroy();
ray_heap_destroy();
return 0;
}
Example 2: Graph BFS Traversal
#include <rayforce.h>
int main(void) {
ray_heap_init();
ray_sym_init();
/* Build a directed graph: 0->1, 0->2, 1->2, 1->3, 2->3, 3->0 */
ray_t* src = ray_vec_from_raw(RAY_I64,
(int64_t[]){0,0,1,1,2,3}, 6);
ray_t* dst = ray_vec_from_raw(RAY_I64,
(int64_t[]){1,2,2,3,3,0}, 6);
ray_t* edges = ray_table_new(2);
edges = ray_table_add_col(edges,
ray_sym_intern("src", 3), src);
edges = ray_table_add_col(edges,
ray_sym_intern("dst", 3), dst);
ray_release(src);
ray_release(dst);
/* Double-indexed CSR (forward + reverse) */
ray_rel_t* rel = ray_rel_from_edges(edges,
"src", "dst", 4, 4, true);
/* Start at node 0, BFS 1..3 hops forward */
ray_t* start = ray_vec_from_raw(RAY_I64,
(int64_t[]){0}, 1);
ray_t* nodes = ray_table_new(1);
nodes = ray_table_add_col(nodes,
ray_sym_intern("id", 2), start);
ray_release(start);
ray_graph_t* g = ray_graph_new(nodes);
ray_op_t* reach = ray_var_expand(g,
ray_scan(g, "id"), rel, 0, 1, 3, false);
ray_t* result = ray_execute(g, ray_optimize(g, reach));
/* src | dst | depth
* ----|-----|------
* 0 | 1 | 1
* 0 | 2 | 1
* 0 | 3 | 2 */
if (result && !RAY_IS_ERR(result)) ray_release(result);
ray_graph_free(g);
ray_rel_free(rel);
ray_release(edges);
ray_release(nodes);
ray_sym_destroy();
ray_heap_destroy();
return 0;
}
Example 3: Join Two Tables
#include <rayforce.h>
int main(void) {
ray_heap_init();
ray_sym_init();
ray_t* orders = ray_read_csv("orders.csv");
ray_t* custs = ray_read_csv("customers.csv");
ray_graph_t* g = ray_graph_new(orders);
/* Inject both tables into the DAG */
ray_op_t* lo = ray_const_table(g, orders);
ray_op_t* ro = ray_const_table(g, custs);
/* Inner join on customer_id */
ray_op_t* lk[] = { ray_scan(g, "customer_id") };
ray_op_t* rk[] = { ray_scan(g, "customer_id") };
ray_op_t* joined = ray_join(g, lo, lk, ro, rk, 1, 0);
ray_t* result = ray_execute(g, ray_optimize(g, joined));
/* customer_id | amount | name
* ------------|--------|--------
* 1 | 250 | Alice
* 2 | 180 | Bob
* 2 | 340 | Bob
* 3 | 120 | Charlie */
if (result && !RAY_IS_ERR(result)) ray_release(result);
ray_graph_free(g);
ray_release(orders);
ray_release(custs);
ray_sym_destroy();
ray_heap_destroy();
return 0;
}