Agent work is monolithic. A session produces a result, but nothing inside it can be verified, reused, or built on. A human reviews the output but can’t reproduce it. The problem isn’t better tools — it’s a missing primitive.
## THE PRIMITIVE
EXPRESSIONS AS EVIDENCE AND MATERIAL
An expression is the unit of work — git-distributed, arrow-native, composable, with lineage by default. A declarative object that carries its own identity, its own history, its own cache.
Every expression has a dual nature. Looking backward, it’s evidence: what was computed, from what inputs, with what hash. Same expression, same hash — agents that arrive at the same answer converge automatically, without coordination.
Looking forward, it’s material: a composable part that snaps into larger computations without knowing who made it or when. The catalog isn’t a document store agents read. It’s a surface agents build on.
Same object. Two faces. That’s the primitive.
## OPERATIONS
SEARCH, COMPOSE, VERIFY
xorq catalog list — search the catalog. Find what exists before computing anything new.
xorq run-unbound — execute an expression with inputs resolved at runtime.
xorq lineage — trace any result to its source. Verify provenance through the graph.
Layer
Role
Protocol
Expression
Specification
Ibis / xorq
Catalog
Identity + lineage
Git
Cache
Input-addressed storage
Parquet
Execution
Pluggable engines
Arrow
## IN PRACTICE
THE AGENT LOOP
Agents share context between sessions. They can’t share data. The catalog does it for them.
Agent receives a question. Searches the catalog for existing entries. Finds partial coverage — some expressions already computed, cached, verified. Composes a new expression from existing parts plus new logic. Executes. The result becomes a new entry.
Second session. Same question. Cache hit. No recomputation. The work compounded.
Different agent. Related question. Finds the entry, extends it. Doesn’t need to know who computed it or when. The hash is the handshake.
Third case. Upstream data changed. Same expression, new input fingerprint. The cache re-executes and stores a fresh result. Lineage stays intact — same recipe, different run.
%%{init: {'theme': 'base', 'themeVariables': {'actorBkg': '#0a2a2e', 'actorBorder': '#C1F0FF', 'actorTextColor': '#C1F0FF', 'actorLineColor': '#1a4a50', 'signalColor': '#C1F0FF', 'signalTextColor': '#C1F0FF', 'noteBkgColor': '#0f3538', 'noteBorderColor': '#1a4a50', 'noteTextColor': '#8fd4e8', 'activationBkgColor': '#0f3538', 'activationBorderColor': '#1a4a50', 'labelBoxBkgColor': '#05181A', 'labelBoxBorderColor': '#1a4a50', 'labelTextColor': '#5ab0c8', 'loopTextColor': '#5ab0c8', 'background': '#05181A', 'mainBkg': '#0a2a2e', 'lineColor': '#1a4a50', 'textColor': '#C1F0FF', 'primaryColor': '#0a2a2e', 'primaryBorderColor': '#C1F0FF', 'primaryTextColor': '#C1F0FF', 'secondaryColor': '#0f3538', 'tertiaryColor': '#0f3538', 'fontFamily': 'FK Grotesk Mono, monospace', 'fontSize': '13px'}}}%%
sequenceDiagram
participant User
participant Agent
participant Catalog
participant Engine
User->>Agent: churn by channel?
activate Agent
Agent->>Catalog: search / compose / lineage
Agent->>Catalog: execute
Catalog->>Engine: cache miss → run
activate Engine
Engine-->>Agent: result
deactivate Engine
Agent->>Catalog: add
Agent->>User: result
deactivate Agent
rect rgba(15, 53, 56, 0.3)
Note over Agent,Catalog: NEW SESSION
User->>Agent: churn by channel?
activate Agent
Agent->>Catalog: execute
Catalog-->>Agent: cache hit
Agent->>User: result
deactivate Agent
end
MCP Server — the catalog appears as a tool in Claude Desktop or any MCP-native framework. Agents search, compose, and execute expressions without SDK integration.
Hooks — lifecycle callbacks (PreToolUse, PostToolUse, ToolFailure) that validate inputs and verify outputs at each step. Guardrails set by humans, enforced on every session.
Plugins — slash commands inside agent UIs. Type /xorq to search the catalog or run an expression inline.
Skills — reusable capabilities that coding agents like Claude Code invoke directly to read from or write to the catalog.
## WHAT CONNECTS
INTEGRATIONS
Xorq sits between agents and the infrastructure you already run. No migrations.
Code Agents
Claude Code, Codex, Cortex Code (Snowflake)
Data Infrastructure
Snowflake, Databricks, S3
Libraries and Frameworks
Scikit-learn, Feast, LangChain
## USAGE-BASED PLANS
PRICING
Pay for catalog storage and bytes transferred over Arrow Flight. Compute stays pluggable.
» You write a declarative expression, Xorq saves it to an immutable catalog entry (expression + metadata + cached results) that can be executed, diffed, shared, and served.
How is this different from a data catalog?
» Traditional catalogs document tables for humans — name-addressed, passively consumed. Xorq catalogs executable expressions for agents — input-addressed, actively composed. A traditional catalog answers “what tables exist?” Xorq answers “has this exact computation been done before?” That’s the difference between documentation and coordination.
Do I need to migrate to Xorq?
» No. Xorq connects directly to your existing infrastructure (Snowflake, Databricks, S3). No migrations required.
What does “input-addressed” caching mean?
» The identity of a cached result is determined by hashing the expression and all of its inputs — not the output content. If the inputs haven’t changed, the result is reused without re-execution. This is input-addressed, not content-addressed: trust comes from knowing the recipe is identical, not from inspecting the output. Traditional caching asks “is this expired?”. Input-addressed caching asks “are the inputs the same?”.
What does it mean to “serve an expression”?
» Serving exposes a compiled expression as a remote endpoint over Arrow Flight so other services or agents can send inputs and receive results without re-implementing pipeline logic.
How does Xorq compare to dbt Fusion?
» dbt Fusion gives you a faster, Rust-powered dbt with SQL-aware validation. Xorq gives you an expression graph that spans languages and engines, with a catalog that versions, caches, and governs the full pipeline. If your world is SQL models, Fusion is a meaningful upgrade. If your world is SQL + Python + ML across multiple engines, xorq is built for that from the ground up.
What if I need imperative code or custom logic?
» Expression-first does not eliminate flexibility. Escape hatches exist via UDFs and Arrow-based interfaces — opaque stages can conform to the same contract. Declarative at the top level, imperative underneath when necessary. Like unsafe in Rust — controlled, explicit, bounded.
What about non-deterministic steps like model training?
» The expression captures the full specification including the seed. If nothing changes, the cache avoids re-execution entirely. If you do re-run, reproducibility depends on the algorithm — that’s a property of the math, not the system. Xorq makes the specification explicit so you can tell the difference.
What happens when upstream data changes between sessions?
» You choose the invalidation strategy: modification-time (re-run if the source has been touched) or snapshot with TTL (trust the cache for a window, then re-validate). The expression stays the same. The cache policy is separate. Staleness tolerance is a business decision, not something the system should hide from you.
How does Xorq handle merge conflicts?
» It doesn’t — because there aren’t any. Same expression, same hash, automatic dedup. Different expression, different hash, no collision. In code, parallel work creates merge conflicts. In an input-addressed catalog, parallel work either converges or coexists. No locking, no conflicts, no deduplication logic.
## THE BODY OF WORK
WHAT COMPOUNDS
Agents are ephemeral. Sessions end. Models get swapped. Prompts change.
What compounds is the catalog — each entry verified evidence that the next agent builds on. Not accumulated output. Composable, reproducible work.