Xorq: The Open Compute Format for AI Data Engineering

July 22, 2025
4
min read
Copied!
Authors
Subscribe to the pipeline
Share
Copied!

Today, we’re excited to announce a major upgrade to  Xorq (https://github.com/xorq-labs/xorq), an open compute catalog that helps teams compose, reuse, ship, and observe AI compute such as transformations, features, and models. 

Data has Apache Iceberg. Compute has Xorq

Data has standards like Apache Iceberg, which make it easier to share and reuse data. But compute is still a mess—trapped in notebooks, duplicated effort across teams, or baked into custom Airflow DAGs. This complexity dramatically slows down AI innovation. 

We founded and built Xorq to eliminate these frustrations. We think of Xorq as the missing analog to Apache Iceberg—an open format that makes compute modular and shareable so that teams can create new AI applications much faster and with greater trust.

Xorq makes AI compute modular and shareable

The latest release of the open source Xorq library for Python features the following:

The declarative Python syntax simplifies the definition of AI compute–especially processing across multiple engines. As you compile compute expressions, Xorq automatically catalogs them and enhances them with caching and observability.

The Xorq compute catalog promotes reuse by recording and securely sharing authorship and lineage details

This demo walks you through how Xorq takes plain Python and turns it into reusable, optimizable, and cached compute artifacts:

Sneak Preview: Visualizing Compute Catalogs

A compute catalog is a powerful asset. The interactive demo below gives you a sneak preview of a console we are working on that leverages the compute catalog to help teams share and discover reusable expressions, combine them into new composite expressions, and observe and troubleshoot their behavior. 

Request early access to the Xorq Cloud

Xorq Use Cases

It’s been exciting to see the first Xorq expressions show up in the wild since our last major release in March–all with Python simplicity and SQL-scale performance.

Here’s what we’ve seen:

  • Fraud, marketing and risk modeling in financial services e.g. XGBoost
  • Semantic Layers
  • Feature Stores
  • MCP + ML Integration
  • ML Pipelines
  • RAG/LLM pipelines

What’s New in V0.3

In this release we introduce the following major enhancements:

User-Defined Exchange Functions (UDXFs)

UDXFs are a specialized type of user-defined function in Xorq that enable distributed data processing using Apache Arrow Flight protocol. Unlike traditional UDFs that operate within a single process, UDXFs execute custom Python logic in separate processes or even remote services, making them ideal for:

  • External API integrations (calling REST APIs, databases, or third-party services)
  • Resource-intensive computations (ML model inference, heavy transformations)
  • Microservice architectures (deploying models as standalone services)
  • Process isolation (running untrusted or memory-intensive code safely)

Expression lineage

One of Xorq’s most powerful new features is automatic lineage tracking. You can now visualize the complete computational graph:

What’s Next

Xorq Cloud 

Xorq expressions can run anywhere as UDxFs. But if you’re seeking a platform on which to run them, Xorq Cloud will be the place. Xorq Cloud, which will provide functionality, such as:

  • Serverless UDXF and Catalog Server hosting
  • Bundled compute and storage (tiered)
  • Compute Catalog console
    • Visualization of Xorq expression reuse and lineage relationships
    • Catalog of expressions - what’s available for reuse 
    • Access control
    • Build control

Request early access to the Xorq Cloud

Xorq Library

  • Harding YAML specification for serializing pipelines
  • Integrations:
    • KuzuDB?
    • LanceDB
  • Observability (OpenTelemetry)

How to Get Started

Install xorq easily with:

pip install xorq

or

nix run github:xorq-labs/xorq

Writing your first Xorq compute expression

Let us know what you think

We really look forward to your feedback on the new release of Xorq. Here are some other resources to help you get started:

Stay Updated with Xorq

Learn about product updates, technical insights, and how Xorq is shaping the future (and present) of AI.

Better ML pipelines.
Launched anywhere.

Try xorq today, or request a walkthrough.