Compared honestly · against frameworks we respect

You already have options.
Here's where Cleat fits.

LangGraph, Temporal, and plain Python with Pydantic are all reasonable choices for building stateful AI backends. Each makes different bets. Cleat makes a different bet too—that auditability, effect tracking, and orchestration belong in the type system, not in your prayers.

This page shows the same problem solved four ways, then where each tool wins. No strawmen. No marketing math.

[ 01 / VS LANGGRAPH ]

vs LangGraph

LangGraph is the dominant choice for stateful agent workflows—graph nodes, shared state, checkpoints, human-in-the-loop. It's a Python library running inside your interpreter, and effects are whatever your nodes happen to do.

Library
LangGraph
Python framework, runtime-discovered effects, observability via LangSmith.
Language
Cleat
Compiled, declared effects, provenance and timeouts as syntax.
analyze_graph.py python
from typing import TypedDict
from langgraph.graph import StateGraph, END
import httpx, json

class State(TypedDict):
    repo: str
    pr: int
    result: dict | None

async def analyze(state: State) -> dict:
    # network call — but who knows? caller doesn't.
    url = f"https://api/{state['repo']}/pr/{state['pr']}"
    try:
        async with httpx.AsyncClient() as c:
            r = await c.get(url, timeout=30)
            r.raise_for_status()
            return {"result": r.json()}
    except Exception as e:
        # retry? you're writing it.
        return {"result": None}

graph = StateGraph(State)
graph.add_node("analyze", analyze)
graph.set_entry_point("analyze")
graph.add_edge("analyze", END)
app = graph.compile()

# mocking? monkeypatch httpx, hope nothing else uses it.
analyze.cleat cleat
import "std/http"
import "std/json"

type Analysis = struct {
    files_changed: int,
    summary: string,
}

tool analyze(repo: string, pr: int)
    -> Result[Analysis, string]
    needs { net }      // declared, enforced
    timeout: 30s         // syntax, not config
    retry: 2             // syntax, not glue
{
    let url = "https://api/${repo}/pr/${pr}"
    let resp = http.get(url)?
    let data: Analysis = json.decode(resp.body)?
    return Ok(data)
}

test "analyze can be mocked" {
    using analyze = fn(r, p) {
        return Ok(Analysis { ... })
    }      // scoped, typed, no monkeypatch
}
LangGraph wins when
  • You're already deep in the LangChain ecosystem and want LangSmith tracing
  • You need cyclic graph execution with rich human-in-the-loop checkpoints
  • Your team thinks in Python and the orchestration layer should match
  • You want a vast pre-built library of agent patterns and integrations
→ Cleat wins when
  • Effects matter—you want the compiler to refuse a tool that lies about side effects
  • Provenance is a requirement, not a feature you'll build later
  • You'd rather ship a single binary than manage a Python deployment
  • Mocking should be a language feature, not a third-party framework
[ 02 / VS TEMPORAL ]

vs Temporal

Temporal is the gold standard for durable execution—workflows that survive crashes and run for months. It's also a separate service you operate, with deterministic constraints that make general-purpose languages awkward to use correctly.

Platform
Temporal
Durable execution service · multi-language SDKs · operate the cluster.
Language
Cleat
Compile to native binary · zero infrastructure · auditability built-in.
workflow.py · Temporal python
from datetime import timedelta
from temporalio import workflow, activity
from temporalio.common import RetryPolicy

@activity.defn
async def fetch_pr(repo: str, pr: int) -> dict:
    # effect is here — but workflow can't call it directly,
    # must use execute_activity. determinism rule.
    ...

@workflow.defn
class AnalyzeWorkflow:
    @workflow.run
    async def run(self, repo: str, pr: int):
        return await workflow.execute_activity(
            fetch_pr, args=[repo, pr],
            start_to_close_timeout=timedelta(seconds=30),
            retry_policy=RetryPolicy(maximum_attempts=3),
        )

# plus: a Worker process, a Temporal Service running
# somewhere, and the discipline to never call random()
# or datetime.now() directly inside a workflow.
analyze.cleat cleat
import "std/http"

tool fetch_pr(repo: string, pr: int)
    -> Result[http.Response, string]
    needs { net }
    timeout: 30s
    retry: 3
{
    let url = "https://api/${repo}/pr/${pr}"
    return http.get(url)
}

// for long-running, durable state, attach a chain:
chain PRAuditTrail {
    signing: ed25519
    @retention(7y)
    record Step {
        action: string,
        result: string,
        timestamp: time,
    }
}

// no Workers. no Service. one binary.
Temporal wins when
  • You need true durable execution—workflows that resume after a server reboot weeks later
  • Polyglot teams where workflows span Go, Java, TypeScript, Python, and .NET
  • You already operate distributed infrastructure and a Temporal Service is no extra burden
  • Mission-critical orchestration where automatic state replay justifies the operational cost
→ Cleat wins when
  • Workflows are seconds-to-minutes long, not weeks—you don't need a workflow service
  • Auditability via signed provenance chains is a first-class need
  • You want determinism enforced by the language, not by SDK conventions you can violate
  • Single-binary deployment matters more than multi-language interop
[ 03 / VS PYTHON + PYDANTIC ]

vs Python + Pydantic

The default. You reach for FastAPI, Pydantic, httpx, and write the orchestration yourself. Maximum flexibility, maximum surface area, runtime everything.

Stack
Python + Pydantic
Glue your own. Runtime validation. Mypy if you remember to run it.
Language
Cleat
Static types. Compile-time effects. Built-in test runner. One binary.
analyze.py python
from pydantic import BaseModel
import httpx, asyncio
from tenacity import retry, stop_after_attempt

class Analysis(BaseModel):
    files_changed: int
    summary: str

@retry(stop=stop_after_attempt(3))
async def analyze(repo: str, pr: int) -> Analysis:
    # effects: ¯\_(ツ)_/¯
    url = f"https://api/{repo}/pr/{pr}"
    async with httpx.AsyncClient(timeout=30) as c:
        r = await c.get(url)
        r.raise_for_status()
        return Analysis.model_validate(r.json())

# tests: pytest + pytest-mock + AsyncMock + a prayer
import pytest
from unittest.mock import AsyncMock, patch

@pytest.mark.asyncio
async def test_analyze():
    with patch("httpx.AsyncClient.get",
               new_callable=AsyncMock) as m:
        m.return_value.json.return_value = {...}
        result = await analyze("a", 1)
        assert result.files_changed == 3
analyze.cleat cleat
import "std/http"
import "std/json"

type Analysis = struct {
    files_changed: int,
    summary: string,
}

tool analyze(repo: string, pr: int)
    -> Result[Analysis, string]
    needs { net }
    timeout: 30s
    retry: 3
{
    let url = "https://api/${repo}/pr/${pr}"
    let resp = http.get(url)?
    let data: Analysis = json.decode(resp.body)?
    return Ok(data)
}

test "analyze can be mocked" {
    using analyze = fn(r, p) {
        return Ok(Analysis {
            files_changed: 3, summary: "ok"
        })
    }
    assert(analyze("a", 1).IsOk())
}

// no pytest. no tenacity. no AsyncMock.
Python wins when
  • You need the entire Python ML and data ecosystem in the same process
  • Your team's existing skills are Python and switching costs outweigh language wins
  • You're prototyping fast and orchestration discipline can come later (or never)
  • You need a library that exists in PyPI and nowhere else
→ Cleat wins when
  • You want a single binary, no virtualenv, no dependency hell
  • Effect tracking and mocking should be free, not 4 dependencies
  • Compile-time errors beat runtime AttributeErrors at 3am
  • You're building a backend whose correctness you'll have to defend in an audit
[ 04 / FEATURE MATRIX ]

The full matrix.

No tool wins on every row. This is what each one actually ships.

Feature LangGraph Temporal Python+Pydantic Cleat
Static type checking
errors at compile, not runtime
via mypy SDK-level via mypy native
Compile-time effect tracking
needs { net, fs } enforced
no no no enforced
First-class tools
timeout, retry as syntax
decorators activities DIY keyword
First-class streams
backpressure, typed yields
yes via signals DIY keyword
State machines with reachability checks graph nodes SDK pattern DIY keyword
Cryptographic provenance
signed audit trail, native
via LangSmith event history DIY keyword
Durable execution across crashes
multi-day workflows that resume
checkpoints flagship no roadmap
Built-in test runner
no external framework
pytest SDK testkits pytest cleat test
Built-in formatter ruff/black per-language ruff/black cleat fmt
Compiles to native binary interpreter per SDK interpreter via Go
External infrastructure required optional cluster none none
Ecosystem maturity large large vast alpha

Last updated April 2026 · corrections welcome on GitHub

[ 05 / HONESTY ]

When not to use Cleat.

A new language is a real cost. If your problem is shaped like one of these, pick something else and we won't be offended.

Reach for Cleat when…

Your backend coordinates AI tools and you care about what they actually did. Especially if "what they did" might one day need to be defended in an audit, a postmortem, or a regulatory review.

  • You're building a multi-step AI pipeline and side effects keep biting you in production tools that hit the wrong API, retries that didn't fire, mocks left in by accident
  • Provenance, signing, and audit trails are requirements—not "nice to have" regulated industries, security tooling, financial workflows, healthcare
  • You want one statically-typed binary, not a Python service plus a workflow cluster plus an observability stack small teams, edge deployment, anyone tired of YAML
  • You're starting a new project and the team has appetite for a new language greenfield is much easier than retrofitting; go all-in or don't bother

Stay where you are when…

Cleat is alpha software. The compiler works, the stdlib ships, real programs run—but it's a young language with a small ecosystem. Be honest about your constraints.

  • You need a specific Python or npm library that has no equivalent scientific computing, ML training loops, niche SDKs—use the language with the library
  • Your workflows must survive multi-day server crashes Temporal solved this. Use Temporal. Cleat's durable execution story is on the roadmap, not in the box
  • You're shipping next quarter and risk-averse stakeholders need a 1.0 release Cleat is alpha. Wait for a stable tag, or pilot it on a non-critical service first
  • Your team has zero appetite to learn a new language That's a real constraint. Forcing a language on an unwilling team produces worse code than the language you replaced

Still curious?

Read the intro, browse the 29 example programs, or clone the repo and compile something in five minutes.