LangGraph, Temporal, and plain Python with Pydantic are all reasonable choices for building stateful AI backends. Each makes different bets. Cleat makes a different bet too—that auditability, effect tracking, and orchestration belong in the type system, not in your prayers.
This page shows the same problem solved four ways, then where each tool wins. No strawmen. No marketing math.
LangGraph is the dominant choice for stateful agent workflows—graph nodes, shared state, checkpoints, human-in-the-loop. It's a Python library running inside your interpreter, and effects are whatever your nodes happen to do.
from typing import TypedDict from langgraph.graph import StateGraph, END import httpx, json class State(TypedDict): repo: str pr: int result: dict | None async def analyze(state: State) -> dict: # network call — but who knows? caller doesn't. url = f"https://api/{state['repo']}/pr/{state['pr']}" try: async with httpx.AsyncClient() as c: r = await c.get(url, timeout=30) r.raise_for_status() return {"result": r.json()} except Exception as e: # retry? you're writing it. return {"result": None} graph = StateGraph(State) graph.add_node("analyze", analyze) graph.set_entry_point("analyze") graph.add_edge("analyze", END) app = graph.compile() # mocking? monkeypatch httpx, hope nothing else uses it.
import "std/http" import "std/json" type Analysis = struct { files_changed: int, summary: string, } tool analyze(repo: string, pr: int) -> Result[Analysis, string] needs { net } // declared, enforced timeout: 30s // syntax, not config retry: 2 // syntax, not glue { let url = "https://api/${repo}/pr/${pr}" let resp = http.get(url)? let data: Analysis = json.decode(resp.body)? return Ok(data) } test "analyze can be mocked" { using analyze = fn(r, p) { return Ok(Analysis { ... }) } // scoped, typed, no monkeypatch }
Temporal is the gold standard for durable execution—workflows that survive crashes and run for months. It's also a separate service you operate, with deterministic constraints that make general-purpose languages awkward to use correctly.
from datetime import timedelta from temporalio import workflow, activity from temporalio.common import RetryPolicy @activity.defn async def fetch_pr(repo: str, pr: int) -> dict: # effect is here — but workflow can't call it directly, # must use execute_activity. determinism rule. ... @workflow.defn class AnalyzeWorkflow: @workflow.run async def run(self, repo: str, pr: int): return await workflow.execute_activity( fetch_pr, args=[repo, pr], start_to_close_timeout=timedelta(seconds=30), retry_policy=RetryPolicy(maximum_attempts=3), ) # plus: a Worker process, a Temporal Service running # somewhere, and the discipline to never call random() # or datetime.now() directly inside a workflow.
import "std/http" tool fetch_pr(repo: string, pr: int) -> Result[http.Response, string] needs { net } timeout: 30s retry: 3 { let url = "https://api/${repo}/pr/${pr}" return http.get(url) } // for long-running, durable state, attach a chain: chain PRAuditTrail { signing: ed25519 @retention(7y) record Step { action: string, result: string, timestamp: time, } } // no Workers. no Service. one binary.
The default. You reach for FastAPI, Pydantic, httpx, and write the orchestration yourself. Maximum flexibility, maximum surface area, runtime everything.
from pydantic import BaseModel import httpx, asyncio from tenacity import retry, stop_after_attempt class Analysis(BaseModel): files_changed: int summary: str @retry(stop=stop_after_attempt(3)) async def analyze(repo: str, pr: int) -> Analysis: # effects: ¯\_(ツ)_/¯ url = f"https://api/{repo}/pr/{pr}" async with httpx.AsyncClient(timeout=30) as c: r = await c.get(url) r.raise_for_status() return Analysis.model_validate(r.json()) # tests: pytest + pytest-mock + AsyncMock + a prayer import pytest from unittest.mock import AsyncMock, patch @pytest.mark.asyncio async def test_analyze(): with patch("httpx.AsyncClient.get", new_callable=AsyncMock) as m: m.return_value.json.return_value = {...} result = await analyze("a", 1) assert result.files_changed == 3
import "std/http" import "std/json" type Analysis = struct { files_changed: int, summary: string, } tool analyze(repo: string, pr: int) -> Result[Analysis, string] needs { net } timeout: 30s retry: 3 { let url = "https://api/${repo}/pr/${pr}" let resp = http.get(url)? let data: Analysis = json.decode(resp.body)? return Ok(data) } test "analyze can be mocked" { using analyze = fn(r, p) { return Ok(Analysis { files_changed: 3, summary: "ok" }) } assert(analyze("a", 1).IsOk()) } // no pytest. no tenacity. no AsyncMock.
No tool wins on every row. This is what each one actually ships.
| Feature | LangGraph | Temporal | Python+Pydantic | Cleat |
|---|---|---|---|---|
| Static type checking errors at compile, not runtime |
via mypy | SDK-level | via mypy | native |
Compile-time effect trackingneeds { net, fs } enforced |
no | no | no | enforced |
| First-class tools timeout, retry as syntax |
decorators | activities | DIY | keyword |
| First-class streams backpressure, typed yields |
yes | via signals | DIY | keyword |
| State machines with reachability checks | graph nodes | SDK pattern | DIY | keyword |
| Cryptographic provenance signed audit trail, native |
via LangSmith | event history | DIY | keyword |
| Durable execution across crashes multi-day workflows that resume |
checkpoints | flagship | no | roadmap |
| Built-in test runner no external framework |
pytest | SDK testkits | pytest | cleat test |
| Built-in formatter | ruff/black | per-language | ruff/black | cleat fmt |
| Compiles to native binary | interpreter | per SDK | interpreter | via Go |
| External infrastructure required | optional | cluster | none | none |
| Ecosystem maturity | large | large | vast | alpha |
Last updated April 2026 · corrections welcome on GitHub
A new language is a real cost. If your problem is shaped like one of these, pick something else and we won't be offended.
Your backend coordinates AI tools and you care about what they actually did. Especially if "what they did" might one day need to be defended in an audit, a postmortem, or a regulatory review.
Cleat is alpha software. The compiler works, the stdlib ships, real programs run—but it's a young language with a small ecosystem. Be honest about your constraints.
Read the intro, browse the 29 example programs, or clone the repo and compile something in five minutes.