Compiler internals
A brief overview for contributors. Not exhaustive — enough to orient someone reading the source.
Pipeline
text
.cleat source → Lexer → Parser → AST → Checker → Codegen → go build → native binary
Each stage is a Go package under internal/.
Key directories
| Directory | Purpose |
|---|---|
cmd/cleat/ | CLI entry point — dispatches to build, test, fmt, lsp, new |
internal/lexer/ | Tokenizer — produces token stream from source bytes |
internal/parser/ | Recursive-descent parser — produces AST from tokens |
internal/ast/ | AST node definitions (expressions, statements, declarations, types) |
internal/checker/ | Type checker, effect validation, symbol resolution, struct analysis |
internal/codegen/ | Go code generator — emits .go files from checked AST |
internal/loader/ | Multi-package import resolver — builds the package DAG |
internal/diag/ | Rustc-style diagnostic renderer with ANSI colors |
internal/format/ | Canonical source formatter (printer + comment preservation) |
internal/lsp/ | Language server — diagnostics, hover, go-to-definition, autocomplete |
internal/config/ | Minimal TOML parser for cleat.toml |
internal/stdlib/ | Embedded stdlib .cleat sources (via //go:embed std) |
runtime/cleat/ | Go runtime library (Result, Effects, Stream, Tool, Agent, Server, ...) |
editors/vscode/ | VS Code extension + TextMate grammar |
Codegen specializations
The codegen has special handling for several call patterns:
| Pattern | What the codegen does |
|---|---|
json.decode with type annotation | Emits cleat.JsonDecode[T] instead of the generic wrapper |
llm.generate with type annotation | Emits cleat.LlmGenerate[T] with JSON Schema as string constant |
Agent.run with type annotation | Two-step: text call + cleat.ExtractStructured[T] |
| HTTP/FS/Env native functions | Conversion wrappers between flat runtime types and package-local structs |
| Server handler wrappers | Convert ServerRawRequest ↔ package-local Request/Response |
| Guard interception | IIFE wrapping at guarded call sites |
Flat type boundary
The runtime uses flat types (e.g., HttpRawResponse, ServerRawRequest) at the runtime ↔ codegen boundary to avoid import cycles between the runtime package and generated code. Codegen wrappers convert between flat types and package-local struct types.
Symbol table
The checker uses scope chains for symbol resolution:
- Universe scope — built-in types (
int,string,bool, ...) - Package scope — top-level declarations (functions, types, tools, agents, ...)
- File scope — imports
- Function/block scope — local variables and parameters
Symbols carry kind (SymFunc, SymTool, SymAgent, SymGuard, SymServer, ...), type, position, and effects.
Effect enforcement
- Each function's
needsclause populates a set of allowed effects - At every call site, the checker compares the callee's required effects against the caller's allowed set
- Missing effects produce a diagnostic with the exact effect name and call site
- Effects are validated transitively — if A calls B which needs net, A must also declare net
Testing
826+ tests across 11 packages. Run with go test ./.... Tests include:
- Lexer: token recognition, string escapes, interpolation
- Parser: every declaration form, expression precedence, error recovery
- Checker: type mismatches, effect violations, primitive validations
- Codegen: golden output tests, integration tests (compile + run + check exit code)
- Runtime: Result, Effects, Stream, Tool, Agent, Server, Memory, Supervisor, Anthropic API
- LSP: JSON-RPC transport, diagnostics, hover, go-to-definition, completions
- Format: round-trip preservation, comment handling