v0.3 · open source · Apache-2.0

Agent runs you can prove
not just trust

loom drives multi-step LLM agent work as a replay-deterministic state machine — safety invariants enforced at commit time, a human gate where it matters, and a complete audit trail you can replay.

npm i -g @loomfsm/pipeline
The loom dashboard: a task parked on its final gate — one open security blocker with file and line, Accept / Reject controls, and the exact agent output the operator is approving
A real gate: the security reviewer found a blocker — the run is parked until you decide.

Durable execution became table stakes — every framework checkpoints now. loom is built for the layer above: structural safety and a provable process, in one SQLite file you own.

Why loom

Guarantees, not vibes

Frameworks help you write agent graphs. loom makes the run itself provable.

🛡️ Safety at commit time

Invariants run inside the transaction — an agent can’t approve its own work over an open blocker. The violating step rolls back.

🔁 Replay-deterministic

Replay a recorded run against a changed rule: “would it have caught last week’s incident?”

🎚️ Human-in-the-loop, on a dial

human · on-blockers (default) · auto — full autonomy above a deterministic safety floor.

💥 Crash-safe

Restart, and the idempotency ledger dedups — no half-applied steps, no double spend.

🔌 Pluggable on three axes

Bundles · providers · transports. The kernel: zero dependencies, no vendor names.

🔑 No API key required

Runs on your Claude Code login — or OpenRouter, local Ollama, Anthropic API, with per-agent fallbacks.

Five ways to run

One state machine, five front-ends

Same engine, same gates, same invariants — pick the surface that fits the moment.

loom up

🖥️ Web dashboard

A browser console for the whole fleet — submit, watch the live agent chain, approve gates, configure backends.

loom bot telegram

📱 Telegram bot

Drive the fleet from your phone — submit tasks, approve gates with inline buttons, ship a finished branch. Outbound-only, default-deny.

/task …

💬 Inside Claude Code

Zero setup: your agent host executes each step, gates surface inline. No API key, no network.

loom run "…"

⚡ Headless one-shot

Drive one task to the end from a terminal, in an isolated git worktree. Your working tree is never touched.

loom daemon

🤖 Autonomous daemon

Set-and-forget: parks on your gates, wakes when you answer, retries with backoff, recovers on restart, commits to a reviewable branch.

--docker

📦 Container isolation

For unattended autonomy: each spawn runs in a container mounting only a dedicated clone — a real blast-radius bound.

A platform, not a single tool

Code review is the first bundle,
not the point

The kernel is domain-blind. Everything domain-specific lives in a bundle — a new domain is a new plugin, the kernel never changes.

The loom agent chain: classifier, analyzer, planner, implementer, deterministic checks and reviewers — each with its model, duration and token usage; gates decided by policy and by a human
Every run is a chain you can read: models, tokens, durations — and who decided each gate.

A bundle declares

  • Phases & steps — the shape of the work
  • Gates & roles — where a human (or policy) decides
  • Safety invariants — rules enforced at commit time
  • Typed prompts — templates, validated, per agent

The kernel provides

  • Atomic state — every step a SQLite transaction
  • The idempotency ledger — crash-safe, no double work
  • Replay — deterministic, auditable runs
  • Gate machinery — park, wake, policies, audit trail

What a bundle could be

  • Incident-response runbooks with human sign-off
  • Research pipelines: gather → synthesize → verify
  • Content workflows: draft → edit → legal gate → publish
  • Any review-gated, multi-step LLM process

The code bundle (review-gated implementation) ships today. The kernel’s domain-blindness isn’t a slogan — it’s enforced: zero runtime dependencies, no vendor or domain names in the kernel, checked by CI greps. Read how bundles plug in ↗

What it guarantees — honestly

Prove the process,
not the model

loom guarantees the process: the declared review ran, nothing was bypassed, irreversible steps got a human. The model’s output is the agents’ job — what you get is proof of which process ran.

Where it stands today

v0.3 — early, built in the open, used daily by its author on real repos. The core (state machine, recovery, audit trail) is stable and heavily tested; APIs may move before 1.0.

Follow the repo ↗

Bring it to your team

Want auditable AI agents
in your company?

I’m the author of loom. If your team is putting AI agents to real work — and needs to prove what they did, for engineering discipline or for compliance — I can set that up with you.

  • Pilot in a week — a review-gated agent pipeline running on one of your repositories, with gates where your process needs them.
  • Custom bundles — your domain encoded as phases, gates, and commit-time invariants: incident response, content pipelines, compliance workflows.
  • On-prem & audit-ready — local-first deployment, no data leaves your infrastructure, a replayable audit trail for every agent decision.

Prefer email? teaarte@gmail.com

Tell me about your use case

Usually I reply within a day. No newsletter, no spam.

FAQ

Questions, answered

How is this different from LangGraph / agent frameworks?

Frameworks help you author agent graphs; loom makes the run itself durable. Replay-determinism (one timestamp token, atomic commits), an idempotency ledger (crash → restart → exact dedup), and invariants enforced inside the database transaction are the difference between “my graph usually works” and “I can prove what happened”.

What does a run cost?

The default backend is your Claude Code subscription — no extra API spend. With API backends, loom records tokens and real cost per spawn, and a hard total-spawn cap bounds runaway runs.

Is my code safe while an agent works?

Steps run in an isolated git worktree — your working tree is never touched. For unattended runs, --docker puts each spawn in a container that mounts only a dedicated clone. Finished work lands on a loom/<task> branch, reviewable, never auto-merged.

Can I use models other than Claude?

Yes. Bind any agent to OpenRouter, local Ollama, or the Anthropic API (loom models set implementer openrouter:deepseek/deepseek-chat), with per-agent fallback chains. File-editing agents run through Aider or opencode harnesses behind the same isolation seam.

What's the data story?

Everything lives in <project>/.loom/state.db — a plain SQLite file you own. No cloud, no telemetry. Open it with any SQLite client and read the full audit trail.

Hand it a task.
Approve at the gates that matter.

npm i -g @loomfsm/pipeline && loom up