Choosing a runtime when model portability is the constraint.

Strategic constraint: agents must support model choices from open weights (Ollama, vLLM) through frontier (Claude, GPT, Gemini), and the architecture must never be tied to one provider. That requirement immediately narrows the runtime layer to options that are multi-provider native. This leaf works the decision: which runtime, what it gives you, what you give up, and when the calculus flips toward a first-party SDK like Claude Agent SDK instead.

Members · gated Runtime layer Decision doc Hot · quarterly review

01 · TL;DR

What the SDK is, in one paragraph.

The recommendation: LangGraph as the runtime, paired with a model gateway underneath (Cloudflare AI Gateway, OpenRouter, LiteLLM — see the model gateway category). LangGraph is multi-provider native (provider switching is a config call to init_chat_model), mature (v1.0 late 2025, 90k+ stars), MCP-aware, and pairs cleanly with both LangSmith and OTel-based audit. The second pick is Vercel AI SDK for TypeScript-first shops; Pydantic AI for type-first Python. The Claude Agent SDK is the right choice only when you have committed to Claude as the single model — which, by the architecture's stated constraint, this design hasn't.

About this leaf

This is a member-tier decision leaf on know.2nth.ai. Opinionated content — "the runtime layer choice and what it costs" — goes behind the join wall. Reference content (the landscape, categories, options per category) stays open at /tools/. Sign up via the join form to unlock the rest.

02 · Why LangGraph, what we considered 🔒

Six runtimes weighed, ranked by model-portability fit.

The model-portability constraint cuts the candidate list immediately: any first-party SDK that locks to one provider is out. Among multi-provider options, the deciding factor is maturity at scale (state, multi-agent orchestration, MCP support, ecosystem) versus ergonomics (DX, type safety, smaller mental model).

Runtime	Verdict
LangGraph — pick	Multi-provider native via `init_chat_model`. Mature (v1.0 late 2025, 90k+ stars). MCP-aware as of late 2025. Best-in-class for stateful multi-agent orchestration when you grow into it. Strong observability story (LangSmith native; Langfuse via OTel exporter). Pairs cleanly with LiteLLM as the model gateway. Python ecosystem is the broader hireable pool.
Vercel AI SDK v6 — alt for TS shops	TypeScript-first, native multi-provider, lovely DX, ToolLoopAgent for production. Narrower scope than LangGraph for complex orchestration. Right pick if the team is TS-only.
Pydantic AI — alt for type-first Python	Multi-provider; Pydantic-typed everywhere. Smaller community; less mature for complex multi-agent. Right pick if your stance is "types over flexibility."
CrewAI	Role-based, fast adoption (60%+ Fortune 500 by Jan 2026). Multi-provider. Coarser permission model than LangGraph; less granular hook surface for layering Cerbos / Langfuse. Outgrown by teams that need fine-grained control.
Claude Agent SDK — rejected here	First-party for Claude. Beautiful primitives (`allowed_tools`, `PreToolUse` hooks, MCP-native, Skills, prompt caching). Claude-only. The right pick when the team has committed to Claude as the single model; ruled out by this architecture's model-portability constraint.
OpenAI Agents SDK — rejected here	OpenAI-only. Same shape of disqualification as Claude Agent SDK, opposite direction.
Raw API + DIY loop	Maximum control; you rebuild the loop, context compaction, tool-result feeding, MCP plumbing. Worth it if you have unusual constraints; otherwise the LangGraph-style runtime saves months of work.

The deciding factor. If model portability is a stated requirement, any first-party SDK that locks to one provider is disqualified before the analysis starts. That leaves multi-provider runtimes. Among those, LangGraph's combination of maturity, multi-agent depth, and MCP support wins for production workloads. Vercel AI SDK wins for TypeScript-only shops. Pydantic AI wins for teams that prize type safety over flexibility.

The architecture lets you reverse this later: swapping LangGraph for Vercel AI SDK is roughly one day per agent, since MCP, Cerbos, Langfuse, and Inngest all sit downstream of the runtime and don't care which one is on top.

03 · What LangGraph does, what you layer on top 🔒

Five concrete jobs — and what the architecture wires around it.

A runtime is narrow by design. Knowing exactly what LangGraph gives you (and what it doesn't) is the difference between using it cleanly and fighting it.

1 · Hosts the agent loop

Model call, tool-call parsing, tool execution, result feedback, state transitions. Multi-step graphs as a first-class primitive.

2 · Multi-provider model calls

init_chat_model("claude-opus-4-7") or "gpt-5" or "ollama:llama-4". Switch with a config change, not a refactor.

3 · MCP-aware (as of late 2025)

Connects to MCP servers directly. Points at the mcp-gateway Worker URL like any other MCP source.

4 · Interrupts + state checkpoints

Built-in human-in-the-loop interrupts. State checkpointing for replay + persistence. Pair with Inngest for cross-process pauses.

5 · Native LangSmith observability

Traces flow into LangSmith out of the box; OTel exporter sends them to Langfuse instead if that's the audit backend.

What LangGraph does NOT do — and what the architecture layers in

Model routing across providers + cost / cache. Layered in via the model gateway (Cloudflare AI Gateway + LiteLLM) below LangGraph.

Audit / proof of work. LangGraph emits traces; Langfuse stores them via the OTel GenAI exporter.

Permission policy beyond the framework allowlist. Layered in via Cerbos, called from a custom node or pre-tool hook.

Cross-process human-in-the-loop approval. LangGraph's interrupts are in-process; for durable approvals (HTML form, hours-long wait, retries), Inngest sits alongside.

Human-facing surface for the same actions. An HTML form on Cloudflare Pages that fires the same Inngest event — identical code path.

04 · Implications to plan for 🔒

The honest cost-of-being-right.

A multi-provider runtime keeps the model itself optional, which is the whole point. But the layer isn't free. Six implications worth saying out loud.

1 · Provider-portable, by construction

The whole reason for picking LangGraph (or Vercel AI SDK / Pydantic AI) is that init_chat_model("<provider>:<model>") swaps the model. Going from Claude to GPT to a self-hosted Llama via Ollama is a config change, not a code change. The architecture's downstream layers (MCP, Cerbos, Langfuse, Inngest) are all provider-neutral too.

2 · Multi-provider testing is a tax

If you actually use the portability — routing one task to Claude, another to GPT, another to Llama — you need an eval harness that runs across providers. Different models have different tool-calling reliability, different context-window costs, different refusal patterns. Budget time for an eval suite from day one.

3 · State-graph mental model

LangGraph models agents as state graphs — nodes for steps, edges for transitions, interrupts for human gates. Worth the learning curve at production scale (multi-agent orchestration, replay-debuggable workflows) but slower to start than a one-loop SDK. Expect a week of ramp-up per engineer.

4 · Observability backend choice is downstream

LangGraph emits traces to LangSmith by default; that's the slick path. For Langfuse (self-hosted, OSS), use the OpenTelemetry GenAI exporter and accept slightly less polish in the UI in exchange for owning the trace store. The reversal cost is low — pick later if you need to.

5 · Model gateway sits underneath, not inside

LangGraph's init_chat_model can route to provider-direct (e.g. Anthropic SDK, OpenAI SDK) or to a unified endpoint (LiteLLM, OpenRouter, Cloudflare AI Gateway). Use the unified endpoint — centralised auth + caching + rate limits + cost metering matter more than the marginal latency of a hop. The model gateway is its own layer in the architecture, deliberately.

6 · If you commit to Claude later, switch back

The Claude Agent SDK is genuinely the best fit when (and only when) Claude is the only model. If a pilot proves that's the right call — Claude wins on every task that matters — switching to Claude Agent SDK is ~1 day per agent. The lock-in becomes acceptable because it bought first-party features (Skills, MCP, prompt caching, sub-agents). Don't pre-commit; reserve the option.

05 · Connections

Where this leaf links into the rest of the tree.

→ Tools catalog → Architecture → Claude Agent SDK (general explainer) → MCP → Agent Skills → Claude (the model) → LangGraph (alternative we passed on) → CrewAI (alternative we passed on) → OpenAI Agents SDK (alternative)

06 · Resources

Primary sources.

Linked tersely. The SDK moves fast — verify the version against the date on this leaf.