LangGraph is the graph-based orchestration framework from LangChain Inc. Built around a StateGraph primitive with explicit nodes, edges, and shared state — you draw the agent flow, the framework runs it. MIT-licensed, Python and TypeScript SDKs, native checkpointing, time-travel debugging, human-in-the-loop interrupts, deep LangSmith observability. The de-facto choice in 2026 when you want full control over agent behaviour rather than letting an LLM decide what happens next. Used in production at Klarna, Replit, Elastic, AppFolio, and across the broader LangChain ecosystem.
LangGraph is a low-level orchestration framework for building stateful, multi-step agent workflows. Released in early 2024 by LangChain Inc. and shipped as a sibling library to LangChain itself, it took a deliberately different design from the "chain" abstraction that made LangChain famous.
The thesis: most production agent failures are control-flow failures, not reasoning failures. The LLM picks the wrong tool, calls it twice, loops forever, or misses a step. LangGraph's answer is to lift the flow out of the prompt and into the framework. You define a StateGraph — nodes, edges, shared state — and the graph runs deterministically. The LLM still reasons inside nodes; the framework decides which node runs next.
That single architectural choice cascades into the things LangGraph does well: checkpointing (the state at every node is persisted, so you can resume, inspect, or fork from any step), time-travel debugging (rewind state to a previous node and re-run), human-in-the-loop interrupts (pause the graph, ask a human, resume), and fan-out / fan-in patterns (parallel branches that merge cleanly). All of which are awkward to bolt onto a "let the agent decide" framework.
LangGraph is MIT-licensed and lives across langchain-ai/langgraph (Python) and langchain-ai/langgraphjs (TypeScript). The Python repo crossed 17,000 GitHub stars in 2026 and ships ~weekly releases. The framework reached its 1.0 stable release in late 2024, signalling production-readiness; v1.x has been the recommended track since.
LangChain's original abstraction was the chain — linear pipelines of LLM calls. The agent abstraction (ReAct loops) added autonomy but lost predictability. LangGraph's graph sits between: as flexible as an agent (cycles, branches, dynamic routing) but as predictable as a chain (you can read the graph and know what it does). For most production workloads, that middle position is structurally where the work lives.
State, Nodes, Edges, Checkpointer, and Interrupts. Master those five and 90% of LangGraph clicks. The rest is integration: which LLM, which tool layer, which deployment surface.
State is the single shared object every node reads from and writes to. Defined as a TypedDict in Python (or a Zod schema in TS), it's the contract between nodes. Nodes are plain functions that take state and return a partial state update. Edges connect nodes — either statically (always go from A to B) or conditionally (a function inspects state and returns the next node name).
The minimal LangGraph in Python:
from typing import TypedDict, Annotated from langgraph.graph import StateGraph, START, END from langgraph.graph.message import add_messages from langchain_anthropic import ChatAnthropic class State(TypedDict): messages: Annotated[list, add_messages] def chatbot(state: State): llm = ChatAnthropic(model="claude-sonnet-4-6") return {"messages": [llm.invoke(state["messages"])]} graph = StateGraph(State) graph.add_node("chatbot", chatbot) graph.add_edge(START, "chatbot") graph.add_edge("chatbot", END) app = graph.compile() # Run it result = app.invoke({"messages": [("user", "Hello")]})
Conditional edges are where LangGraph earns its keep. A function reads state and decides which node runs next. This is how you build routing, retries, and tool-use loops without the LLM picking from a freeform menu:
def route(state: State) -> str: last = state["messages"][-1] if last.tool_calls: return "tools" return END graph.add_conditional_edges("chatbot", route, { "tools": "tool_node", END: END, }) graph.add_edge("tool_node", "chatbot") # back to the LLM
Checkpointing. Every node executes inside a transaction. If you compile the graph with a Checkpointer (in-memory, SQLite, Postgres, or Redis backends), the state at every step is persisted. That gives you free wins:
thread_id keeps historyfrom langgraph.checkpoint.postgres import PostgresSaver checkpointer = PostgresSaver.from_conn_string("postgresql://...") app = graph.compile(checkpointer=checkpointer) # Same thread_id keeps state across calls config = {"configurable": {"thread_id": "user-42"}} app.invoke({"messages": [("user", "Hi")]}, config) app.invoke({"messages": [("user", "What did I just say?")]}, config)
Human-in-the-loop interrupts. Pause the graph mid-execution, let a human review or edit state, then resume. This is the load-bearing feature for any agent that touches money, sends external messages, or has compliance constraints:
from langgraph.types import interrupt, Command def approve_payment(state: State): decision = interrupt({"amount": state["amount"]}) if decision["approved"]: return {"status": "approved"} return {"status": "rejected"} # Run; graph pauses at interrupt(). Resume with a Command: app.invoke(Command(resume={"approved": True}), config)
In a freeform agent, "did the model approve a $50,000 transfer?" is a prompt-engineering question. In LangGraph, it's an architectural fact — the approval node either ran or it didn't, the interrupt either fired or it didn't, and you can read the graph to know which is true. Predictable control flow makes audit and compliance tractable. That's the productivity argument for the extra boilerplate.
LangGraph alone is a graph runtime. Plug it into the LangChain ecosystem and you get the largest tool integration library in agent-land. Plug LangSmith on top and you get the most mature observability story in the field. Wire in LangGraph Platform (the commercial managed runtime) and you get hosted deployment with checkpointing, scaling, and monitoring as managed services.
The MIT-licensed graph runtime itself. Python and TypeScript SDKs. Self-host anywhere — Docker, Cloud Run, Lambda, Hetzner droplets, your laptop.
The sibling library. 700+ tool integrations, retrievers, document loaders, output parsers, agents. LangGraph nodes typically use LangChain tools internally.
Commercial observability for LangChain & LangGraph runs. Traces, evaluations, prompt management, dataset curation. Free tier available; paid tiers for production volume.
Managed runtime for LangGraph apps. Hosted checkpointing, autoscaling, threads API, Studio (visual graph debugger). Self-host or cloud-host options.
The honest commercial picture. LangGraph the library is genuinely free MIT. The free tier of LangSmith covers small teams. At production scale — high throughput, long retention, multi-seat — LangSmith and LangGraph Platform are the revenue model. Most teams that adopt LangGraph end up paying LangChain Inc. for one or both. That's the trade compared with Google ADK + Vertex AI Agent Engine (different vendor, similar pattern) or self-hosting LangGraph + a roll-your-own observability layer (more work, no SaaS bill).
LangGraph and Google ADK are the two strongest "build production agents" frameworks in 2026. They make different bets. The right choice depends less on quality — both are excellent — and more on ecosystem alignment, deployment surface, and how much explicit control you want.
| Dimension | LangGraph | Google ADK |
|---|---|---|
| Default control style | Explicit graph — you draw it | Hierarchical agents — LLM routes |
| Cloud alignment | Vendor-agnostic; runs anywhere | GCP-first; managed surface on Vertex AI |
| Tool ecosystem | LangChain — the largest in agent-land | MCP + LangchainTool / CrewaiTool adapters |
| Observability | LangSmith (paid) — the most mature | OpenTelemetry GenAI + Cloud Trace |
| Voice / multimodal | Possible, not native | Native via Gemini Live API |
| Checkpointing & time-travel | Native, with multiple backends | Sessions + Memory Bank (managed) |
| Boilerplate | Higher — you write the graph | Lower — agents wire themselves up |
| Languages | Python, TypeScript | Python, TypeScript, Go, Java |
| A2A interop | Community wrapper | First-party (to_a2a()) |
| Best fit | Predictable flows, compliance-heavy work, multi-vendor stacks | GCP deployments, Gemini-first, voice agents, faster prototypes |
LangGraph and ADK aren't mutually exclusive. The pattern that's emerging in 2026: LangGraph for the orchestration "control plane" (where you want explicit edges and audit), ADK or specialist frameworks for the "data plane" (Gemini-Live voice agents, Vertex-managed sub-agents). The two talk via A2A. LangGraph's strength as the orchestrator is the explicit graph; ADK's strength as a sub-agent is the managed runtime. Use both for what each is best at.
Six patterns where LangGraph has shipped at scale, drawn from the LangChain Inc. case studies, public reference architectures, and production engineering blogs.
ParallelAgent-style patterns natively expressed as graph branches.LangGraph is opinionated. The opinion is that explicit control beats implicit reasoning for production work, and the cost is more boilerplate. That trade is right for some teams and wrong for others. Honest two-sided guidance follows.
LangGraph is on a stable v1.x track. Pin minor versions in production (langgraph~=1.x) but expect to update monthly — the API is stable but the ecosystem moves. LangChain itself has a faster cadence; if you depend on specific LangChain integrations, pin those more conservatively. The graph definition itself is the durable artefact — you can usually upgrade the runtime without rewriting the graph.
LangGraph's vendor-agnostic shape and explicit control flow have practical leverage in SA delivery contexts. POPIA compliance is easier when the audit trail is a checkpointer table, not a vendor's proprietary log. Multi-cloud is easier when the framework runs anywhere. Cost predictability is easier when you can swap an expensive cloud LLM for a local Ollama call by changing one node.
For SA banks, insurers, and telcos with mixed-cloud or AWS-first estates, LangGraph is the right framework choice if ADK's GCP-bias is a problem. Run on EKS in Cape Town (AWS af-south-1), checkpoint to RDS Postgres in the same region, observe via LangSmith or self-hosted Phoenix. The audit trail stays on-region; the framework doesn't care which LLM you call. POPIA Section 72 cross-border-transfer concerns become a per-node decision: which nodes call Claude (US), which call Vertex (JHB), which call local Ollama. The graph itself documents the policy.
For mid-market builds, LangGraph + Cloud Run + a small Postgres instance for checkpointing is a productive stack. LangSmith free tier covers most pilots; you can defer the paid SKU until volume actually justifies it. The pragmatic path for SA studios: start in LangGraph dev mode on a developer laptop, ship to Cloud Run for staging, only move to LangGraph Platform if a client wants the managed runtime + Studio visual debugger.
LangGraph is one of the better frameworks to learn agent patterns on, precisely because the abstractions are explicit. You can see the nodes, the edges, the state. The official LangChain Academy course is free and SA-friendly (no Google Cloud account or paid model needed — pair with Ollama-hosted Gemma 3 for the model layer). For SA developers learning agentic patterns in 2026, LangGraph + Ollama is a credible zero-cost learning stack.
FX cost note. LangSmith is USD-billed and the paid tiers add up at production volume. If FX exposure is a constraint, plan for either: (a) self-hosting an OpenTelemetry-based observability stack (more engineering work upfront), (b) using LangSmith's free tier and accepting the 14-day retention limit, or (c) treating LangSmith as a strategic line item and locking in annual pricing.
ChatOllama from langchain-ollama; same node code works against frontier APIs and local models.langchain-anthropic.Authored from the canonical langchain-ai/langgraph repo, the LangGraph documentation, the LangChain Academy course materials, and LangChain Inc.'s public case studies and engineering blog posts. Last reviewed 2026-05-10.