Skip to main content
Lifetime license included with every purchase
n8n automationLLM workflowsAI agentsRAG pipelines

How to Build LLM Agent Workflows with n8n

n8n LLM automation covers agents, RAG pipelines, and tool-calling workflows you can run on self-hosted infrastructure. Explore templates and get started.

Nn8n Marketplace Team·May 28, 2026·10 min read

Most n8n tutorials stop at the zero-shot OpenAI call: send text in, get text out, route on the result. That's a useful pattern and it covers a lot of ground. But it's one layer of what n8n LLM automation can actually do.

The more interesting territory is agents. A workflow that can search a knowledge base before answering, decide which tool to call based on the query, maintain context across multiple messages, or A/B test two models on the same prompt set. These patterns exist in n8n today through the LangChain-based nodes that shipped in v1.x. They're just underexplained.

This guide covers how to build them from scratch.

What LLM Agent Workflows Can Do

Once you move past single-shot OpenAI calls, the use cases expand considerably:

  • Search a Pinecone or Qdrant vector store and answer questions grounded in your own documents
  • Call external APIs as tools mid-execution (read a CRM record, check an inventory count, pull a Slack thread) and fold the result into the next prompt
  • Maintain a sliding window of conversation memory across executions using the Buffer Window Memory node
  • Score and classify outputs automatically, then route based on numeric fields or categories the model returned
  • Run blind A/B comparisons between two models and aggregate per-dimension scores into a Google Sheets report

Not every workflow needs an agent. A single call to score sentiment doesn't need tools or memory. The n8n OpenAI workflows guide covers that tier well. The agent architecture earns its complexity when the task genuinely requires multi-step reasoning or context that doesn't fit in the prompt.

Browse LLM and AI workflow templates

The LLM Agent Pipeline

The backbone of any agent-based n8n workflow looks like this:

Trigger → Code Node (normalize) → AI Agent Node
                                      ↳ Tool: HTTP Request (CRM, calendar, inventory)
                                      ↳ Tool: Vector Store Retrieval
                                      ↳ Memory: Buffer Window Memory
                                  → Code Node (parse output) → Switch Node → Actions

The Agent node runs the core reasoning loop. The model sees the system prompt and the available tools, picks one, gets a result, and continues until it produces a final answer or hits the iteration ceiling. Set that ceiling explicitly in the node settings. The default of 10 iterations sounds conservative, but a misconfigured tool that returns an empty result on every call will drive 10 API calls before failing. Costly if you let it run.

Step by Step: From Trigger to Agent Output

1. Collect: Normalize the Input First

A Webhook trigger or Form Trigger delivers raw input. Run it through a Code node (v2, using the jsCode parameter) before it reaches the Agent node. The Agent node doesn't surface clean parse errors when the prompt assembly breaks — it either produces a hallucinated output or silently times out. Stripping null fields, trimming whitespace, and enforcing a known schema before the model sees the input removes an entire class of failure that's hard to diagnose from the execution log.

Not optional. A five-line Code node buys you a lot of debugging time later.

2. Process: Configure the Agent Node

The AI Agent node (@n8n/n8n-nodes-langchain.agent) takes three inputs: the model connection, the tools list, and optionally a memory connection. Each tool is a sub-workflow or an HTTP Request node that the agent can call during execution.

System prompt quality matters more than most documentation suggests. A prompt like "Use the knowledge base tool first. Only call the CRM tool if the knowledge base doesn't answer the question" consistently outperforms "You are a helpful assistant" in tool-selection accuracy — the model uses the system prompt as its routing logic. The more specific the instructions, the fewer wasted tool calls.

3. Route: The Agent Decides

The Agent node chooses which tool to call at each step based on model reasoning. That's the architectural shift from a standard n8n workflow: the routing isn't a Switch node you control, it's inference. What you can control is the tool descriptions, because those become part of the prompt.

Short, verb-led descriptions work best. "Look up customer record by email" beats "This tool retrieves customer information from the CRM database when provided with an email address." The model parses intent from natural language; verbose descriptions don't add accuracy, they add noise.

4. Act: Parse the Final Output

The agent's final output arrives as text. A Code node parses any structured fields the model embedded in the response. Skip this step and you'll eventually get a JSON object wrapped in a markdown fence, or a leading sentence before the object, and your downstream Switch node will fail in a way that's annoying to trace.

Even when the prompt asks for clean JSON output, the model doesn't guarantee it 100% of the time. The parse step handles the edge cases.

Where agent workflows fail in production

The two most common failure modes are: a tool that returns an empty or error response on every call (which drives the agent to max iterations before failing), and an output parser that assumes clean JSON when the model occasionally wraps it in prose. Add an Error Trigger workflow that catches failed executions and logs the tool call history — without that, debugging is guesswork.

5. Follow Up: Log Token Usage

Every agent execution should write a record: the input, which tools were called, and the token count. The basic @n8n/n8n-nodes-langchain.openAi v2.1 node exposes token usage at $json.output[0].usage.total_tokens when called directly. For the Agent node, token counts per tool call show in n8n's execution log in the UI. Without logging, runaway executions only surface when the API bill arrives at the end of the month.

Implementation Patterns

Pattern 1: RAG with Vector Store

For smaller knowledge bases (under 10,000 chunks), n8n's in-memory Vector Store works without an external vector database:

Document Loader → OpenAI Embeddings Node → In-Memory Vector Store
                                               ↑
Webhook → OpenAI Embeddings Node → Similarity Search → Context Builder → OpenAI Node → Output

The in-memory store resets between executions, so it works for document sets that don't change between runs. For anything that needs persistent embeddings across different execution sessions, swap in the Pinecone or Qdrant Vector Store node. The interface is identical — just swap the node type and update credentials.

Test models before you commit to one

The RAG Eval: Blind A/B Agent Tester template handles a production need that's easy to skip: objectively comparing two models or prompts before choosing one. It runs both configurations against a configurable test suite, anonymizes the outputs so the judge model scores blind, aggregates per-dimension scores (accuracy, clarity, completeness, conciseness on a 1–10 scale), and writes a structured findings report to Google Sheets. It's how you make model-selection decisions with data instead of guessing.

Pattern 2: Tool-Calling Agent for Lead Qualification

The AI Agent node connects a model to live CRM data during execution:

Webhook (form submission) → Code Node (normalize) → AI Agent Node
    ↳ Tool: HTTP Request to CRM (get contact history)
    ↳ Tool: HTTP Request to Sheets (check existing score)
→ Code Node (parse score + reasoning) → Switch (hot/warm/cold) → SendGrid (personalized email)

The model scores the lead 0–100 and outputs a reasoning field explaining the score. The Code node extracts both values. The Switch node routes based on the score. Cold leads drop into a nurture sequence; hot leads go straight to the priority sales queue.

Pre-scored leads on day one

The AI Lead Scoring and Email Routing template implements this exact pattern using GPT-4o-mini, SendGrid, and Google Sheets. It handles the credential injection automatically and takes about 10 minutes to configure. The scoring prompt and threshold values live in a single Config node, so you don't have to touch the workflow logic to adjust them.

Pattern 3: Hybrid Rule Engine and LLM Fallback

Not every decision benefits from LLM reasoning. Predictable approval workflows (expense requests under a threshold, standard content publishing, low-risk access grants) are faster, cheaper, and auditable with a rule-based Switch node. Reserve the model for the edge cases rules can't handle.

Webhook → Switch Node (rule check) → Auto-approve path
                                   OR AI Agent (edge case reasoning) → Slack + Sheets log

This pattern keeps token costs flat and predictable. Rules handle roughly 80% of request volume; the model handles the rest. A hybrid approach is often the right call for high-throughput workflows where every execution has a cost.

See the Decision Rule Engine template

n8n Nodes You'll Use Most

NodePurpose
@n8n/n8n-nodes-langchain.agentAI Agent with tool-calling loop and multi-step reasoning
@n8n/n8n-nodes-langchain.openAi v2.1Single model call; output at $json.output[0].content[0].text
@n8n/n8n-nodes-langchain.vectorStorePineconePinecone vector store for persistent embeddings
@n8n/n8n-nodes-langchain.memoryBufferWindowSliding window memory for conversational context
@n8n/n8n-nodes-langchain.embeddingsOpenAiConverts text to embedding vectors for similarity search
n8n-nodes-base.code v2Input normalization and output parsing (param: jsCode)
n8n-nodes-base.httpRequest v4.2Agent tools that reach external APIs mid-execution
n8n-nodes-base.switchRoutes agent output based on parsed numeric fields or labels

Getting Started

  1. Check your n8n version — LangChain nodes require n8n v1.15 or higher. Older instances don't show the Agent node in the node picker.
  2. Add OpenAI credentials — Create an API key at platform.openai.com, add it to n8n under Credentials. GPT-4o-mini handles most classification and scoring tasks at a fraction of GPT-4o cost. Worth starting there before scaling up.
  3. Build a minimal agent first — Before adding tools and memory, wire up a single Agent node with no tools and confirm the model responds as expected. Add complexity one piece at a time.
  4. Add one tool at a time — Each tool is an HTTP Request node with an exact URL, method, and defined output schema. The tool description is prompt text; write it like you're explaining the tool to a human who's never seen it. Test each tool in isolation before connecting it to the agent.
  5. Always add a parse step after the agent — Put a Code node immediately downstream from the Agent node. Even when the model returns clean JSON 95% of the time, the remaining 5% will surface in production at the worst moment.
  6. Import a starting template — The RAG Eval: Blind A/B Agent Tester and AI Lead Scoring and Email Routing templates are ready-to-run starting points for the two most common agent patterns on n8n.

The output parser is the step that gets skipped most often. Don't skip it. A model that appends a reasoning paragraph before its JSON object is indistinguishable from clean output at the prompt-design stage. The failure only appears when the Switch node downstream gets a string where it expected a number.

For the foundational AI patterns (zero-shot classification, structured extraction, scored evaluation) without the agent complexity, the n8n AI automation guide covers those in detail. If you're deploying agent workflows to a self-hosted instance and haven't locked down your production environment yet, the n8n self-hosting guide is worth reading before you expose a webhook to the public internet.

Browse all n8n AI and LLM templates
FAQ

Common questions

Can n8n run LLM agent workflows without relying on n8n Cloud?
Yes. All of n8n's LangChain-based nodes — the AI Agent node, Vector Store nodes, Memory nodes, and the OpenAI node (v2.1) — run on self-hosted n8n exactly as they do on n8n Cloud. The only requirement is that the n8n instance can reach the APIs you've connected (OpenAI, Pinecone, Qdrant, etc.).
What is the difference between the basic OpenAI node and the AI Agent node in n8n?
The basic OpenAI node (@n8n/n8n-nodes-langchain.openAi v2.1) makes a single call and returns output at $json.output[0].content[0].text. The AI Agent node wraps a model with tool definitions, a memory component, and a loop that lets the agent decide which tool to call next. Use the basic node for one-shot tasks; use the Agent node when the workflow needs to reason over multiple steps or retrieve external context.
How does RAG work inside an n8n workflow?
A RAG pattern in n8n uses three stages: a Vector Store node (Pinecone or in-memory) stores embeddings of your documents, an Embeddings node converts the user query into a vector, and a retrieval step returns the top-k matching chunks. Those chunks feed into an OpenAI prompt as context before the model generates its answer. The RAG Eval: Blind A/B Agent Tester template automates this pattern and adds blind A/B scoring to compare model responses against each other.
Stop reading. Start running.

Get the workflow templates this guide is built on

Import-ready n8n JSON, step-by-step setup, and tested end-to-end. One-time payment, own it forever.