n8n workflowsdata pipelinesETL automationdata sync

How to Build n8n Data Pipelines (Sync, Transform, and Route Data Between Any App)

Build n8n data pipelines that extract from APIs, transform with Code nodes, and load into any target system: schedule-driven or real-time, no ETL tool needed.

Nn8n Marketplace Team·May 22, 2026·9 min read

Manual Data Syncing Breaks. Pipelines Don't.

Copy-pasting records between systems. Scheduled CSV exports that nobody checks. Spreadsheet formulas that break when a column moves. Every growing team has a version of this problem: data that lives in one system and needs to be somewhere else, cleaned, every day.

n8n lets you build proper data pipelines: extract from any API or database, transform with JavaScript, validate against business rules, and load into any target, on a schedule or triggered by an event. No external ETL service required.

This guide covers how to build reliable extract-transform-load workflows in n8n, from a simple API-to-Sheets sync to a paginated multi-source aggregation that combines data from several platforms into one unified record.

What You Can Automate With n8n Data Pipelines

Any recurring data movement task is a candidate:

API-to-spreadsheet sync: pull records from Stripe, HubSpot, or your product API into Google Sheets every hour
Database mirroring: replicate rows from a PostgreSQL table to a secondary database or analytics store
Multi-source aggregation: combine Shopify orders, Stripe revenue, and GA4 sessions into one unified daily record
Incremental sync: track the last synced timestamp and only fetch new or updated records on each run
Data cleansing: normalize phone numbers, deduplicate contacts, and reject records missing required fields before they land in your CRM
Cross-app data bridge: move records from one platform to another when there's no native integration (e.g., Jotform submissions to PostgreSQL)

Get the Data Entry Hub template: a pre-built n8n data pipeline →

The Data Pipeline

Every n8n data pipeline follows the same shape:

Schedule (or Webhook)
  → HTTP Request (Extract: fetch records from source API)
  → Code (Transform: normalize fields, compute derived values)
  → IF (Validate: reject records missing required fields)
  → Split in Batches (chunk into groups of 100)
    → HTTP Request / Sheets / DB (Load: write to target)
  → Slack (Report: send run summary with counts and errors)

This structure is repeatable and debuggable regardless of which source and target you're using.

1. Trigger: Define When the Pipeline Runs

Use a Schedule Trigger for batch pipelines:

Hourly: 0 * * * *
Nightly at 2am: 0 2 * * *
Every 15 minutes: */15 * * * *

Use a Webhook Trigger when the pipeline should fire in response to an event: a new row added, a product backend event, or a manual trigger from Slack.

For real-time event-driven pipelines that need to respond within seconds, see the n8n webhook automation guide.

2. Extract: Pull Data From the Source

The HTTP Request node handles any REST API. Configure:

Method: GET
URL: your API endpoint (e.g., https://api.stripe.com/v1/charges)
Authentication: Header Auth with Authorization: Bearer YOUR_API_KEY
Query parameters: limit=100, date filters using created[gte]

For paginated APIs, use a loop pattern: fetch a page, load it, fetch the next until all pages are consumed.

Check the exact output path before writing transform logic

Every node exposes results at $json.fieldName. For paginated APIs, records are usually nested at $json.data or $json.items. Always inspect the HTTP Request node's output panel after a test run to find the exact field path before writing downstream Code node logic that depends on it.

3. Transform: Normalize and Reshape

A Code node transforms raw API output into the shape your target expects:

return $input.all().map(item => {
  const src = item.json;
  return {
    json: {
      id:           src.id,
      email:        (src.email || "").toLowerCase().trim(),
      amount_usd:   (src.amount / 100).toFixed(2),   // Stripe sends cents
      created_date: new Date(src.created * 1000).toISOString().split("T")[0],
      status:       src.status,
      name:         [src.first_name, src.last_name].filter(Boolean).join(" "),
    }
  };
});

Keep all field mapping here, not scattered across downstream nodes. One Code node that produces clean, predictable output is far easier to debug than a chain of Set nodes doing partial transforms.

4. Validate: Filter Out Bad Records

An IF node checks each record against business rules before it reaches your target:

Email must not be empty
Amount must be greater than 0
Status must be one of the expected values

Records that fail route to the false branch. Write them to an "Error Log" sheet tab or trigger a Slack alert with the record ID and reason. Records that pass continue to loading.

Log rejected records, do not silently drop them

An IF node's false branch should write failed records somewhere visible. A Google Sheet "Error Log" tab with the record ID and failure reason makes debugging pipelines possible. Silent drops mean you discover bad data days later, not minutes later.

5. Load: Write to the Target

Depending on your target:

Google Sheets node: appendOrUpdate operation to upsert by a key column
HTTP Request: POST the transformed record body to a target REST API
PostgreSQL or MySQL node: INSERT ... ON CONFLICT DO UPDATE to upsert by primary key
Airtable node: create or update records in a base table

Use Split in Batches (batch size: 50–200) before the load step when processing more than a few hundred records. This prevents memory pressure and makes partial failures easy to identify. You know exactly which batch failed.

6. Report: Send a Run Summary

After loading completes, a Slack or Gmail node sends a summary:

Pipeline run complete: 2026-05-22 02:00 UTC
Records processed: 1,247
Records loaded:    1,231
Records rejected:  16 (see Error Log tab)
Duration: 43s

This turns a background job into an auditable process. A failed run becomes visible immediately instead of surfacing as corrupted data days later.

Implementation Patterns

Pattern 1: Scheduled API-to-Sheets Sync

The simplest pattern: fetch from an API every hour and upsert rows into a Google Sheet.

Schedule (every hour)
  → HTTP Request (GET /api/events with date filter)
  → Code (normalize fields: remap, cast types, clean strings)
  → Google Sheets (appendOrUpdate, key column: id)
  → Slack (post count: N records synced)

Works for: Stripe charges, HubSpot contacts, product event logs, support tickets.

Use the Market Trend Analyzer template as a reference for pulling API data into a structured sheet on a schedule. It handles pagination and deduplication out of the box.

Explore the Market Trend Analyzer: an API-to-dashboard pipeline template →

Pattern 2: Multi-Source Aggregation

Merge data from several APIs into one unified record per entity; for example, one row per customer combining billing, CRM, and product data.

Schedule (nightly)
  → HTTP Request (Stripe: customer list)
  → HTTP Request (HubSpot: matching contact by email)
  → HTTP Request (product DB: last login timestamp)
  → Code (merge all three into one object per customer)
  → Google Sheets (write unified customer row)

The Code node after the last fetch assembles the merged object:

const stripeData  = $("Stripe").all();
const hubspotData = $("HubSpot").all();
const productData = $("Product DB").all();

return stripeData.map(s => {
  const hs = hubspotData.find(h => h.json.email === s.json.email);
  const pd = productData.find(p => p.json.email === s.json.email);
  return {
    json: {
      customer_id:   s.json.id,
      email:         s.json.email,
      plan:          s.json.plan,
      hs_deal_stage: hs?.json.dealstage ?? "unknown",
      last_login:    pd?.json.last_login ?? null,
    }
  };
});

The Decision Makers Dashboard template uses this exact pattern: pulling signals from multiple sources and presenting them as one unified view.

Pattern 3: Incremental Sync With a Cursor

For large datasets, fetch only new or updated records since the last run by persisting a cursor in a config sheet.

Schedule (every 15 minutes)
  → Google Sheets (read cursor from config row)
  → HTTP Request (GET /records?updated_since=<cursor>)
  → Code (transform)
  → [load to target]
  → Google Sheets (write new cursor = max updated_at from this batch)

The cursor prevents double-loading already-processed records and keeps each pipeline run fast even as the total dataset grows.

Store cursor state outside the workflow

n8n workflows have no persistent state between executions. Store the cursor (last synced timestamp, last processed ID) in a Google Sheet config row, a database table, or n8n's built-in static data via $workflow.staticData. Read it at the start of every run, write it at the end.

n8n Nodes You'll Use Most

Node	Purpose
Schedule Trigger	Run the pipeline on a cron schedule
HTTP Request	Extract from any REST API or load to any REST target
Code	Transform, normalize, merge, and compute derived fields
IF	Validate records; route failures to an error branch
Split in Batches	Process large record sets in chunks to avoid memory pressure
Google Sheets	Append, upsert, or read cursor state from a spreadsheet
Merge	Combine outputs from multiple upstream extraction branches
PostgreSQL / MySQL	Read from or write to a relational database
Slack / Gmail	Send pipeline run summaries and error alerts
Loop Over Items	Paginate through multi-page API responses

Getting Started

Identify source and target: write down the API endpoint or database table you're extracting from, and exactly where the data needs to land.
Test the extract step alone: add just the Schedule and HTTP Request nodes, run manually, and inspect the raw output. Know the exact field paths before writing any transform logic.
Write the Code node transform: map source fields to target fields, normalize types, strip whitespace, compute derived values. Test against real output from step 2.
Add the IF validation node: define what a valid record looks like and route failures to an error log sheet.
Add Split in Batches before the load: even with 10 records today, the batch node makes scaling later painless.
Add the load node: connect to Google Sheets, a database, or the target API. Test with a single record first, then a full batch.
Add the Slack or Gmail summary node: log record counts and any errors. Activate the workflow only after the full pipeline passes a real test run.

For pipelines that need to classify or enrich records during transformation (scoring leads, detecting intent, categorizing support tickets), see the n8n AI-powered automation guide. For turning pipeline output into live dashboards and scheduled reports, see the n8n data reporting guide.

The App Analytics Dashboard template shows how to close the loop: pipeline output feeding directly into a structured reporting view.

Browse n8n data automation templates →

FAQ

Common questions

What's the difference between an n8n data pipeline and an n8n data report?

A data pipeline moves and transforms data: it extracts from a source, reshapes it, and loads it into a target. A data report reads already-stored data and presents it. In n8n, pipelines use nodes like HTTP Request, Code, and Split in Batches to process records; reports use nodes like Google Sheets and Gmail to surface summaries. You often build both: the pipeline feeds the warehouse, the report reads from it.

Can n8n replace a dedicated ETL tool like Fivetran or Airbyte?

For most small and mid-size data volumes (under a few million records per run), yes. n8n handles extraction via HTTP Request or database nodes, transformation via Code nodes, and loading via any integration. It lacks the pre-built connectors and schema management of enterprise ETL tools, but for custom pipelines between your own systems, it is faster to build and cheaper to run.

How do I handle large datasets in n8n without hitting memory limits?

Use the Split in Batches node to chunk large result sets into groups of 50–200 records and process them sequentially. For paginated APIs, loop over pages with a While or recursive subworkflow pattern, loading each page before fetching the next. Avoid accumulating millions of records in a single workflow execution. Process and discard as you go.

Stop reading. Start running.

Get the workflow templates this guide is built on

Import-ready n8n JSON, step-by-step setup, and tested end-to-end. One-time payment, own it forever.

Browse templates ← All articles

Free — $40 value

Get 3 tested n8n templates, free

The full customer package for three real catalog templates — workflow JSON, step-by-step setup guide, credential checklist. Built through the same live-instance release process as everything we sell. Plus new templates and automation guides in your inbox. No spam, unsubscribe anytime.

01Smart To-Do List ManagerPre-built n8n workflow template that automates productivity with OpenAI. Live in about 10 minutes.$14
02Email Follow-Up AutomatorPre-built n8n workflow template that automates crm with OpenAI. Live in about 15 minutes.$12
03Market Trend AnalyzerPre-built n8n workflow template that automates data processing with OpenAI. Live in about 10 minutes.$14

More automation guides

All posts

n8n Backup Automation That Actually Verifies the Backup

7 min read

n8n workflowsbackup automationDevOps automation

n8n Backup Automation That Actually Verifies the Backup

A Backup You Never Verified Is a Coin Flip Every n8n backup automation tutorial that ranks does the same thing: schedule an export, push the JSON to Google Drive or S3, done. They back up the data and…

Jul 26, 2026Read more →

How to Build an n8n Log Alerting Workflow (No Grafana Required)

7 min read

n8n workflowslog alertingerror handling

How to Build an n8n Log Alerting Workflow (No Grafana Required)

Most n8n Error Setups Tell You Everything, Which Means They Tell You Nothing An n8n log alerting workflow has one job: surface the failures that need a human and quietly file the ones that don't. The…

Jul 25, 2026Read more →

n8n Uptime Monitoring with Slack Alerts (Without the Alert Storms)

7 min read

n8n workflowsuptime monitoringSlack alerts

n8n Uptime Monitoring with Slack Alerts (Without the Alert Storms)

An Uptime Check That Cries Wolf Gets Muted in a Week The point of n8n uptime monitoring with Slack alerts isn't to detect a single outage. It's to be trusted the next hundred times it fires. Most of t…

Jul 24, 2026Read more →