AI Workflows with Ollama
Connect n8n to a local LLM for intelligent automation — zero API costs, zero data leaving your server.
Completed Part 2, 4 GB+ RAM VPS
~45 minutes
4 AI-powered workflows with zero API costs
New to Ollama? Check out Part 1 of our AI Stack series for a deep dive on model selection and CPU performance tuning.
AI Automation Without the API Bill
Every major automation platform is pushing AI features right now. Zapier has AI actions. Make has AI modules. They all charge extra for them, and they all route your data through third-party LLM APIs — typically OpenAI — at $0.01 to $0.06 per 1K tokens.
Run that across thousands of automated tasks per month and you're looking at real money on top of your already-expensive automation subscription.
Here's the alternative: Ollama running locally on your RamNode VPS, connected to n8n. You get LLM-powered workflows with zero API costs, zero token metering, and zero data leaving your server.
What You Need
Ollama runs language models in memory. The model size determines your RAM requirements:
| Model | Params | RAM | Best For |
|---|---|---|---|
| Phi-3 Mini | 3.8B | 2.5 GB | Fast classification, simple tasks |
| Llama 3.1 | 8B | 5 GB | General purpose, good quality |
| Mistral | 7B | 4.5 GB | Instruction following, code |
| Gemma 2 | 9B | 6 GB | Summarization, analysis |
| Llama 3.1 | 70B | 40 GB | Complex reasoning (GPU VPS) |
For most workflows, an 8B parameter model on a 4 GB+ VPS hits the sweet spot between quality and performance.
Step 1: Install Ollama
curl -fsSL https://ollama.com/install.sh | shPull a model — we'll start with Llama 3.1 8B:
ollama pull llama3.1:8bVerify it's working:
ollama run llama3.1:8b "Respond with just 'OK' if you're working."By default, Ollama listens on localhost:11434. Since n8n runs in Docker, we need Ollama accessible from inside the Docker network.
Step 2: Configure Ollama for Docker Access
Edit the Ollama service to bind to all interfaces:
sudo systemctl edit ollamaAdd the following override:
[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"Restart Ollama:
sudo systemctl restart ollamaNow update your docker-compose.yml to give n8n access to the host network:
n8n:
image: docker.n8n.io/n8nio/n8n
restart: unless-stopped
extra_hosts:
- "host.docker.internal:host-gateway"
environment:
- N8N_HOST=n8n.yourdomain.com
# ... rest of your environment variablescd ~/n8n-docker
docker compose up -dn8n can now reach Ollama at http://host.docker.internal:11434.
Step 3: Configure Ollama Credentials in n8n
- Go to Credentials in the left sidebar
- Click "Add Credential"
- Search for "Ollama"
- Set the Base URL to
http://host.docker.internal:11434 - Save the credential
Workflow 1: Intelligent Email Classifier
Receives emails via IMAP, classifies them using your local LLM, and routes them to the appropriate handler.
Trigger: Email Trigger (IMAP)
- Add an Email Trigger (IMAP) node
- Configure your mailbox credentials
- Set it to check every 2 minutes
Classify with Ollama
Add an Ollama node with this prompt:
Classify the following email into exactly one category:
- SUPPORT: Technical issues, bug reports, help requests
- SALES: Pricing inquiries, demo requests, purchase interest
- BILLING: Invoice questions, payment issues, refund requests
- SPAM: Marketing, unsolicited offers, irrelevant content
- OTHER: Anything that doesn't fit the above categories
Respond with ONLY the category name, nothing else.
Subject: {{ $json.subject }}
Body: {{ $json.text.substring(0, 1000) }}Route with Switch
- SUPPORT → Create ticket in your support system via HTTP Request
- SALES → Forward to sales team, log in CRM
- BILLING → Forward to billing, flag in accounting system
- SPAM → Move to spam folder, no notification
This entire pipeline runs on your VPS at zero marginal cost. On Zapier with AI actions, you'd pay per classification and per downstream action.
Workflow 2: Document Summarizer
Watches for new documents, summarizes them with your local LLM, and posts summaries to Slack.
Summarize with Ollama
Summarize the following document in 3-5 bullet points.
Focus on key decisions, action items, and important facts.
Keep each bullet point to one sentence.
Document:
{{ $json.content.substring(0, 4000) }}Post to Slack
📄 *New Document Summary*
Source: {{ $('Webhook').first().json.filename }}
Processed: {{ $now.format('yyyy-MM-dd HH:mm') }}
{{ $json.text }}Workflow 3: Support Ticket Auto-Responder
Instead of paying per-ticket for AI triage, your VPS handles it all.
Step 1 — Analyze the Ticket
Analyze this support ticket and provide:
1. SEVERITY: low, medium, high, critical
2. CATEGORY: billing, technical, account, feature-request
3. SUGGESTED_RESPONSE: A brief, professional response addressing the customer's issue.
Ticket:
Subject: {{ $json.subject }}
Message: {{ $json.body.substring(0, 2000) }}Step 2 — Parse the Response
const response = $input.first().json.text;
const severity = response.match(/SEVERITY:\s*(low|medium|high|critical)/i)?.[1] || 'medium';
const category = response.match(/CATEGORY:\s*(\w[\w-]*)/i)?.[1] || 'general';
const suggestedResponse = response.match(/SUGGESTED_RESPONSE:\s*([\s\S]+)/i)?.[1]?.trim() || '';
return [{
json: {
severity,
category,
suggestedResponse,
originalSubject: $input.first().json.subject,
originalBody: $input.first().json.body
}
}];Step 3 — Route by Severity
- Critical: Immediately page on-call engineer + auto-respond
- High: Post to team Slack channel + auto-respond
- Medium/Low: Auto-respond with suggested response, queue for human review
Workflow 4: Log Analyzer
Feed server logs through your local LLM to detect patterns and anomalies. Triggered every hour on a schedule.
Collect Logs
journalctl --since "1 hour ago" --no-pager -q | tail -100Analyze with Ollama
Analyze these server logs from the last hour. Report:
1. Any errors or warnings (with counts)
2. Unusual patterns or anomalies
3. Suggested actions if any issues are detected
If everything looks normal, respond with "STATUS: NORMAL" and nothing else.
Logs:
{{ $json.stdout }}Conditional Alert
- IF response does NOT contain "STATUS: NORMAL" → Send Slack alert with the analysis
- IF response contains "STATUS: NORMAL" → No action (save on notification noise)
Performance Tips for Ollama on a VPS
Model Selection Matters
Don't default to the largest model. For classification tasks (yes/no, category routing, sentiment), a 3B parameter model responds in under a second and is perfectly accurate. Reserve larger models for tasks that require reasoning or generation.
Keep Models Loaded
Ollama keeps the last-used model in memory. If you're running workflows that use the same model frequently, this means near-instant response times after the first call. Avoid switching between multiple models in rapid succession.
Set Timeouts Appropriately
LLM inference on CPU takes longer than GPU. For an 8B model on a VPS:
- Classification tasks: 5–15 seconds
- Summarization: 15–45 seconds
- Long-form generation: 30–120 seconds
Batch When Possible
Instead of processing 50 emails one at a time through the LLM, batch them. Use the Code node to combine multiple items into a single prompt with numbered entries, then parse the batched response.
A Note on Data Privacy
Every piece of data these workflows process stays on your VPS. The emails being classified, the documents being summarized, the support tickets being analyzed — none of it leaves your server. No data is sent to OpenAI, Anthropic, Google, or any other third party. For businesses handling sensitive customer data, regulated industries, or anyone who simply prefers not to pipe their internal data through external APIs, this architecture is a significant advantage.
What's Next?
In Part 4, we're building DevOps automation workflows — server monitoring alerts with intelligent thresholds, deployment triggers that run on git push, automated log analysis with escalation, and infrastructure health dashboards.
Your VPS starts managing itself.
