ADR-037: n8n Workflow Automation Integration¶
Status: DRAFT Date: 2025-12-30 Context: Enabling LLM Council as an "agent jury" in workflow automation Depends On: ADR-009 (HTTP API), ADR-025 (Future Integration Capabilities) Author: @amiable-dev Council Review: 2025-12-30 (High Tier, 4/4 models)
Context¶
LLM Council's HTTP API enables integration with workflow automation platforms. Among these, n8n stands out as a popular open-source alternative to Zapier with strong LLM integration capabilities and a visual workflow editor.
Problem Statement¶
Engineers building automation workflows face a fundamental limitation: single-model AI decisions are binary and prone to hallucination. When automating high-stakes decisions (code reviews, support triage, design approvals), teams need:
- Consensus-based decisions - Multiple perspectives reduce single-model bias
- Confidence signals - Understanding when the AI is uncertain
- Auditability - Transparent reasoning for compliance and debugging
- Flexibility - Both binary (go/no-go) and synthesized (detailed analysis) outputs
n8n Platform Analysis¶
Why n8n as the first integration target:
| Factor | n8n | Zapier | Make |
|---|---|---|---|
| Open Source | Yes (Fair-code) | No | No |
| Self-hosted | Yes | No | No |
| LLM Integration | Native AI nodes | Add-ons | Add-ons |
| HTTP Flexibility | Full control | Limited | Limited |
| Community Templates | 7600+ | Larger | Medium |
| Target Audience | Engineers/DevOps | Business users | Mixed |
n8n's engineer-focused design, HTTP Request node flexibility, and self-hosting capability align with LLM Council's technical audience.
Current State (Pre-ADR)¶
ADR-025 identified n8n as P2 priority:
"Create n8n integration example/template"
HTTP API availability (ADR-009):
- POST /v1/council/run - Synchronous deliberation
- GET /v1/council/stream - SSE streaming
- GET /v1/health - Health check
Gap: No documented patterns, example workflows, or security guidance for n8n integration.
Decision¶
Implement comprehensive n8n integration with three workflow templates, security patterns, and documentation targeting engineer audiences.
1. Integration Architecture¶
┌─────────────────────────────────────────────────────────────┐
│ n8n Workflow │
├─────────────────────────────────────────────────────────────┤
│ ┌──────────┐ ┌───────────────┐ ┌─────────────────┐ │
│ │ Trigger │───▶│ HTTP Request │───▶│ Parse Response │ │
│ │ (Webhook)│ │ POST /v1/ │ │ (Set Node) │ │
│ └──────────┘ │ council/run │ └────────┬────────┘ │
│ └───────────────┘ │ │
│ ▼ │
│ ┌───────────────────┐ │
│ │ Conditional Logic │ │
│ │ (IF verdict=...) │ │
│ └─────────┬─────────┘ │
│ │ │
│ ┌───────────────┼───────────┐ │
│ ▼ ▼ ▼ │
│ ┌─────────┐ ┌─────────┐ ┌──────┐│
│ │ Approve │ │ Reject │ │ Flag ││
│ │ Action │ │ Action │ │Human ││
│ └─────────┘ └─────────┘ └──────┘│
└─────────────────────────────────────────────────────────────┘
2. API Contract¶
Request Schema¶
{
"prompt": "Review this code for security issues:\n{{ $json.diff }}",
"verdict_type": "synthesis | binary | tie_breaker",
"confidence": "quick | balanced | high",
"include_dissent": false,
"metadata": {
"correlation_id": "{{ $execution.id }}",
"workflow_id": "code-review-v1",
"idempotency_key": "{{ $json.event_id }}"
}
}
Response Schema¶
{
"request_id": "council_abc123",
"stage1": [
{"model": "gpt-4", "content": "...", "tokens": 450}
],
"stage2": [
{"reviewer": "claude-3", "rankings": [...], "rationale": "..."}
],
"stage3": {
"synthesis": "The council identified three concerns...",
"verdict": "approved | rejected",
"confidence": 0.85,
"dissent": [{"model": "gemini-pro", "opinion": "..."}]
},
"metadata": {
"correlation_id": "exec_xyz789",
"models_consulted": ["gpt-4", "claude-3", "gemini-pro"],
"aggregate_rankings": {"Response A": 1.5, "Response B": 2.3},
"timing": {
"total_ms": 18500,
"stage1_ms": 8200,
"stage2_ms": 7100,
"stage3_ms": 3200
},
"token_usage": {
"input": 2400,
"output": 1850,
"total": 4250
}
}
}
Error Response Schema¶
{
"error": {
"code": "COUNCIL_TIMEOUT | RATE_LIMITED | PARTIAL_FAILURE | VALIDATION_ERROR",
"message": "Human-readable error description",
"details": {
"models_succeeded": ["gpt-4"],
"models_failed": ["claude-3", "gemini-pro"],
"retry_after_seconds": 30
}
}
}
3. Verdict Types for Automation¶
| Verdict Type | Use Case | Response Structure | Confidence Interpretation |
|---|---|---|---|
synthesis |
Detailed analysis (code review, design feedback) | stage3.synthesis (string) |
Agreement level among models |
binary |
Go/no-go gates (triage, approval) | stage3.verdict (approved/rejected), confidence (0-1) |
Proportion of models agreeing |
tie_breaker |
Deadlocked decisions | Chairman resolves split votes | N/A (chairman decides) |
Confidence Score Semantics:
- 0.0-0.5: Strong disagreement, recommend human review
- 0.5-0.7: Mixed consensus, proceed with caution
- 0.7-0.9: Moderate agreement, generally reliable
- 0.9-1.0: Strong consensus, high reliability
4. Workflow Templates¶
4.1 Code Review Automation¶
- Trigger: GitHub PR webhook
- Verdict:
synthesis - Output: Detailed security, performance, and style analysis
- Action: Post review comment to PR
4.2 Support Ticket Triage¶
- Trigger: Ticket creation webhook
- Verdict:
binary - Output:
approved(URGENT) orrejected(STANDARD) - Action: Route to appropriate queue based on verdict + confidence
4.3 Technical Design Decision¶
- Trigger: Design doc submission
- Verdict:
synthesiswithinclude_dissent: true - Output: Council recommendation + minority opinions
- Action: Send to Slack/email for human review
5. Security Model¶
5.1 Trust Boundaries¶
┌─────────────────────────────────────────────────────────────────┐
│ UNTRUSTED ZONE │
│ ┌─────────────┐ │
│ │ GitHub/Jira │──webhook──┐ │
│ │ (PR, Ticket)│ │ │
│ └─────────────┘ │ │
└────────────────────────────┼────────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ n8n WORKFLOW (SEMI-TRUSTED) │
│ ┌──────────┐ ┌──────────────┐ ┌───────────────────────┐ │
│ │ Validate │───▶│ Sanitize │───▶│ Call LLM Council │ │
│ │ Webhook │ │ Input │ │ (API Key + TLS) │ │
│ └──────────┘ └──────────────┘ └───────────────────────┘ │
└────────────────────────────────────────────────────────────────┘
│
▼ HTTPS + API Key
┌────────────────────────────────────────────────────────────────┐
│ LLM COUNCIL API (TRUSTED) │
│ - Input validation │
│ - Rate limiting │
│ - Audit logging │
└────────────────────────────────────────────────────────────────┘
5.2 Authentication (Outbound to LLM Council)¶
API Key Authentication:
// n8n HTTP Request node - Header Auth
{
"headerAuth": {
"name": "Authorization",
"value": "Bearer {{ $credentials.llmCouncilApiKey }}"
}
}
Credential Storage Requirements: - Store API keys in n8n Credential Objects (encrypted at rest) - Never hardcode in workflow JSON - Use environment-specific credentials (dev/staging/prod) - Rotate keys quarterly or on suspected compromise
5.3 HMAC Signature Verification (Inbound Webhooks)¶
For webhook callbacks from LLM Council:
// n8n Function node - HMAC verification
const crypto = require('crypto');
const payload = JSON.stringify($input.first().json);
const secret = $credentials.llmCouncilWebhookSecret;
const receivedSig = $input.first().headers['x-council-signature'];
const receivedNonce = $input.first().headers['x-council-nonce'];
const receivedTimestamp = parseInt($input.first().headers['x-council-timestamp']);
// 1. Timestamp validation (±5 minutes)
const now = Math.floor(Date.now() / 1000);
if (Math.abs(now - receivedTimestamp) > 300) {
throw new Error('Timestamp too old or in future');
}
// 2. Nonce replay protection (requires external store)
// Store nonces in Redis/DB with TTL matching timestamp window
const nonceKey = `nonce:${receivedNonce}`;
if (await $env.cache.exists(nonceKey)) {
throw new Error('Nonce already used - replay attack detected');
}
await $env.cache.set(nonceKey, '1', { EX: 600 }); // 10 min TTL
// 3. HMAC verification with timing-safe comparison
const expectedSig = 'sha256=' + crypto
.createHmac('sha256', secret)
.update(`${receivedTimestamp}.${receivedNonce}.${payload}`)
.digest('hex');
if (!crypto.timingSafeEqual(
Buffer.from(expectedSig),
Buffer.from(receivedSig)
)) {
throw new Error('Invalid signature');
}
return $input.first();
5.4 Input Validation & Prompt Injection Mitigation¶
Input Sanitization:
// n8n Function node - Input sanitization
const input = $input.first().json;
// 1. Size limits
const MAX_DIFF_SIZE = 50000; // 50KB
const diff = input.diff?.substring(0, MAX_DIFF_SIZE) || '';
// 2. Remove potential prompt injection patterns
const sanitized = diff
.replace(/```system/gi, '```code') // Prevent system prompt injection
.replace(/\[INST\]/gi, '[CODE]') // Prevent instruction injection
.replace(/<<SYS>>/gi, '<<CODE>>'); // Prevent Llama-style injection
// 3. Escape special characters in structured fields
const safeSubject = input.subject?.replace(/[<>]/g, '') || '';
return {
json: {
diff: sanitized,
subject: safeSubject,
// Preserve original for audit
_original_size: input.diff?.length || 0,
_was_truncated: (input.diff?.length || 0) > MAX_DIFF_SIZE
}
};
LLM Council Server-Side Protections: - Input length validation (reject oversized payloads) - Rate limiting per API key - Audit logging of all requests - System prompt isolation from user content
5.5 Secret Rotation Strategy¶
| Secret | Rotation Frequency | Procedure |
|---|---|---|
| API Key | 90 days or on compromise | Generate new key → update n8n credential → revoke old key |
| Webhook Secret | 90 days | Support dual secrets during rotation window |
| TLS Certificates | Auto-renew (Let's Encrypt) | Managed by reverse proxy |
6. Performance Configuration¶
| Confidence Tier | Models | Typical Latency | Timeout Setting | Est. Cost/Call |
|---|---|---|---|---|
quick |
2 | 5-10s | 60000ms | ~$0.02 |
balanced |
3 | 10-20s | 90000ms | ~$0.04 |
high |
4 | 20-40s | 120000ms | ~$0.08 |
Cost Estimation Formula:
Assumes GPT-4 class models. Actual cost varies by model mix.HTTP Request Node Settings:
{
"options": {
"timeout": 120000,
"retry": {
"maxRetries": 3,
"retryOn": [429, 500, 502, 503, 504],
"retryDelay": 5000
}
}
}
7. Partial Failure Handling¶
Scenario: 2 of 3 models respond successfully.
| Policy | Behavior | Use Case |
|---|---|---|
require_all |
Fail entire request | High-stakes decisions |
require_majority |
Proceed with 2/3 | Default behavior |
best_effort |
Proceed with any response | Low-stakes, speed-critical |
Default Behavior (require_majority):
- If ≥ 50% of models respond, proceed with available responses
- Flag partial failure in response metadata
- Include models_failed array for debugging
8. Error Handling Strategy¶
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Council │────▶│ IF Error? │────▶│ Retry w/ Backoff│
│ Request │ └──────┬───────┘ └────────┬────────┘
└─────────────┘ │ │
│ ▼
┌──────┴──────┐ ┌─────────────┐
│ Process │ │ Error Type? │
│ Success │ └──────┬──────┘
└─────────────┘ │
┌─────────────┼─────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ 429: Wait│ │ 5xx: │ │ Timeout: │
│ & Retry │ │ Fallback │ │ Human │
└──────────┘ └──────────┘ └──────────┘
n8n Implementation:
// Error handling with n8n Error Trigger
{
"nodes": [
{
"name": "Council Request",
"type": "n8n-nodes-base.httpRequest",
"continueOnFail": true // Don't stop workflow on error
},
{
"name": "Check Error",
"type": "n8n-nodes-base.if",
"parameters": {
"conditions": {
"string": [{
"value1": "={{ $json.error?.code }}",
"operation": "isNotEmpty"
}]
}
}
},
{
"name": "Handle 429",
"type": "n8n-nodes-base.wait",
"parameters": {
"amount": "={{ $json.error?.details?.retry_after_seconds || 30 }}"
}
}
]
}
Fallback Chain: 1. Retry (3x with exponential backoff) - For transient errors 2. Degrade (single model) - For persistent council failures (configurable per workflow) 3. Human escalation - For critical decisions, never auto-approve
Policy Configuration (per workflow):
const FALLBACK_POLICY = {
"code-review": {
allow_single_model_fallback: true, // Code review can degrade
human_escalation_threshold: 0.5 // Escalate if confidence < 0.5
},
"security-approval": {
allow_single_model_fallback: false, // Security requires council
human_escalation_threshold: 0.8 // Higher bar for auto-approve
}
};
9. Observability & Debugging¶
9.1 Correlation ID Propagation¶
// Pass correlation ID through entire workflow
const correlationId = $json.correlation_id || `n8n_${$execution.id}`;
// Include in LLM Council request
{
"metadata": {
"correlation_id": correlationId,
"n8n_execution_id": $execution.id,
"n8n_workflow_id": $workflow.id
}
}
// Log for tracing
console.log(`[${correlationId}] Council request initiated`);
9.2 Metrics to Monitor¶
| Metric | Description | Alert Threshold |
|---|---|---|
council_latency_p95 |
95th percentile response time | > 60s |
council_error_rate |
Percentage of failed requests | > 5% |
council_consensus_rate |
Requests with confidence > 0.7 | < 80% |
council_fallback_rate |
Requests using single-model fallback | > 10% |
9.3 Troubleshooting Guide¶
| Symptom | Likely Cause | Resolution |
|---|---|---|
COUNCIL_TIMEOUT |
High tier + large input | Reduce tier or truncate input |
RATE_LIMITED (429) |
Too many concurrent requests | Implement request queueing |
PARTIAL_FAILURE |
Model API issues | Check models_failed, retry later |
| Inconsistent verdicts | Ambiguous prompt | Improve prompt specificity |
| Low confidence scores | Models disagree | Review dissent, consider human review |
| HMAC validation failed | Clock drift or secret mismatch | Sync NTP, verify secret |
10. Versioning Strategy¶
API Versioning:
- Templates target LLM Council API v1 (/v1/council/*)
- v1 will remain supported for 18 months after v2 GA
- Breaking changes require major version bump
- Additive changes (new fields) are non-breaking
Template Versioning:
{
"name": "LLM Council - Code Review",
"meta": {
"llm_council_template_version": "1.0.0",
"llm_council_api_version": "v1",
"n8n_min_version": "1.20.0"
}
}
Compatibility Matrix:
| Template Version | API Version | n8n Version |
|---|---|---|
| 1.0.x | v1 | ≥1.20.0 |
| 2.0.x (future) | v1, v2 | ≥1.30.0 |
Consequences¶
Positive¶
- Lower integration barrier - Engineers can import ready-to-use templates
- Best practices encoded - Security patterns (HMAC, input validation) built into examples
- Discoverability - n8n Creator Hub expands reach to automation engineers
- Reference architecture - Patterns applicable to other platforms (Make, Zapier)
Negative¶
- Maintenance burden - Templates need updates when API changes
- Version coupling - n8n workflow format may evolve
- Limited scope - Only covers HTTP integration, not MCP
- Sync limitations - Long deliberations may exceed timeouts
Trade-offs¶
| Choice | Alternative | Rationale |
|---|---|---|
| n8n first | Zapier, Make | Open source, engineer audience, self-hosted |
| HTTP API | MCP Server | HTTP is universal, n8n lacks MCP support |
| 3 templates | More use cases | Quality over quantity; community can extend |
| Sync-first | Async-first | Simpler to document; covers 95% of use cases |
| HTTP Request node | Custom n8n node | Faster to ship; custom node for future roadmap |
Deferred: Async Webhook Callbacks - Why not now: Adds complexity (callback URL registration, delivery guarantees, state management) - When to revisit: If users report timeout failures > 5% of calls
Implementation¶
Files Created¶
| File | Purpose |
|---|---|
docs/integrations/index.md |
Integration landing page |
docs/integrations/n8n.md |
Comprehensive n8n guide |
docs/examples/n8n/code-review-workflow.json |
PR review template |
docs/examples/n8n/support-triage-workflow.json |
Triage template |
docs/examples/n8n/design-decision-workflow.json |
Design review template |
docs/blog/08-n8n-workflow-automation.md |
Blog post |
tests/test_n8n_examples.py |
TDD test suite (28 tests) |
Files Modified¶
| File | Change |
|---|---|
mkdocs.yml |
Added Integrations section, blog entry |
Future Work¶
- Async webhook callbacks - When timeout failures exceed 5%
- Custom n8n node - Better UX, typed fields, versioned behavior
- n8n Creator Hub - Submit templates for official listing
- Additional platforms - Make, Zapier adapters
- MCP integration - When n8n adds MCP support
Validation¶
Acceptance Criteria¶
- [x] All workflow JSON files are valid and importable
- [x] Documentation covers security (HMAC, auth, input validation)
- [x] Documentation covers timeouts and error handling
- [x] Blog post reviewed by LLM Council for engineer audience
- [x] 28 TDD tests pass
- [x] mkdocs builds without warnings
Test Results¶
tests/test_n8n_examples.py::TestN8nWorkflowExamples - 19 passed
tests/test_n8n_examples.py::TestIntegrationDocs - 5 passed
tests/test_n8n_examples.py::TestBlogPost - 4 passed
Total: 28 passed
References¶
Related ADRs¶
- ADR-009: HTTP API Open Core Boundary
- ADR-025: Future Integration Capabilities
- ADR-038: One-Click Deployment Strategy - Enables easy deployment for n8n integration
External Resources¶
Appendix A: Prompt Engineering Patterns¶
Code Review Prompt¶
You are a code review expert. Review this code change for:
1. Security vulnerabilities (injection, XSS, etc.)
2. Performance issues
3. Code style and best practices
4. Potential bugs
Diff:
{{ $json.diff }}
Provide specific, actionable feedback with line references.
Triage Prompt¶
Should this ticket be escalated to URGENT priority?
Criteria for URGENT:
- Production system down
- Data loss or corruption
- Security incident
- Multiple users affected
Ticket Subject: {{ $json.subject }}
Ticket Body: {{ $json.body }}
Respond approved (URGENT) or rejected (STANDARD).
Design Review Prompt¶
As a council of senior architects, evaluate this design:
{{ $json.design_doc }}
Consider: Scalability, Maintainability, Security, Cost, Complexity.
Provide:
- Recommendation (proceed/revise/reject)
- Critical concerns
- Suggested improvements
Appendix B: Council Review Summary¶
Review Date: 2025-12-30 Tier: High (4 models) Models: grok-4.1-fast, gemini-3-pro-preview, gpt-5.2, claude-opus-4.5
Key Feedback Incorporated¶
- API Schemas - Added complete request/response/error schemas (Section 2)
- Security Depth - Expanded to include authentication, input validation, prompt injection mitigation, nonce-based replay protection, secret rotation (Section 5)
- Partial Failure Semantics - Documented require_all/require_majority/best_effort policies (Section 7)
- Error Handling - Added n8n-specific implementation patterns (Section 8)
- Observability - Added correlation ID propagation, metrics, troubleshooting guide (Section 9)
- Versioning - Added API and template versioning strategy (Section 10)
- Cost Estimation - Added per-tier cost estimates and formula (Section 6)
- Async Trade-off - Documented why sync-first, when to revisit (Trade-offs section)
Deferred Recommendations¶
- Custom n8n node (prioritized for future roadmap)
- Extension points for custom verdict types (future work)
- mTLS for enterprise deployments (not yet required)