Pillar 04 — Cognitive Production System

AI
Factory
System

A multi-agent Cognitive Production System — infrastructure that builds governed AI products at scale.

Architecture4-Role Agents
FrameworkDGVR Cycle
MCP Servers8 Registry
Margin>70% gross
4-Role Agent Architecture

Cognitive Agents

Each agent has a single defined role, a PEMS-registered prompt, and a strict stop condition.

R1 — Active
Orchestrator
System Coordinator
mistral:7b-q4 · Ollama local
Decomposes tasks into agent assignments. Manages DGVR cycle state and phase gates.
Phase gate enforcement
Agent task delegation via Paperclip
DGVR state machine
R1 · Production
R1 — Active
Architect
System Designer
mistral:7b-q4 · Ollama local
Produces ADRs, schema definitions, API contracts. RAG-enabled. Output feeds Critic.
Architecture Decision Record generation
Schema and API contract drafting
RAG retrieval from project knowledge base
R1 · Production
R1 — Active
Critic
Quality Enforcer
phi3:mini · Ollama local
Validates every Architect output against PEMS criteria. Blocks below-threshold outputs.
PEMS vector scoring V01–V11
DEPLOY_BLOCKED trigger
Contradiction detection
R1 · Production
R1 — Active
Coach
Quality Elevator
mistral:7b-q4 · Ollama local
When Critic blocks, Coach generates targeted refinements. Closes the DGVR refinement loop.
Vector-specific refinement prompts
PEMS-registered coaching prompt library
Iterative improvement until gate cleared
R1 · Production
R3 — Planned
AdminAgent
Dashboard Operator
Model TBD · Post-contract
Dashboard ops, booking management, availability editing. JWT auth. Post-contract.
R3 · Planned
R4 — Planned
ContentAgent
Content Producer
Model TBD · Post-contract
SEO blog, multilingual content, guide expansion. PEMS-governed. R4 scope.
R4 · Planned
Technology Stack

Production Stack

Orchestration
LangGraph + CrewAI
Agent graph execution, state machines, phase gate routing
API Layer
FastAPI + Hono
FastAPI (Python) + Hono (Node edge). Railway deploy.
Database
Postgres + Redis
Postgres persistent state. Redis cache + queue.
Inference — Primary
Ollama local
mistral:7b-q4 + phi3:mini. M1 Pro 16GB. 10GB ceiling enforced.
Inference — Fallback
DeepSeek → Groq
DeepSeek (12s) → Groq (10s). Silent failover.
Tool Protocol
Paperclip + MCP
All tool calls via Paperclip. 8-server MCP registry R1–R4.
RAG
ChromaDB + nomic-embed-text
512 chunks · top-3 cosine > 0.72 · ChromaDB
Governance
PEMS + DGVR
All prompts PEMS-registered · RC versioned · SHA-256 locked
Deploy
Railway + Cloudflare Pages
Hono on Railway · Pages on Cloudflare · FORGE gate
Inference Fallback Chain

Every Call. Every Agent.

This chain runs on every inference call, across every agent. User never sees a failover.

Step 1
Ollama Local
mistral:7b-q4 or phi3:mini. freemem < 6GB → skip.
Timeout: 8s
Step 2
DeepSeek API
deepseek-chat. Silent switch. Full logging.
Timeout: 12s
Step 3
Groq
llama3-8b-8192. Final fallback.
Timeout: 10s
On Failure
Structured Error
Typed error returned. Full context logged. Never unhandled.
Unit Economics

Factory Economics

>70%
Gross Margin
Local inference eliminates primary cloud compute cost
€8K+
Monthly Potential
Factory-as-a-Service at low marginal cost per client
R1→R4
Revenue Expansion
Each R phase adds billable modules. R3/R4 unlock AdminAgent + ContentAgent.
Build with AI Factory

Production AI.
Not Prototypes.

Every project starts with a defined scope and a clear deployment path. We build systems, not demos.