AI Factory — Quantum Shift ORG

4-Role Agent Architecture

Cognitive Agents

Each agent has a single defined role, a PEMS-registered prompt, and a strict stop condition.

R1 — Active

Orchestrator

System Coordinator

mistral:7b-q4 · Ollama local

Decomposes tasks into agent assignments. Manages DGVR cycle state and phase gates.

Phase gate enforcement

Agent task delegation via Paperclip

DGVR state machine

R1 · Production

R1 — Active

Architect

System Designer

mistral:7b-q4 · Ollama local

Produces ADRs, schema definitions, API contracts. RAG-enabled. Output feeds Critic.

Architecture Decision Record generation

Schema and API contract drafting

RAG retrieval from project knowledge base

R1 · Production

R1 — Active

Critic

Quality Enforcer

phi3:mini · Ollama local

Validates every Architect output against PEMS criteria. Blocks below-threshold outputs.

PEMS vector scoring V01–V11

DEPLOY_BLOCKED trigger

Contradiction detection

R1 · Production

R1 — Active

Coach

Quality Elevator

mistral:7b-q4 · Ollama local

When Critic blocks, Coach generates targeted refinements. Closes the DGVR refinement loop.

Vector-specific refinement prompts

PEMS-registered coaching prompt library

Iterative improvement until gate cleared

R1 · Production

R3 — Planned

AdminAgent

Dashboard Operator

Model TBD · Post-contract

Dashboard ops, booking management, availability editing. JWT auth. Post-contract.

R3 · Planned

R4 — Planned

ContentAgent

Content Producer

Model TBD · Post-contract

SEO blog, multilingual content, guide expansion. PEMS-governed. R4 scope.

R4 · Planned

Technology Stack

Production Stack

Orchestration

LangGraph + CrewAI

Agent graph execution, state machines, phase gate routing

API Layer

FastAPI + Hono

FastAPI (Python) + Hono (Node edge). Railway deploy.

Database

Postgres + Redis

Postgres persistent state. Redis cache + queue.

Inference — Primary

Ollama local

mistral:7b-q4 + phi3:mini. M1 Pro 16GB. 10GB ceiling enforced.

Inference — Fallback

DeepSeek → Groq

DeepSeek (12s) → Groq (10s). Silent failover.

Tool Protocol

Paperclip + MCP

All tool calls via Paperclip. 8-server MCP registry R1–R4.

RAG

ChromaDB + nomic-embed-text

512 chunks · top-3 cosine > 0.72 · ChromaDB

Governance

PEMS + DGVR

All prompts PEMS-registered · RC versioned · SHA-256 locked

Deploy

Railway + Cloudflare Pages

Hono on Railway · Pages on Cloudflare · FORGE gate

Inference Fallback Chain

Every Call. Every Agent.

This chain runs on every inference call, across every agent. User never sees a failover.

Step 1

Ollama Local

mistral:7b-q4 or phi3:mini. freemem < 6GB → skip.

Timeout: 8s

Step 2

DeepSeek API

deepseek-chat. Silent switch. Full logging.

Timeout: 12s

Step 3

Groq

llama3-8b-8192. Final fallback.

Timeout: 10s

On Failure

Structured Error

Typed error returned. Full context logged. Never unhandled.

AI
Factory
System

Cognitive Agents

Production Stack

Every Call. Every Agent.

Factory Economics

Production AI.
Not Prototypes.

AI Factory System

Cognitive Agents

Production Stack

Every Call. Every Agent.

Factory Economics

Production AI.Not Prototypes.

AI
Factory
System

Production AI.
Not Prototypes.