Architecture¶
Audience: Engineers orienting to the monorepo for design or review work. Outcome: After reading, you know which package owns which concern, how they communicate, and where to dive deeper. Last verified: 2026-04-22
This document is the umbrella. It does not re-derive system internals — it points at the existing per-package architecture docs and the ADRs that ratify each decision. If you are new to the repo, read this in full first; if you are in a specific area, jump to the per-package link.
1. System at a glance¶
graph TB
subgraph Clients["Clients"]
CLI["agentic CLI
Typer"]
UI["React 19 Dashboard
Vite 6 · @xyflow/react"]
EXT["External callers
REST · WS · SSE"]
end
subgraph Runtime["agentic-workflows-v2/"]
API["FastAPI Server
REST · WebSocket · SSE · 500-event replay"]
ADR["AdapterRegistry
singleton"]
NATIVE["Native DAG Engine
Kahn's algorithm · asyncio.wait FIRST_COMPLETED"]
LG["LangGraph Engine
StateGraph · checkpointing"]
ROUTER["SmartModelRouter
tier routing · circuit breakers · stats persistence"]
RAG["RAG Pipeline
chunk · embed · index · retrieve · assemble"]
AGENTS["Agents
Base + Coder + Reviewer + Architect + Orchestrator"]
end
subgraph Eval["agentic-v2-eval/"]
SCORER["Scorer
rubric-based"]
RUNNERS["Batch · Streaming · AsyncStreaming Runners"]
REPORTER["Reporters
json · markdown · html"]
end
subgraph Shared["tools/ (agentic-tools)"]
LLM["LLMClient
10 providers"]
BENCH["Benchmarks"]
CACHE["Response cache"]
end
CLI --> ADR
UI -->|REST · WS| API
EXT -->|REST · WS · SSE| API
API --> ADR
ADR --> NATIVE
ADR --> LG
NATIVE --> AGENTS
LG --> AGENTS
AGENTS --> ROUTER
AGENTS --> RAG
ROUTER -->|provider calls| LLM
API -->|scores| SCORER
SCORER --> RUNNERS
RUNNERS --> REPORTER
classDef rt fill:#4a90d9,stroke:#2c5f8a,color:#fff
classDef ev fill:#00b894,stroke:#008060,color:#fff
classDef sh fill:#fdcb6e,stroke:#c8a034,color:#333
class API,ADR,NATIVE,LG,ROUTER,RAG,AGENTS rt
class SCORER,RUNNERS,REPORTER ev
class LLM,BENCH,CACHE sh
The three Python packages have zero cross-package imports. They communicate via:
tools/is published as a wheel (agentic-tools) — the runtime and eval packages consume it like any other library.- The runtime exposes
agentic-v2-evalthrough its REST API (POST /runs/:id/evaluation,GET /runs/:id/evaluation) — the eval framework does not import runtime internals.
2. Per-package deep dives¶
| Package | Entry point | Deep dive |
|---|---|---|
| Runtime | agentic-workflows-v2/agentic_v2/ |
architecture-runtime.md |
| UI | agentic-workflows-v2/ui/src/ |
architecture-ui.md |
| Evaluation | agentic-v2-eval/src/agentic_v2_eval/ |
architecture-eval.md |
| Shared tools | tools/ |
architecture-tools.md |
| Cross-package integration | — | integration-architecture.md |
Additional supporting documents:
api-contracts-runtime.md— 16 REST endpoints + WebSocket + SSE schemas.data-models-runtime.md— 38+ Pydantic v2 models across server, contracts, core.component-inventory-ui.md— 17 UI components across 6 categories.source-tree-analysis.md— full annotated directory tree.development-guide.md— dev environments, CLI, tests.deployment-guide.md— CI/CD, environment variables, production checklist.
3. The five load-bearing mechanisms¶
These are the places where a change ripples across the system. Understand these before proposing architectural work.
3.1 Adapter registry¶
AdapterRegistry is a process-wide singleton in agentic_v2/adapters/registry.py. Engines register with a name (native, langchain), the CLI resolves --adapter <name> at runtime, and tests reset the singleton via an autouse fixture to prevent cross-test leakage.
- Why it exists: ADR-001 — dual execution engine.
- Current default:
langchain(configurable per run). - Direction of travel: ADR-013 — native DAG as the single long-term engine.
3.2 Typed execution-event wire format¶
contracts/events.py defines a Pydantic discriminated union covering workflow_start, step_start, step_end, step_complete, step_error, workflow_end, evaluation_start, evaluation_complete. WebSocket and SSE broadcasts validate before emit. TypeScript interfaces in ui/src/api/types.ts mirror this union by hand — drift is detected by convention, not yet by automation.
- Ratifies: ADR-014.
- Related: the 500-event replay buffer in
server/websocket.py— clients reconnecting mid-run receive missed events.
3.3 SLO gates in git¶
Time-to-first-span p95 and nightly flake rate are stored as rolling windows in git — measurements are appended to JSON artifacts committed on each CI run, and the gate reads the window, not a fresh sample. This keeps the signal stable across single bad runs.
- Ratifies: ADR-015.
- Known limitation: p95 gate passes trivially when the window is empty — see
KNOWN_LIMITATIONS.md.
3.4 SmartModelRouter¶
Maps tier (tier3_analyst) → capability → best available model at runtime. Health-weighted selection, exponential cooldowns, circuit breakers, persisted stats across restarts, Retry-After header awareness.
3.5 RAG pipeline¶
Thirteen modules in agentic_v2/rag/: loader → recursive chunker → embedder (content-hash dedup) → cosine vectorstore + BM25 keyword index → RRF hybrid retriever → token-budget assembler. Full OTEL tracing. Memory backed by MemoryStoreProtocol (InMemoryStore or RAGMemoryStore).
- Blueprint:
adr/RAG-pipeline-blueprint.md.
4. The decision record¶
| ADR | Domain | Status |
|---|---|---|
| 001 | Dual execution engine | Accepted (superseded by 013) |
| 002 | SmartModelRouter circuit breakers | Accepted |
| 003 | Deep research supervisor | Superseded → 007 |
| 007 | Multidimensional classification + stop policy | Proposed |
| 008 | Test value taxonomy | Accepted |
| 009 | Scoring enhancements | Accepted |
| 010 | Commit-driven A/B eval harness | Proposed |
| 011 | Eval harness API design | Proposed |
| 012 | UI evaluation hub | Proposed |
| 013 | Native DAG as single engine | Accepted |
| 014 | Pydantic wire format for execution events | Accepted |
| 015 | SLO rolling window in git | Accepted |
| 016 | GitHub Models as default E2E provider | Accepted |
ADRs 004–006 are intentionally unused — the gap is documented in adr/ADR-INDEX.md and should not be reclaimed.
5. What this document is not¶
- Not a replacement for per-package docs — it is a map.
- Not a roadmap — see
ROADMAP.md. - Not a limitations list — see
KNOWN_LIMITATIONS.md. - Not a migration guide — see
MIGRATIONS.md.