Runtime Architecture — agentic-workflows-v2¶
Package: agentic-workflows-v2 | Python: 3.11+ | Build: hatchling
Table of Contents¶
- Executive Summary
- Technology Stack
- High-Level Structure
- Architectural Layers
- Server Layer
- Agents Layer
- Engine Layer
- Models Layer
- Tools Layer
- RAG Pipeline
- Core Protocols
- Async Architecture
- Security Architecture
- Configuration System
- CLI
- Source Map
- Key Design Decisions
Executive Summary¶
The agentic-workflows-v2 package is a production-grade multi-agent workflow runtime built for enterprise environments. It provides:
- Dual execution engines: a native DAG executor (Kahn's topological sort algorithm) and a LangGraph state machine engine, both running behind a common adapter interface.
- 8+ LLM provider support with tier-based smart routing, circuit breakers, and fallback chains.
- FastAPI server with full WebSocket and SSE streaming, a 500-event replay buffer, and a React 19 dashboard.
- Full RAG pipeline including recursive chunking, content-hash deduplication, cosine similarity vector search, BM25 keyword indexing, RRF hybrid fusion, and token-budget assembly.
- OpenTelemetry tracing integrated throughout the execution pipeline.
- 3-layer security middleware with a 5-detector input sanitization pipeline.
- 11 built-in tool modules with DENY-by-default safety and per-step allowlisting.
The system serves dual purpose: an operational agentic AI platform and an educational portfolio for team onboarding in cleared federal environments.
Technology Stack¶
| Category | Technology | Notes |
|---|---|---|
| Runtime | Python 3.11+ | Async-first via asyncio |
| Web framework | FastAPI | ASGI, async route handlers |
| Data validation | Pydantic v2 | All models use model_dump() / model_validate() |
| CLI | Typer | 7 commands |
| HTTP client | httpx / aiohttp | Async outbound requests from tools |
| Templating | Jinja2 | Prompt template rendering |
| LLM orchestration | LangChain / LangGraph | Optional; guarded by try/except ImportError |
| Tracing | OpenTelemetry | OTEL SDK; exporter configurable |
| Vector store | LanceDB | RAG persistent index |
| LLM routing | LiteLLM | Unified provider interface for 8+ backends |
| Frontend | React 19 + Vite 6 | Served from ui/dist/ |
| Graph UI | @xyflow/react 12 | DAG visualization canvas |
| Data fetching | TanStack Query | Frontend cache and server state |
| Styling | Tailwind CSS | Utility-first |
| Pre-commit | black, isort, ruff, mypy, detect-secrets | Enforced on every commit |
High-Level Structure¶
agentic-workflows-v2/
├── agentic_v2/
│ ├── server/ # FastAPI application, routes, auth, streaming
│ ├── agents/ # BaseAgent + 4 specialized agents + implementations/
│ ├── adapters/ # AdapterRegistry, ExecutionEngine backends
│ ├── core/ # Protocols, memory, context, contracts, errors
│ ├── engine/ # Native DAG executor (Kahn's algorithm)
│ ├── langchain/ # LangGraph execution engine (optional)
│ ├── models/ # LLM tier routing, provider backends, smart router
│ ├── rag/ # Full RAG pipeline
│ ├── contracts/ # Pydantic I/O models (additive-only)
│ ├── prompts/ # 7 agent persona definitions (.md)
│ ├── tools/builtin/ # 11 built-in tool modules
│ ├── workflows/definitions/ # 6 YAML workflow definitions
│ ├── integrations/ # OpenTelemetry integration
│ └── middleware/ # Sanitization detectors
├── tests/ # 100+ test files, pytest-asyncio auto mode
└── ui/ # React 19 frontend
Architectural Layers¶
Server Layer¶
Source: agentic_v2/server/
The server layer is a FastAPI ASGI application. It owns the HTTP interface, authentication, streaming infrastructure, and background task dispatch.
Middleware Stack¶
Middleware is applied in the following order (outermost to innermost):
- CORS middleware — Configures allowed origins, methods, and headers. Defaults permissive for local development; locked down via
AGENTIC_CORS_ORIGINSin production. - API key middleware — Validates
Authorization: BearerorX-API-Keyheaders usingsecrets.compare_digest(). Passes through configured public paths without authentication. - Sanitization middleware — Runs all inbound request bodies through the 5-detector sanitization pipeline. Requests classified
BLOCKEDare rejected with400 Bad Requestbefore reaching any route handler.
Route Modules¶
| Module | Responsibility |
|---|---|
server/routes/health.py |
/api/health liveness endpoint |
server/routes/agents.py |
Agent listing |
server/routes/workflows.py |
Workflow CRUD, DAG, capabilities, editor |
server/routes/runs.py |
Run dispatch, list, summary, log retrieval |
server/routes/stream.py |
SSE streaming for live run events |
server/routes/eval.py |
Evaluation dataset listing and dataset preview |
server/websocket.py |
WebSocket handler with 500-event replay buffer |
server/execution.py |
Background task coordination, event publication |
server/auth.py |
API key dependency, secrets.compare_digest |
server/models.py |
All server-layer Pydantic request/response models |
Execution Dispatch¶
POST /api/run uses FastAPI BackgroundTasks to dispatch workflow execution without blocking the HTTP response. The execution coroutine publishes events to an asyncio.Queue which is consumed by both the SSE stream handler and the WebSocket handler.
SPA Integration¶
When ui/dist/index.html is present, a catch-all route (GET /{path:path}) serves the compiled React application, enabling client-side routing without server configuration per route.
Agents Layer¶
Source: agentic_v2/agents/
BaseAgent¶
All agents inherit from BaseAgent. It provides:
- LLM client lifecycle management
- Message history management (
list[AgentMessage]) - Tool execution dispatch
- Structured logging via
loguru - OpenTelemetry span creation
- Retry logic with configurable backoff
Specialized Agents¶
| Agent | Class | Role |
|---|---|---|
| Coder | CoderAgent |
Code generation, refactoring, debugging |
| Reviewer | ReviewerAgent |
Code review, quality analysis, finding generation |
| Orchestrator | OrchestratorAgent |
Workflow coordination, sub-task delegation |
| Architect | ArchitectAgent |
System design decisions, ADR generation |
Capability Mixins¶
Agents can compose optional capabilities via mixins:
| Mixin | Capability |
|---|---|
SupportsRAGMixin |
Augments prompts with RAG-retrieved context |
SupportsVerificationMixin |
Enables output verification and self-correction cycles |
SupportsStreamingMixin |
Emits token-level streaming events |
Persona Definitions¶
Each agent has a corresponding Markdown persona file in agentic_v2/prompts/. Persona files define: Expertise, Boundaries, Critical rules, and Output format. These are loaded at agent instantiation and injected as system prompt context.
Agent Implementations¶
Extended or domain-specific agent implementations live in agents/implementations/ and inherit from one of the four base specializations.
Engine Layer¶
Source: agentic_v2/engine/, agentic_v2/langchain/, agentic_v2/adapters/
Adapter Pattern¶
The AdapterRegistry singleton maps string adapter names to ExecutionEngine protocol implementations. Workflow execution always goes through the registry:
Adapters are registered at startup. The langchain adapter is registered only when langchain and langgraph are importable (guarded by try/except ImportError).
Native DAG Executor¶
Source: agentic_v2/engine/
The native executor implements topological step ordering via Kahn's algorithm:
- Parse the YAML workflow definition into a DAG.
- Detect cycles; raise
WorkflowErrorif any are found. - Compute in-degree for each step node.
- Maintain a ready queue of nodes with in-degree zero.
- Execute ready steps (respecting
depends_onconstraints) concurrently usingasyncio.gather. - Decrement in-degrees of dependent nodes as steps complete; enqueue newly unblocked steps.
- Collect
StepResultobjects and assemble the finalWorkflowResult.
This approach achieves maximum step-level parallelism while respecting explicit dependencies. It has no external dependencies beyond the Python standard library and Pydantic.
LangGraph Engine¶
Source: agentic_v2/langchain/
The LangGraph adapter wraps workflow definitions as LangGraph StateGraph state machines. Each step becomes a graph node; depends_on relationships become graph edges. This engine is used when LangChain-specific features are required (e.g., built-in memory, tool-calling with LangChain tool wrappers, or LangSmith tracing).
The LangGraph adapter satisfies the same ExecutionEngine protocol as the native adapter, so the rest of the system is engine-agnostic.
Models Layer¶
Source: agentic_v2/models/
Smart Router¶
smart_router.py is the central dispatch point for all LLM calls. It selects the appropriate provider and model based on a tier system:
| Tier | Intended Use | Example Models |
|---|---|---|
fast |
High-throughput, latency-sensitive tasks | GPT-4o-mini, Claude Haiku, Gemini Flash |
standard |
General-purpose agent tasks | GPT-4o, Claude Sonnet, Gemini Pro |
powerful |
Complex reasoning, architecture decisions | o3, Claude Opus, Gemini Ultra |
Provider Backends¶
8+ provider backends are supported:
| Provider | Config Key | Notes |
|---|---|---|
| OpenAI | OPENAI_API_KEY |
Direct API |
| Anthropic | ANTHROPIC_API_KEY |
Direct API |
| Google Gemini | GEMINI_API_KEY |
Direct API |
| Azure OpenAI | AZURE_OPENAI_API_KEY_0..n |
Supports _0 through _n suffix for multiple deployments and failover |
| Azure AI Foundry | AZURE_FOUNDRY_* |
Foundry model catalog |
| GitHub Models | GITHUB_TOKEN |
Models API |
| Ollama | OLLAMA_BASE_URL |
Local inference |
| Local ONNX | LOCAL_MODEL_PATH |
Auto-detected from ~/.cache/aigallery |
Circuit Breaker and Fallback¶
The smart router implements:
- Circuit breaker: Each provider backend tracks consecutive failure counts. Backends that exceed the threshold are marked unavailable for a configurable cool-down window.
- Fallback chains: Each tier has an ordered fallback chain. If the primary provider is unavailable or returns an error, the router automatically retries with the next provider in the chain.
- Retry with backoff: Individual LLM calls retry on transient errors (rate limits, timeouts) with exponential backoff before the circuit breaker engages.
Tools Layer¶
Source: agentic_v2/tools/builtin/
Safety Model¶
The tools layer enforces a DENY-by-default safety policy. Every tool operation has an associated risk classification:
| Risk Level | Examples | Default Policy |
|---|---|---|
| Low | file_read, web_search |
ALLOW |
| Medium | file_write, http_request |
ALLOW with path/URL constraints |
| High | shell, git, file_delete, code_exec |
DENY unless explicitly allowlisted |
Workflow YAML definitions must explicitly allowlist high-risk operations per step. An agent cannot perform a high-risk operation unless the step's tools block includes the relevant permission.
Built-in Tool Modules (11 total)¶
| Module | Description |
|---|---|
file_read |
Read files from the filesystem; path containment enforced |
file_write |
Write files; path containment enforced; DENY by default for paths outside working dir |
file_delete |
Delete files; DENY by default |
shell |
Execute shell commands; DENY by default; allowlist per command |
web_search |
Web search via configured search API |
http_request |
Outbound HTTP; blocks private IP ranges; timeout enforced |
git |
Git operations (status, diff, log, commit); DENY for writes by default |
code_exec |
Execute code in sandbox; DENY by default |
rag_search |
Query the RAG index |
memory_read |
Read from the active memory store |
memory_write |
Write to the active memory store |
RAG Pipeline¶
Source: agentic_v2/rag/
The RAG pipeline provides document ingestion, indexing, and retrieval for context augmentation. It is used directly by the RAGMemoryStore and the SupportsRAGMixin.
Pipeline Stages¶
Document Loading
↓
Recursive Chunking
↓
Content-Hash Deduplication
↓
Embedding (with hash-based cache)
↓
┌─────────────────────────────┐
│ LanceDB Vector Index │ ← cosine similarity
│ BM25 Keyword Index │ ← lexical matching
└─────────────────────────────┘
↓
Hybrid Retrieval (RRF fusion)
↓
Token-Budget Assembly
↓
OTEL Trace Spans
Stage Descriptions¶
| Stage | Detail |
|---|---|
| Document Loading | Supports plain text, Markdown, PDF (via pdfplumber), and HTML inputs |
| Recursive Chunking | Splits documents by semantic boundaries (headings, paragraphs, sentences) before falling back to token-count limits |
| Content-Hash Deduplication | Each chunk is hashed (SHA-256 of normalised content). Duplicate chunks are skipped during embedding, preventing redundant index entries |
| Embedding | Embedding vectors are computed lazily and cached by content hash, avoiding re-embedding unchanged content across ingestion runs |
| LanceDB Vector Index | Persistent on-disk vector store. Cosine similarity search returns top-K candidates |
| BM25 Keyword Index | In-memory BM25 index over chunk text. Captures exact-match and term-frequency signals not captured by dense vectors |
| Hybrid Retrieval (RRF) | Reciprocal Rank Fusion merges vector and keyword result lists. Balances semantic and lexical relevance |
| Token-Budget Assembly | Assembles the final context string by greedily appending the highest-ranked chunks until a configured token budget is reached |
| OTEL Tracing | Each pipeline stage emits OpenTelemetry spans, enabling distributed trace visualisation of retrieval quality |
Core Protocols¶
Source: agentic_v2/core/protocols.py
All core abstractions are defined as Python Protocol classes decorated with @runtime_checkable. This enables isinstance() checks without requiring inheritance, keeping implementations loosely coupled.
| Protocol | Description |
|---|---|
ExecutionEngine |
Interface for workflow execution engines; execute(definition, inputs) -> WorkflowResult |
AgentProtocol |
Interface for all agents; run(task_input) -> TaskOutput |
ToolProtocol |
Interface for tool modules; invoke(operation, params) -> Any |
MemoryStore |
Async key-value + search interface (see Memory Abstractions) |
SupportsStreaming |
Marks components that emit streaming events; stream() -> AsyncIterator[AgentMessage] |
SupportsCheckpointing |
Marks components that can save and restore state; checkpoint() -> bytes, restore(data: bytes) -> None |
All six protocols are @runtime_checkable.
Async Architecture¶
The runtime is async-first. The following design decisions govern concurrency:
Background Task Dispatch¶
POST /api/run dispatches execution via FastAPI BackgroundTasks. The HTTP response returns immediately (202 Accepted) with the run_id. Execution runs in the same event loop as the server but does not block the response thread.
Event Publication via asyncio.Queue¶
Each active run owns an asyncio.Queue[dict]. The execution coroutine put()s event dictionaries into the queue as execution proceeds. The SSE stream handler and WebSocket handler both consume from this queue via asyncio.Queue.get(). This is a pure in-process pub/sub mechanism with no external broker dependency.
SSE Streaming¶
The SSE endpoint uses an async generator that yields text/event-stream formatted strings from the run's event queue. FastAPI's streaming response support delivers these chunks via HTTP chunked transfer encoding.
WebSocket with Replay Buffer¶
The WebSocket handler maintains a per-run circular buffer (collections.deque(maxlen=500)) of serialised event dictionaries. On new WebSocket connections, all buffered events are replayed in order before live events begin. This allows late-joining clients to recover full run history up to the buffer limit.
Native DAG Concurrency¶
Within the native engine, steps with no unresolved dependencies are executed concurrently using asyncio.gather. This maximises throughput in workflows with parallel branches while respecting the dependency graph.
No External Broker¶
There is no Redis, Kafka, or any external message broker. All event state is in-process. This simplifies deployment but means events are lost on server restart. Durable history is available via the persisted JSON run-log files.
Security Architecture¶
Security controls are layered across three tiers:
Tier 1: Transport and Authentication¶
- HTTPS is enforced by the deployment infrastructure (reverse proxy / load balancer). The application layer does not terminate TLS directly.
- API key authentication via
secrets.compare_digest()prevents timing-based key enumeration. - CORS is configurable via
AGENTIC_CORS_ORIGINS. Defaults to permissive for local development; must be locked down for production.
Tier 2: Input Sanitization Middleware¶
All inbound request bodies pass through a 5-detector pipeline before reaching route handlers:
| Detector | What It Catches |
|---|---|
| Secret detector | API keys, tokens, private keys, connection strings |
| PII detector | Email addresses, phone numbers, national identifiers |
| Prompt injection detector | Instruction-override patterns (e.g., "ignore previous instructions") |
| Unicode anomaly detector | Zero-width characters, directional overrides, BOM markers |
| Classification engine | Assigns CLEAN, REDACTED, BLOCKED, or REQUIRES_APPROVAL |
Requests classified BLOCKED (e.g., containing private key material) are rejected with 400 Bad Request before any business logic executes. Requests classified REDACTED proceed with sensitive values replaced by [REDACTED] markers.
Tier 3: Runtime Safety Controls¶
| Control | Mechanism |
|---|---|
| Path containment | File-access tools validate that resolved paths remain within the configured working directory before any I/O |
| Private IP blocking | Outbound HTTP tool requests check destination against RFC 1918 ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) and loopback (127.0.0.0/8) and block matches |
| Tool safety defaults | All 11 built-in tool modules default to DENY for high-risk operations; per-step YAML allowlisting required |
| Secret provider abstraction | The SecretProvider abstraction centralises secret access; secrets are never passed directly in model configs or log output |
Configuration System¶
Source: agentic_v2/core/context.py, environment variables
Environment Variables¶
Approximately 25 environment variables govern runtime behaviour. Key variables:
| Variable | Description |
|---|---|
AGENTIC_API_KEY |
Server API key; unset enables open mode |
AGENTIC_FILE_BASE_DIR |
Base directory for file tool path containment; path-traversal safety anchor |
AGENTIC_CORS_ORIGINS |
Comma-separated list of allowed CORS origins |
AGENTIC_DEFAULT_ADAPTER |
Default execution engine (native or langchain) |
AGENTIC_LOG_LEVEL |
Logging level (DEBUG, INFO, WARNING, ERROR) |
OPENAI_API_KEY |
OpenAI API key |
ANTHROPIC_API_KEY |
Anthropic Claude API key |
GEMINI_API_KEY |
Google Gemini API key |
AZURE_OPENAI_API_KEY_0..n |
Azure OpenAI keys (supports _0 through _n suffix) |
AZURE_OPENAI_ENDPOINT_0..n |
Azure OpenAI endpoints (matching _0 through _n index) |
GITHUB_TOKEN |
GitHub Models API token |
OLLAMA_BASE_URL |
Ollama local inference base URL |
LOCAL_MODEL_PATH |
Local ONNX model path; auto-detected from ~/.cache/aigallery when unset |
OTEL_EXPORTER_OTLP_ENDPOINT |
OpenTelemetry collector endpoint |
See .env.example for the full variable list with documentation.
YAML-Based Configuration¶
Model routing, agent parameters, and workflow definitions are configured in YAML:
| Config Type | Location | Description |
|---|---|---|
| Workflow definitions | workflows/definitions/*.yaml |
6 built-in workflow definitions |
| Agent configs | Runtime-loaded | Agent names, tiers, tool allowlists |
| Model tier config | models/tiers.yaml |
Provider priority order per tier |
SecretProvider Abstraction¶
A SecretProvider class centralises access to secrets at runtime, preventing direct os.environ access scattered through the codebase. It supports:
- Environment variable backend (default)
- File-based backend (for mounted secrets in containerised environments)
- Custom backend (implement the protocol)
CLI¶
Source: agentic_v2/cli/
The CLI is implemented with Typer and provides 7 top-level commands:
| Command | Description |
|---|---|
agentic run <workflow> --input <file.json> |
Execute a workflow with inputs from a JSON file |
agentic compare <workflow> --input <file> |
Run the same workflow on both engines and compare outputs |
agentic list workflows\|agents\|tools |
List available workflows, agents, or tools |
agentic validate <workflow> |
Validate a workflow YAML without executing it |
agentic serve |
Start the FastAPI development server |
agentic version |
Print version information |
agentic rag ingest --source <path> |
Ingest documents into the RAG index |
agentic rag search <query> |
Run a search query against the RAG index |
The compare command is useful for verifying that the native and LangGraph engines produce equivalent outputs for a given workflow, which is important when migrating workflows between engines.
Source Map¶
| Path | Contents |
|---|---|
agentic_v2/server/ |
FastAPI app factory, route modules, auth, execution coordinator, WebSocket handler, SSE streaming, server-layer models |
agentic_v2/contracts/ |
Pydantic contracts: messages, schemas, sanitization, verification (additive-only policy) |
agentic_v2/core/ |
Protocols, memory implementations, context, errors, secret provider |
agentic_v2/engine/ |
Native DAG executor (Kahn's algorithm), step scheduler, result collector |
agentic_v2/langchain/ |
LangGraph execution engine, state graph builder, LangChain tool wrappers |
agentic_v2/adapters/ |
AdapterRegistry singleton, ExecutionEngine adapter base class |
agentic_v2/models/ |
SmartRouter, provider backends, tier config, circuit breaker, LiteLLM integration |
agentic_v2/agents/ |
BaseAgent, CoderAgent, ReviewerAgent, OrchestratorAgent, ArchitectAgent, capability mixins, implementations/ |
agentic_v2/tools/builtin/ |
11 built-in tool modules: file_read, file_write, file_delete, shell, web_search, http_request, git, code_exec, rag_search, memory_read, memory_write |
agentic_v2/rag/ |
Document loader, chunker, embedder, LanceDB vector store, BM25 index, hybrid retriever, RRF fusion, token-budget assembler, OTEL spans |
agentic_v2/prompts/ |
7 agent persona Markdown files |
agentic_v2/workflows/definitions/ |
6 YAML workflow definitions |
agentic_v2/integrations/ |
OpenTelemetry integration, tracer provider setup, span helpers |
agentic_v2/middleware/ |
Sanitization detector implementations (secret, PII, prompt injection, Unicode, classifier) |
agentic_v2/cli/ |
Typer CLI entry points |
tests/ |
100+ test files; pytest-asyncio auto mode; markers: integration, slow, security |
ui/ |
React 19 frontend; @xyflow/react DAG canvas; TanStack Query; Tailwind CSS; Vitest |
Key Design Decisions¶
Dual Execution Engine¶
The system supports two execution engines behind a shared ExecutionEngine protocol. The native engine has no optional dependencies and is the default; the LangGraph engine is opt-in and provides richer LangChain ecosystem integration. This allows teams to migrate workflows incrementally and compare outputs using agentic compare.
Additive-Only Contracts¶
All Pydantic models in contracts/ follow an additive-only policy. Fields are never removed or renamed in ways that break existing serialised data (JSON run logs) or running clients. New fields are added as Optional with defaults. This policy protects filesystem-persisted run logs from becoming unreadable after upgrades.
DENY-by-Default Tool Safety¶
High-risk tool operations are denied unless explicitly enabled per workflow step. This prevents accidental privilege escalation when new tools are added and ensures that security review of a workflow can be done by reading the YAML allowlist rather than auditing all agent code.
No External Message Broker¶
The event streaming system uses in-process asyncio.Queue with a 500-event circular buffer. This eliminates infrastructure dependencies (Redis, RabbitMQ) for the core streaming path, simplifying deployment and reducing operational surface. The trade-off is that events are not durable across server restarts; this is acceptable because all run results are persisted to JSON log files.
Filesystem Persistence¶
There is no database or ORM. All run results are serialised as JSON files. This keeps the deployment footprint minimal (no database server required) and makes run logs directly inspectable with standard tools. The trade-off is that querying run history at scale requires reading multiple files; the GET /api/runs endpoint applies in-memory filtering.
Protocol-Driven Architecture¶
All major system interfaces are defined as @runtime_checkable Protocol classes. This decouples implementations from the interface definitions, enabling testing with pure mock implementations and preventing tight coupling between layers. It also allows third-party adapters, tools, and memory stores to be registered without modifying core code.
Architectural Critique & Known Gaps¶
Merged from the 2026-03-03 architecture review. Captures weaknesses and recommendations that complement the strengths above.
Protocol Type-Safety Weaknesses¶
ExecutionEngine.execute()usesAnyforworkflow,ctx, and return type — loses compile-time type safety and pushes validation to runtime.AgentProtocol.run()usesAnyfor both input and output.BaseAgentis generic overTInput/TOutput, but the protocol itself does not enforce it.- No protocol exists for
WorkflowLoaderor config validation — these remain concrete classes.
Adapter Gaps¶
- Context bridging missing.
LangChainEngineacceptsctx: Anybut does not forward theExecutionContextto the underlyingWorkflowRunner. Shared state (variables, services, step tracking) from the native context system is not available in LangGraph executions.ctxis currently reserved for "future use." - Instance caching.
AdapterRegistrycaches adapter instances. Configuration changes after first access require a registry reset — production-safe but forcesobject.__new__()workarounds in tests.
Workflow DSL Limits¶
deep_research.yamlis 619 lines with four near-identical rounds. YAML anchors help, but the round-based structure does not support dynamic round counts; adding more rounds requires duplication.server/routes/workflows.pyis ~1,200 lines — the largest single file. Evaluation, dataset, and run-history concerns should be extracted as verticals grow.
Code-Quality Configuration Drift¶
agentic-workflows-v2/pyproject.tomlhas no[tool.ruff]section. Pre-commit runsruff --fixwith no--select, falling back to defaults (E + F only). CLAUDE.md prescribes 13 rule categories (E, F, W, I, N, UP, S, B, A, C4, SIM, TCH, RUF). Onlyagentic-v2-evalhas an explicit ruff config. Documented standards are aspirational rather than tool-enforced for the main package.
Production Readiness Gaps¶
- All vector store / memory implementations are in-memory. Production needs a persistent
VectorStoreProtocolimplementation (LanceDB optional dep exists but no adapter). - No cross-package integration tests exercising
tools/→agentic-workflows-v2→agentic-v2-evalend-to-end. - No adapter/tool plugin discovery — registration is import-time only (no
entry_pointsor directory scan). - RAG prompt-injection hardening (system-prompt-level delimiter framing for retrieved documents) is noted as architectural gap.
Prioritized Recommendations¶
| # | Recommendation | Impact | Effort | Priority |
|---|---|---|---|---|
| 1 | Tighten protocol signatures — replace Any in ExecutionEngine.execute() / AgentProtocol.run() with bounded TypeVars or Union types |
4 | M | High |
| 2 | Bridge ExecutionContext into LangChainEngine so both engines share state during adapter-routed execution |
4 | M | High |
| 3 | Add cross-package integration tests covering the LLM client → engine → eval scoring path | 4 | M | High |
| 4 | Add RAG prompt-injection hardening (delimiter framing in system prompts for retrieved docs) | 4 | M | High |
| 5 | Add persistent VectorStoreProtocol adapter (LanceDB) to bridge dev and production |
4 | M | Medium |
| 6 | Extract deep_research round template into a loader-level loop construct — ~619 lines → ~200 |
3 | M | Medium |
| 7 | Split server/routes/workflows.py into evaluation, dataset, and run-history route modules |
3 | S | Medium |
| 8 | Add a standalone quickstart.py / CLI command running a simple workflow end-to-end |
3 | S | Medium |
| 9 | Document "How to implement ExecutionEngine / VectorStoreProtocol" with test templates | 3 | S | Medium |
| 10 | Add adapter/tool plugin discovery via entry_points or directory scan |
3 | M | Low |