Core API¶
This page is auto-generated from docstrings via mkdocstrings. Source-of-truth lives in the Python files.
Patterns¶
consensus¶
executionkit.patterns.consensus.consensus
async
¶
consensus(provider: LLMProvider, prompt: str, *, num_samples: int = 5, strategy: VotingStrategy | str = 'majority', temperature: float = 0.9, max_tokens: int = 4096, max_concurrency: int = 5, retry: RetryConfig | None = None, max_cost: TokenUsage | None = None) -> PatternResult[str]
Run parallel LLM samples and aggregate via voting.
Fires num_samples concurrent completions and applies the chosen
voting strategy to determine the winning response.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider
|
LLMProvider
|
LLM provider to call. |
required |
prompt
|
str
|
User prompt sent identically to every sample. |
required |
num_samples
|
int
|
Number of parallel completions to request. Must be >= 1. |
5
|
strategy
|
VotingStrategy | str
|
|
'majority'
|
temperature
|
float
|
Sampling temperature (higher = more diverse). |
0.9
|
max_tokens
|
int
|
Maximum tokens per completion. |
4096
|
max_concurrency
|
int
|
Semaphore limit for parallel calls. |
5
|
retry
|
RetryConfig | None
|
Optional retry configuration per call. |
None
|
max_cost
|
TokenUsage | None
|
Optional token/call budget. Passed to each individual
|
None
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
PatternResult[str]
|
class: |
PatternResult[str]
|
|
|
PatternResult[str]
|
|
Raises:
| Type | Description |
|---|---|
ConsensusFailedError
|
When |
ValueError
|
If |
Source code in executionkit/patterns/consensus.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 | |
refine_loop¶
executionkit.patterns.refine_loop.refine_loop
async
¶
refine_loop(provider: LLMProvider, prompt: str, *, evaluator: Evaluator | None = None, max_eval_chars: int = 32768, target_score: float = 0.9, max_iterations: int = 5, patience: int = 3, delta_threshold: float = 0.01, temperature: float = 0.7, max_tokens: int = 4096, max_cost: TokenUsage | None = None, retry: RetryConfig | None = None) -> PatternResult[str]
Iteratively refine an LLM response until convergence or budget exhaustion.
Generates an initial response, evaluates it, then enters a refinement
loop. Each iteration asks the LLM to improve upon the previous output
given its score. The loop terminates when the
:class:ConvergenceDetector signals convergence (target score reached
or score deltas stall beyond patience) or max_iterations is hit.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider
|
LLMProvider
|
LLM provider to call. |
required |
prompt
|
str
|
The original user prompt. |
required |
evaluator
|
Evaluator | None
|
Async callable |
None
|
target_score
|
float
|
Convergence target in |
0.9
|
max_iterations
|
int
|
Maximum refinement iterations (excluding the initial generation). |
5
|
patience
|
int
|
Stale-delta iterations before convergence is declared. |
3
|
delta_threshold
|
float
|
Minimum meaningful score improvement. |
0.01
|
temperature
|
float
|
Sampling temperature for generation calls. |
0.7
|
max_tokens
|
int
|
Maximum tokens per completion. |
4096
|
max_cost
|
TokenUsage | None
|
Optional token/call budget. |
None
|
retry
|
RetryConfig | None
|
Optional retry configuration per call. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
PatternResult[str]
|
class: |
PatternResult[str]
|
|
|
PatternResult[str]
|
|
Source code in executionkit/patterns/refine_loop.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 | |
react_loop¶
executionkit.patterns.react_loop.react_loop
async
¶
react_loop(provider: ToolCallingProvider, prompt: str, tools: Sequence[Tool], *, max_rounds: int = 8, max_observation_chars: int = 12000, tool_timeout: float | None = None, temperature: float = 0.3, max_tokens: int = 4096, max_cost: TokenUsage | None = None, retry: RetryConfig | None = None, max_history_messages: int | None = None, **_: Any) -> PatternResult[str]
Execute a think-act-observe tool-calling loop.
The LLM is called repeatedly with the conversation history and
available tool schemas. When the LLM returns tool calls, each tool
is executed and its result appended as a tool-role message. The loop
ends when the LLM responds without tool calls (final answer) or
max_rounds is exhausted.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider
|
ToolCallingProvider
|
LLM provider to call. |
required |
prompt
|
str
|
Initial user prompt. |
required |
tools
|
Sequence[Tool]
|
Sequence of :class: |
required |
max_rounds
|
int
|
Maximum think-act-observe cycles. |
8
|
max_observation_chars
|
int
|
Truncation limit for each tool result. |
12000
|
tool_timeout
|
float | None
|
Per-call timeout override. Falls back to
|
None
|
temperature
|
float
|
Sampling temperature (lower = more deterministic). |
0.3
|
max_tokens
|
int
|
Maximum tokens per LLM completion. |
4096
|
max_cost
|
TokenUsage | None
|
Optional token/call budget. |
None
|
retry
|
RetryConfig | None
|
Optional retry configuration per LLM call. |
None
|
max_history_messages
|
int | None
|
When set, trim the message history to at most
this many entries before each LLM call. Always keeps the first
message (the original prompt). |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
PatternResult[str]
|
class: |
PatternResult[str]
|
response, |
|
PatternResult[str]
|
|
Source code in executionkit/patterns/react_loop.py
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 | |
pipe¶
executionkit.compose.pipe
async
¶
pipe(provider: LLMProvider, prompt: str, *steps: PatternStep, max_budget: TokenUsage | None = None, **shared_kwargs: Any) -> PatternResult[Any]
Chain reasoning patterns, threading output as the next prompt.
Each step must be an async callable with the signature::
async def step(provider, prompt, **kwargs) -> PatternResult[Any]
The value of each result is converted to a string and passed as the
prompt to the following step. Costs are accumulated and, when
max_budget is given, the remaining budget is forwarded to each step
via the max_cost keyword argument.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider
|
LLMProvider
|
LLM provider passed unchanged to every step. |
required |
prompt
|
str
|
Initial input prompt. |
required |
*steps
|
PatternStep
|
Async pattern callables to chain in order. |
()
|
max_budget
|
TokenUsage | None
|
Optional shared token/call budget across all steps. |
None
|
**shared_kwargs
|
Any
|
Extra keyword arguments forwarded to every step. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
The |
PatternResult[Any]
|
class: |
PatternResult[Any]
|
with its |
|
PatternResult[Any]
|
If steps is empty the prompt is returned as-is with zero cost. |
Source code in executionkit/compose.py
executionkit.compose.PatternStep ¶
Bases: Protocol
Callable protocol for a single step in a :func:pipe chain.
Each step must accept (provider, prompt, **kwargs) and return an
awaitable :class:~executionkit.types.PatternResult. Extra keyword
arguments (e.g. max_cost) are filtered to only those the step
actually accepts, so steps that do not declare **kwargs will not
receive unsupported arguments.
Sync wrappers¶
Each sync wrapper takes the same arguments as its async counterpart and runs it via asyncio.run. They raise RuntimeError when called inside a running event loop — use await directly there.
Value types¶
executionkit.types.PatternResult
dataclass
¶
PatternResult(value: T, score: float | None = None, cost: TokenUsage = TokenUsage(), metadata: MappingProxyType[str, Any] = (lambda: MappingProxyType({}))())
Bases: Generic[T]
Result returned by every reasoning pattern.
metadata keys vary by pattern. Each pattern documents its own keys in its
function docstring. Do not rely on undocumented keys — they are private.
executionkit.types.TokenUsage
dataclass
¶
Accumulated token and call counts.
executionkit.types.Tool
dataclass
¶
Tool(name: str, description: str, parameters: Mapping[str, Any], execute: Callable[..., Awaitable[str]], timeout: float = 30.0)
Describes a tool available for LLM tool-calling.
parameters is a JSON Schema mapping describing the function arguments.
Automatically wrapped in a read-only proxy.
execute is the async callable invoked when the LLM requests this tool.
executionkit.types.VotingStrategy ¶
Bases: StrEnum
Strategy for consensus voting.
executionkit.types.Evaluator
module-attribute
¶
Async callable that scores a response string on [0.0, 1.0].
Session¶
executionkit.kit.Kit ¶
Session that holds a :class:~executionkit.provider.Provider and
tracks cumulative token usage across all pattern calls.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider
|
LLMProvider
|
The LLM provider to use for all calls. |
required |
track_cost
|
bool
|
When |
True
|
Source code in executionkit/kit.py
consensus
async
¶
Run the :func:~executionkit.patterns.consensus.consensus pattern.
All keyword arguments are forwarded unchanged to :func:consensus.
Source code in executionkit/kit.py
pipe
async
¶
Run :func:~executionkit.compose.pipe with this Kit's provider.
All keyword arguments are forwarded unchanged to :func:pipe.
Source code in executionkit/kit.py
react
async
¶
Run the :func:~executionkit.patterns.react_loop.react_loop pattern.
All keyword arguments are forwarded unchanged to :func:react_loop.
The provider must satisfy :class:~executionkit.provider.ToolCallingProvider;
react_loop will raise :exc:TypeError if it does not.
Source code in executionkit/kit.py
refine
async
¶
Run the :func:~executionkit.patterns.refine_loop.refine_loop pattern.
All keyword arguments are forwarded unchanged to :func:refine_loop.
Source code in executionkit/kit.py
Cost tracking¶
executionkit.cost.CostTracker ¶
Mutable accumulator for token and call counts.
Source code in executionkit/cost.py
add_usage ¶
Add pre-computed usage to the tracker (e.g. from a pattern result).
Use this instead of accessing private fields directly.
Source code in executionkit/cost.py
record ¶
record_without_call ¶
Record token usage from a response without incrementing the call counter.
Used by :func:checked_complete which pre-increments _calls before
the await to prevent TOCTOU races in concurrent budget checks.
Source code in executionkit/cost.py
release_call ¶
Release a reserved call slot (after a failed call).
Only call this after :meth:reserve_call if the provider call raised
an exception and did not complete successfully.
reserve_call ¶
Reserve a call slot before dispatching (for TOCTOU-safe budget checks).
Called by :func:checked_complete before awaiting the provider call.
If the call fails, use :meth:release_call to undo the reservation.
Source code in executionkit/cost.py
to_usage ¶
Engine helpers¶
executionkit.engine.convergence.ConvergenceDetector
dataclass
¶
ConvergenceDetector(delta_threshold: float = 0.01, patience: int = 3, score_threshold: float | None = None)
Tracks score history and detects convergence via delta + patience.
Convergence is declared when either:
- score_threshold is set and the current score meets or exceeds it, or
- The score delta has been below delta_threshold for patience
consecutive iterations.
Attributes:
| Name | Type | Description |
|---|---|---|
delta_threshold |
float
|
Minimum meaningful score improvement. |
patience |
int
|
How many consecutive stale iterations before stopping. |
score_threshold |
float | None
|
Optional absolute score target for early exit. |
reset ¶
should_stop ¶
Record a score and return whether convergence is reached.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
score
|
float
|
Evaluator score, must be in [0.0, 1.0] and not NaN. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if the loop should stop (converged or threshold met). |
Raises:
| Type | Description |
|---|---|
ValueError
|
If score is NaN or outside [0.0, 1.0]. |
Source code in executionkit/engine/convergence.py
executionkit.engine.retry.RetryConfig
dataclass
¶
RetryConfig(max_retries: int = 3, base_delay: float = 1.0, max_delay: float = 60.0, exponential_base: float = 2.0, retryable: tuple[type[Exception], ...] = (RateLimitError, ProviderError))
Immutable retry configuration with exponential backoff.
Attributes:
| Name | Type | Description |
|---|---|---|
max_retries |
int
|
Maximum number of retry attempts. 0 means no retries. |
base_delay |
float
|
Base delay in seconds before first retry. |
max_delay |
float
|
Maximum delay cap in seconds. |
exponential_base |
float
|
Multiplier for exponential backoff. |
retryable |
tuple[type[Exception], ...]
|
Tuple of exception types that trigger retries. |
get_delay ¶
Calculate jittered backoff delay for the given attempt (1-indexed).
Uses full jitter (random value in [0, capped_exponential]) to prevent thundering-herd effects when multiple coroutines retry simultaneously.
Source code in executionkit/engine/retry.py
executionkit.engine.json_extraction.extract_json ¶
Extract JSON from LLM output using multiple strategies.
Strategies (in order):
1. Raw json.loads(text.strip())
2. Strip markdown fences (e.g. json or generic code fences)
3. Balanced-brace extraction -- find first { or [, track nesting
depth respecting string boundaries, find matching closer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Raw LLM response text that may contain JSON. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any] | list[Any]
|
Parsed JSON as a dict or list. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no valid JSON can be extracted. |
Source code in executionkit/engine/json_extraction.py
Errors¶
All exceptions inherit from ExecutionKitError and carry .cost (TokenUsage accumulated up to the failure) and .metadata (dict).
| Exception | Cause |
|---|---|
ExecutionKitError |
Base for all errors. |
LLMError |
Base for provider communication errors. |
RateLimitError |
HTTP 429 — retryable. Carries retry_after. |
PermanentError |
HTTP 401/403/404 — not retryable. |
ProviderError |
Unexpected HTTP failure — retryable. |
PatternError |
Base for pattern logic errors. |
BudgetExhaustedError |
Token or call budget exceeded. |
ConsensusFailedError |
Unanimous strategy could not agree. |
MaxIterationsError |
Loop hit max_rounds / max_iterations. |
executionkit.provider.ExecutionKitError ¶
ExecutionKitError(message: str, *, cost: TokenUsage | None = None, metadata: dict[str, Any] | None = None)
Bases: Exception
Base exception for all ExecutionKit errors.
Source code in executionkit/errors.py
executionkit.provider.RateLimitError ¶
RateLimitError(message: str, *, retry_after: float = 1.0, cost: TokenUsage | None = None, metadata: dict[str, Any] | None = None)
Bases: LLMError
Provider returned HTTP 429 — retryable after retry_after seconds.
Source code in executionkit/errors.py
executionkit.provider.BudgetExhaustedError ¶
BudgetExhaustedError(message: str, *, cost: TokenUsage | None = None, metadata: dict[str, Any] | None = None)
Bases: PatternError
Token or call budget exceeded.
Source code in executionkit/errors.py
executionkit.provider.ConsensusFailedError ¶
ConsensusFailedError(message: str, *, cost: TokenUsage | None = None, metadata: dict[str, Any] | None = None)
Bases: PatternError
Consensus pattern could not reach agreement.
Source code in executionkit/errors.py
executionkit.provider.MaxIterationsError ¶
MaxIterationsError(message: str, *, cost: TokenUsage | None = None, metadata: dict[str, Any] | None = None)
Bases: PatternError
Loop pattern exceeded its iteration limit.