Skip to content

Middleware Lifecycle

Middleware participates in an agent run through lifecycle hooks, wrappers, state updates, and events. Use middleware when behavior must sit around the agent loop instead of inside one tool.

Common uses include permission gates, retry policy, error formatting, tool-call timeout, logging, compaction, content handling, and UI-facing request/response events.

Execution Order

Hooks follow a stack-shaped model:

SurfaceOrder
Before* hooksregistration order
After* hooksreverse registration order
OnErrorAsyncreverse registration order
model/function wrappersfirst registered is outermost

If middleware is registered as A, then B, then C, wrapper execution is:

text
A(B(C(core)))

This is important for error handling. A retry middleware registered before a timeout middleware wraps the timeout behavior. A formatter registered later is closer to the core operation.

Main Hook Families

Message-turn hooks run around a user message turn. Iteration hooks run around an agent loop iteration. Function hooks run around tool/function execution. Branch hooks run around branch fork commit. Runtime hooks run around agent start/stop.

Wrapper hooks are different from Before* and After* hooks. They receive a handler and can decide whether to call it, call it more than once, transform inputs, transform outputs, or catch errors. Function wrappers always participate. Streaming model wrappers opt in by returning a non-null stream.

Runtime start/stop hooks run when an agent is used as a started runtime, such as hosted SSE/WebSocket, bot, TUI, client-tool, or other long-lived input-loop scenarios. Direct one-shot RunAsync(...) calls still use the message-turn, iteration, function, branch, and error hooks. See Agent Runtime And Capabilities for the distinction.

Hook Reference

IAgentMiddleware has hooks at each layer of the runtime. Override only the hooks your middleware needs.

LayerHookWhen it runsCommon uses
RuntimeBeforeStartAsyncBefore the agent runtime input loop startsAllocate runtime resources, validate startup, configure audio/realtime resources
RuntimeAfterStartedAsyncAfter the runtime input loop has startedAvailability diagnostics, background work that depends on the running loop
RuntimeBeforeStopAsyncBefore the runtime input loop stopsGraceful drain decisions, buffer flushing, shutdown diagnostics
RuntimeAfterStoppedAsyncAfter the runtime input loop has stopped and registered resources were disposedFinal telemetry, cleanup confirmation
Message turnBeforeMessageTurnAsyncBefore processing one user message turnRAG injection, memory retrieval, context augmentation, run-config inspection
Message turnAfterMessageTurnAsyncAfter a message turn completesMemory extraction, analytics, turn-level logging
IterationBeforeIterationAsyncBefore each model iterationPrompt/message modification, chat option tuning, per-iteration policy
Model wrapperWrapModelTurnStreamingAsyncAround the streaming model turn when the middleware opts inRetry, caching, request modification, streaming transformation, progressive metrics
Tool iterationBeforeToolExecutionAsyncAfter the model returns tool calls but before tools executeWhole-iteration tool validation, permission checks, tool filtering
Function batchBeforeParallelBatchAsyncBefore a parallel batch of functions executesBatch-level permissions, rate limiting, batch approval
FunctionBeforeFunctionAsyncBefore each individual function executesArgument validation, per-function permission checks, logging, overrides
Function wrapperWrapFunctionCallAsyncAround the actual function bodyRetry, caching, timeout, result transformation
FunctionAfterFunctionAsyncAfter a function completes or throwsResult formatting, function telemetry, exception observation
IterationAfterIterationAsyncAfter tool results are collected for an iterationResult aggregation, error recovery, state updates
Branch lifecycleBeforeBranchForkCommitAsyncAfter a target branch has been materialized for a fork, before it is persistedCompact copied history, stamp branch metadata, rewrite branch-local middleware state
ErrorOnErrorAsyncDuring the tool/function error pathFunction error logging, circuit breakers, graceful degradation

BeforeBranchForkCommitAsync is the fork hook. It sees both the source branch and the not-yet-persisted target branch, plus the fork point and BranchForkOptions. Use it when the target branch should start differently from a raw copy, such as compacting copied history, adding branch-local metadata, or adjusting copied middleware state before the new branch becomes durable.

Fork events are a different surface. After a fork is committed, branch event projection can include durable events such as BRANCH_FORKED and BRANCH_MIDDLEWARE_STATE_COMMITTED. Those events describe what was committed; they are not pre-commit mutation hooks.

Streaming Wrapper Probe

WrapModelTurnStreamingAsync(...) uses nullable return semantics. Returning null means the middleware does not intercept the streaming model turn.

During pipeline construction, the framework probes each middleware once to see whether it returns a stream. If it opts in, the middleware is called again when the stream actually executes. Avoid side effects before returning the IAsyncEnumerable.

Prefer this shape:

csharp
public IAsyncEnumerable<AgentModelUpdate>? WrapModelTurnStreamingAsync(
    AgentModelTurnRequest request,
    Func<AgentModelTurnRequest, IAsyncEnumerable<AgentModelUpdate>> next,
    CancellationToken cancellationToken)
{
    return RunAsync(request, next, cancellationToken);
}

private async IAsyncEnumerable<AgentModelUpdate> RunAsync(
    AgentModelTurnRequest request,
    Func<AgentModelTurnRequest, IAsyncEnumerable<AgentModelUpdate>> next,
    [System.Runtime.CompilerServices.EnumeratorCancellation] CancellationToken cancellationToken)
{
    // Side effects here run during enumeration, not during the probe.
    Console.WriteLine("Model stream started.");

    await foreach (var update in next(request).WithCancellation(cancellationToken))
    {
        yield return update;
    }
}

Do not increment counters, emit events, mutate request-related state, or start background work before returning the stream. That work can happen during the probe even if no tokens have been enumerated yet.

State Updates

Middleware state is immutable. Use UpdateState(...) for core state or multi-state atomic updates. Use UpdateMiddlewareState<TState>(...) for simple updates to one middleware state record.

csharp
public sealed class TurnCountingMiddleware : IAgentMiddleware
{
    public Task BeforeIterationAsync(
        BeforeIterationContext context,
        CancellationToken cancellationToken)
    {
        context.UpdateMiddlewareState<TurnCounterState>(state => state with
        {
            Count = state.Count + 1
        });

        return Task.CompletedTask;
    }
}

[MiddlewareState(Persistent = true, Scope = StateScope.Branch)]
public sealed record TurnCounterState
{
    public int Count { get; init; }
}

Do not read state, await unrelated work, and then write a derived value. Put the read inside the update lambda so the update is based on the current state at the time of mutation.

Use UpdateState(...) when you need to update loop state directly:

csharp
context.UpdateState(state => state with
{
    IsTerminated = true,
    TerminationReason = "Stopped by middleware policy"
});

Error Scope

OnErrorAsync currently belongs to the tool/function error path. A function body exception is routed through OnErrorAsync and then through AfterFunctionAsync.

Do not assume OnErrorAsync catches every provider, model-call, streaming, or whole message-turn exception. Model-call errors are handled by model streaming wrappers such as retry and error formatting. Message-turn failures are surfaced through message-turn error events.

Built-In Middleware Order

The builder also registers built-in middleware for configured features. Source-checked built-ins include content upload/reference/image middleware, retry, function timeout, error formatting, container/collapsing, client tools, and logging.

Because wrappers are first-registered-is-outermost, the recommended error-handling stack is:

text
RetryMiddleware(FunctionTimeoutMiddleware(ErrorFormattingMiddleware(core)))

See Error Handling for details.

Built for production .NET agent applications.