Skip to content

Backend Abstraction Layer

The backend abstraction layer provides a provider-agnostic interface for agent execution. Downstream consumers (test runner, chat session, serve endpoint) depend only on the protocols defined in holodeck.lib.backends.base -- no provider-specific types leak through.

Routing

BackendSelector inspects model.provider and instantiates the correct backend automatically:

Provider Backend
openai, azure_openai OpenAIAgentsBackend
anthropic, ollama ClaudeBackend

holodeck.lib.backends.base -- Core Protocols & Data Classes

Defines the provider-agnostic contracts that every backend must satisfy and the unified result types returned to callers.

ExecutionResult

The unified result type returned by every backend. Fields: response, tool_calls, tool_results, token_usage, structured_output, num_turns, is_error, error_reason, and thinking (extended-thinking text, empty when disabled or unsupported by the active backend).

ExecutionResult(response, tool_calls=list(), tool_results=list(), token_usage=TokenUsage.zero(), structured_output=None, num_turns=1, is_error=False, error_reason=None, thinking='') dataclass

Provider-agnostic result of a single agent turn.

Attributes:

Name Type Description
response str

The text response from the agent.

tool_calls list[dict[str, Any]]

List of tool call records made during execution.

tool_results list[dict[str, Any]]

List of tool result records returned during execution.

token_usage TokenUsage

Token consumption metadata for this turn.

structured_output Any | None

Optional structured output from the agent.

num_turns int

Number of turns taken to produce this result.

is_error bool

Whether the execution ended in an error state.

error_reason str | None

Human-readable reason for the error, if any.

thinking str

Extended-thinking text emitted by the model, concatenated in arrival order. Empty when extended thinking is disabled or unsupported by the backend.

ToolEvent

ToolEvent(kind, tool_name, tool_use_id, tool_input=None, tool_response=None, error=None, parent_tool_use_id=None, text=None) dataclass

Real-time tool execution event from the backend.

Emitted by backends that support hook-based tool observation (e.g. Claude Agent SDK). Events are pushed onto an asyncio.Queue that consumers can drain concurrently during agent execution.

Attributes:

Name Type Description
kind Literal['start', 'end', 'error', 'subagent_message', 'parent_link', 'thinking']

Event type — "start" before execution, "end" after success, "error" after failure, "subagent_message" for an assistant text snapshot streamed by an in-flight subagent (Task tool), "parent_link" to declare that a tool invocation was spawned by a subagent (so the panel can annotate it), "thinking" for an extended-thinking block streamed mid-turn (one event per ThinkingBlock, ahead of any tool call the block precedes).

tool_name str

Name of the tool being invoked. Empty for "thinking" (no tool involved).

tool_use_id str

Unique identifier correlating start/end/error for the same invocation. For "subagent_message" events this is the parent Task's tool_use_id so consumers can attach the snapshot to the corresponding active entry. For "parent_link" this is the child tool's id. For "thinking" this is a per-block id callers can thread through any downstream protocol (e.g. AG-UI REASONING_* message_id).

tool_input dict[str, Any] | None

Tool input parameters (present on "start").

tool_response str | None

Tool output (present on "end").

error str | None

Error description (present on "error").

parent_tool_use_id str | None

For nested events, the parent Task's tool_use_id. Present on "subagent_message" and "parent_link".

text str | None

Latest assistant text snapshot from a subagent (present on "subagent_message"), or the thinking-block text (present on "thinking").

AgentSession

AgentSession

Bases: Protocol

Stateful multi-turn conversation session.

Implementations maintain conversation history across multiple send calls. Callers must invoke close when the session is no longer needed to release any held resources (connections, subprocesses, etc.).

close() async

Release session resources (connections, subprocesses, etc.).

Source code in src/holodeck/lib/backends/base.py
147
148
149
async def close(self) -> None:
    """Release session resources (connections, subprocesses, etc.)."""
    ...

prepare() async

Connect or otherwise ready the session before the first send.

Called by callers that need the session's underlying transport opened in a specific async task context (e.g. _TaskBoundSession requires the SDK's connect() to run in the actor task so the anyio task group binds correctly).

Default behavior is a no-op. Implementations that have lazy connect semantics — like ClaudeSession — should override this to perform the connect explicitly. Implementations that connect at construction time can leave this as a no-op.

Source code in src/holodeck/lib/backends/base.py
109
110
111
112
113
114
115
116
117
118
119
120
121
122
async def prepare(self) -> None:
    """Connect or otherwise ready the session before the first ``send``.

    Called by callers that need the session's underlying transport
    opened in a specific async task context (e.g.
    ``_TaskBoundSession`` requires the SDK's ``connect()`` to run in
    the actor task so the anyio task group binds correctly).

    Default behavior is a no-op. Implementations that have lazy
    connect semantics — like ``ClaudeSession`` — should override
    this to perform the connect explicitly. Implementations that
    connect at construction time can leave this as a no-op.
    """
    return None

send(message) async

Send a message and receive a single-turn result.

Parameters:

Name Type Description Default
message str

The user message to send to the agent.

required

Returns:

Type Description
ExecutionResult

ExecutionResult containing the agent response and metadata.

Source code in src/holodeck/lib/backends/base.py
124
125
126
127
128
129
130
131
132
133
async def send(self, message: str) -> ExecutionResult:
    """Send a message and receive a single-turn result.

    Args:
        message: The user message to send to the agent.

    Returns:
        ExecutionResult containing the agent response and metadata.
    """
    ...

send_streaming(message) async

Send a message and stream the agent response token by token.

Parameters:

Name Type Description Default
message str

The user message to send to the agent.

required

Yields:

Type Description
AsyncGenerator[str, None]

Successive string chunks of the agent response.

Source code in src/holodeck/lib/backends/base.py
135
136
137
138
139
140
141
142
143
144
145
async def send_streaming(self, message: str) -> AsyncGenerator[str, None]:
    """Send a message and stream the agent response token by token.

    Args:
        message: The user message to send to the agent.

    Yields:
        Successive string chunks of the agent response.
    """
    # Protocol stub — concrete implementations use `yield`
    yield ""  # pragma: no cover

AgentBackend

AgentBackend

Bases: Protocol

Provider backend factory.

Each backend encapsulates provider-specific initialisation logic and exposes a uniform surface for single-turn invocations (invoke_once) and stateful sessions (create_session). Callers must call initialize before any other method and teardown when done.

create_session(*, eager_connect=True) async

Create a new stateful multi-turn session.

Parameters:

Name Type Description Default
eager_connect bool

When True (default), the backend may open its underlying transport before returning. When False, the backend must return a session whose transport is opened lazily — typically by AgentSession.prepare() invoked from a specific async task. Used by callers that need connection lifecycle bound to a different task than the one calling create_session (e.g. _TaskBoundSession actor). Backends without lazy connect semantics may accept and ignore this argument.

True

Returns:

Type Description
AgentSession

A fresh AgentSession instance bound to this backend.

Raises:

Type Description
BackendInitError

If the backend was not initialised before calling.

BackendSessionError

If the session cannot be created.

Source code in src/holodeck/lib/backends/base.py
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
async def create_session(self, *, eager_connect: bool = True) -> AgentSession:
    """Create a new stateful multi-turn session.

    Args:
        eager_connect: When True (default), the backend may open its
            underlying transport before returning. When False, the
            backend must return a session whose transport is opened
            lazily — typically by ``AgentSession.prepare()`` invoked
            from a specific async task. Used by callers that need
            connection lifecycle bound to a different task than the
            one calling ``create_session`` (e.g.
            ``_TaskBoundSession`` actor). Backends without lazy
            connect semantics may accept and ignore this argument.

    Returns:
        A fresh AgentSession instance bound to this backend.

    Raises:
        BackendInitError: If the backend was not initialised before calling.
        BackendSessionError: If the session cannot be created.
    """
    ...

initialize() async

Prepare the backend for use.

Raises:

Type Description
BackendInitError

If the backend cannot be initialised (e.g. missing API key, unavailable subprocess).

Source code in src/holodeck/lib/backends/base.py
162
163
164
165
166
167
168
169
async def initialize(self) -> None:
    """Prepare the backend for use.

    Raises:
        BackendInitError: If the backend cannot be initialised (e.g.
            missing API key, unavailable subprocess).
    """
    ...

invoke_once(message, context=None) async

Execute a single stateless agent turn.

Parameters:

Name Type Description Default
message str

The user message to send to the agent.

required
context list[dict[str, Any]] | None

Optional list of prior conversation turns.

None

Returns:

Type Description
ExecutionResult

ExecutionResult containing the agent response and metadata.

Raises:

Type Description
BackendSessionError

If the invocation fails at runtime.

BackendTimeoutError

If the invocation exceeds configured timeout.

Source code in src/holodeck/lib/backends/base.py
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
async def invoke_once(
    self,
    message: str,
    context: list[dict[str, Any]] | None = None,
) -> ExecutionResult:
    """Execute a single stateless agent turn.

    Args:
        message: The user message to send to the agent.
        context: Optional list of prior conversation turns.

    Returns:
        ExecutionResult containing the agent response and metadata.

    Raises:
        BackendSessionError: If the invocation fails at runtime.
        BackendTimeoutError: If the invocation exceeds configured timeout.
    """
    ...

teardown() async

Release all backend resources.

Source code in src/holodeck/lib/backends/base.py
214
215
216
async def teardown(self) -> None:
    """Release all backend resources."""
    ...

ContextGenerator

ContextGenerator

Bases: Protocol

Backend-agnostic contextual embedding generation.

Implementations produce situating context for document chunks by summarising each chunk's role within the larger document. Both the existing Semantic Kernel generator and future Claude SDK generator should satisfy this protocol.

contextualize_batch(chunks, document_text, concurrency=None) async

Generate contextual descriptions for a batch of chunks.

Parameters:

Name Type Description Default
chunks list[DocumentChunk]

Document chunks to contextualize.

required
document_text str

Full text of the source document.

required
concurrency int | None

Maximum number of concurrent LLM calls.

None

Returns:

Type Description
list[str]

A list of contextual description strings, one per chunk.

Source code in src/holodeck/lib/backends/base.py
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
async def contextualize_batch(
    self,
    chunks: list[DocumentChunk],
    document_text: str,
    concurrency: int | None = None,
) -> list[str]:
    """Generate contextual descriptions for a batch of chunks.

    Args:
        chunks: Document chunks to contextualize.
        document_text: Full text of the source document.
        concurrency: Maximum number of concurrent LLM calls.

    Returns:
        A list of contextual description strings, one per chunk.
    """
    ...

Exceptions

BackendError

Bases: HoloDeckError

Base exception for all backend errors.

Catch this to handle any backend-related failure without needing to know the specific subtype.

BackendInitError

Bases: BackendError

Raised during initialize() — startup validation failures.

Examples include a missing API key, an unreachable subprocess, or an incompatible runtime environment.

BackendSessionError

Bases: BackendError

Raised during send() — session-level failures.

Examples include unexpected disconnections, malformed responses, or provider-reported errors during an active session.

BackendTimeoutError

Bases: BackendError

Raised when a single invocation exceeds the configured timeout.

Callers may choose to retry with a longer timeout or surface this as a user-visible error.


holodeck.lib.backends.selector -- Backend Routing

Routes an Agent configuration to the correct backend based on model.provider.

BackendSelector

BackendSelector

Selects and initializes the appropriate backend for an agent configuration.

select(agent, tool_instances=None, mode='test') async staticmethod

Select and initialize the appropriate backend for the given agent.

Parameters:

Name Type Description Default
agent Agent

Agent configuration with model provider information.

required
tool_instances dict[str, Any] | None

Initialized tool instances for Claude backend.

None
mode str

Execution mode ("test" or "chat").

'test'

Returns:

Type Description
AgentBackend

An initialized AgentBackend instance ready for use.

Raises:

Type Description
BackendInitError

If the provider is not supported or initialization fails.

Source code in src/holodeck/lib/backends/selector.py
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
@staticmethod
async def select(
    agent: Agent,
    tool_instances: dict[str, Any] | None = None,
    mode: str = "test",
) -> AgentBackend:
    """Select and initialize the appropriate backend for the given agent.

    Args:
        agent: Agent configuration with model provider information.
        tool_instances: Initialized tool instances for Claude backend.
        mode: Execution mode (``"test"`` or ``"chat"``).

    Returns:
        An initialized AgentBackend instance ready for use.

    Raises:
        BackendInitError: If the provider is not supported or initialization fails.
    """
    provider = agent.model.provider

    if provider in (ProviderEnum.OPENAI, ProviderEnum.AZURE_OPENAI):
        # Lazy import keeps the optional openai-agents SDK off the import
        # path for non-OpenAI providers (SC-005).
        from holodeck.lib.backends.openai_agents_backend import OpenAIAgentsBackend

        openai_backend = OpenAIAgentsBackend(agent=agent)
        await openai_backend.initialize()
        return openai_backend

    if provider in (ProviderEnum.ANTHROPIC, ProviderEnum.OLLAMA):
        claude_backend = ClaudeBackend(
            agent=agent,
            tool_instances=tool_instances,
            mode=mode,
        )
        await claude_backend.initialize()
        return claude_backend

    raise BackendInitError(f"Unsupported provider: {provider}")

holodeck.lib.backends.openai_agents_backend -- OpenAI Agents Backend

Implements the backend for provider: openai and provider: azure_openai natively on the OpenAI Agents SDK, behind the provider-agnostic backend interfaces.

OpenAIAgentsBackend

OpenAIAgentsBackend(agent, base_dir=None)

OpenAI Agents SDK backend implementing the AgentBackend protocol.

Wraps an SDK Agent (built from the HoloDeck agent config) and drives it through Runner.run for single-turn invocations and SQLiteSession- backed multi-turn sessions.

Initialize the backend with agent configuration.

Parameters:

Name Type Description Default
agent Agent

The HoloDeck agent configuration.

required
base_dir Path | None

Directory for resolving relative tool/instruction paths. Falls back to the agent_base_dir context variable.

None
Source code in src/holodeck/lib/backends/openai_agents_backend.py
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
def __init__(self, agent: Agent, base_dir: Path | None = None) -> None:
    """Initialize the backend with agent configuration.

    Args:
        agent: The HoloDeck agent configuration.
        base_dir: Directory for resolving relative tool/instruction paths.
            Falls back to the ``agent_base_dir`` context variable.
    """
    self._agent_config = agent
    self._base_dir = base_dir
    self._sdk_agent: Any | None = None
    self._has_structured_output = False
    self._tool_instances: dict[str, Any] = {}
    self._owned_tools: list[Any] = []
    self._mcp_servers: list[Any] = []

create_session(*, eager_connect=True) async

Create a stateful multi-turn session backed by a fresh SQLiteSession.

Parameters:

Name Type Description Default
eager_connect bool

Accepted for protocol compatibility; the SQLite session is created synchronously regardless.

True

Returns:

Type Description
AgentSession

An OpenAIAgentsSession bound to this backend's SDK agent.

Source code in src/holodeck/lib/backends/openai_agents_backend.py
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
async def create_session(self, *, eager_connect: bool = True) -> AgentSession:
    """Create a stateful multi-turn session backed by a fresh SQLiteSession.

    Args:
        eager_connect: Accepted for protocol compatibility; the SQLite
            session is created synchronously regardless.

    Returns:
        An ``OpenAIAgentsSession`` bound to this backend's SDK agent.
    """
    del eager_connect  # no lazy-connect transport for this backend
    sdk_agent = self._require_agent()
    from agents import SQLiteSession

    session_id = f"holodeck-{uuid.uuid4().hex}"
    session = SQLiteSession(session_id)
    return OpenAIAgentsSession(
        sdk_agent=sdk_agent,
        sqlite_session=session,
        agent_config=self._agent_config,
        group_id=session_id,
        max_turns=_max_turns(self._agent_config),
        budget_usd=_max_budget_usd(self._agent_config),
        structured_output=self._has_structured_output,
    )

initialize() async

Build the SDK Agent — validating credentials and tools.

Raises:

Type Description
BackendInitError

If credentials are missing or the provider is unsupported.

ConfigError

If a tool config is unsupported or fails to load.

Source code in src/holodeck/lib/backends/openai_agents_backend.py
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
async def initialize(self) -> None:
    """Build the SDK ``Agent`` — validating credentials and tools.

    Raises:
        BackendInitError: If credentials are missing or the provider is
            unsupported.
        ConfigError: If a tool config is unsupported or fails to load.
    """
    from agents import Agent as SDKAgent

    from holodeck.lib.backends.openai_agents_output import (
        build_output_schema,
        load_response_format_schema,
    )
    from holodeck.lib.backends.openai_agents_tool_adapters import build_sdk_tools
    from holodeck.lib.backends.validators import validate_openai_agents
    from holodeck.lib.instruction_resolver import resolve_instructions

    # FR-034 / FR-110: fail load on credential gaps and allow ∩ disallow
    # conflicts (all problems surfaced together) before any SDK side effects.
    validate_openai_agents(self._agent_config)

    # Install the OTel-mirroring TracingProcessor once, before any run emits
    # spans (H1). Gated on the agent's observability/tracing being enabled.
    _install_tracing_mirror(self._agent_config)

    base_dir = self._resolve_base_dir()
    disallowed = _disallowed_tool_names(self._agent_config)
    model = _build_model(self._agent_config)
    instructions = resolve_instructions(
        self._agent_config.instructions, base_dir=base_dir
    )
    await self._initialize_tool_instances()
    tools = build_sdk_tools(
        self._agent_config.tools,
        base_dir,
        tool_instances=self._tool_instances,
        disallowed=disallowed,
    )
    mcp_servers = await self._initialize_mcp_servers(base_dir, disallowed)

    model_settings = _build_model_settings(
        self._agent_config.model, self._agent_config.openai
    )

    # FR-004: wire response_format (dict | str path | None) to output_type.
    schema = load_response_format_schema(
        self._agent_config.response_format, base_dir
    )
    output_type = build_output_schema(schema) if schema is not None else None
    self._has_structured_output = output_type is not None

    self._sdk_agent = SDKAgent(
        name=self._agent_config.name,
        instructions=instructions,
        model=model,
        tools=tools,
        mcp_servers=mcp_servers,
        model_settings=model_settings,
        output_type=output_type,
    )

invoke_once(message, context=None) async

Execute a single stateless agent turn.

Parameters:

Name Type Description Default
message str

The user message to send to the agent.

required
context list[dict[str, Any]] | None

Optional prior turns (unused in the MVP).

None

Returns:

Type Description
ExecutionResult

ExecutionResult for the turn.

Raises:

Type Description
BackendSessionError

If the SDK run fails at runtime.

Source code in src/holodeck/lib/backends/openai_agents_backend.py
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
async def invoke_once(
    self,
    message: str,
    context: list[dict[str, Any]] | None = None,
) -> ExecutionResult:
    """Execute a single stateless agent turn.

    Args:
        message: The user message to send to the agent.
        context: Optional prior turns (unused in the MVP).

    Returns:
        ExecutionResult for the turn.

    Raises:
        BackendSessionError: If the SDK run fails at runtime.
    """
    del context  # not threaded into the SDK loop for the MVP
    sdk_agent = self._require_agent()
    from agents import Runner

    try:
        result = await Runner.run(
            sdk_agent,
            message,
            max_turns=_max_turns(self._agent_config),
            run_config=_build_run_config(self._agent_config),
            hooks=self._invoke_hooks(),
        )
    except BackendBudgetExceededError as exc:
        # Surface the budget abort as an error result so the partial response
        # and accumulated cost are preserved (FR-032), not lost to a raise.
        return _budget_error_result(exc)
    except Exception as exc:  # noqa: BLE001 - re-raised as backend error
        raise BackendSessionError(
            f"OpenAI Agents run failed: {type(exc).__name__}: {exc}"
        ) from exc
    return _to_execution_result(result, structured=self._has_structured_output)

teardown() async

Release backend resources, cleaning up RAG tools and MCP servers.

Source code in src/holodeck/lib/backends/openai_agents_backend.py
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
async def teardown(self) -> None:
    """Release backend resources, cleaning up RAG tools and MCP servers."""
    for tool_inst in self._owned_tools:
        cleanup = getattr(tool_inst, "cleanup", None)
        if callable(cleanup):
            try:
                await cleanup()
            except Exception as exc:  # noqa: BLE001 - best-effort cleanup
                logger.warning("Error cleaning up tool: %s", exc)
    self._owned_tools = []
    self._tool_instances = {}

    for server in self._mcp_servers:
        try:
            await server.cleanup()
        except Exception as exc:  # noqa: BLE001 - best-effort cleanup
            logger.warning("Error cleaning up MCP server: %s", exc)
    self._mcp_servers = []

OpenAIAgentsSession

OpenAIAgentsSession(sdk_agent, sqlite_session, *, agent_config=None, group_id=None, max_turns=20, budget_usd=None, structured_output=False)

Stateful multi-turn session backed by an SDK SQLiteSession.

Each send runs the SDK agent loop with the shared SQLiteSession so the SDK persists turn history. Idle sessions are SQLite rows, not held processes.

Bind the session to an SDK agent and its SQLite-backed history.

Parameters:

Name Type Description Default
sdk_agent Any

The built SDK Agent.

required
sqlite_session Any

The SDK SQLiteSession persisting turn history.

required
agent_config Agent | None

The HoloDeck agent config used to build the per-run RunConfig. When None no RunConfig is attached.

None
group_id str | None

The session id, carried as RunConfig.group_id so the session's turns correlate in the trace.

None
max_turns int

The agent-loop cap passed to Runner.run.

20
budget_usd float | None

The configured max_budget_usd. When set, a single cost accountant is shared across every turn so the budget covers the whole session; when None no cost hooks are attached.

None
structured_output bool

Whether the agent has an output schema, so each turn's final_output is parsed into structured_output.

False
Source code in src/holodeck/lib/backends/openai_agents_backend.py
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
def __init__(
    self,
    sdk_agent: Any,
    sqlite_session: Any,
    *,
    agent_config: Agent | None = None,
    group_id: str | None = None,
    max_turns: int = 20,
    budget_usd: float | None = None,
    structured_output: bool = False,
) -> None:
    """Bind the session to an SDK agent and its SQLite-backed history.

    Args:
        sdk_agent: The built SDK ``Agent``.
        sqlite_session: The SDK ``SQLiteSession`` persisting turn history.
        agent_config: The HoloDeck agent config used to build the per-run
            ``RunConfig``. When ``None`` no ``RunConfig`` is attached.
        group_id: The session id, carried as ``RunConfig.group_id`` so the
            session's turns correlate in the trace.
        max_turns: The agent-loop cap passed to ``Runner.run``.
        budget_usd: The configured ``max_budget_usd``. When set, a single
            cost accountant is shared across every turn so the budget covers
            the whole session; when ``None`` no cost hooks are attached.
        structured_output: Whether the agent has an output schema, so each
            turn's ``final_output`` is parsed into ``structured_output``.
    """
    self._sdk_agent = sdk_agent
    self._session = sqlite_session
    self._agent_config = agent_config
    self._group_id = group_id
    self._max_turns = max_turns
    self._budget_usd = budget_usd
    self._structured_output = structured_output
    # One accountant shared across the session's turns (FR-032).
    self._accountant: Any | None = None

close() async

Release the SQLite session connection, if any.

Source code in src/holodeck/lib/backends/openai_agents_backend.py
792
793
794
795
796
797
798
async def close(self) -> None:
    """Release the SQLite session connection, if any."""
    close = getattr(self._session, "close", None)
    if callable(close):
        maybe = close()
        if hasattr(maybe, "__await__"):
            await maybe

prepare() async

No-op. The SQLite session is ready at construction time.

Source code in src/holodeck/lib/backends/openai_agents_backend.py
719
720
721
async def prepare(self) -> None:
    """No-op. The SQLite session is ready at construction time."""
    return None

send(message) async

Run one turn against the persistent session.

Parameters:

Name Type Description Default
message str

The user message to send to the agent.

required

Returns:

Type Description
ExecutionResult

ExecutionResult for this turn. Runtime failures are returned as an

ExecutionResult

error result (is_error=True) rather than raised, so the multi-turn

ExecutionResult

executor can record per-turn failures.

Source code in src/holodeck/lib/backends/openai_agents_backend.py
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
async def send(self, message: str) -> ExecutionResult:
    """Run one turn against the persistent session.

    Args:
        message: The user message to send to the agent.

    Returns:
        ExecutionResult for this turn. Runtime failures are returned as an
        error result (``is_error=True``) rather than raised, so the multi-turn
        executor can record per-turn failures.
    """
    from agents import Runner

    try:
        result = await Runner.run(
            self._sdk_agent,
            message,
            session=self._session,
            max_turns=self._max_turns,
            run_config=self._run_config(),
            hooks=self._hooks(),
        )
    except BackendBudgetExceededError as exc:
        return _budget_error_result(exc)
    except Exception as exc:  # noqa: BLE001 - surfaced via ExecutionResult
        return ExecutionResult(
            response="",
            is_error=True,
            error_reason=f"{type(exc).__name__}: {exc}",
        )
    return _to_execution_result(result, structured=self._structured_output)

send_streaming(message) async

Stream the agent response token by token.

Runs the SDK agent loop via Runner.run_streamed and forwards each model text delta as it arrives. Text deltas surface as raw-response events carrying a ResponseTextDeltaEvent; tool-call and lifecycle events are ignored for the streamed text channel.

Parameters:

Name Type Description Default
message str

The user message to send to the agent.

required

Yields:

Type Description
AsyncGenerator[str, None]

String chunks of the agent response as the model produces them.

Source code in src/holodeck/lib/backends/openai_agents_backend.py
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
async def send_streaming(self, message: str) -> AsyncGenerator[str, None]:
    """Stream the agent response token by token.

    Runs the SDK agent loop via ``Runner.run_streamed`` and forwards each
    model text delta as it arrives. Text deltas surface as raw-response
    events carrying a ``ResponseTextDeltaEvent``; tool-call and lifecycle
    events are ignored for the streamed text channel.

    Args:
        message: The user message to send to the agent.

    Yields:
        String chunks of the agent response as the model produces them.
    """
    from agents import Runner
    from openai.types.responses import ResponseTextDeltaEvent

    result = Runner.run_streamed(
        self._sdk_agent,
        message,
        session=self._session,
        max_turns=self._max_turns,
        run_config=self._run_config(),
        hooks=self._hooks(),
    )
    try:
        async for event in result.stream_events():
            if event.type != "raw_response_event":
                continue
            data = event.data
            if isinstance(data, ResponseTextDeltaEvent) and data.delta:
                yield data.delta
    except BackendBudgetExceededError:
        # The budget tripped mid-stream; the deltas produced so far have
        # already been yielded, so end the stream gracefully (FR-032).
        return

holodeck.lib.backends.claude_backend -- Claude Agent SDK Backend

Implements the backend for provider: anthropic (and local provider: ollama models). Single-turn invocations use the top-level query() SDK function; multi-turn chat sessions use ClaudeSDKClient.

ClaudeBackend

ClaudeBackend(agent, tool_instances=None, mode='test')

Backend implementation for the Claude Agent SDK.

Implements the AgentBackend protocol. Both single-turn invocations and multi-turn sessions are built on the top-level query() function; multi-turn state is threaded via resume=<sdk_session_id> inside ClaudeSession.

The constructor stores config only — no I/O, no subprocess spawned. Initialization is deferred to initialize() (called lazily on first use).

Store configuration without performing any I/O.

Parameters:

Name Type Description Default
agent Agent

Agent configuration.

required
tool_instances dict[str, Any] | None

Initialized vectorstore/hierarchical-doc tool instances.

None
mode str

Execution mode ("test" or "chat").

'test'
Source code in src/holodeck/lib/backends/claude_backend.py
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
def __init__(
    self,
    agent: Agent,
    tool_instances: dict[str, Any] | None = None,
    mode: str = "test",
) -> None:
    """Store configuration without performing any I/O.

    Args:
        agent: Agent configuration.
        tool_instances: Initialized vectorstore/hierarchical-doc tool instances.
        mode: Execution mode (``"test"`` or ``"chat"``).
    """
    self._agent = agent
    self._tool_instances = tool_instances or {}
    self._mode = mode
    self._initialized = False
    self._options: ClaudeAgentOptions | None = None
    self._owned_tools: list[Any] = []  # Tools created during initialize()
    self._instrumentor: Any = None

create_session(*, eager_connect=True) async

Create a new multi-turn session.

Automatically initializes if not yet done. Under spec 034 P4 the session no longer holds a persistent ClaudeSDKClient; each turn opens its own subprocess via query(resume=session_id). eager_connect is retained as a no-op for API compatibility — ClaudeSession.prepare() is itself a no-op under P4.

Parameters:

Name Type Description Default
eager_connect bool

Retained for backwards compatibility; has no effect under spec 034 P4.

True

Returns:

Type Description
ClaudeSession

A new ClaudeSession instance.

Source code in src/holodeck/lib/backends/claude_backend.py
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
async def create_session(self, *, eager_connect: bool = True) -> ClaudeSession:
    """Create a new multi-turn session.

    Automatically initializes if not yet done. Under spec 034 P4 the
    session no longer holds a persistent ``ClaudeSDKClient``; each
    turn opens its own subprocess via ``query(resume=session_id)``.
    ``eager_connect`` is retained as a no-op for API compatibility —
    ``ClaudeSession.prepare()`` is itself a no-op under P4.

    Args:
        eager_connect: Retained for backwards compatibility; has no
            effect under spec 034 P4.

    Returns:
        A new ``ClaudeSession`` instance.
    """
    await self._ensure_initialized()
    if self._options is None:
        raise BackendInitError("Backend options not set after initialization")
    session = ClaudeSession(options=self._options)
    if eager_connect:
        await session.prepare()
    return session

initialize() async

Initialize the backend — validate config, build options.

Idempotent: calling multiple times is a no-op after the first.

Raises:

Type Description
BackendInitError

On validation or configuration failure.

Source code in src/holodeck/lib/backends/claude_backend.py
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
async def initialize(self) -> None:
    """Initialize the backend — validate config, build options.

    Idempotent: calling multiple times is a no-op after the first.

    Raises:
        BackendInitError: On validation or configuration failure.
    """
    if self._initialized:
        return

    try:
        agent = self._agent
        claude = agent.claude

        # 1. Node.js prerequisite
        validate_nodejs(agent)

        # 2. Credentials
        auth_env = validate_credentials(agent.model)

        # 3. Embedding provider (vectorstore tools)
        validate_embedding_provider(agent)

        # 4. Auto-initialize vectorstore/hierarchical-doc tools if needed
        await self._initialize_tools()

        # 5. Tool adapters
        from pathlib import Path

        from holodeck.config.context import agent_base_dir as _agent_base_dir

        _base_dir_str = _agent_base_dir.get()
        _base_dir = Path(_base_dir_str) if _base_dir_str else None
        adapters = create_tool_adapters(
            tool_configs=agent.tools or [],
            tool_instances=self._tool_instances,
            base_dir=_base_dir,
        )
        tool_server, tool_names = build_holodeck_sdk_server(adapters)

        # 6. External MCP configs
        mcp_tools = [t for t in (agent.tools or []) if isinstance(t, MCPTool)]
        mcp_configs = build_claude_mcp_configs(mcp_tools)

        # 7. OTel env vars
        otel_env: dict[str, str] = {}
        if agent.observability:
            otel_env = translate_observability(agent.observability)

        # 8. Build options
        self._options = build_options(
            agent=agent,
            tool_server=tool_server if adapters else None,
            tool_names=tool_names,
            mcp_configs=mcp_configs,
            auth_env=auth_env,
            otel_env=otel_env,
            mode=self._mode,
        )

        # 9. Working directory collision check
        wd = claude.working_directory if claude else None
        validate_working_directory(wd)

        # 10. Response format validation
        validate_response_format(agent.response_format)

        # 11. GenAI instrumentation (optional, non-blocking)
        self._activate_instrumentation()

        self._initialized = True

    except Exception as exc:
        if not isinstance(exc, BackendInitError):
            raise BackendInitError(
                f"Claude backend initialization failed: {exc}"
            ) from exc
        raise

invoke_once(message, context=None) async

Invoke the agent for a single turn.

Automatically initializes if not yet done. Retries on ProcessError (subprocess crash) with exponential backoff.

Parameters:

Name Type Description Default
message str

User message text.

required
context list[dict[str, Any]] | None

Optional conversation context (unused for Claude backend).

None

Returns:

Type Description
ExecutionResult

ExecutionResult with the agent's response.

Raises:

Type Description
BackendSessionError

After max retries exhausted.

Source code in src/holodeck/lib/backends/claude_backend.py
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
async def invoke_once(
    self, message: str, context: list[dict[str, Any]] | None = None
) -> ExecutionResult:
    """Invoke the agent for a single turn.

    Automatically initializes if not yet done. Retries on ``ProcessError``
    (subprocess crash) with exponential backoff.

    Args:
        message: User message text.
        context: Optional conversation context (unused for Claude backend).

    Returns:
        ``ExecutionResult`` with the agent's response.

    Raises:
        BackendSessionError: After max retries exhausted.
    """
    await self._ensure_initialized()

    last_error: BaseException | None = None
    for attempt in range(_MAX_RETRIES):
        try:
            return await self._invoke_query(message)
        except (ProcessError, CLIConnectionError, MessageParseError) as exc:
            last_error = exc
        except BaseExceptionGroup as exc:
            # anyio TaskGroup wraps subprocess errors in ExceptionGroup
            last_error = exc

        if attempt < _MAX_RETRIES - 1:
            backoff = _BACKOFF_BASE_SECONDS * (2**attempt)
            logger.warning(
                "Claude subprocess error (attempt %d/%d), retrying in %ds: %s",
                attempt + 1,
                _MAX_RETRIES,
                backoff,
                last_error,
            )
            await asyncio.sleep(backoff)

    raise BackendSessionError(
        f"Claude subprocess failed after {_MAX_RETRIES} retries: {last_error}"
    )

teardown() async

Reset backend state, releasing any built options.

Source code in src/holodeck/lib/backends/claude_backend.py
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
async def teardown(self) -> None:
    """Reset backend state, releasing any built options."""
    # Deactivate GenAI instrumentation
    if self._instrumentor is not None:
        try:
            self._instrumentor.uninstrument()
        except Exception as exc:
            logger.warning("Error deactivating GenAI instrumentation: %s", exc)
        self._instrumentor = None

    # Cleanup owned tools
    for tool_inst in self._owned_tools:
        if hasattr(tool_inst, "cleanup"):
            try:
                await tool_inst.cleanup()
            except Exception as exc:
                logger.warning("Error cleaning up tool: %s", exc)
    self._owned_tools = []
    self._tool_instances = {}
    self._initialized = False
    self._options = None

ClaudeSession

ClaudeSession(options)

Stateful multi-turn session backed by query(resume=...) (spec 034 P4).

Each send() / send_streaming() opens a fresh CLI subprocess via the top-level query() function. Turn 1 has no resume; the CLI assigns a session id which is captured from ResultMessage and stored on _sdk_session_id. Subsequent turns pass that id via options.resume so the CLI rehydrates the JSONL transcript at ~/.claude/projects/<encoded-cwd>/<sdk_session_id>.jsonl.

The _base_options reference is never mutated. Turn-specific options are created as new ClaudeAgentOptions instances.

Initialize session with base options.

Parameters:

Name Type Description Default
options ClaudeAgentOptions

Base options (immutable reference for the session lifetime).

required
Source code in src/holodeck/lib/backends/claude_backend.py
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
def __init__(self, options: ClaudeAgentOptions) -> None:
    """Initialize session with base options.

    Args:
        options: Base options (immutable reference for the session lifetime).
    """
    self._base_options = options
    self._turn_count: int = 0
    self._tool_event_queue: asyncio.Queue[ToolEvent] = asyncio.Queue()
    # spec 034 P4 — hybrid-session state.
    # CLI-assigned conversation id; captured from ResultMessage on turn 1
    # and fed into ``options.resume`` on turn 2+ so each fresh subprocess
    # rehydrates the JSONL transcript at
    # ``~/.claude/projects/<encoded-cwd>/<sdk_session_id>.jsonl``.
    self._sdk_session_id: str | None = None
    # Serialises concurrent send() / send_streaming() on the same session.
    # Two concurrent turns with the same resume= would race the transcript.
    self._send_lock: asyncio.Lock = asyncio.Lock()

tool_events property

Queue of real-time tool events emitted via SDK hooks.

close() async

Delete the on-disk JSONL transcript and clear session state.

Under spec 034 P4 the session has no persistent subprocess to disconnect. Conversation state lives on disk at ~/.claude/projects/<encoded-cwd>/<sdk_session_id>.jsonl. Closing the session permanently discards that transcript so the next open of the same threadId starts fresh.

Source code in src/holodeck/lib/backends/claude_backend.py
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
async def close(self) -> None:
    """Delete the on-disk JSONL transcript and clear session state.

    Under spec 034 P4 the session has no persistent subprocess to
    disconnect. Conversation state lives on disk at
    ``~/.claude/projects/<encoded-cwd>/<sdk_session_id>.jsonl``. Closing
    the session permanently discards that transcript so the next
    open of the same threadId starts fresh.
    """
    if self._sdk_session_id is not None:
        cwd_value = getattr(self._base_options, "cwd", None)
        path = _transcript_path(self._sdk_session_id, cwd=cwd_value)
        try:
            path.unlink()
        except FileNotFoundError:
            pass
        except OSError as exc:
            logger.warning("Failed to delete transcript %s: %s", path, exc)
        self._sdk_session_id = None

prepare() async

No-op under spec 034 P4.

Retained for backwards compatibility with the chat executor's _TaskBoundSession. Under the hybrid-session model the SDK's anyio task group is created inside each query() call frame, so there is no task-binding to do up front.

Source code in src/holodeck/lib/backends/claude_backend.py
1327
1328
1329
1330
1331
1332
1333
1334
1335
async def prepare(self) -> None:
    """No-op under spec 034 P4.

    Retained for backwards compatibility with the chat executor's
    ``_TaskBoundSession``. Under the hybrid-session model the SDK's
    anyio task group is created inside each ``query()`` call frame,
    so there is no task-binding to do up front.
    """
    return None

release_transport() async

No-op under spec 034 P4.

Retained for backwards compatibility with the chat executor's _TaskBoundSession. Under the hybrid-session model each turn's subprocess is created and torn down inside query(); there is no persistent transport to release between turns.

Source code in src/holodeck/lib/backends/claude_backend.py
1934
1935
1936
1937
1938
1939
1940
1941
1942
async def release_transport(self) -> None:
    """No-op under spec 034 P4.

    Retained for backwards compatibility with the chat executor's
    ``_TaskBoundSession``. Under the hybrid-session model each turn's
    subprocess is created and torn down inside ``query()``; there is no
    persistent transport to release between turns.
    """
    return None

send(message) async

Send a message and collect the full response.

Parameters:

Name Type Description Default
message str

User message text.

required

Returns:

Type Description
ExecutionResult

ExecutionResult with the agent's response.

Raises:

Type Description
BackendSessionError

On subprocess or SDK error.

Source code in src/holodeck/lib/backends/claude_backend.py
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
async def send(self, message: str) -> ExecutionResult:
    """Send a message and collect the full response.

    Args:
        message: User message text.

    Returns:
        ``ExecutionResult`` with the agent's response.

    Raises:
        BackendSessionError: On subprocess or SDK error.
    """
    async with self._send_lock:
        turn_no = self._turn_count + 1
        logger.debug(
            "[trace] ClaudeSession.send turn=%d: entering, resume=%s",
            turn_no,
            self._sdk_session_id,
        )
        try:
            options = self._options_with_hooks()
            if self._sdk_session_id is not None:
                import dataclasses

                try:
                    options = dataclasses.replace(
                        options, resume=self._sdk_session_id
                    )
                except TypeError:
                    # Fallback for non-dataclass options (test mocks).
                    options.resume = self._sdk_session_id

            text_parts: list[str] = []
            tool_calls: list[dict[str, Any]] = []
            tool_results: list[dict[str, Any]] = []
            thinking_parts: list[str] = []
            token_usage = TokenUsage.zero()
            num_turns = 1
            structured_output: Any = None

            msg_count = 0
            async for msg in claude_agent_sdk.query(
                prompt=_streaming_user_envelope(message), options=options
            ):
                msg_count += 1
                logger.debug(
                    "[trace] ClaudeSession.send turn=%d: msg #%d type=%s",
                    turn_no,
                    msg_count,
                    msg.__class__.__name__,
                )
                _maybe_emit_thinking_blocks(msg, self._tool_event_queue)
                _maybe_emit_subagent_message(msg, self._tool_event_queue)
                text_parts, tool_calls, tool_results, thinking_parts = (
                    _process_message(
                        msg,
                        text_parts,
                        tool_calls,
                        tool_results,
                        thinking_parts,
                    )
                )
                if msg.__class__.__name__ == "ResultMessage":
                    rm = cast(Any, msg)
                    usage = rm.usage or {}
                    prompt_tokens = usage.get("input_tokens", 0)
                    completion = usage.get("output_tokens", 0)
                    token_usage = TokenUsage(
                        prompt_tokens=prompt_tokens,
                        completion_tokens=completion,
                        total_tokens=prompt_tokens + completion,
                    )
                    num_turns = rm.num_turns
                    structured_output = rm.structured_output
                    if self._sdk_session_id is None:
                        captured = getattr(rm, "session_id", None)
                        if isinstance(captured, str) and captured:
                            self._sdk_session_id = captured

            logger.debug(
                "[trace] ClaudeSession.send turn=%d: exited, "
                "msg_count=%d, num_turns=%d, sdk_session_id=%s",
                turn_no,
                msg_count,
                num_turns,
                self._sdk_session_id,
            )
            self._turn_count += 1

            _enrich_tool_results(tool_calls, tool_results)

            response_text = "".join(text_parts)
            if structured_output is not None:
                response_text = json.dumps(structured_output)

            return ExecutionResult(
                response=response_text,
                tool_calls=tool_calls,
                tool_results=tool_results,
                token_usage=token_usage,
                structured_output=structured_output,
                num_turns=num_turns,
                thinking="".join(thinking_parts),
            )
        except (ProcessError, CLIConnectionError) as exc:
            raise BackendSessionError(
                f"subprocess terminated unexpectedly: {exc}"
            ) from exc

send_agui(input_data, message_override=None) async

Send an AG-UI request through this Claude session.

Yields AG-UI events translated directly from Claude SDK stream messages.

Source code in src/holodeck/lib/backends/claude_backend.py
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
async def send_agui(
    self,
    input_data: RunAgentInput,
    message_override: str | None = None,
) -> AsyncGenerator[BaseEvent, None]:
    """Send an AG-UI request through this Claude session.

    Yields AG-UI events translated directly from Claude SDK stream messages.
    """
    async with self._send_lock:
        thread_id = input_data.thread_id
        run_id = input_data.run_id
        options = self._options_for_agui(input_data)
        prompt = _agui_prompt_from_input(input_data, message_override)
        frontend_tool_names = set(_agui_tool_names(input_data.tools))
        current_state = input_data.state
        result_data: dict[str, Any] | None = None

        yield RunStartedEvent(
            type=EventType.RUN_STARTED,
            thread_id=thread_id,
            run_id=run_id,
            parent_run_id=input_data.parent_run_id,
            input=input_data,
        )
        if input_data.state is not None:
            yield StateSnapshotEvent(
                type=EventType.STATE_SNAPSHOT,
                snapshot=input_data.state,
            )

        try:
            message_stream = claude_agent_sdk.query(
                prompt=_streaming_user_envelope(prompt),
                options=options,
            )
            async for event in self._stream_agui_messages(
                message_stream=message_stream,
                thread_id=thread_id,
                run_id=run_id,
                input_data=input_data,
                frontend_tool_names=frontend_tool_names,
                current_state=current_state,
            ):
                if isinstance(event, _AguiStreamStateUpdate):
                    current_state = event.state
                    continue
                if isinstance(event, _AguiStreamResult):
                    result_data = event.result
                    continue
                yield event

            self._turn_count += 1
            yield RunFinishedEvent(
                type=EventType.RUN_FINISHED,
                thread_id=thread_id,
                run_id=run_id,
                result=result_data,
            )
        except (ProcessError, CLIConnectionError) as exc:
            raise BackendSessionError(
                f"subprocess terminated unexpectedly: {exc}"
            ) from exc

send_streaming(message) async

Send a message and yield text chunks progressively.

Parameters:

Name Type Description Default
message str

User message text.

required

Yields:

Type Description
AsyncGenerator[str, None]

Text chunks as they arrive from the SDK.

Raises:

Type Description
BackendSessionError

On subprocess or SDK error.

Source code in src/holodeck/lib/backends/claude_backend.py
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
async def send_streaming(self, message: str) -> AsyncGenerator[str, None]:
    """Send a message and yield text chunks progressively.

    Args:
        message: User message text.

    Yields:
        Text chunks as they arrive from the SDK.

    Raises:
        BackendSessionError: On subprocess or SDK error.
    """
    async with self._send_lock:
        options = self._options_with_hooks()
        if self._sdk_session_id is not None:
            import dataclasses

            try:
                options = dataclasses.replace(options, resume=self._sdk_session_id)
            except TypeError:
                # Fallback for non-dataclass options (test mocks).
                options.resume = self._sdk_session_id

        try:
            async for msg in claude_agent_sdk.query(
                prompt=_streaming_user_envelope(message), options=options
            ):
                _maybe_emit_thinking_blocks(msg, self._tool_event_queue)
                _maybe_emit_subagent_message(msg, self._tool_event_queue)
                if msg.__class__.__name__ == "AssistantMessage":
                    for block in cast(Any, msg).content:
                        if block.__class__.__name__ == "TextBlock" and block.text:
                            yield block.text
                elif msg.__class__.__name__ == "ResultMessage":
                    self._turn_count += 1
                    if self._sdk_session_id is None:
                        captured = getattr(msg, "session_id", None)
                        if isinstance(captured, str) and captured:
                            self._sdk_session_id = captured
        except (ProcessError, CLIConnectionError) as exc:
            raise BackendSessionError(
                f"subprocess terminated unexpectedly: {exc}"
            ) from exc

build_options

build_options(*, agent, tool_server, tool_names, mcp_configs, auth_env, otel_env, mode)

Assemble ClaudeAgentOptions from agent config and bridge outputs.

Parameters:

Name Type Description Default
agent Agent

The agent configuration.

required
tool_server McpSdkServerConfig | None

In-process MCP server for vectorstore/hierarchical-doc tools.

required
tool_names list[str]

Allowed tool names from the in-process server.

required
mcp_configs dict[str, Any]

External MCP server configs from build_claude_mcp_configs().

required
auth_env dict[str, str]

Auth env vars from validate_credentials().

required
otel_env dict[str, str]

OTel env vars from translate_observability().

required
mode str

Execution mode ("test" or "chat").

required

Returns:

Type Description
ClaudeAgentOptions

Configured ClaudeAgentOptions ready for query().

Source code in src/holodeck/lib/backends/claude_backend.py
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
def build_options(
    *,
    agent: Agent,
    tool_server: McpSdkServerConfig | None,
    tool_names: list[str],
    mcp_configs: dict[str, Any],
    auth_env: dict[str, str],
    otel_env: dict[str, str],
    mode: str,
) -> ClaudeAgentOptions:
    """Assemble ``ClaudeAgentOptions`` from agent config and bridge outputs.

    Args:
        agent: The agent configuration.
        tool_server: In-process MCP server for vectorstore/hierarchical-doc tools.
        tool_names: Allowed tool names from the in-process server.
        mcp_configs: External MCP server configs from ``build_claude_mcp_configs()``.
        auth_env: Auth env vars from ``validate_credentials()``.
        otel_env: OTel env vars from ``translate_observability()``.
        mode: Execution mode (``"test"`` or ``"chat"``).

    Returns:
        Configured ``ClaudeAgentOptions`` ready for ``query()``.
    """
    claude = agent.claude
    system_prompt = resolve_instructions(agent.instructions)

    # MCP servers
    mcp_servers: dict[str, Any] = dict(mcp_configs)
    if tool_server is not None:
        mcp_servers["holodeck_tools"] = tool_server

    # Env vars — unset CLAUDECODE to prevent the "nested session" guard
    # when HoloDeck runs inside a terminal with Claude Code active.
    # SDK merges: {**os.environ, **options.env}, so "" overrides "1".
    env: dict[str, str] = {"CLAUDECODE": "", **auth_env, **otel_env}

    # Custom base URL — when a custom auth provider is used with an explicit
    # endpoint, forward it as ANTHROPIC_BASE_URL so the SDK subprocess targets
    # the third-party endpoint.  Only inject for auth_provider=custom to avoid
    # accidentally overriding the default Anthropic URL for proxy/rate-limit
    # setups that use endpoint with a standard auth provider.
    if agent.model.endpoint and agent.model.auth_provider == AuthProvider.custom:
        env["ANTHROPIC_BASE_URL"] = agent.model.endpoint

    # Permission mode (raises ConfigError for acceptAll without unsafe opt-in)
    perm_mode = _build_permission_mode(claude, mode)

    # Extended thinking
    max_thinking_tokens = None
    if claude and claude.extended_thinking and claude.extended_thinking.enabled:
        max_thinking_tokens = claude.extended_thinking.budget_tokens

    # Allowed tools
    allowed_tools: list[str] = list(tool_names)
    if claude and claude.allowed_tools:
        allowed_tools.extend(claude.allowed_tools)

    # Built-in capabilities
    if claude and claude.web_search:
        allowed_tools.append("WebSearch")

    # Output format
    output_format = _build_output_format(agent.response_format)

    # Working directory — fall back to agent.yaml's directory so that
    # relative paths in MCP args (e.g. "./data") resolve correctly.
    from holodeck.config.context import agent_base_dir

    cwd = claude.working_directory if claude else None
    if cwd is None:
        cwd = agent_base_dir.get()

    # Max turns — default to _DEFAULT_MAX_TURNS when unset (spec 034 P1a)
    max_turns = (
        claude.max_turns
        if claude and claude.max_turns is not None
        else _DEFAULT_MAX_TURNS
    )

    # Disallowed tools (P1b — spec 034). Auto-disallow risky built-in SDK
    # tools (Bash, Write, Edit, WebFetch) that the operator has not declared
    # via the dedicated schema fields or claude.allowed_tools. This is the
    # real fail-closed lever — permission_mode is only a tiebreaker for tools
    # not covered by either list.
    declared = _declared_builtin_tools(claude)
    auto_disallow = sorted(_RISKY_BUILTIN_TOOLS - declared)
    explicit_disallow = (
        list(claude.disallowed_tools) if claude and claude.disallowed_tools else []
    )
    disallowed_tools: list[str] | None = (
        sorted(set(explicit_disallow) | set(auto_disallow))
        if (explicit_disallow or auto_disallow)
        else None
    )
    if auto_disallow:
        logger.info(
            "Auto-disallowed risky built-in SDK tools not declared in agent "
            "config: %s. Declare them via claude.bash.enabled, "
            "claude.file_system.*, or claude.allowed_tools to opt in.",
            ", ".join(auto_disallow),
        )

    # Default [] → CLI subprocess never inherits ~/.claude plugins/skills/hooks.
    setting_sources: list[str] = (
        list(claude.setting_sources)
        if claude is not None and claude.setting_sources is not None
        else []
    )

    # Build the options dict
    opts_kwargs: dict[str, Any] = {
        "model": agent.model.name,
        "system_prompt": system_prompt,
        "permission_mode": perm_mode,
        "max_turns": max_turns,
        "mcp_servers": mcp_servers,
        "allowed_tools": allowed_tools,
        "env": env,
        "cwd": cwd,
        "output_format": output_format,
        "setting_sources": setting_sources,
    }
    if disallowed_tools is not None:
        opts_kwargs["disallowed_tools"] = disallowed_tools

    if max_thinking_tokens is not None:
        opts_kwargs["max_thinking_tokens"] = max_thinking_tokens

    if claude is not None:
        if claude.effort is not None:
            opts_kwargs["effort"] = claude.effort
        if claude.max_budget_usd is not None:
            opts_kwargs["max_budget_usd"] = claude.max_budget_usd
        if claude.fallback_model is not None:
            opts_kwargs["fallback_model"] = claude.fallback_model
        if claude.agents:
            opts_kwargs["agents"] = {
                name: AgentDefinition(
                    description=spec.description,
                    prompt=spec.prompt or "",
                    tools=spec.tools,
                    model=spec.model,
                )
                for name, spec in claude.agents.items()
            }

    # Spec 034 P2b — merge HoloDeck default hooks (credential-redaction
    # PostToolUse) ahead of any user-supplied hooks. Opt-out by setting
    # `claude.disable_default_hooks: true` in agent.yaml. Bash hardening
    # is handled by SDK permission rules + P1b auto-disallow.
    from holodeck.lib.backends.claude_hooks import build_default_hooks

    user_hooks: dict[Any, list[Any]] = opts_kwargs.get("hooks") or {}
    if claude is not None and claude.disable_default_hooks:
        logger.warning(
            "HoloDeck default hooks DISABLED for agent '%s' (credential "
            "redaction PostToolUse hook will NOT run). OTel attribute "
            "redaction is unaffected. To re-enable, remove "
            "`claude.disable_default_hooks: true` from agent.yaml.",
            agent.name,
        )
        if user_hooks:
            opts_kwargs["hooks"] = user_hooks
    else:
        default_hooks = build_default_hooks()
        merged: dict[Any, list[Any]] = {}
        # Default hooks first so they short-circuit when they deny.
        for event, matchers in default_hooks.items():
            merged[event] = list(matchers)
        for event, matchers in user_hooks.items():
            if event in merged:
                merged[event] = merged[event] + list(matchers)
            else:
                merged[event] = list(matchers)
        if merged:
            opts_kwargs["hooks"] = merged

    # Spec 034 P2b — default-on subprocess env scrubbing. These two flags
    # tell the Claude CLI to strip Anthropic/cloud creds from tool
    # subprocesses (Bash, etc.) and to spawn stdio MCP servers with only
    # their declared env (not the full inherited shell env). They don't
    # affect the SDK subprocess's own env (which inherits everything from
    # this serve process — see P3 for the structural fix). Opt out via
    # `claude.disable_subprocess_env_scrub: true` in agent.yaml.
    env_overrides = dict(opts_kwargs.get("env") or {})
    if claude is not None and claude.disable_subprocess_env_scrub:
        logger.warning(
            "Subprocess env scrubbing DISABLED for agent '%s' "
            "(tool subprocesses and stdio MCP servers will inherit the "
            "full agent container env including credentials). To re-enable, "
            "remove `claude.disable_subprocess_env_scrub: true` from "
            "agent.yaml.",
            agent.name,
        )
    else:
        env_overrides.setdefault("CLAUDE_CODE_SUBPROCESS_ENV_SCRUB", "1")
        # Note: this scrubs the SDK-spawned subprocess env for stdio MCP servers.
        # Operators must declare any inherited env vars (HOME, PATH, provider creds)
        # on the MCP tool's `env` block — see docs/security/prompt-injection-defenses.md
        # §"Operator footgun".
        env_overrides.setdefault("CLAUDE_CODE_MCP_ALLOWLIST_ENV", "1")
    if env_overrides:
        opts_kwargs["env"] = env_overrides

    return ClaudeAgentOptions(**opts_kwargs)

holodeck.lib.backends.tool_adapters -- Claude SDK Tool Adapters

Wraps HoloDeck vectorstore and hierarchical-document tools as @tool-decorated functions, bundles them into an in-process MCP server, and provides a factory for ClaudeBackend to call during initialization.

VectorStoreToolAdapter

VectorStoreToolAdapter(config, instance)

Wraps a VectorStoreTool for use with the Claude Agent SDK.

Parameters:

Name Type Description Default
config VectorstoreTool

The vectorstore tool configuration from the agent YAML.

required
instance VectorStoreTool

An initialized VectorStoreTool instance.

required
Source code in src/holodeck/lib/backends/tool_adapters.py
161
162
163
164
165
166
167
def __init__(
    self,
    config: VectorstoreToolConfig,
    instance: VectorStoreTool,
) -> None:
    self.config = config
    self.instance = instance

to_sdk_tool()

Return an SdkMcpTool backed by this adapter's search method.

Source code in src/holodeck/lib/backends/tool_adapters.py
169
170
171
172
173
174
175
def to_sdk_tool(self) -> SdkMcpTool[Any]:
    """Return an ``SdkMcpTool`` backed by this adapter's search method."""
    name = f"{self.config.name}_search"
    desc = _truncate_description(
        f"Search {self.config.name}: {self.config.description}"
    )
    return _make_vectorstore_search_fn(self.instance, name, desc)

HierarchicalDocToolAdapter

HierarchicalDocToolAdapter(config, instance)

Wraps a HierarchicalDocumentTool for use with the Claude Agent SDK.

Parameters:

Name Type Description Default
config HierarchicalDocumentToolConfig

The hierarchical document tool configuration from the agent YAML.

required
instance HierarchicalDocumentTool

An initialized HierarchicalDocumentTool instance.

required
Source code in src/holodeck/lib/backends/tool_adapters.py
215
216
217
218
219
220
221
def __init__(
    self,
    config: HierarchicalDocumentToolConfig,
    instance: HierarchicalDocumentTool,
) -> None:
    self.config = config
    self.instance = instance

to_sdk_tool()

Return an SdkMcpTool backed by this adapter's search method.

Source code in src/holodeck/lib/backends/tool_adapters.py
223
224
225
226
227
228
229
def to_sdk_tool(self) -> SdkMcpTool[Any]:
    """Return an ``SdkMcpTool`` backed by this adapter's search method."""
    name = f"{self.config.name}_search"
    desc = _truncate_description(
        f"Search {self.config.name}: {self.config.description}"
    )
    return _make_hierarchical_search_fn(self.instance, name, desc)

create_tool_adapters

create_tool_adapters(tool_configs, tool_instances, base_dir=None)

Build adapters for vectorstore, hierarchical-document, and function tools.

Filters tool_configs for supported types, matches each to its initialized instance (vectorstore / hierarchical) or loads the Python callable (function tools) and returns adapter objects.

Parameters:

Name Type Description Default
tool_configs list[ToolUnion]

All tool configurations from the agent YAML.

required
tool_instances dict[str, VectorStoreTool | HierarchicalDocumentTool]

Initialized tool instances keyed by config name.

required
base_dir Path | None

Directory used to resolve relative FunctionTool.file paths. Typically the agent project root. Unused by vectorstore / hierarchical-document adapters.

None

Returns:

Type Description
list[VectorStoreToolAdapter | HierarchicalDocToolAdapter | FunctionToolAdapter]

List of adapter objects ready for build_holodeck_sdk_server().

Raises:

Type Description
BackendInitError

If a supported tool config has no matching instance.

ConfigError

If a function tool fails to load.

Source code in src/holodeck/lib/backends/tool_adapters.py
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
def create_tool_adapters(
    tool_configs: list[ToolUnion],
    tool_instances: dict[str, VectorStoreTool | HierarchicalDocumentTool],
    base_dir: Path | None = None,
) -> list[VectorStoreToolAdapter | HierarchicalDocToolAdapter | FunctionToolAdapter]:
    """Build adapters for vectorstore, hierarchical-document, and function tools.

    Filters *tool_configs* for supported types, matches each to its
    initialized instance (vectorstore / hierarchical) or loads the Python
    callable (function tools) and returns adapter objects.

    Args:
        tool_configs: All tool configurations from the agent YAML.
        tool_instances: Initialized tool instances keyed by config name.
        base_dir: Directory used to resolve relative ``FunctionTool.file``
            paths. Typically the agent project root. Unused by vectorstore /
            hierarchical-document adapters.

    Returns:
        List of adapter objects ready for ``build_holodeck_sdk_server()``.

    Raises:
        BackendInitError: If a supported tool config has no matching instance.
        ConfigError: If a function tool fails to load.
    """
    adapters: list[
        VectorStoreToolAdapter | HierarchicalDocToolAdapter | FunctionToolAdapter
    ] = []

    for cfg in tool_configs:
        if isinstance(cfg, VectorstoreToolConfig):
            instance = tool_instances.get(cfg.name)
            if instance is None:
                raise BackendInitError(
                    f"No initialized instance found for tool '{cfg.name}' "
                    f"(type: {cfg.type}). Ensure tool initialization "
                    "completed before creating adapters."
                )
            adapters.append(
                VectorStoreToolAdapter(
                    config=cfg,
                    instance=instance,  # type: ignore[arg-type]
                )
            )
        elif isinstance(cfg, HierarchicalDocumentToolConfig):
            instance = tool_instances.get(cfg.name)
            if instance is None:
                raise BackendInitError(
                    f"No initialized instance found for tool '{cfg.name}' "
                    f"(type: {cfg.type}). Ensure tool initialization "
                    "completed before creating adapters."
                )
            adapters.append(
                HierarchicalDocToolAdapter(
                    config=cfg,
                    instance=instance,  # type: ignore[arg-type]
                )
            )
        elif isinstance(cfg, FunctionTool):
            func = load_function_tool(cfg, base_dir=base_dir)
            adapters.append(FunctionToolAdapter(config=cfg, callable=func))

    return adapters

build_holodeck_sdk_server

build_holodeck_sdk_server(adapters)

Bundle adapters into an in-process MCP server for the Claude subprocess.

Parameters:

Name Type Description Default
adapters list[VectorStoreToolAdapter | HierarchicalDocToolAdapter | FunctionToolAdapter]

Adapter objects produced by create_tool_adapters().

required

Returns:

Type Description
McpSdkServerConfig

A tuple of (server_config, allowed_tool_names) where

list[str]

server_config is a McpSdkServerConfig TypedDict and

tuple[McpSdkServerConfig, list[str]]

allowed_tool_names are the fully-qualified MCP tool names.

tuple[McpSdkServerConfig, list[str]]

Search-backed adapters contribute <name>_search; function-tool

tuple[McpSdkServerConfig, list[str]]

adapters contribute the raw tool name.

Source code in src/holodeck/lib/backends/tool_adapters.py
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
def build_holodeck_sdk_server(
    adapters: list[
        VectorStoreToolAdapter | HierarchicalDocToolAdapter | FunctionToolAdapter
    ],
) -> tuple[McpSdkServerConfig, list[str]]:
    """Bundle adapters into an in-process MCP server for the Claude subprocess.

    Args:
        adapters: Adapter objects produced by ``create_tool_adapters()``.

    Returns:
        A tuple of ``(server_config, allowed_tool_names)`` where
        *server_config* is a ``McpSdkServerConfig`` TypedDict and
        *allowed_tool_names* are the fully-qualified MCP tool names.
        Search-backed adapters contribute ``<name>_search``; function-tool
        adapters contribute the raw tool name.
    """
    sdk_tools: list[SdkMcpTool[Any]] = [a.to_sdk_tool() for a in adapters]

    server_config: McpSdkServerConfig = create_sdk_mcp_server(
        name=_SERVER_NAME,
        tools=sdk_tools,
    )

    allowed_tools: list[str] = []
    for a in adapters:
        if isinstance(a, FunctionToolAdapter):
            allowed_tools.append(f"mcp__{_SERVER_NAME}__{a.config.name}")
        else:
            allowed_tools.append(f"mcp__{_SERVER_NAME}__{a.config.name}_search")

    return server_config, allowed_tools

holodeck.lib.backends.mcp_bridge -- MCP Configuration Bridge

Translates HoloDeck MCPTool configurations into Claude Agent SDK McpStdioServerConfig format for subprocess-based MCP servers. Only stdio transport tools are supported.

build_claude_mcp_configs

build_claude_mcp_configs(mcp_tools)

Translate HoloDeck MCPTool configs to Claude SDK MCP server configs.

Only stdio transport tools are supported by the Claude subprocess. Non-stdio tools are skipped with a warning.

Parameters:

Name Type Description Default
mcp_tools list[MCPTool]

List of MCPTool configurations from agent YAML.

required

Returns:

Type Description
dict[str, McpStdioServerConfig]

Dictionary mapping tool names to McpStdioServerConfig TypedDicts.

Source code in src/holodeck/lib/backends/mcp_bridge.py
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
def build_claude_mcp_configs(
    mcp_tools: list[MCPTool],
) -> dict[str, McpStdioServerConfig]:
    """Translate HoloDeck MCPTool configs to Claude SDK MCP server configs.

    Only stdio transport tools are supported by the Claude subprocess.
    Non-stdio tools are skipped with a warning.

    Args:
        mcp_tools: List of MCPTool configurations from agent YAML.

    Returns:
        Dictionary mapping tool names to McpStdioServerConfig TypedDicts.
    """
    from holodeck.config.context import agent_base_dir

    base_dir = agent_base_dir.get()

    configs: dict[str, McpStdioServerConfig] = {}

    for tool in mcp_tools:
        if tool.transport != TransportType.STDIO:
            logger.warning(
                "Skipping MCP tool '%s': %s transport is not supported by "
                "Claude subprocess (only stdio is supported)",
                tool.name,
                tool.transport.value,
            )
            continue

        command = tool.command.value if tool.command else "npx"
        args = _resolve_relative_args(tool.args or [], base_dir)
        env = _resolve_mcp_env(tool)

        entry: McpStdioServerConfig = {
            "command": command,
            "args": args,
        }

        if env:
            entry["env"] = env

        configs[tool.name] = entry

    return configs

holodeck.lib.backends.otel_bridge -- Observability Bridge

Translates HoloDeck ObservabilityConfig into environment variable dicts that configure OpenTelemetry for the Claude subprocess.

translate_observability

translate_observability(config)

Translate ObservabilityConfig to env vars for the Claude subprocess.

Produces a dict of environment variable key-value pairs that configure OpenTelemetry in the Claude subprocess. All values are strings.

Parameters:

Name Type Description Default
config ObservabilityConfig

HoloDeck observability configuration from agent YAML.

required

Returns:

Type Description
dict[str, str]

Dictionary of environment variable names to string values.

dict[str, str]

Empty dict if observability is disabled.

Source code in src/holodeck/lib/backends/otel_bridge.py
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
def translate_observability(config: ObservabilityConfig) -> dict[str, str]:
    """Translate ObservabilityConfig to env vars for the Claude subprocess.

    Produces a dict of environment variable key-value pairs that configure
    OpenTelemetry in the Claude subprocess. All values are strings.

    Args:
        config: HoloDeck observability configuration from agent YAML.

    Returns:
        Dictionary of environment variable names to string values.
        Empty dict if observability is disabled.
    """
    if not config.enabled:
        return {}

    env: dict[str, str] = {}

    # Core telemetry enablement
    env["CLAUDE_CODE_ENABLE_TELEMETRY"] = "1"

    # Disable subprocess trace export when Python-side GenAI instrumentation
    # is active.  The instrumentor's hooks produce the authoritative
    # invoke_agent / execute_tool spans in-process; letting the subprocess
    # also export traces creates duplicate, unlinked trace trees in the
    # collector and confuses dashboards like Aspire.
    if config.traces.enabled:
        env["OTEL_TRACES_EXPORTER"] = "none"

    # OTLP exporter configuration
    otlp = config.exporters.otlp
    otlp_enabled = otlp is not None and otlp.enabled

    # Metrics exporter
    if otlp_enabled and config.metrics.enabled:
        env["OTEL_METRICS_EXPORTER"] = "otlp"
    else:
        env["OTEL_METRICS_EXPORTER"] = "none"

    # Logs exporter
    if otlp_enabled and config.logs.enabled:
        env["OTEL_LOGS_EXPORTER"] = "otlp"
    else:
        env["OTEL_LOGS_EXPORTER"] = "none"

    # Protocol and endpoint (only if OTLP is enabled)
    if otlp_enabled and otlp is not None:
        env["OTEL_EXPORTER_OTLP_PROTOCOL"] = _PROTOCOL_MAP[otlp.protocol]
        env["OTEL_EXPORTER_OTLP_ENDPOINT"] = otlp.endpoint

    # Export intervals
    interval = str(config.metrics.export_interval_ms)
    env["OTEL_METRIC_EXPORT_INTERVAL"] = interval
    env["OTEL_LOGS_EXPORT_INTERVAL"] = interval

    # Privacy controls (FR-038: default off)
    if config.traces.capture_content:
        env["OTEL_LOG_USER_PROMPTS"] = "true"
        env["OTEL_LOG_TOOL_DETAILS"] = "true"

    # Warn about unsupported fields
    unsupported = _collect_unsupported_fields(config)
    if unsupported:
        logger.warning(
            "Claude subprocess does not support these observability settings "
            "(they will be ignored): %s",
            ", ".join(unsupported),
        )

    return env

holodeck.lib.backends.validators -- Startup Validators

Pre-flight checks called by ClaudeBackend.initialize() before spawning the Claude subprocess. These surface configuration errors at startup rather than at runtime.

validate_nodejs

validate_nodejs(agent)

Validate that Node.js is available on PATH when the agent needs it.

Node.js is only required when at least one MCP tool spawns a Node interpreter (node, npx, yarn, pnpm). Agents without such tools skip this check entirely.

Parameters:

Name Type Description Default
agent Agent

Agent configuration to inspect for Node-dependent tools.

required

Raises:

Type Description
ConfigError

If node is not found on PATH and the agent needs it.

Source code in src/holodeck/lib/backends/validators.py
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
def validate_nodejs(agent: Agent) -> None:
    """Validate that Node.js is available on PATH when the agent needs it.

    Node.js is only required when at least one MCP tool spawns a Node
    interpreter (node, npx, yarn, pnpm).  Agents without such tools skip
    this check entirely.

    Args:
        agent: Agent configuration to inspect for Node-dependent tools.

    Raises:
        ConfigError: If node is not found on PATH and the agent needs it.
    """
    if not agent_needs_nodejs(agent):
        return  # No Node MCP tools — installation unnecessary.

    if shutil.which("node") is None:
        raise ConfigError(
            "nodejs",
            "Node.js is required to run Claude Agent SDK but was not found on "
            "PATH. Install Node.js from https://nodejs.org/ and ensure it is "
            "on your PATH.",
        )

    stdout = ""
    try:
        result = subprocess.run(  # noqa: S603 # nosec B603 B607
            ["node", "--version"],  # noqa: S607
            capture_output=True,
            text=True,
            timeout=5,
        )
        stdout = result.stdout.strip()
        version = stdout.lstrip("v")
        major = int(version.split(".")[0])
    except subprocess.TimeoutExpired as e:
        raise ConfigError(
            "nodejs",
            "Node.js version check timed out. Ensure 'node --version' "
            "completes quickly.",
        ) from e
    except subprocess.CalledProcessError as e:
        raise ConfigError(
            "nodejs",
            "Failed to determine Node.js version. Ensure 'node --version' " "works.",
        ) from e
    except ValueError as e:
        raise ConfigError(
            "nodejs",
            f"Could not parse Node.js version from output: {stdout}",
        ) from e

    if major < NODEJS_MIN_VERSION:
        raise ConfigError(
            "nodejs",
            f"Node.js version {version} found but >= {NODEJS_MIN_VERSION} is "
            "required by Claude Agent SDK. Download from https://nodejs.org/",
        )

    logger.debug(f"Node.js version {version} validated (>= {NODEJS_MIN_VERSION})")

validate_credentials

validate_credentials(model)

Validate authentication credentials for the LLM provider.

Checks that the required environment variables are present for the configured auth_provider, including cloud routing context for Bedrock, Vertex, and Foundry. Returns a dict of environment variables to inject into the Claude subprocess.

Parameters:

Name Type Description Default
model LLMProvider

LLM provider configuration.

required

Returns:

Type Description
dict[str, str]

Dict of environment variables to set for the subprocess.

Raises:

Type Description
ConfigError

If required credentials are absent.

Source code in src/holodeck/lib/backends/validators.py
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
def validate_credentials(model: LLMProvider) -> dict[str, str]:
    """Validate authentication credentials for the LLM provider.

    Checks that the required environment variables are present for the
    configured auth_provider, including cloud routing context for Bedrock,
    Vertex, and Foundry. Returns a dict of environment variables to inject
    into the Claude subprocess.

    Args:
        model: LLM provider configuration.

    Returns:
        Dict of environment variables to set for the subprocess.

    Raises:
        ConfigError: If required credentials are absent.
    """
    auth = model.auth_provider or AuthProvider.api_key

    if auth == AuthProvider.api_key:
        key = _get_required_env_var(
            "ANTHROPIC_API_KEY",
            "ANTHROPIC_API_KEY environment variable is not set. "
            "Set it with: export ANTHROPIC_API_KEY=sk-ant-...",
        )
        return {"ANTHROPIC_API_KEY": key}

    if auth == AuthProvider.oauth_token:
        token = _get_required_env_var(
            "CLAUDE_CODE_OAUTH_TOKEN",
            "CLAUDE_CODE_OAUTH_TOKEN environment variable is not set. "
            "Run `claude setup-token` to authenticate with OAuth.",
        )
        return {"CLAUDE_CODE_OAUTH_TOKEN": token}

    if auth == AuthProvider.bedrock:
        bedrock_region = _get_first_present_env_var(_BEDROCK_REGION_ENV_CANDIDATES)
        if bedrock_region is None:
            raise ConfigError(
                "AWS_REGION",
                "Missing AWS region for auth_provider: bedrock. Set either "
                "AWS_REGION or AWS_DEFAULT_REGION (for example: "
                "export AWS_REGION=us-east-1).",
            )
        return {
            "CLAUDE_CODE_USE_BEDROCK": "1",
            bedrock_region[0]: bedrock_region[1],
        }

    if auth == AuthProvider.vertex:
        region = _get_required_env_var(
            "CLOUD_ML_REGION",
            "CLOUD_ML_REGION environment variable is not set for "
            "auth_provider: vertex. Set it with: "
            "export CLOUD_ML_REGION=us-east5",
        )
        project_context = _get_first_present_env_var(_VERTEX_PROJECT_ENV_CANDIDATES)
        if project_context is None:
            raise ConfigError(
                "ANTHROPIC_VERTEX_PROJECT_ID",
                "Missing Vertex project context for auth_provider: vertex. "
                "Set one of: ANTHROPIC_VERTEX_PROJECT_ID, GCLOUD_PROJECT, "
                "GOOGLE_CLOUD_PROJECT, or GOOGLE_APPLICATION_CREDENTIALS.",
            )
        return {
            "CLAUDE_CODE_USE_VERTEX": "1",
            "CLOUD_ML_REGION": region,
            project_context[0]: project_context[1],
        }

    if auth == AuthProvider.custom:
        token = _get_required_env_var(
            "ANTHROPIC_AUTH_TOKEN",
            "ANTHROPIC_AUTH_TOKEN environment variable is not set. "
            "Set it for custom endpoint authentication "
            "(e.g., export ANTHROPIC_AUTH_TOKEN=ollama).",
        )
        return {"ANTHROPIC_AUTH_TOKEN": token}

    # AuthProvider.foundry (fallthrough)
    foundry_target = _get_first_present_env_var(_FOUNDRY_TARGET_ENV_CANDIDATES)
    if foundry_target is None:
        raise ConfigError(
            "ANTHROPIC_FOUNDRY_RESOURCE",
            "Missing Foundry target for auth_provider: foundry. "
            "Set one of: ANTHROPIC_FOUNDRY_RESOURCE or "
            "ANTHROPIC_FOUNDRY_BASE_URL.",
        )
    return {"CLAUDE_CODE_USE_FOUNDRY": "1", foundry_target[0]: foundry_target[1]}

validate_embedding_provider

validate_embedding_provider(agent)

Validate embedding provider configuration for vectorstore tools.

Anthropic does not support generating embeddings, so an external embedding_provider must be specified when using vectorstore tools with the Anthropic LLM provider.

Parameters:

Name Type Description Default
agent Agent

Agent configuration to validate.

required

Raises:

Type Description
ConfigError

If embedding configuration is invalid for the provider.

Source code in src/holodeck/lib/backends/validators.py
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
def validate_embedding_provider(agent: Agent) -> None:
    """Validate embedding provider configuration for vectorstore tools.

    Anthropic does not support generating embeddings, so an external
    embedding_provider must be specified when using vectorstore tools
    with the Anthropic LLM provider.

    Args:
        agent: Agent configuration to validate.

    Raises:
        ConfigError: If embedding configuration is invalid for the provider.
    """
    if not agent.tools:
        return

    has_vectorstore = any(
        isinstance(tool, VectorstoreTool | HierarchicalDocumentToolConfig)
        for tool in agent.tools
    )
    if not has_vectorstore:
        return

    # Anthropic cannot be used as an embedding provider
    if agent.embedding_provider is not None:
        if agent.embedding_provider.provider == ProviderEnum.ANTHROPIC:
            raise ConfigError(
                "embedding_provider",
                "Anthropic cannot generate embeddings. Use a different provider "
                "such as openai or azure_openai for embedding_provider when "
                "using vectorstore tools.",
            )
        return

    # Anthropic LLM + vectorstore tool + no embedding_provider
    if agent.model.provider == ProviderEnum.ANTHROPIC:
        raise ConfigError(
            "embedding_provider",
            "embedding_provider is required when using vectorstore tools with "
            "provider: anthropic. Add an embedding_provider using openai or "
            "azure_openai.",
        )

validate_working_directory

validate_working_directory(path)

Warn if CLAUDE.md in working directory may conflict with agent instructions.

Detects a CLAUDE.md file that contains a '# CLAUDE.md' header, which is the standard format used by Claude Code project instructions. Such a file may override or conflict with the agent's configured instructions.

Parameters:

Name Type Description Default
path str | None

Working directory path, or None to skip validation.

required
Source code in src/holodeck/lib/backends/validators.py
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
def validate_working_directory(path: str | None) -> None:
    """Warn if CLAUDE.md in working directory may conflict with agent instructions.

    Detects a CLAUDE.md file that contains a '# CLAUDE.md' header, which
    is the standard format used by Claude Code project instructions. Such
    a file may override or conflict with the agent's configured instructions.

    Args:
        path: Working directory path, or None to skip validation.
    """
    if path is None:
        return

    claude_md = Path(path) / "CLAUDE.md"
    if not claude_md.exists():
        return

    content = claude_md.read_text()
    if "# CLAUDE.md" in content:
        logger.warning(
            "A CLAUDE.md file with a '# CLAUDE.md' header was found in the "
            "working directory '%s'. This may conflict with the agent's system "
            "instructions. Review the file to avoid unexpected behavior.",
            path,
        )

validate_response_format

validate_response_format(response_format)

Validate response format schema is serializable and accessible.

Parameters:

Name Type Description Default
response_format dict[str, Any] | str | None

Inline schema dict, file path string, or None.

required

Raises:

Type Description
ConfigError

If the schema is not JSON-serializable or file not found.

Source code in src/holodeck/lib/backends/validators.py
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
def validate_response_format(response_format: dict[str, Any] | str | None) -> None:
    """Validate response format schema is serializable and accessible.

    Args:
        response_format: Inline schema dict, file path string, or None.

    Raises:
        ConfigError: If the schema is not JSON-serializable or file not found.
    """
    if response_format is None:
        return

    if isinstance(response_format, str):
        schema_path = Path(response_format)
        if not schema_path.exists():
            raise ConfigError(
                "response_format",
                f"response_format file not found: {response_format}",
            )
        try:
            json.loads(schema_path.read_text())
        except (json.JSONDecodeError, OSError) as e:
            raise ConfigError(
                "response_format",
                f"response_format file is not valid JSON: {e}",
            ) from e
        return

    # Must be a dict — verify it is JSON-serializable
    try:
        json.dumps(response_format)
    except (TypeError, ValueError) as e:
        raise ConfigError(
            "response_format",
            f"response_format contains non-JSON-serializable values: {e}",
        ) from e