Tools Reference Guide¶

This guide explains HoloDeck's six tool types that extend agent capabilities.

Overview¶

Tools are agent capabilities defined in agent.yaml. HoloDeck supports six tool types:

Tool Type	Description	Status
Vectorstore Tools	Semantic search over data	✅ Implemented
Hierarchical Document Tools	Structure-aware hybrid document search	✅ Implemented
MCP Tools	Model Context Protocol servers	✅ Implemented
Function Tools	Custom Python functions	✅ Implemented
Prompt Tools	LLM-powered semantic functions	🚧 Planned

Note: Vectorstore Tools, Hierarchical Document Tools, MCP Tools, and Function Tools are fully implemented. Prompt tools are defined in the configuration schema but not yet functional.

Tool Filtering¶

Tool filtering automatically reduces tool context per request by selecting only the most relevant tools. It builds an in-memory index of tool metadata and uses Semantic Kernel's FunctionChoiceBehavior to include a filtered set of tools for each query.

Agent Configuration¶

# agent.yaml
name: support-agent
model:
  provider: openai
  name: gpt-4o-mini
instructions:
  file: instructions/system.md

tool_filtering:
  enabled: true
  top_k: 5
  similarity_threshold: 0.3
  always_include_top_n_used: 3
  search_method: semantic


tools:
  - name: filesystem
    type: mcp
    description: Read and write files
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem", "./data"]

Configuration Knobs¶

top_k: Max tools per request (includes always-included tools)
similarity_threshold: Filter out tools scoring below this value
always_include: Tool names always available (full name or function name suffix)
always_include_top_n_used: Keep most-used tools in context (usage-aware)
search_method: semantic, bm25, or hybrid

Sensible Defaults¶

Parameter	Default	Rationale
`top_k`	5	Enough tools for most tasks without token bloat
`similarity_threshold`	0.3	Include tools at least 30% as relevant as top result
`always_include`	`[]`	Agent-specific—add your critical tools here
`always_include_top_n_used`	3	Keep frequently used tools in context
`search_method`	`semantic`	Best semantic match for short prompts

Threshold Tuning by Search Method¶

All search methods return normalized scores in the 0-1 range, so similarity_threshold is consistent across methods.

Method	Good Match Range	Recommended Threshold
`semantic`	0.4 - 0.6	0.3 - 0.4
`bm25` (normalized)	0.8 - 1.0	0.5 - 0.6
`hybrid` (normalized)	0.8 - 1.0	0.5 - 0.6

A threshold of 0.3 means "include tools scoring at least 30% of what the top result scores."

Tip: always_include_top_n_used tracks usage across requests, so early or accidental tool calls can bias results. Set it to 0 during development if you want to avoid usage bias.

Common Tool Fields¶

All tools share these fields:

tools:
  - name: tool-id # Required: Tool identifier (unique)
    description: What it does # Required: Human-readable description
    type: vectorstore|function|mcp|prompt # Required: Tool type

Name¶

Required: Yes
Type: String
Format: 1-100 characters, alphanumeric + underscores
Uniqueness: Must be unique within agent
Purpose: Used to reference tool in test cases, execution logs

- name: search_kb

Description¶

Required: Yes
Type: String
Max Length: 500 characters
Purpose: Helps agent understand when to use this tool

- description: Search company knowledge base for answers

Type¶

Required: Yes
Type: String (Enum)
Options: vectorstore, hierarchical_document, function, mcp, prompt
Purpose: Determines which additional fields are required

- type: vectorstore

Vectorstore Tools ✅¶

Status: Fully implemented

Anthropic Users: Embedding Provider Required

Anthropic does not provide embedding models. When using model.provider: anthropic with vectorstore tools, you must define embedding_provider at the agent level. See Embedding Provider.

Semantic search over unstructured or structured data.

When to Use¶

Searching documents, knowledge bases, FAQs
Semantic similarity matching
Context retrieval for RAG (Retrieval-Augmented Generation)

Basic Example¶

- name: search-kb
  description: Search knowledge base for answers
  type: vectorstore
  source: knowledge_base/

Supported Vector Database Providers¶

HoloDeck supports multiple vector database backends through Semantic Kernel's VectorStoreCollection abstractions. You can switch providers via configuration without changing your agent code.

Provider	Description	Connection	Install Command
`postgres`	PostgreSQL with pgvector extension	`postgresql://user:pass@host/db`	`uv add holodeck-ai[postgres]`
`qdrant`	Qdrant vector database	`http://localhost:6333`	`uv add holodeck-ai[qdrant]`
`chromadb`	ChromaDB (local or server)	Local path or host URL	`uv add holodeck-ai[chromadb]`
`pinecone`	Pinecone serverless vector database	API key + index name	`uv add holodeck-ai[pinecone]`
`in-memory`	Simple in-memory storage	None required	Built-in

Tip: Install all vector store providers at once with uv add holodeck-ai[vectorstores]. Use in-memory for development and testing without installing any dependencies. Switch to a persistent provider like postgres, qdrant, or chromadb for production.

Database Configuration Examples¶

PostgreSQL with pgvector

- name: search-kb
  type: vectorstore
  source: knowledge_base/
  database:
    provider: postgres
    connection_string: postgresql://user:password@localhost:5432/mydb

Azure AI Search

- name: search-kb
  type: vectorstore
  source: knowledge_base/
  database:
    provider: azure-ai-search
    connection_string: ${AZURE_SEARCH_ENDPOINT}
    api_key: ${AZURE_SEARCH_API_KEY}

Qdrant

- name: search-kb
  type: vectorstore
  source: knowledge_base/
  database:
    provider: qdrant
    url: http://localhost:6333
    # api_key: optional-api-key

In-Memory (development only)

- name: search-kb
  type: vectorstore
  source: knowledge_base/
  database:
    provider: in-memory

Reference to Global Config

You can also reference a named vectorstore from your global config.yaml:

# In agent.yaml
- name: search-kb
  type: vectorstore
  source: knowledge_base/
  database: my-postgres-store # Reference to config.yaml vectorstores section

# In config.yaml
vectorstores:
  my-postgres-store:
    provider: postgres
    connection_string: ${DATABASE_URL}

Required Fields¶

Source¶

Type: String (path)
Purpose: Data file or directory to index
Formats Supported:
Single files: .txt, .md, .pdf, .json, .csv
Directories: Recursively indexes supported formats
Remote URLs: File auto-cached locally

source: knowledge_base/
# OR
source: docs.json
# OR
source: https://example.com/data.pdf

Optional Fields¶

Embedding Model¶

Type: String
Purpose: Which embedding model to use
Default: Provider-specific default
Examples: text-embedding-3-small, text-embedding-ada-002, nomic-embed-text:latest

embedding_model: text-embedding-3-small

Embedding provider behaviour by backend:

Semantic Kernel backend (OpenAI, Azure OpenAI, Ollama): Uses the agent's own model provider for embeddings. The embedding_model field on the tool selects which embedding model to use from that provider.

Claude Agent SDK backend (Anthropic): Anthropic does not offer embedding models. The agent-level embedding_provider is used instead, and the tool's embedding_model field selects the model within that provider.

Mismatched embedding models across tools

If your agent defines both vectorstore and hierarchical_document tools with different embedding_model values, HoloDeck will raise a validation error at startup. All embedding-based tools in an agent must use the same embedding model because they share a single embedding provider instance.

Vector Field¶

Type: String or List of strings
Purpose: Which field(s) to vectorize (for JSON/CSV)
Default: Auto-detect text fields
Note: XOR with vector_fields (use one or the other)

vector_field: content
# OR
vector_field: [title, description]

Meta Fields¶

Type: List of strings
Purpose: Metadata fields to include in results
Default: All fields included

meta_fields: [title, source, date]

Chunk Size¶

Type: Integer
Purpose: Characters per chunk for text splitting
Default: 512
Constraint: Must be > 0

chunk_size: 1024

Chunk Overlap¶

Type: Integer
Purpose: Characters to overlap between chunks
Default: 0
Constraint: Must be >= 0

chunk_overlap: 100

Record Path¶

Type: String
Purpose: Path to array in nested JSON (dot notation)
Example: For {data: {items: [{...}]}}, use data.items

record_path: data.records

Record Prefix¶

Type: String
Purpose: Prefix added to record fields
Default: None

record_prefix: record_

Meta Prefix¶

Type: String
Purpose: Prefix added to metadata fields
Default: None

meta_prefix: meta_

Complete Example¶

- name: search-docs
  description: Search technical documentation
  type: vectorstore
  source: docs/
  embedding_model: text-embedding-3-small
  vector_field: [title, content]
  meta_fields: [source, date, url]
  chunk_size: 1024
  chunk_overlap: 128

Data Format Examples¶

Text Files (.txt, .md)

# Document Title

This is the document content that will be
vectorized for semantic search.

JSON (Array of objects)

[
  {
    "title": "Getting Started",
    "content": "How to get started with the platform...",
    "source": "docs/intro.md"
  }
]

JSON (Nested structure)

{
  "data": {
    "records": [
      {
        "id": 1,
        "title": "Article 1",
        "content": "..."
      }
    ]
  }
}

Use record_path: data.records to access records.

CSV

title,content,source
"Getting Started","How to get started...","docs/intro"
"API Reference","API documentation...","docs/api"

Hierarchical Document Tools ✅¶

Status: Fully implemented

Anthropic Users: Embedding Provider Required

Anthropic does not provide embedding models. When using model.provider: anthropic with hierarchical document tools, you must define embedding_provider at the agent level. See Embedding Provider.

Structure-aware hybrid document search combining semantic, keyword, and exact match modalities with contextual embeddings for superior retrieval quality.

When to Use¶

Searching structured/legal/regulatory documents with section hierarchy
Hybrid search combining semantic, keyword, and exact match
RAG pipelines needing contextual embeddings (Anthropic approach — 49% better retrieval)
Documents with heading-based structure (markdown, PDF, Word)

Basic Example¶

tools:
  - name: docs_search
    type: hierarchical_document
    description: "Search company documentation"
    source: "./docs/"

The tool automatically:

Converts documents to markdown (via markitdown)
Parses hierarchical structure from headings
Chunks by structure (max 800 tokens)
Generates LLM context for each chunk (Anthropic contextual retrieval approach)
Generates embeddings from contextualized text
Builds BM25 and exact match indices

Required Fields¶

Source¶

Type: String (path)
Purpose: Path to document file or directory to index
Formats Supported: .md, .pdf, .docx, .txt
Note: Directories are recursively indexed

source: "./docs/"
# OR
source: "./legal/compliance-manual.pdf"

Optional Fields¶

Chunking¶

Field	Type	Default	Range	Description
`chunking_strategy`	Enum	`structure`	`structure`, `token`	`structure` parses markdown headings; `token` uses fixed token-based splitting
`max_chunk_tokens`	Integer	`800`	100–2000	Maximum tokens per chunk
`chunk_overlap`	Integer	`50`	0–200	Token overlap between chunks

chunking_strategy: structure
max_chunk_tokens: 1000
chunk_overlap: 100

Document Domain¶

Field	Type	Default	Description
`document_domain`	Enum	`none`	Domain-specific subsection detection patterns
`max_subsection_depth`	Integer	None	Maximum subsection nesting depth (None = unlimited)

Available domains:

Domain	Description	Example Patterns
`none`	No subsection patterns (default)	—
`us_legislative`	US Code style	(a), (1), (A), (i) hierarchy
`au_legislative`	Australian style	(1), (a), (i), (A) hierarchy
`academic`	Academic papers	1., 1.1, 1.1.1 numbered sections
`technical`	Technical manuals	Step 1, 1.1, Note:, Warning:
`legal_contract`	Legal contracts	Article I, Section 1, (a) clauses

document_domain: us_legislative
max_subsection_depth: 3

Search¶

Field	Type	Default	Range	Description
`search_mode`	Enum	`hybrid`	`semantic`, `keyword`, `exact`, `hybrid`	Search modality
`top_k`	Integer	`10`	1–100	Number of results to return
`min_score`	Float	None	0.0–1.0	Minimum similarity score threshold
`semantic_weight`	Float	`0.5`	0.0–1.0	Weight for semantic search in hybrid mode
`keyword_weight`	Float	`0.3`	0.0–1.0	Weight for keyword search in hybrid mode
`exact_weight`	Float	`0.2`	0.0–1.0	Weight for exact match in hybrid mode
`rrf_k`	Integer	`60`	≥ 1	Reciprocal Rank Fusion constant

Note: In hybrid mode, semantic_weight + keyword_weight + exact_weight must sum to 1.0.

search_mode: hybrid
top_k: 15
semantic_weight: 0.4
keyword_weight: 0.4
exact_weight: 0.2

Contextual Embeddings¶

Field	Type	Default	Range	Description
`contextual_embeddings`	Boolean	`true`	—	Enable LLM-generated context per chunk
`context_max_tokens`	Integer	`100`	50–200	Max tokens for context summary
`context_concurrency`	Integer	`10`	1–50	Parallel LLM context generation requests

Contextual embeddings prepend an LLM-generated summary to each chunk before embedding, yielding ~49% better retrieval accuracy (per Anthropic research). Uses Claude Haiku by default.

Embedding provider behaviour by backend:

Semantic Kernel backend (OpenAI, Azure OpenAI, Ollama): Uses the agent's own model provider for embeddings. The tool's embedding_model field (if set) selects the model within that provider.

Claude Agent SDK backend (Anthropic): Anthropic does not offer embedding models. The agent-level embedding_provider is used instead for all embedding generation.

contextual_embeddings: true
context_max_tokens: 100
context_concurrency: 15

Feature Extraction¶

Field	Type	Default	Description
`extract_definitions`	Boolean	`true`	Extract and index term definitions from documents
`extract_cross_references`	Boolean	`true`	Extract and resolve cross-references between sections

extract_definitions: true
extract_cross_references: true

Reranking¶

Field	Type	Default	Description
`enable_reranking`	Boolean	`false`	Enable LLM-based reranking of search results
`reranker_model`	LLMProvider	None	LLM config for reranking (required when `enable_reranking: true`)

enable_reranking: true
reranker_model:
  provider: openai
  name: gpt-4o-mini
  temperature: 0.0

Storage¶

Field	Type	Default	Description
`database`	DatabaseConfig / String	None	Vector database configuration or named reference (None = in-memory)
`keyword_index`	KeywordIndexConfig	None	Keyword index backend configuration (None = in-memory BM25)

See Vectorstore Tools — Database Configuration for database provider options.

Search Modes¶

Mode	Use Case	Example Query
`semantic`	Conceptual questions	"What are the compliance requirements?"
`keyword`	Technical terms	"authentication timeout error"
`exact`	Section references	"Section 203(a)(1)"
`hybrid`	General purpose (default)	"reporting requirements in Section 5"

Keyword Index Configuration¶

In-Memory BM25 (Default)¶

Zero-configuration — uses rank_bm25 in-process. This is the default when keyword_index is omitted or set to in-memory.

tools:
  - name: docs_search
    type: hierarchical_document
    description: "Search with in-memory BM25 keyword index"
    source: "./docs/"
    search_mode: hybrid
    keyword_index:
      provider: in-memory  # Optional — this is the default

OpenSearch (Production)¶

For production workloads, use an external OpenSearch cluster for the keyword index. This offloads BM25 scoring to a dedicated search engine, suitable for large corpora and multi-instance deployments.

tools:
  - name: docs_search
    type: hierarchical_document
    description: "Search with OpenSearch keyword backend"
    source: "./docs/"
    search_mode: hybrid
    keyword_index:
      provider: opensearch
      endpoint: "https://search.example.com:9200"
      index_name: "my-keyword-index"
      username: "${OPENSEARCH_USERNAME}"
      password: "${OPENSEARCH_PASSWORD}"
      verify_certs: true
      timeout_seconds: 10

KeywordIndexConfig fields (OpenSearch):

Field	Type	Default	Description
`provider`	Enum	`in-memory`	`in-memory` or `opensearch`
`endpoint`	String	None	OpenSearch endpoint URL (required for opensearch)
`index_name`	String	None	OpenSearch index name (required for opensearch)
`username`	String	None	Basic auth username
`password`	String	None	Basic auth password
`api_key`	String	None	API key auth (alternative to basic auth)
`verify_certs`	Boolean	`true`	Verify TLS certificates
`timeout_seconds`	Integer	`10`	Connection timeout in seconds (1–120)

OpenSearch Docker Setup¶

Development (security disabled)¶

docker run -d -p 9200:9200 -p 9600:9600 \
  -e "discovery.type=single-node" \
  -e "DISABLE_SECURITY_PLUGIN=true" \
  opensearchproject/opensearch:latest

Then use endpoint: "http://localhost:9200" with no auth.

With admin password (OpenSearch 2.12+)¶

docker run -d -p 9200:9200 -p 9600:9600 \
  -e "discovery.type=single-node" \
  -e "OPENSEARCH_INITIAL_ADMIN_PASSWORD=<YourStrongPassword1!>" \
  opensearchproject/opensearch:latest

Note: On Linux, you may need to increase the vm.max_map_count kernel setting:
sudo sysctl -w vm.max_map_count=262144

Verify OpenSearch is running:

curl -s http://localhost:9200 | python -m json.tool

Configuration Examples¶

Semantic-Only Search¶

Best for conceptual questions without specific terminology.

tools:
  - name: concept_search
    type: hierarchical_document
    description: "Search by concepts"
    source: "./docs/"
    search_mode: semantic
    contextual_embeddings: true

Keyword-Heavy Search¶

Best for technical documents with specific terminology.

tools:
  - name: tech_search
    type: hierarchical_document
    description: "Technical documentation search"
    source: "./technical/"
    search_mode: hybrid
    semantic_weight: 0.3
    keyword_weight: 0.5
    exact_weight: 0.2

Legal/Regulatory Documents¶

Optimized for section references and definitions.

tools:
  - name: legal_search
    type: hierarchical_document
    description: "Search legal documents"
    source: "./legal/"
    search_mode: hybrid
    document_domain: us_legislative
    extract_definitions: true
    extract_cross_references: true
    exact_weight: 0.3
    keyword_weight: 0.3
    semantic_weight: 0.4

Custom Context Model¶

Specify a different LLM for context generation (defaults to Claude Haiku).

tools:
  - name: docs_search
    type: hierarchical_document
    description: "Search with custom context model"
    source: "./docs/"
    contextual_embeddings: true
    context_max_tokens: 100
    context_concurrency: 10

Large Corpus with Persistence¶

tools:
  - name: knowledge_base
    type: hierarchical_document
    description: "Persistent knowledge base"
    source: "./knowledge/"
    database:
      provider: postgres
      connection_string: "${POSTGRES_URL}"
    top_k: 20

How It Works¶

Preprocessing Pipeline (Anthropic Contextual Retrieval)¶

┌─────────────────────────────────────────────────────────────────────────┐
│                         INGESTION (per document)                        │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  document.pdf  →  [markitdown]  →  markdown text                        │
│                                                                         │
│  markdown text →  [StructuredChunker]  →  chunks with parent_chain      │
│                                                                         │
│  For each chunk:                                                        │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │ [LLM Context Generation] - Claude Haiku                          │  │
│  │                                                                   │  │
│  │ Input:  Whole document + chunk                                    │  │
│  │ Output: "This chunk describes the reporting requirements under    │  │
│  │          Section 203 of the Environmental Protection Act..."      │  │
│  │                                                                   │  │
│  │ Cost: ~$0.03 per 100-page document                                │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│  contextualized = "{LLM context}\n\n{original chunk}"                   │
│                                                                         │
│  contextualized  →  [Embedding]  →  dense vector                        │
│  contextualized  →  [BM25/Native] →  keyword index                      │
│  section_id      →  [Dict]        →  exact match index                  │
│                                                                         │
│  Result: 49% better retrieval vs standard RAG (Anthropic research)      │
└─────────────────────────────────────────────────────────────────────────┘

Hybrid Search Flow¶

Query: "reporting requirements in Section 403"
                    │
    ┌───────────────┼───────────────┐
    ▼               ▼               ▼
[Semantic]      [BM25]          [Exact]
contextualized  contextualized  section_id
embeddings      text search     lookup
    │               │               │
    └───────────────┼───────────────┘
                    ▼
              [RRF Fusion]
              k=60, weights
                    │
                    ▼
            [Top K Results]

Query Analysis: Detects exact match patterns (section numbers, quoted phrases)
Parallel Search: Runs semantic, keyword, and exact match searches
RRF Fusion: Combines ranked results using Reciprocal Rank Fusion (k=60)
Enrichment: Adds relevant definitions if terms are defined in documents

Result Format¶

Each search result includes:

[1] Score: 0.847 | Source: docs/compliance.md
Location: Chapter 5 > Section 5.3 > Reporting Requirements
Section: sec_5_3

Annual reporting must be submitted within 60 days of the fiscal year end.
All covered entities must include revenue breakdowns by category.

Relevant definitions:
  • Covered entity: Any organization with annual revenue exceeding...

Performance Tips¶

Chunk size: Default 800 tokens works well. Increase for dense technical content, decrease for granular retrieval.
Weights tuning: Increase semantic_weight for conceptual queries, keyword_weight for technical terminology, exact_weight for regulatory documents.
Persistence: For large corpora (>1000 documents), use a persistent database to avoid re-indexing on restart.
Contextual embeddings: Keep enabled (default) for 49% better retrieval accuracy per Anthropic research.
Context generation cost: ~$0.03 per 100-page document using Claude Haiku. Increase context_concurrency to reduce ingestion time.
Large documents: For very large documents that exceed context limits, the tool automatically truncates the document context while preserving the chunk content.

Error Handling¶

No results found: Returns empty results — check that documents exist in source path and format is supported
Slow ingestion: Large PDFs take longer; consider splitting very large documents or using persistent storage
Missing context: Ensure contextual_embeddings: true (default) and documents have heading structure

MCP Tools ✅¶

Status: Fully implemented (stdio transport)

Model Context Protocol (MCP) server integrations enable agents to interact with external systems through a standardized protocol. HoloDeck uses Semantic Kernel's MCP plugins for seamless integration.

Finding MCP Servers: Browse the official MCP server registry at github.com/modelcontextprotocol/servers for a curated list of available servers including filesystem, GitHub, Slack, Google Drive, PostgreSQL, and many more community-contributed integrations.

When to Use¶

File system operations (read, write, list files)
GitHub/GitLab operations (issues, PRs, code)
Database access (SQLite, PostgreSQL)
Web browsing and search
Any standardized MCP server

Basic Example¶

- name: filesystem
  description: Read and write files in the workspace
  type: mcp
  command: npx
  args: ["-y", "@modelcontextprotocol/server-filesystem", "./data"]

Complete Example¶

tools:
  # MCP filesystem tool for reading/writing files
  - type: mcp
    name: filesystem
    description: Read and write files in the workspace data directory
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem", "./sample/data"]
    config:
      allowed_directories: ["./sample/data"]
    request_timeout: 30

Required Fields¶

Command¶

Type: String (enum: npx, node, uvx, docker)
Purpose: How to launch the MCP server
Required: Yes (for stdio transport)

command: npx     # For npm packages (auto-installs if needed)
# OR
command: node    # For local .js files or installed packages
# OR
command: uvx     # For Python packages via uv
# OR
command: docker  # For containerized servers

When to use each:

npx - Run npm packages directly (e.g., @modelcontextprotocol/server-filesystem)
node - Run local JavaScript files (e.g., ./tools/my-server.js)
uvx - Run Python packages via uv (e.g., mcp-server-fetch)
docker - Run containerized MCP servers

Args¶

Type: List of strings
Purpose: Command-line arguments for the server
Note: Often includes the server package name and configuration

args: ["-y", "@modelcontextprotocol/server-filesystem", "./data"]

Optional Fields¶

Transport¶

Type: String (enum: stdio, sse, websocket, http)
Default: stdio
Purpose: Communication protocol with the server
Note: Currently only stdio is implemented

transport: stdio # Default, works with most servers

Config¶

Type: Object (free-form)
Purpose: Server-specific configuration passed via MCP_CONFIG env var
Validation: Server validates at runtime

config:
  allowed_directories: ["./data", "/tmp"]
  max_file_size: 1048576

Env¶

Type: Object (string key-value pairs)
Purpose: Environment variables for the server process
Supports: Variable substitution with ${VAR_NAME}

env:
  GITHUB_TOKEN: "${GITHUB_TOKEN}"
  API_KEY: "static-value"

Env File¶

Type: String (path)
Purpose: Load environment variables from a file
Format: Standard .env file format

env_file: .env.mcp

Request Timeout¶

Type: Integer (seconds)
Default: 30
Purpose: Timeout for individual MCP requests

request_timeout: 60

Encoding¶

Type: String
Default: utf-8
Purpose: Character encoding for stdio communication

encoding: utf-8

Sample MCP Servers¶

Filesystem (stdio)¶

Read, write, and manage files:

- name: filesystem
  type: mcp
  description: File system operations
  command: npx
  args: ["-y", "@modelcontextprotocol/server-filesystem", "./data"]
  config:
    allowed_directories: ["./data"]

Tools provided: read_file, write_file, list_directory, create_directory, move_file, search_files, get_file_info

GitHub¶

Interact with GitHub repositories:

- name: github
  type: mcp
  description: GitHub repository operations
  command: npx
  args: ["-y", "@modelcontextprotocol/server-github"]
  env:
    GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"

Tools provided: search_repositories, create_issue, list_issues, get_file_contents, create_pull_request, fork_repository

SQLite¶

Query SQLite databases:

- name: sqlite
  type: mcp
  description: SQLite database queries
  command: npx
  args:
    [
      "-y",
      "@modelcontextprotocol/server-sqlite",
      "--db-path",
      "./data/database.db",
    ]

Tools provided: read_query, write_query, create_table, list_tables, describe_table

Brave Search¶

Web search capabilities:

- name: brave-search
  type: mcp
  description: Web search via Brave
  command: npx
  args: ["-y", "@modelcontextprotocol/server-brave-search"]
  env:
    BRAVE_API_KEY: "${BRAVE_API_KEY}"

Tools provided: brave_web_search, brave_local_search

Puppeteer (Browser Automation)¶

Browser automation and web scraping:

- name: puppeteer
  type: mcp
  description: Browser automation
  command: npx
  args: ["-y", "@modelcontextprotocol/server-puppeteer"]

Tools provided: puppeteer_navigate, puppeteer_screenshot, puppeteer_click, puppeteer_fill, puppeteer_evaluate

Local Node.js Servers (node)¶

For local JavaScript MCP server files, use node:

- name: my-custom-server
  type: mcp
  description: Custom local MCP server
  command: node
  args: ["./tools/my-mcp-server.js", "--config", "./config.json"]

Note: Use node for local .js files. Use npx for npm packages.

Python MCP Servers (uvx)¶

For Python-based MCP servers, use uvx:

- name: mcp-server-fetch
  type: mcp
  description: Fetch web content
  command: uvx
  args: ["mcp-server-fetch"]

Memory (Short-Term Storage)¶

Scratchpad for agent short-term memory storage:

- name: memory
  type: mcp
  description: Scratchpad for short term memory storage
  command: uvx
  args: ["basic-memory", "mcp"]
  request_timeout: 30

Tools provided: write_note, read_note, search_notes, delete_note

Use case: Enable agents to persist information across conversation turns, store intermediate results, or maintain context during multi-step tasks, and especially between chat sessions.

Docker MCP Servers¶

For containerized servers:

- name: custom-server
  type: mcp
  description: Custom containerized server
  command: docker
  args: ["run", "-i", "--rm", "my-mcp-server:latest"]

Environment Variable Patterns¶

Static values:

env:
  API_KEY: "sk-1234567890"

Environment substitution:

env:
  GITHUB_TOKEN: "${GITHUB_TOKEN}" # From process environment

From env file:

env_file: .env.mcp
env:
  OVERRIDE_VAR: "override-value" # Overrides env_file

Error Handling¶

Server unavailable: Error during agent startup
Connection timeout: Configurable via request_timeout
Invalid config: Error during agent startup (validation)
Runtime errors: Logged and returned as tool error responses

Prerequisites¶

MCP tools require the appropriate runtime to be installed on your machine based on the command you use:

Command	Required Software	Installation
`npx`	Node.js + npm	nodejs.org or `brew install node`
`node`	Node.js	nodejs.org or `brew install node`
`uvx`	uv (Python)	`curl -LsSf https://astral.sh/uv/install.sh \\| sh` or `brew install uv`
`docker`	Docker	docker.com or `brew install --cask docker`

Verify installation:

# For npm-based MCP servers
node --version    # Should show v18+ recommended
npx --version

# For Python-based MCP servers
uv --version
uvx --version

# For containerized servers
docker --version

Tip: Most MCP servers use npx with npm packages. Ensure Node.js 18+ is installed for best compatibility.

Lifecycle Management¶

MCP plugins are automatically managed:

Startup: Plugin initialized and connected when agent starts
Execution: Tools discovered and registered on the kernel
Shutdown: Plugin properly closed when session ends

Important: Always terminate chat sessions properly (exit or quit) to ensure MCP servers are cleanly shut down.

Function Tools ✅¶

Status: Fully implemented (Semantic Kernel and Claude Agent SDK backends both dispatch through the same loader)

Execute custom Python functions defined in your project. Function tools are imported at agent-config load time via importlib.util.spec_from_file_location, validated as callable, and surfaced to the model with the function's signature and docstring as the tool's parameter schema and description.

When to Use¶

Custom business logic the agent needs to call deterministically
Pure functions (math, parsing, formatting) that don't need an MCP wrapper
Lightweight integrations with libraries you already have in the venv
Anything where MCP would be overkill (no separate process, no protocol overhead)

Required Fields¶

Field	Type	Required	Notes
`name`	string	Yes	Tool identifier (alphanumeric + underscores, must be unique on agent)
`description`	string	Yes	Shown to the model — kept short, the function's docstring complements it
`type`	`"function"`	Yes	Discriminator
`file`	string (path)	Yes	Python file containing the function. Resolved relative to the directory containing `agent.yaml` (absolute paths also work)
`function`	string	Yes	Name of the callable inside `file`

Optional Fields¶

Field	Type	Default	Notes
`parameters`	dict	None	Parameter schema for introspection. Optional — backends derive the JSON schema from the function's type annotations when omitted
`defer_loading`	bool	`true`	When `true`, the tool is excluded from the initial tool context and surfaced via tool filtering on demand. Set to `false` for tools the agent should always have available

See src/holodeck/models/tool.py:FunctionTool for the full Pydantic model.

Basic Example¶

agent.yaml:

tools:
  - name: subtract
    type: function
    description: "Compute a - b. Pass a and b as numeric strings; returns the difference as a string."
    file: tools/financial_tools.py
    function: subtract

tools/financial_tools.py:

def subtract(a: str, b: str) -> str:
    """Compute ``a - b`` for two numeric string operands.

    Args:
        a: Minuend as a numeric string (e.g. ``"206588"`` or ``"1,234.56"``).
        b: Subtrahend as a numeric string.

    Returns:
        The difference ``a - b`` as a string.
    """
    return str(float(a.replace(",", "")) - float(b.replace(",", "")))

A complete working sample lives at sample/financial-assistant/claude/ — agent.yaml wires subtract and divide into a Claude agent, and tools/financial_tools.py implements them.

How Loading Works¶

src/holodeck/lib/function_tool_loader.py:load_function_tool is shared by both backends. At agent-startup (not first call) it:

Resolves tool.file relative to the agent project root (the directory holding agent.yaml); absolute paths pass through unchanged.
Builds an import spec via importlib.util.spec_from_file_location and executes the module in isolation under a synthesized module name (holodeck_function_tool_<tool.name>), so two tools with the same source file don't collide.
Looks up tool.function on the loaded module and verifies the attribute is callable.
Returns the callable to the backend, which then wraps it as a Semantic Kernel function or a Claude SDK in-process MCP tool.

Any failure — missing file, import error, missing attribute, attribute that isn't callable — is re-raised as a ConfigError keyed on tools.<tool.name>, so the agent fails to start with a clear, actionable message instead of crashing mid-test (FR-025).

Adding a New Function Tool — Checklist¶

Create the Python file under your agent project (commonly tools/):

# tools/my_tool.py
def my_tool(query: str, limit: int = 10) -> list[dict]:
    """One-line description shown to the model alongside `description`.

    Args:
        query: ...
        limit: ...

    Returns:
        List of result records.
    """
    ...

Use type hints on every parameter and the return — the backends generate the tool's JSON schema from them.
Make the function self-contained. The loader executes the module fresh; relying on side-effects from other modules in your project may not work as expected.
Return JSON-serializable data. Strings, numbers, booleans, lists, and dicts are safe; arbitrary objects will fail when the backend serializes the tool result for the model.
Wire it into agent.yaml:

tools:
  - name: my_tool
    type: function
    description: "Search the local catalog for matching records."
    file: tools/my_tool.py
    function: my_tool

Run holodeck test agent.yaml (or holodeck chat) — a load-time ConfigError here means the path or function name is wrong; fix and re-run.

Best Practices¶

Keep each function focused on a single task — small, well-named tools are easier for the model to pick.
Always add type hints + a short docstring; both feed the model's tool description.
Validate inputs and raise a meaningful ValueError — the runtime captures it as the tool error and surfaces it to the agent.
Return JSON-serializable values (strings, numbers, dicts, lists). Avoid returning complex objects.
Avoid long-running calls inside a function tool; prefer an MCP server when you need a separate process or async I/O at scale.
Prefer pure functions where possible — they're trivially testable in isolation (e.g., tests/unit/sample/test_financial_tools.py).

Prompt Tools 🚧¶

Status: Planned - Configuration schema defined, execution not yet implemented

LLM-powered semantic functions with template substitution.

When to Use¶

Text generation with templates
Specialized prompts for specific tasks
Reusable prompt chains
A/B testing different prompts

Basic Example¶

- name: summarize
  description: Summarize text into key points
  type: prompt
  template: "Summarize this in 3 bullet points: {{text}}"
  parameters:
    text:
      type: string
      description: Text to summarize

Required Fields¶

Template or File¶

Either template (inline) or file (external), not both:

Inline Template

Type: String
Max Length: 5000 characters
Syntax: Mustache-style {{variable}}

template: "Summarize: {{content}}"

Template File

Type: String (path)
Path: Relative to agent.yaml

file: prompts/summarize.txt

File contents:

Summarize this text in 3 bullet points:

{{text}}

Focus on key takeaways.

Parameters¶

Type: Object mapping parameter names to schemas
Purpose: Template variables the agent can fill
Required: Yes (at least one)

parameters:
  text:
    type: string
    description: Text to process

Optional Fields¶

Model Override¶

Type: Model configuration object
Purpose: Use different model for this tool
Default: Uses agent's model

model:
  provider: openai
  name: gpt-4 # Different from agent's model
  temperature: 0.2

Complete Example¶

- name: code-reviewer
  description: Review code for best practices
  type: prompt
  file: prompts/code_review.txt
  model:
    provider: openai
    name: gpt-4
    temperature: 0.3
  parameters:
    code:
      type: string
      description: Code to review
    language:
      type: string
      description: Programming language
      enum: [python, javascript, go, java]

Template file (prompts/code_review.txt):

Review this {{language}} code for best practices.

Code:
{{code}}

Provide:
1. Issues found
2. Suggestions for improvement
3. Security considerations

Template Syntax¶

Variables use Mustache-style syntax:

Simple variable: {{name}}

Conditionals (if parameter provided):
{{#if description}}
Description: {{description}}
{{/if}}

Loops (if parameter is array):
{{#each items}}
- {{this}}
{{/each}}

Tool Comparison¶

Feature	Vectorstore	Hierarchical Document	MCP	Function	Prompt
Status	✅ Implemented	✅ Implemented	✅ Implemented	✅ Implemented	🚧 Planned
Use Case	Search data	Structured document search	External integrations	Custom logic	Template-based
Execution	Vector similarity	Hybrid RRF fusion	MCP protocol (stdio)	Python function	LLM generation
Setup	Data files	Document files	Server config + runtime	Python files	Template text
Parameters	Implicit (search query)	Implicit (search query)	Server-specific tools	Defined in code	Defined in YAML
Latency	Medium (~100ms)	Medium (~100-300ms)	Medium (~50-500ms)	Low (<10ms)	High (LLM call)
Cost	Embedding API	Embedding + context LLM	Server resource	Internal	LLM tokens

Common Patterns¶

Knowledge Base Search¶

- name: search-kb
  type: vectorstore
  source: kb/
  chunk_size: 512
  embedding_model: text-embedding-3-small

Database Query¶

- name: query-db
  type: function
  file: tools/db.py
  function: query
  parameters:
    sql:
      type: string

File Operations (MCP)¶

- name: filesystem
  type: mcp
  description: Read and write files
  command: npx
  args: ["-y", "@modelcontextprotocol/server-filesystem", "./data"]
  config:
    allowed_directories: ["./data"]

GitHub Integration (MCP)¶

- name: github
  type: mcp
  description: GitHub repository operations
  command: npx
  args: ["-y", "@modelcontextprotocol/server-github"]
  env:
    GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"

Structured Document Search¶

- name: policy_search
  type: hierarchical_document
  description: "Search policy documents with section awareness"
  source: "./policies/"
  search_mode: hybrid
  document_domain: legal_contract
  extract_definitions: true
  extract_cross_references: true
  database:
    provider: postgres
    connection_string: "${POSTGRES_URL}"

Text Transformation¶

- name: translate
  type: prompt
  template: "Translate to {{language}}: {{text}}"
  parameters:
    text:
      type: string
    language:
      type: string

Error Handling¶

Vectorstore Tool Errors¶

No data found: Returns empty results
Invalid path: Error during agent startup (config validation)
Unsupported format: Error during agent startup

Function Tool Errors¶

All resolution failures raise ConfigError at agent startup (keyed on tools.<name>), so a misconfigured function tool fails fast instead of mid-test:

File not found: tool.file does not resolve to an existing path (relative paths are resolved against the agent project root)
Import failure: The module raised an exception during exec_module — the original traceback is chained into the ConfigError
Function not found: The module loaded but has no attribute matching tool.function
Not callable: The attribute exists but is not callable (e.g., a constant rather than a function)
Runtime error inside the function: Caught by the backend, surfaced to the model as a tool error response so the agent can recover or retry

MCP Tool Errors¶

Server unavailable: Error during agent startup (fails fast)
Command not found: Error if runtime (npx, uvx, docker) not installed
Connection timeout: Configurable via request_timeout, returns error
Invalid config: Error during agent startup (validation)
Runtime errors: Returned as tool error responses to the LLM

Prompt Tool Errors¶

Invalid template: Error during agent startup
LLM failure: Soft failure (logged, error message returned)
Template rendering: Error during execution

Performance Tips¶

Vectorstore Tools¶

Use appropriate chunk size (larger = fewer embeddings)
Enable caching for remote files
Reduce vector_field count if possible
Index only necessary fields

Function Tools¶

Keep functions fast (<1 second)
Use connection pooling for databases
Cache results when possible

MCP Tools¶

Use server-side filtering when available
Limit result sets
Cache responses locally

Prompt Tools¶

Use simpler models for repeated operations
Batch processing when possible
Limit template complexity

Best Practices¶

Clear Names: Use descriptive tool names
Clear Descriptions: Agent uses description to decide when to call tool
Parameters: Define expected parameters clearly
Error Handling: Handle errors gracefully
Performance: Test with realistic data
Versioning: Manage tool file versions in source control
Testing: Include test cases that exercise each tool

Next Steps¶

See Agent Configuration Guide for tool usage
See File References Guide for path resolution
See Examples for complete tool usage