Test Execution Framework API¶

The test runner orchestrates the complete test execution pipeline for HoloDeck agents, from configuration resolution through evaluation and result reporting.

Test Case Configuration¶

Configuration models for defining test cases with multimodal file support.

`TestCaseModel` ¶

Bases: BaseModel

Test case for agent evaluation.

Represents a single test scenario with input, optional expected output, expected tool usage, multimodal file inputs, and RAG context.

`validate_files(v)` `classmethod` ¶

Validate files list is not empty if provided.

Source code in src/holodeck/models/test_case.py

@field_validator("files")
@classmethod
def validate_files(cls, v: list[FileInput] | None) -> list[FileInput] | None:
    """Validate files list is not empty if provided."""
    if v is not None and len(v) > 10:
        raise ValueError("Maximum 10 files per test case")
    return v

`validate_ground_truth(v)` `classmethod` ¶

Validate ground_truth is not empty if provided.

Source code in src/holodeck/models/test_case.py

@field_validator("ground_truth")
@classmethod
def validate_ground_truth(cls, v: str | None) -> str | None:
    """Validate ground_truth is not empty if provided."""
    if v is not None and (not v or not v.strip()):
        raise ValueError("ground_truth must be non-empty if provided")
    return v

`validate_input(v)` `classmethod` ¶

Validate input is not empty.

Source code in src/holodeck/models/test_case.py

@field_validator("input")
@classmethod
def validate_input(cls, v: str) -> str:
    """Validate input is not empty."""
    if not v or not v.strip():
        raise ValueError("input must be a non-empty string")
    return v

`validate_name(v)` `classmethod` ¶

Validate name is not empty if provided.

Source code in src/holodeck/models/test_case.py

@field_validator("name")
@classmethod
def validate_name(cls, v: str | None) -> str | None:
    """Validate name is not empty if provided."""
    if v is not None and (not v or not v.strip()):
        raise ValueError("name must be non-empty if provided")
    return v

`FileInput` ¶

Bases: BaseModel

File input for multimodal test cases.

Represents a single file reference for test case inputs, supporting both local files and remote URLs with optional extraction parameters.

`check_path_or_url(v, info)` `classmethod` ¶

Validate that exactly one of path or url is provided.

Source code in src/holodeck/models/test_case.py

@field_validator("path", "url", mode="before")
@classmethod
def check_path_or_url(cls, v: Any, info: Any) -> Any:
    """Validate that exactly one of path or url is provided."""
    # This runs before validation, so we check in root_validator
    return v

`model_post_init(__context)` ¶

Validate path and url mutual exclusivity after initialization.

Source code in src/holodeck/models/test_case.py

def model_post_init(self, __context: Any) -> None:
    """Validate path and url mutual exclusivity after initialization."""
    if self.path and self.url:
        raise ValueError("Cannot provide both 'path' and 'url'")
    if not self.path and not self.url:
        raise ValueError("Must provide either 'path' or 'url'")

`validate_pages(v)` `classmethod` ¶

Validate pages are positive integers.

Source code in src/holodeck/models/test_case.py

@field_validator("pages")
@classmethod
def validate_pages(cls, v: list[int] | None) -> list[int] | None:
    """Validate pages are positive integers."""
    if v is not None and not all(isinstance(p, int) and p > 0 for p in v):
        raise ValueError("pages must be positive integers")
    return v

`validate_type(v)` `classmethod` ¶

Validate file type is supported.

Source code in src/holodeck/models/test_case.py

@field_validator("type")
@classmethod
def validate_type(cls, v: str) -> str:
    """Validate file type is supported."""
    valid_types = {"image", "pdf", "text", "excel", "word", "powerpoint", "csv"}
    if v not in valid_types:
        raise ValueError(f"type must be one of {valid_types}, got {v}")
    return v

Test Results¶

Data models for test execution results and metrics.

`TestResult` ¶

Bases: BaseModel

Result of executing a single test case.

Contains the test input, agent response, tool calls, metric results, and overall pass/fail status along with any errors encountered.

Example Usage¶

from holodeck.lib.test_runner.executor import TestExecutor
from holodeck.config.loader import ConfigLoader

# Load agent configuration
loader = ConfigLoader()
config = loader.load("agent.yaml")

# Create executor and run tests
executor = TestExecutor()
results = executor.run_tests(config)

# Access results
for test_result in results.test_results:
    print(f"Test {test_result.test_name}: {test_result.status}")
    print(f"Metrics: {test_result.metrics}")

Multimodal File Support¶

The test runner integrates with the file processor to handle:

Images: JPG, PNG with OCR support
Documents: PDF (full or page ranges), Word, PowerPoint
Data: Excel (sheet/range selection), CSV, text files
Remote Files: URL-based inputs with caching

Files are automatically processed before agent invocation and included in test context.

Integration with Evaluation Framework¶

Test results automatically pass through the evaluation framework:

NLP Metrics: Computed on all test outputs (F1, BLEU, ROUGE, METEOR)
AI-powered Metrics: Optional evaluation by Azure AI models
Custom Metrics: User-defined evaluation functions

Evaluation configuration comes from the agent's evaluations section.

Data Models: Test case and result models
Evaluation Framework: Metrics and evaluation system
Configuration Loading: Loading agent configurations

Test Execution Framework API¶

Test Case Configuration¶

TestCaseModel ¶

validate_files(v) classmethod ¶

validate_ground_truth(v) classmethod ¶

validate_input(v) classmethod ¶

validate_name(v) classmethod ¶

FileInput ¶

check_path_or_url(v, info) classmethod ¶

model_post_init(__context) ¶

validate_pages(v) classmethod ¶

validate_type(v) classmethod ¶

Test Results¶

TestResult ¶

Example Usage¶

Multimodal File Support¶

Integration with Evaluation Framework¶

Related Documentation¶

`TestCaseModel` ¶

`validate_files(v)` `classmethod` ¶

`validate_ground_truth(v)` `classmethod` ¶

`validate_input(v)` `classmethod` ¶

`validate_name(v)` `classmethod` ¶

`FileInput` ¶

`check_path_or_url(v, info)` `classmethod` ¶

`model_post_init(__context)` ¶

`validate_pages(v)` `classmethod` ¶

`validate_type(v)` `classmethod` ¶

`TestResult` ¶