TestRunBuilder

@final
class TestRunBuilder(Generic[T])

Builder for test runs. Execution flow: With platform evaluators only (e.g. "Bias"), the SDK pushes entries and the Maxim platform runs the prompt or workflow. With local evaluators or yields_output(), the SDK runs the prompt/workflow locally (or calls your output function) and then runs evaluators locally.

init

def __init__(base_url: str, api_key: str, name: str, workspace_id: str,
             evaluators: List[Union[str, BaseEvaluator]])

Constructor

with_data_structure

def with_data_structure(data: T) -> "TestRunBuilder[T]"

Set the data structure for the test run Arguments:

Name	Type	Description
`data`	T	The data structure to use

Returns:

Name	Description
`[TestRunBuilder](/sdk/python/references/test_runs/test_run_builder)[T]`	The current TestRunBuilder instance for method chaining

with_data

def with_data(data: Data) -> "TestRunBuilder[T]"

Set the data for the test run Arguments:

Name	Type	Description
`data`	DataValue[T]	The data to use

Returns:

Name	Description
`[TestRunBuilder](/sdk/python/references/test_runs/test_run_builder)[T]`	The current TestRunBuilder instance for method chaining

with_evaluators

def with_evaluators(
        *evaluators: Union[str, BaseEvaluator]) -> "TestRunBuilder[T]"

Add evaluators to the test run. Use platform evaluators by name (e.g. "Bias") or BaseEvaluator instances for local evaluators. With local evaluators, the SDK runs the prompt/workflow locally to get output before evaluating. Arguments:

Name	Type	Description
`*evaluators`	str	The evaluators to add

Returns:

Name	Description
`[TestRunBuilder](/sdk/python/references/test_runs/test_run_builder)[T]`	The current TestRunBuilder instance for method chaining

with_human_evaluation_config

def with_human_evaluation_config(
        config: HumanEvaluationConfig) -> "TestRunBuilder[T]"

Set the human evaluation configuration for the test run Arguments:

Name	Type	Description
`config`	HumanEvaluationConfig	The human evaluation configuration to use

Returns:

Name	Description
`[TestRunBuilder](/sdk/python/references/test_runs/test_run_builder)[T]`	The current TestRunBuilder instance for method chaining

with_workflow_id

def with_workflow_id(
        workflow_id: str,
        context_to_evaluate: Optional[str] = None) -> "TestRunBuilder[T]"

Set the workflow ID for the test run. Optionally, you can also set the context to evaluate for the workflow. (Note: setting the context to evaluate will end up overriding the CONTEXT_TO_EVALUATE dataset column value) Arguments:

Name	Type	Description
`workflow_id`	str	The ID of the workflow to use
`context_to_evaluate`	Optional[str]	The context to evaluate for the workflow (variable name essentially).

Returns:

Name	Description
`[TestRunBuilder](/sdk/python/references/test_runs/test_run_builder)[T]`	The current TestRunBuilder instance for method chaining

Raises:

ValueError - If a prompt version ID, prompt chain version ID or output function is already set for this run builder

with_prompt_version_id

def with_prompt_version_id(
        prompt_version_id: str,
        context_to_evaluate: Optional[str] = None) -> "TestRunBuilder[T]"

Set the prompt version ID for the test run. Optionally, you can also set the context to evaluate for the prompt. (Note: setting the context to evaluate will end up overriding the CONTEXT_TO_EVALUATE dataset column value) Arguments:

Name	Type	Description
`prompt_version_id`	str	The ID of the prompt version to use
`context_to_evaluate`	Optional[str]	The context to evaluate for the prompt (variable name essentially).

Returns:

Name	Description
`[TestRunBuilder](/sdk/python/references/test_runs/test_run_builder)[T]`	The current TestRunBuilder instance for method chaining

Raises:

ValueError - If a workflow ID, prompt chain version ID or output function is already set for this run builder

with_prompt_chain_version_id

def with_prompt_chain_version_id(
        prompt_chain_version_id: str,
        context_to_evaluate: Optional[str] = None) -> "TestRunBuilder[T]"

Set the prompt chain version ID for the test run. Optionally, you can also set the context to evaluate for the prompt chain. (Note: setting the context to evaluate will end up overriding the CONTEXT_TO_EVALUATE dataset column value) Arguments:

Name	Type	Description
`prompt_chain_version_id`	str	The ID of the prompt chain version to use
`context_to_evaluate`	Optional[str]	The context to evaluate for the prompt chain (variable name essentially).

Returns:

Name	Description
`[TestRunBuilder](/sdk/python/references/test_runs/test_run_builder)[T]`	The current TestRunBuilder instance for method chaining

Raises:

ValueError - If a workflow ID, prompt version ID or output function is already set for this run builder

with_simulation_config

def with_simulation_config(simulation_config: SimulationConfig) -> "TestRunBuilder[T]"

Set the simulation configuration for the test run. Use this to run AI-simulated multi-turn conversations against your prompt or workflow. Execution flow: With platform evaluators only, the simulation runs on the Maxim platform. With local evaluators, the SDK fetches simulation output and runs your evaluators locally. When used with yields_output(), the SDK runs your output function locally in a turn-by-turn loop. You can omit both with_prompt_version_id() and with_workflow_id() for SDK-only simulation. Your function receives SimulationContext with conversation_history, current_user_input, turn_number, total_cost, and total_tokens. Use turn.response.get("output", "") for the assistant’s text in each turn. Arguments:

Name	Type	Description
`simulation_config`	SimulationConfig	The simulation configuration (e.g., `max_turns`, `persona`)

Returns:

Name	Description
`[TestRunBuilder](/sdk/python/references/test_runs/test_run_builder)[T]`	The current TestRunBuilder instance for method chaining

Raises:

ValueError - If simulation config is used with with_prompt_chain_version_id (use with_workflow_id or with_prompt_version_id instead)

with_preset

def with_preset(preset_name: str) -> "TestRunBuilder[T]"

Set a preset (test configuration) to use for this test run. The preset provides default values for datasets, evaluators, simulation config, and context-to-evaluate settings. Any values explicitly set via other builder methods will take priority over preset defaults. Requires an entity to be set first via with_workflow_id(), with_prompt_version_id(), or with_prompt_chain_version_id(), as the preset is scoped to a specific entity. Arguments:

Name	Type	Description
`preset_name`	str	The name of the preset (test config) saved on the Maxim platform

Returns:

Name	Description
`[TestRunBuilder](/sdk/python/references/test_runs/test_run_builder)[T]`	The current TestRunBuilder instance for method chaining

Raises:

ValueError - If preset_name is empty or not a string

yields_output

def yields_output(
    output_function: Callable[..., Union[YieldedOutput, Awaitable[YieldedOutput]]]
) -> "TestRunBuilder[T]"

Set the output function for the test run. When combined with with_simulation_config(), enables local-execution simulation where your function is called turn-by-turn with SimulationContext (conversation history, current user input, turn number, total_cost, total_tokens). You can omit both with_prompt_version_id() and with_workflow_id() for SDK-only simulation. Use turn.response.get("output", "") for the assistant’s text in each turn of conversation_history. Arguments:

Name	Type	Description
`output_function`	Callable[[LocalData] or [LocalData, Optional[SimulationContext]], Union[YieldedOutput, Awaitable[YieldedOutput]]]	The output function. Accepts `(data, simulation_context)` when used with simulation

Returns:

Name	Description
`[TestRunBuilder](/sdk/python/references/test_runs/test_run_builder)[T]`	The current TestRunBuilder instance for method chaining

Raises:

ValueError - If a workflow ID, prompt chain version ID or prompt version ID is already set for this run builder

with_concurrency

def with_concurrency(concurrency: int) -> "TestRunBuilder[T]"

Set the concurrency level for the test run Arguments:

Name	Type	Description
`concurrency`	int	The concurrency level to use

Returns:

Name	Description
`[TestRunBuilder](/sdk/python/references/test_runs/test_run_builder)[T]`	The current TestRunBuilder instance for method chaining

with_logger

def with_logger(logger: TestRunLogger) -> "TestRunBuilder[T]"

Set the logger for the test run Arguments:

Name	Type	Description
`logger`	TestRunLogger	The logger to use

Returns:

Name	Description
`[TestRunBuilder](/sdk/python/references/test_runs/test_run_builder)[T]`	The current TestRunBuilder instance for method chaining

run

def run(timeout_in_minutes: Optional[int] = 10) -> Optional[RunResult]

Run the test Arguments:

Name	Type	Description
`timeout_in_minutes`	Optional[int]	The timeout in minutes. Defaults to 10.

Returns:

Name	Description
`[RunResult](/sdk/python/references/models/test_run)`	The result of the test run

Overview

Python

Typescript

TestRunBuilder

TestRunBuilder

init

with_data_structure

with_data

with_evaluators

with_human_evaluation_config

with_workflow_id

with_prompt_version_id

with_prompt_chain_version_id

with_simulation_config

with_preset

yields_output

with_concurrency

with_logger

run

Overview

Python

Typescript

Documentation Index

​TestRunBuilder

​__init__

​with_data_structure

​with_data

​with_evaluators

​with_human_evaluation_config

​with_workflow_id

​with_prompt_version_id

​with_prompt_chain_version_id

​with_simulation_config

​with_preset

​yields_output

​with_concurrency

​with_logger

​run

TestRunBuilder

init

with_data_structure

with_data

with_evaluators

with_human_evaluation_config

with_workflow_id

with_prompt_version_id

with_prompt_chain_version_id

with_simulation_config

with_preset

yields_output

with_concurrency

with_logger

run