Models

Pydantic response models returned by the CT SDK.

All models use extra='ignore' — new server fields never break older SDK versions. Fields use snake_case throughout.


from kaizen_sdk.models import Task, FeedbackResult, Prompt, Job, OptimizeResult, CostEstimate, TraceResult

Task

A tuning task (e.g. "summarize_ticket") with its configuration and current status.


class Task(BaseModel): ...

Fields

Field	Type	Default	Description
`id`	`uuid.UUID`	required	Unique task identifier.
`name`	`str`	required	Human-readable task name (e.g. `"summarize_ticket"`).
`description`	`str \| None`	`None`	Optional description of what the task does.
`task_schema`	`dict \| None`	`None`	JSON schema for expected input structure. (API alias: `schema_json`)
`feedback_threshold`	`int`	required	Number of feedback entries before auto-optimization triggers.
`feedback_retention_limit`	`int`	`1000`	Maximum feedback entries to retain per task.
`evaluator_config`	`dict \| None`	`None`	Custom evaluator configuration.
`teacher_model`	`str \| None`	`None`	LLM used as teacher during optimization (e.g. `"gpt-4o"`).
`judge_model`	`str \| None`	`None`	LLM used to judge prompt quality (e.g. `"gpt-4o-mini"`).
`module_type`	`str`	`"predict"`	DSPy module type.
`cost_budget`	`float \| None`	`None`	Maximum USD to spend per optimization run.
`github_repo`	`str \| None`	`None`	Git repository for auto-PR (e.g. `"org/repo"`).
`github_base_branch`	`str \| None`	`None`	Base branch for PRs (e.g. `"main"`).
`prompt_path`	`str \| None`	`None`	File path where the optimized prompt is written.
`prompt_format`	`str \| None`	`None`	Format for the prompt file (e.g. `"python"`, `"text"`).
`created_at`	`datetime`	required	Timestamp when the task was created.
`feedback_count`	`int`	`0`	Total feedback entries logged for this task.
`last_optimization`	`datetime \| None`	`None`	When the most recent optimization run completed.
`active_prompt_score`	`float \| None`	`None`	Eval score of the currently active prompt version.
`threshold_progress`	`str`	`"0/50"`	Human-readable progress toward feedback threshold (e.g. `"42/50"`).

Example


with CTClient() as client:
    task = client.create_task(name="summarize_ticket", feedback_threshold=100)
    print(task.id)             # uuid.UUID
    print(task.threshold_progress)  # "0/100"

FeedbackResult

A single feedback entry logged for a task.


class FeedbackResult(BaseModel): ...

Fields

Field	Type	Default	Description
`id`	`uuid.UUID`	required	Unique feedback entry identifier.
`task_id`	`uuid.UUID`	required	Task this feedback belongs to.
`inputs`	`dict \| None`	`None`	Input variables that were passed to the LLM.
`output`	`str \| None`	`None`	LLM output text that was evaluated.
`score`	`float \| None`	`None`	Quality score in `[0.0, 1.0]`.
`source`	`str \| None`	`None`	Source label (e.g. `"sdk"`, `"human"`, `"auto"`).
`metadata`	`dict \| None`	`None`	Arbitrary metadata attached to this entry. (API alias: `metadata_`)
`created_at`	`datetime`	required	Timestamp when the feedback was logged.

Example


with CTClient() as client:
    result = client.log_feedback(
        task_id=str(task.id),
        inputs={"ticket": "Login broken"},
        output="User cannot log in.",
        score=0.95,
    )
    print(result.id)        # uuid.UUID
    print(result.created_at)

Prompt

A prompt version produced by optimization (or the initial baseline).


class Prompt(BaseModel): ...

PromptVersion is an alias for Prompt used in listing contexts. They are the same type.

Fields

Field	Type	Default	Description
`id`	`uuid.UUID`	required	Unique prompt version identifier.
`task_id`	`uuid.UUID`	required	Task this prompt version belongs to.
`version_number`	`int`	required	Monotonically increasing version number.
`prompt_text`	`str \| None`	`None`	The full prompt template text.
`eval_score`	`float \| None`	`None`	Evaluation score achieved by this version.
`status`	`str`	required	Version status: `"active"`, `"candidate"`, `"archived"`.
`optimizer`	`str \| None`	`None`	Optimizer that produced this version (e.g. `"MIPROv2"`).
`dspy_version`	`str \| None`	`None`	DSPy version used during optimization.
`created_at`	`datetime`	required	Timestamp when this prompt version was created.

Example


with CTClient() as client:
    prompt = client.get_prompt(str(task.id))
    print(prompt.prompt_text)
    print(f"v{prompt.version_number} — score: {prompt.eval_score}")

Job

An optimization job (running or completed).


class Job(BaseModel): ...

Fields

Field	Type	Default	Description
`id`	`uuid.UUID`	required	Unique job identifier.
`task_id`	`uuid.UUID`	required	Task being optimized.
`prompt_version_id`	`uuid.UUID \| None`	`None`	The prompt version produced by this job (set on completion).
`status`	`str`	required	Job status: `"PENDING"`, `"RUNNING"`, `"EVALUATING"`, `"COMPILING"`, `"SUCCESS"`, `"FAILURE"`, `"PR_FAILED"`.
`triggered_by`	`str \| None`	`None`	How the job was triggered (`"api"`, `"threshold"`, `"webhook"`).
`feedback_count`	`int \| None`	`None`	Number of feedback entries used for this run.
`pr_url`	`str \| None`	`None`	Pull request URL if auto-PR was created.
`error_message`	`str \| None`	`None`	Error detail if `status == "FAILURE"` or `"PR_FAILED"`.
`job_metadata`	`dict \| None`	`None`	Internal metadata from the optimization run.
`progress_step`	`str \| None`	`None`	Human-readable current step (e.g. `"Compiling with MIPROv2..."`).
`started_at`	`datetime \| None`	`None`	When the job started running.
`completed_at`	`datetime \| None`	`None`	When the job completed or failed.
`created_at`	`datetime`	required	When the job was created.

Example


with CTClient() as client:
    job = client.get_job(str(optimize_result.job.id))
    print(f"{job.status}: {job.progress_step}")
    if job.status == "SUCCESS":
        print(f"PR: {job.pr_url}")
    elif job.status in ("FAILURE", "PR_FAILED"):
        print(f"Error: {job.error_message}")

OptimizeResult

Returned by trigger_optimization(). Contains the queued job and a cost estimate.


class OptimizeResult(BaseModel): ...

Fields

Field	Type	Default	Description
`job`	`Job`	required	The optimization job that was created.
`cost_estimate`	`CostEstimate`	required	Estimated cost for this optimization run.
`budget_warning`	`str \| None`	`None`	Warning message if estimated cost exceeds the task’s `cost_budget`.

Example


with CTClient() as client:
    result = client.trigger_optimization(str(task.id))
    print(f"Job: {result.job.id}, Status: {result.job.status}")
    print(f"Cost: ${result.cost_estimate.estimated_cost_usd:.4f}")
    if result.budget_warning:
        print(f"⚠ {result.budget_warning}")

CostEstimate

Estimated resource usage for an optimization run.


class CostEstimate(BaseModel): ...

Fields

Field	Type	Description
`estimated_cost_usd`	`float`	Estimated total USD cost for the optimization run.
`estimated_llm_calls`	`int`	Estimated number of LLM API calls.
`train_size`	`int`	Number of feedback entries used for training.
`val_size`	`int`	Number of feedback entries used for validation.
`max_trials`	`int`	Maximum optimization trials planned.
`teacher_model`	`str`	Teacher model that will be used (e.g. `"gpt-4o"`).
`judge_model`	`str`	Judge model that will be used (e.g. `"gpt-4o-mini"`).

TraceResult

A trace captured by the auto-instrument SDK.


class TraceResult(BaseModel): ...

Fields

Field	Type	Default	Description
`id`	`uuid.UUID`	required	Unique trace identifier.
`task_id`	`uuid.UUID`	required	Task this trace belongs to.
`prompt_text`	`str \| None`	`None`	The prompt text sent to the LLM.
`response_text`	`str \| None`	`None`	The LLM’s response text.
`model`	`str \| None`	`None`	Model name (e.g. `"gpt-4o-mini"`).
`tokens`	`int \| None`	`None`	Total tokens used in the LLM call.
`latency_ms`	`float \| None`	`None`	End-to-end latency in milliseconds.
`source_file`	`str \| None`	`None`	Source file where the LLM call was made.
`source_variable`	`str \| None`	`None`	Variable name holding the prompt in source code.
`score`	`float \| None`	`None`	Quality score applied to this trace (if scored).
`scored_by`	`str \| None`	`None`	Who/what scored the trace (e.g. `"sdk"`, `"human"`).
`created_at`	`datetime`	required	Timestamp when the trace was captured.

Example


# TraceResult is returned by CTClient.score()
with CTClient() as client:
    trace = client.score(
        trace_id=response.ct_trace_id,
        score=0.9,
        scored_by="human",
    )
    print(f"Trace {trace.id}: score={trace.score}, latency={trace.latency_ms}ms")