Evaluate Run
Evaluate Run
POST/api/2/auto_eval/evaluation/evaluate_run
Similar to Evaluate, but persists results as an EvaluationRun for further capabilites.
Request
- application/json
Body
Array [
]
metrics
object[]
required
Possible values: [MODEL_DEPLOYMENT_STATUS_UNSPECIFIED
, MODEL_DEPLOYMENT_STATUS_PENDING
, MODEL_DEPLOYMENT_STATUS_ONLINE
, MODEL_DEPLOYMENT_STATUS_OFFLINE
, MODEL_DEPLOYMENT_STATUS_PAUSED
]
The project where the evaluation run will be persisted
If specified, the evaluation run will be associated with this experiment
metadata
object
Common metadata relevant to the application configuration from which all request inputs were derived. E.g. 'llm_model', 'chunk_size' E.g. 'llm_model', 'chunk_size'
fields
object
required
property name*
object
Ordered row values with length always equal to num_rows
on the
corresponding view.
Ordered row values with length always equal to num_rows
on the
corresponding view.
Responses
- 200
Successful operation
- application/json
- Schema
- Example (from schema)
Schema
Array [
]
metricScores
object[]
required
metric
object
required
Possible values: [MODEL_DEPLOYMENT_STATUS_UNSPECIFIED
, MODEL_DEPLOYMENT_STATUS_PENDING
, MODEL_DEPLOYMENT_STATUS_ONLINE
, MODEL_DEPLOYMENT_STATUS_OFFLINE
, MODEL_DEPLOYMENT_STATUS_PAUSED
]
{
"runId": "string",
"metricScores": [
{
"metric": {
"id": "string",
"name": "string",
"description": "string",
"deploymentStatus": "MODEL_DEPLOYMENT_STATUS_UNSPECIFIED"
},
"scores": [
0
]
}
]
}