Evaluate

POST /api/2/auto_eval/evaluation/evaluate

Evaluate a metric on rows of data, returning scores for each row. Specify metric.id or metric.name to identify the metric.

Request

application/json

Body

metric

object

required

id string

name string

description string

deploymentStatus ModelDeploymentStatus (string)

Possible values: [MODEL_DEPLOYMENT_STATUS_UNSPECIFIED, MODEL_DEPLOYMENT_STATUS_PENDING, MODEL_DEPLOYMENT_STATUS_ONLINE, MODEL_DEPLOYMENT_STATUS_OFFLINE, MODEL_DEPLOYMENT_STATUS_PAUSED]

input string[]required

output string[]required

groundTruth string[]required

projectId string

The project where evaluation inference logs will be stored

metadata

object

Common metadata relevant to the application configuration from which all request inputs were derived. E.g. 'llm_model', 'chunk_size'

fields

object

required

property name*

object

Ordered row values with length always equal to num_rows on the corresponding view.

property name* any

Ordered row values with length always equal to num_rows on the corresponding view.

Responses

Successful operation

application/json

Schema
Example (from schema)

Schema

metric

object

required

id string

name string

description string

deploymentStatus ModelDeploymentStatus (string)

Possible values: [MODEL_DEPLOYMENT_STATUS_UNSPECIFIED, MODEL_DEPLOYMENT_STATUS_PENDING, MODEL_DEPLOYMENT_STATUS_ONLINE, MODEL_DEPLOYMENT_STATUS_OFFLINE, MODEL_DEPLOYMENT_STATUS_PAUSED]

scores number[]required

{
  "metric": {
    "id": "string",
    "name": "string",
    "description": "string",
    "deploymentStatus": "MODEL_DEPLOYMENT_STATUS_UNSPECIFIED"
  },
  "scores": [
    0
  ]
}

Evaluate

/api/2/auto_eval/evaluation/evaluate

Request​

Body

Responses​

Request

Responses