Use this file to discover all available pages before exploring further.
This page covers how to import and export data to and from the Vals platform: setting up a test suite, uploading historical Q&A pairs, or pulling run results for offline analysis.
To get started
If you’re building a new suite, start with Importing Data.
If you’ve already run an evaluation and want to analyze results, jump to Exporting Results.
A full test suite import includes tests, checks, context, tags, and any associated files. Only Test Input is required; all other fields are optional, so you can import inputs alone without defining any checks. Imported tests are appended after any existing tests in the suite.If the import file includes a Test Id column and tests with matching IDs already exist in the suite, the import will be rejected by default. Check Overwrite tests with matching Test IDs in the import dialog to replace existing tests instead.If you want to import existing model outputs and run checks against them, see this pageSupported formats:CSV, JSON, ZIP
If your tests include file attachments (documents, images, etc.), use ZIP. Attached files should be stored under documents/ inside the ZIP.
These columns define an individual test: the input sent to the model and any supporting context. Only Test Input field is required.
Column
Type
Description
Test Id
str
Optional custom identifier for the test. Must be unique within the suite and at most 64 characters. If omitted, a random ID is generated
Test Input
str
The prompt or question sent to the LLM (e.g., What is burden shifting under Title VII?)
Right Answer *
str
The expected correct answer
Tags
str
Labels for organizing tests (e.g., math, law). Spread across rows. See Formatting Rules
Files **
str
Filename or path to an attached file (e.g., documents/doc1.pdf)
context:<key>
str
Context column for injecting key-value context into a test. Each context field gets its own column, prefixed with context: (e.g., context:date, context:user_id). See Context Columns
* In most cases, either Right Answer or Checks are used. Learn more about Right Answer ** Files will only work as expected for the .zip upload.
For fields that support multiple values (like Tags or Checks), each value goes in its own row beneath the test, rather than being comma-separated in a single cell.Example:
Test Id
Test Input
Tags
Operator
Criteria
19025787-…
Where is the Bay Area located?
Bay
includes
California
Easy
includes_exactly
Northern California, United States
excludes
Los Angeles
excludes_exactly
Atlantic Ocean
Each row without a new Test Input belongs to the previous test. Tags stack down, and each check gets its own row.
Context is represented as columns rather than rows. Each context field has its own column, prefixed with context:. The column name after the prefix becomes the context key, and the cell value becomes the context value.Example:
Test Input
context:user_id
context:language
Operator
Criteria
What is QSBS?
user-123
en
includes
qualified
Describe vesting
user-456
fr
includes
vesting
This creates two tests, each with a user_id and language context entry. Empty cells in context columns are ignored — the context key will simply be omitted for that test.
Results are best reviewed directly in the platform. If you need to export them for custom reporting or offline storage, we support CSV and JSON.Supported formats:CSV, JSON
We recommend CSV if the data needs to be reviewed by non-technical users, and JSON for any programmatic use case.
Export completed human review data for analyzing reviewer agreement, test-level feedback, and metric evaluations outside the platform.Supported formats:CSV
Only completed reviews will be included in exports.