The Vals platform is designed to allow you to easily test LLM applications — copilots, RAG systems, and more. This documentation covers how to use both the web application and the CLI/SDK tools. At the core of the platform are Test Suites. Each Test Suite is designed to evaluate your model’s performance in a specific area. A Test Suite is composed of multiple Tests. Each “Test” has exactly one input, typically representing user input. For example, if you are testing a math copilot, an input might be:
What is 3 * 2?
Each Test also includes a set of Checks. A Check evaluates whether the model’s response meets a specific expectation. For the input above, a Check might be:
The output includes the number 6.
The rest of this documentation explains how to create Test Suites and run them against models. To get started, see Creating Test Suites.