Suite
object,
then call create()
. For example:
Note on Async: All SDK functions are asynchronous, so you will need to call them from an asynchronous context. See the async docs for more information.
context
parameter of the Test
:
NOTE: Context field values can be either raw strings or JSON objects. If it is a JSON object, it will be parsed correctly and pretty-printed in the UI.
global_checks
parameter. For example, this is how you would check the grammar of every test by default.
Suite.from_id
:
run()
function. This will run all the tests in the suite
against your model. We support three different ways to produce outputs as you run the suite:
gpt-4o-mini
performs on the tests you’ve defined.
NOTE: If you are using this method, the input_under_test_ field in the QuestionAnswerPair
must match the input_under_test field in the test suite. Likewise, if you are using either
the context or file features, both the context and files must also match.
run()
function to control its behavior, in addition to the model
parameter. If you set wait_for_completion=True
, the function will block until the run is complete (by default, it will return as soon as the run is started, not when it’s complete). You can also pass a run_name
parameter to uniquely identify the run - this is useful if you’re starting many runs of the same test suite, and need a way to disambiguated them.
Finally, you can also pass a RunParameters
object to the run()
function to control more aspects of the run. Some options include:
eval_model
: The model to use as the LLM as judgeparallelism
: The number of tests to run at Onceheavyweight_factor
: Run the auto eval multiple times and take the mode of the resultsmax_output_tokens
: If using the first model option above, control the max_output_tokens. Ignored if outputs are provided directly or using a function.system_prompt
: If using the first model option above, provide a system prompt to the model.except_on_error
: Will raise an exception if the run fails.custom_parameters
: Custom parameters to pass to the model. This will be shown in the run result page, even when running with function.Run
object. You can access the results of each test in the test_results
property, as well as the top-line pass rate, the URL, and other information.
Suite.from_file()
function.
golden_answer
field in the test. A full example is as follows: