Note: By default, we only make a subset of models available to users. If you’re interested in using additional models, including those hosted on Bedrock and Azure, please reach out to the Vals team.Model Parameters:
includes
. By default, we evaluate with GPT-4o. However, using this dropdown, you can also choose another model for evaluation (such as Mistral, Llama, or Claude).DEFAULT
: Compute and display a confidence score (high or low) for each check.DEFAULT
: Create summary of the run.DEFAULT
: Run right answer comparison.