Overview
The Human Review system allows you to queue test runs for manual evaluation by human reviewers. This provides a way to validate model outputs beyond automated checks.Adding a Run to Review Queue
Queue a run for human review using theadd_to_queue() method:
Parameters
assigned_reviewers- List of reviewer email addresses (empty list allows any reviewer)number_of_reviews- How many reviewers will evaluate each test (default: 1)rereview_auto_eval- Whether to re-run auto-evaluation after reviews (default: True)
Working with Reviews
Once a run is queued, you can access the review through therun.review cached property:
Key Properties
id- Same asrun.review_idstatus- Current review status (Pending, Archived, or Completed)pass_rate_human_eval- Pass rate across all human reviewsagreement_rate_human_eval- Agreement rate between human reviewerstest_results- List of completed test results (cached property, requiresawait)
Working with Test Results
Access individual test result reviews to get detailed feedback:Test Result Properties
reviewed_by- List of reviewer email addressesreviews- List of all reviews for this testtest- The original test being reviewedcheck_results- Auto-evaluated check results
Test Review Properties
feedback- Optional reviewer feedbackcompleted_by- Reviewer who completed this reviewcompleted_at/started_at- Timestampsauto_eval_review_values- Human validation of auto-evaluationscustom_review_values- Custom template review data