# Evaluator

<figure><img src="https://3697023207-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFSlso1Kjob5CLDrh0dVn%2Fuploads%2FPUJBIPZX3JoaSJuFW1tg%2Fevaluator_view.gif?alt=media&#x26;token=7eede172-0398-405f-befa-435d21493224" alt=""><figcaption></figcaption></figure>

The Evaluator View is similar to the batch interface, in that it allows running a CSV file of inputs on your agent, all at once. This view allows testing before a project goes live, and leverages a LLM to evaluate your agent's output.

There are two types of evaluation:

### 1. Grading outputs based on criteria&#x20;

On the right hand side, create an evaluator:

* Select the output to evaluate
* Add a system prompt - the evaluation logic
* Give it a name

Once the evaluator is created, a new column will appear in the table showing the evaluation results for each row.

<figure><img src="https://3697023207-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFSlso1Kjob5CLDrh0dVn%2Fuploads%2FOpqszZX39WgRKRPfhDBB%2Fimage.png?alt=media&#x26;token=b36471d9-3e64-498e-9150-092701afcc1f" alt=""><figcaption></figcaption></figure>

Add as many evaluators as outputs in your workflow. Each one will evaluate a different output. Give each evaluator's model a system prompt and select which of your agent's outputs should be evaluated.

<figure><img src="https://3697023207-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFSlso1Kjob5CLDrh0dVn%2Fuploads%2FPx3um6v9diQmvwjkVauq%2Fimage.png?alt=media&#x26;token=0a75bfba-6458-483c-85d8-4010d2da8476" alt=""><figcaption></figcaption></figure>

You can manually add rows to evaluate, or upload a CSV with all your scenarios to evaluate (click the 3 dots and then the upload CSV option).&#x20;

### 2. Comparing outputs to a gold standard answer&#x20;

<figure><img src="https://3697023207-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFSlso1Kjob5CLDrh0dVn%2Fuploads%2Fho8flDgaYWwgGDBqF6iH%2Fimage.png?alt=media&#x26;token=a19d240a-d094-4336-9a6a-cff109e6731a" alt=""><figcaption></figcaption></figure>

Click 'Requires Expected Answer' to add a ground truth to your execution. This is the response you would expect from the AI model. The evaluator will then take it into consideration for the analysis.
