Node level evaluation
Evaluate any component of your trace or log to gain insights into your agent’s behavior.
What is Node level evaluation (or Agentic evaluation)?
As your AI application grows in complexity, it becomes increasingly difficult to understand how it is performing on different flows and components. This granular insight becomes necessary to identify bottlenecks or low quality areas in your application’s or agent’s flow. By targeting the underperforming areas, you can optimize overall performance more effectively than using brute force approaches.
This is where Node level evaluation can help out. It enables you to evaluate a trace or its component (a span, generation or retrieval) in isolation. This can be done via the Maxim SDK’s logger
using a very simple API. Let us see how we can start evaluating our nodes.
Before you start
You need to have your logging set up to capture interactions between your LLM and users before you can evaluate them. To do so, you would need to integrate Maxim SDK into your application.
Understanding how the Maxim SDK logger evaluates
Two actions are mainly required to evaluate a node:
- Attach Evaluators: This action defines what evaluators to run on the particular node, this needs to be called to start an evaluation on any component.
- Attach Variables: Once evaluators are attached on a component, each evaluator waits for all the variables it needs to evaluate to be attached to it. Only after all the variables an evaluator needs are attached, does it start processing.
Once you have attached evaluators and variables to them, we will process the evaluator and display the results in the Evaluation
tab under the respective node.
- The evaluator will not run until all of the variables it needs are attached to it.
- If we don’t receive all the variables needed for an evaluator for over 5 minutes, we will start displaying a
Missing variables
message (although we will still process the evaluator even if variables are received after 5 minutes). - The variables that an evaluator needs can be found in the evaluator’s page. The evaluator test panel on the right has all the variables that the evalutor needs listed (all of them are required).
As per the image above, we can see that the evaluator needs
input
,context
andexpectedOutput
variables.
Attaching evaluators via Maxim SDK
We use the withEvaluators
method to attach evaluators to any component within a trace or the trace itself. It is as easy as just listing the names of the evaluators you want to attach, which are available on the platform.
If you list an evaluator that doesn’t exist in your workspace but is available in the store, we will auto install it for you in the workspace.
If the evaluator is not available in the store as well, we will ignore it.
Providing variables to evaluators
Once evaluators are attached to a component, variables can be passed to them via the withVariables
method. This method accepts a key-value pair of variable names to their values.
You also need to specify which evaluators you want these variables to be attached to, which can be done by passing the list of evaluator names as the second argument.
You can directly chain the withVariables
method after attaching evaluators to any component. Allowing you to skip mentioning the evaluator names again.
Viewing evaluation results on evaluations tab
This is very similar to Making sense of evaluations on logs, except that the evaluations for each component appear on their own card as it did for the trace.
Code example for agentic evaluation
This example displays how Node level evaluation might fit in a workflow.
Best practices
- Use evaluators selectively to monitor key performance metrics. Do not overdo with attaching too many evaluators.
- Setup sampling and filtering according to your needs to ensure accurate evaluation processing without eating up too much cost.
- Attach variables reliably to ensure no evaluation is left pending due to lack of variables.