1. Set Up Your Environment
First, configure your AI model providers:Maxim requires at least one provider with access to GPT-3.5 and GPT-4 models. We use industry-standard encryption to securely store your API keys.
2. Create Your First Prompt or HTTP Endpoint
Create prompts to experiment and evaluate a call to a model with attached context or tools. Use endpoints to easily test your complex AI agents using the HTTP endpoint for your application without any integration.Prompt
Create prompt
Navigate to the
Prompts tab under the Evaluate section and click on Single prompts. Click Create prompt or Try sample to get started.Configure model and parameters.
Configure additional settings like model, temperature, and max tokens.
Iterate
Click
Run to test your prompt and see the AI’s response. Iterate on your prompt based on the results.HTTP Endpoint
Create endpoint
Navigate to the
HTTP Endpoints option under the tab Agents located in the Evaluate section. Click Create Endpoint or Try sample.Configure agent endpoint
Enter your API endpoint URL in the
URL field. Configure any necessary headers or parameters. You can use dynamic variables like {input} to reference static context easily in any part of your endpoint using {}3. Prepare Your Dataset
Organize and manage the data you’ll use for testing and evaluation:Create dataset
Navigate to the Datasets tab under the
Library section. Click Create New or Upload CSV or Generate synthetic Data to get started.Edit dataset
If creating a new dataset, enter a name and description for your dataset. Add columns to your dataset (e.g., ‘input’ and ‘expected_output’).
5. Add Evaluators
Set up evaluators to assess your prompt or endpoint’s performance:Add evaluators from store
Navigate to the
Evaluators tab under the Library section. Click Add Evaluator to browse available evaluators.6. Run Your First Test
Execute a test run to evaluate your prompt or endpoint:Select endpoint/prompt to test
Navigate to your saved prompt or endpoint. Click
Test in the top right corner.Configure test run
Select the dataset you created earlier. Choose the evaluators you want to use for this test run.
If you’ve added human evaluators, you’ll be prompted to set up human annotation on the report or via email.
7. Analyze Test Results
Review and analyze the results of your test run:View report
Navigate to the
Runs tab in the left navigation menu. Find your recent test run and click on it to view details.Review performance
Review the overall performance metrics and scores for each evaluator. Drill down into individual queries to see specific scores and reasoning.
Next Steps
Now that you’ve completed your first cycle on the Maxim platform, consider exploring these additional capabilities:- Prompt comparisons: Evaluate different prompts side-by-side to determine which ones produce the best results for a given task.
- Agents via no-code builder: Create complex, multi-step AI workflows. Learn how to connect prompts, code, and APIs to build powerful, real-world AI systems using our intuitive, no-code editor.
- Context sources: Integrate Retrieval-Augmented Generation (RAG) into your agent endpoints.
- Prompt tools: Enhance your prompts with custom functions and agentic behaviors.
- Observability: Use our stateless SDK to monitor real-time production logs and run periodic quality checks.
Schedule a demo to see how Maxim AI helps teams ship reliable agents.