The Test Runner
Every prompt has a Test button that opens the test runner:- Auto-generated form based on your input schema
- Real-time execution against the actual LLM
- Response preview with timing and token usage
- History of past test runs
Single Request Testing
Test Results Panel
After running a test, you’ll see:| Section | Description |
|---|---|
| Response | The JSON output from the LLM |
| Latency | Time taken for the request |
| Tokens | Input and output token counts |
| Cost | Estimated cost of the request |
| Raw Output | Unprocessed LLM response |
Bulk CSV Testing
Test multiple inputs at once by uploading a CSV file:CSV Best Practices
Use quoted strings
Use quoted strings
Wrap text containing commas in double quotes:
"Hello, world"Match field names exactly
Match field names exactly
Column headers must match your input schema field names.
Start small
Start small
Test with 5-10 rows first before running hundreds.
Include edge cases
Include edge cases
Add rows with empty optional fields, long text, special characters.
Saved Test Datasets
Save frequently-used test data for quick access:
Saved datasets are useful for:
- Regression testing after prompt changes
- Comparing outputs across different prompts
- Onboarding team members with realistic examples
Comparing Prompts
Test the same input against multiple prompts:- Open the endpoint’s Testing tab
- Select multiple prompts to compare
- Enter your test input
- Run the comparison
- View results side-by-side
- Compare model performance (GPT-4 vs Claude)
- Evaluate prompt variations
- Choose the best approach before going live
Test History
All test runs are saved in your history:- View past test inputs and outputs
- Re-run previous tests
- Track how responses change over time
Test history is separate from production logs. Tests don’t count against your API usage limits.
Testing Best Practices
Test Edge Cases
Empty strings, very long inputs, special characters, missing optional fields
Test Realistic Data
Use real-world examples, not just “test” and “hello world”
Test Before Promoting
Always test Draft prompts before promoting to Live
Save Good Datasets
Build a library of test cases you can reuse
Debugging Failed Tests
If your test fails or returns unexpected results:Check Input Validation
Check JSON Parsing
Check Token Limits
Check Rate Limits
Keyboard Shortcuts
| Shortcut | Action |
|---|---|
Ctrl/Cmd + Enter | Run test |
Ctrl/Cmd + S | Save test input |
Escape | Close test runner |

