Supported Providers
OpenAI
GPT-4o, GPT-4, GPT-3.5-turbo
Anthropic
Claude 3 Opus, Sonnet, Haiku
Model Comparison
| Model | Provider | Speed | Quality | Cost | Best For |
|---|---|---|---|---|---|
| GPT-4o | OpenAI | Fast | Excellent | Medium | General purpose, balanced |
| GPT-4 | OpenAI | Medium | Excellent | High | Complex reasoning |
| GPT-3.5-turbo | OpenAI | Very Fast | Good | Low | Simple tasks, high volume |
| Claude 3 Opus | Anthropic | Medium | Excellent | High | Nuanced analysis, long context |
| Claude 3 Sonnet | Anthropic | Fast | Very Good | Medium | Balanced performance |
| Claude 3 Haiku | Anthropic | Very Fast | Good | Low | Fast responses, simple tasks |
Choosing a Model
For Quality-Critical Tasks
- GPT-4o or Claude 3 Opus
- Complex analysis, nuanced responses
- Higher cost, worth it for important outputs
For High-Volume Tasks
- GPT-3.5-turbo or Claude 3 Haiku
- Simple classification, extraction
- Low cost, high throughput
For Long Documents
- Claude 3 models
- Support up to 200K tokens context
- Ideal for document analysis
Model Capabilities
Context Window
How much input each model can handle:| Model | Max Input Tokens |
|---|---|
| GPT-4o | 128,000 |
| GPT-4 | 128,000 |
| GPT-3.5-turbo | 16,385 |
| Claude 3 Opus | 200,000 |
| Claude 3 Sonnet | 200,000 |
| Claude 3 Haiku | 200,000 |
JSON Mode
Some models support native JSON output mode:| Model | JSON Mode |
|---|---|
| GPT-4o | ✅ Supported |
| GPT-4 | ✅ Supported |
| GPT-3.5-turbo | ✅ Supported |
| Claude 3 models | Via prompting |
JSON mode is enabled automatically when you define an output schema.
Model Selection in Prompts
When creating or editing a prompt:- Click the Model dropdown
- Select your preferred model
- The model is saved with the prompt
Cost Considerations
LLM costs are based on tokens:- Input tokens: Your prompt text (including rendered template)
- Output tokens: The model’s response
| Cost Factor | Impact |
|---|---|
| Model choice | Higher-end models cost more per token |
| Prompt length | Longer prompts = more input tokens |
| Response length | Higher max_tokens = potentially more output tokens |
| Request volume | More requests = more total cost |
Model-Specific Tips
- OpenAI
- Anthropic
GPT-4o is the recommended default:
- Best balance of speed, quality, and cost
- Multimodal capabilities (can process images)
- Reliable JSON output
- Use temperature 0-0.3 for deterministic tasks
- Enable JSON mode for structured output
- GPT-3.5-turbo is 10x cheaper for simple tasks
Adding Provider API Keys
Your tenant needs API keys configured for each provider you want to use:- Go to LLM API Keys in the sidebar (under Configuration)
- Add your OpenAI and/or Anthropic API keys
- Keys are encrypted and stored securely

