Core Settings
Temperature
Controls randomness in the model’s output.| Value | Behavior | Use Cases |
|---|---|---|
| 0.0 | Most deterministic | Extraction, classification, factual Q&A |
| 0.1 - 0.3 | Consistent with minor variation | Summarization, analysis |
| 0.4 - 0.6 | Balanced | General tasks |
| 0.7 - 0.9 | Creative variation | Writing, brainstorming |
| 1.0 | Maximum randomness | Creative exploration |
Max Tokens
Maximum number of tokens in the model’s response.| Setting | Typical Use |
|---|---|
| 100-300 | Short answers, classifications |
| 500-1000 | Summaries, explanations |
| 1000-2000 | Detailed analyses |
| 2000+ | Long-form content, reports |
System Prompt
An optional instruction that sets the AI’s behavior and persona:- Setting a consistent persona
- Establishing behavioral guidelines
- Defining constraints and limitations
System prompts are separate from your main template. They’re sent to the model as a “system” message when supported.
Advanced Settings
Top P (Nucleus Sampling)
Alternative to temperature for controlling randomness:- Top P = 1.0: Consider all tokens (default)
- Top P = 0.9: Consider tokens in top 90% probability mass
- Top P = 0.5: More focused, less diverse outputs
Frequency Penalty
Reduces repetition by penalizing tokens based on how often they’ve appeared:| Value | Effect |
|---|---|
| 0.0 | No penalty (default) |
| 0.5 | Moderate reduction in repetition |
| 1.0+ | Strong avoidance of repeated phrases |
Presence Penalty
Encourages the model to talk about new topics:| Value | Effect |
|---|---|
| 0.0 | No penalty (default) |
| 0.5 | Moderate encouragement of new topics |
| 1.0+ | Strong push for novelty |
Settings by Use Case
Classification / Extraction
Summarization
Content Generation
Brainstorming / Ideation
Model-Specific Defaults
Different models may have different optimal settings:| Model | Recommended Temp | Notes |
|---|---|---|
| GPT-4o | 0.3 | Very capable at low temps |
| GPT-4 | 0.3 | Excellent reasoning |
| GPT-3.5-turbo | 0.5 | May need higher temp for quality |
| Claude 3 Opus | 0.3 | Excellent at following instructions |
| Claude 3 Sonnet | 0.4 | Good balance |
| Claude 3 Haiku | 0.5 | May need higher temp |
Setting Configuration
In the prompt editor:- Open the Settings panel (usually on the right side)
- Adjust the sliders or enter values
- Settings are saved with the prompt
Different prompts on the same endpoint can have different settings—useful for A/B testing configurations.
Testing Settings
When experimenting with settings:Common Issues
Truncated Responses
Symptom: Response ends mid-sentence. Solution: Increase max tokens.Too Much Variation
Symptom: Same input gives wildly different outputs. Solution: Lower temperature to 0.1-0.3.Repetitive Output
Symptom: Model repeats phrases or ideas. Solution: Increase frequency penalty to 0.3-0.5.Boring/Generic Output
Symptom: Responses feel template-like. Solution: Increase temperature to 0.6-0.8.Best Practices
Match settings to task
Match settings to task
Classification needs low temp. Creative writing needs higher temp.
Test before production
Test before production
Always test settings with real-world inputs before going live.
Be conservative with max tokens
Be conservative with max tokens
Set max tokens to what you actually need, not the maximum possible.
Document your choices
Document your choices
Note why you chose specific settings in the prompt description.

