Endpoint Settings

The Settings tab lets you configure advanced endpoint behaviors like caching, rate limits, and other operational settings.

Accessing Settings

Open any endpoint and click the Settings tab in the workspace.

Caching

Caching stores LLM responses so identical requests return instantly without calling the LLM again.

Enable Caching

Setting	Description
Enable Cache	Toggle caching on/off for this endpoint
Cache Duration	How long to cache responses (in seconds)

Caching is based on the complete request payload. Different inputs produce different cache keys.

When to Use Caching

Good for Caching

Reference data lookups
Static content generation
Classification tasks
Repeated queries

Avoid Caching

User-specific content
Time-sensitive data
Random/creative outputs
Conversational contexts

Cache Bypass

During testing, you may want fresh responses. The test runner includes a “Bypass Cache” option that forces a new LLM call. You can also bypass cache programmatically:

curl -X POST https://yourcompany.api.endprompt.ai/api/v1/summarize \
  -H "x-api-key: your-key" \
  -H "x-cache-bypass: true" \
  -d '{"text": "..."}'

Rate Limits

Protect your endpoints from abuse with rate limiting.

Rate Limit Settings

Setting	Description
Requests per Minute	Maximum requests allowed per minute
Requests per Hour	Maximum requests allowed per hour
Requests per Day	Maximum requests allowed per day

Rate limits apply per API key. Different keys have independent limits.

Rate Limit Headers

Responses include rate limit information:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1699999999

When exceeded, requests receive a 429 Too Many Requests response:

{
  "error": "rate_limit_exceeded",
  "message": "Rate limit exceeded. Try again in 45 seconds.",
  "retry_after": 45
}

Visibility

Control who can see and use this endpoint.

Setting	Description
Public	Visible to all team members
Internal	Only visible to you and admins

Use Internal visibility for endpoints under development or for personal experiments.

Timeout Settings

Configure how long to wait for LLM responses.

Setting	Default	Description
Request Timeout	60s	Maximum time to wait for LLM response

Long-running prompts (complex analysis, long documents) may need higher timeouts.

Danger Zone

Actions in the Danger Zone are permanent and cannot be undone.

Delete Endpoint

Permanently removes the endpoint and all associated:

Prompts and versions
Execution logs
Cached responses

To delete:

Click Delete Endpoint
Type the endpoint name to confirm
Click Permanently Delete

Deleting an endpoint will break any applications calling it. Ensure nothing depends on the endpoint before deleting.

Configuration Best Practices

Start without caching

Enable caching only after you’ve stabilized your prompts and confirmed outputs are deterministic.

Set conservative rate limits

Start with lower limits and increase based on actual usage patterns.

Use Internal during development

Keep endpoints Internal while testing, then make Public when ready.

Document settings choices

Use the endpoint description to note why certain settings were chosen.

Settings by Use Case

Use Case	Cache	Rate Limit	Timeout
Search/lookup	Yes, 1 hour	100/min	30s
Content generation	No	20/min	60s
Document analysis	No	10/min	120s
Classification	Yes, 24 hours	200/min	30s
Chatbot	No	30/min	45s

Getting Started

Endpoints

Prompts

LLM Models

Accessing Settings

Caching

Enable Caching

When to Use Caching

Good for Caching

Avoid Caching

Cache Bypass

Rate Limits

Rate Limit Settings

Rate Limit Headers

Visibility

Timeout Settings

Danger Zone

Delete Endpoint

Configuration Best Practices

Settings by Use Case

Next Steps

Create a Prompt

API Authentication

Getting Started

Endpoints

Prompts

LLM Models

​Accessing Settings

​Caching

​Enable Caching

​When to Use Caching

Good for Caching

Avoid Caching

​Cache Bypass

​Rate Limits

​Rate Limit Settings

​Rate Limit Headers

​Visibility

​Timeout Settings

​Danger Zone

​Delete Endpoint

​Configuration Best Practices

​Settings by Use Case

​Next Steps

Create a Prompt

API Authentication

Accessing Settings

Caching

Enable Caching

When to Use Caching

Cache Bypass

Rate Limits

Rate Limit Settings

Rate Limit Headers

Visibility

Timeout Settings

Danger Zone

Delete Endpoint

Configuration Best Practices

Settings by Use Case

Next Steps