Skip to main content
Endprompt logs every API request, giving you full visibility into what’s happening with your endpoints.

The Logs Page

Navigate to Logs in the sidebar to access the log explorer.

Log Entry Information

Each log entry shows:
FieldDescription
TimestampWhen the request occurred
EndpointWhich endpoint was called
PromptWhich prompt was used
ModelLLM model used
StatusSuccess or failure
LatencyTime to complete (ms)
TokensInput + output tokens
CostEstimated cost

Filtering Logs

Use filters to find specific requests:

Time Range

Last hour, 24 hours, 7 days, or custom range

Endpoint

Filter by specific endpoint

Status

Success, failed, or all

Model

Filter by LLM model used
Search logs by:
  • Request ID
  • Input content (partial match)
  • Output content (partial match)

Log Details

Click any log entry to see full details:

Request Tab

{
  "text": "The content that was sent...",
  "max_length": 100,
  "options": {
    "format": "bullets"
  }
}

Response Tab

{
  "summary": "The LLM's response...",
  "word_count": 45
}

Rendered Prompt Tab

See the actual text sent to the LLM after template rendering:
You are a helpful assistant.

Summarize the following text in bullet points:

The content that was sent...

Return your response as JSON.

Metadata Tab

FieldValue
Request IDreq_xxxxxxxxxxxx
Endpoint/api/v1/summarize
Promptsummarizer-v2
Modelgpt-4o
Temperature0.3
Input Tokens256
Output Tokens89
Latency1,234ms
Cache HitNo
Cost$0.0012

Replaying Requests

Replay any logged request to:
  • Debug issues with the same input
  • Compare results after prompt changes
  • Test with different models or settings
1

Open Log Entry

Click on the log entry you want to replay.
2

Click Replay

Click the Replay button.
3

Modify (Optional)

Optionally change the prompt, model, or settings.
4

Execute

Run the replay and compare results.

Monitoring Metrics

Endpoint Metrics

On each endpoint’s dashboard:
MetricDescription
RequestsTotal requests over time
Success RatePercentage of successful requests
Avg LatencyAverage response time
Token UsageTotal tokens consumed
CostTotal estimated cost

Charts

  • Request Volume — Requests over time
  • Latency Distribution — P50, P95, P99 latencies
  • Error Rate — Failures over time
  • Model Usage — Breakdown by model

Setting Up Alerts

Custom alerting is available on Pro and Enterprise plans.
Configure alerts for:
  • Error rate exceeds threshold
  • Latency exceeds threshold
  • Daily cost exceeds budget
  • Unusual traffic patterns

Debugging Common Issues

High Latency

  1. Check the model — Some models are slower
  2. Check input size — Large inputs take longer
  3. Check time of day — Provider congestion varies
  4. Check max tokens — Higher limits may cause longer generation

High Error Rate

  1. Check validation errors — Are inputs malformed?
  2. Check provider status — Is OpenAI/Anthropic down?
  3. Check rate limits — Are you exceeding limits?
  4. Check prompt — Is it producing valid JSON?

Unexpected Outputs

  1. View rendered prompt — Is the template rendering correctly?
  2. Check temperature — Too high may cause variations
  3. Review recent changes — Did the prompt change?
  4. Replay request — Reproduce the issue

Log Retention

PlanRetention
Free7 days
Pro30 days
Enterprise90 days (or custom)
Export important logs before they expire if you need them for compliance or analysis.

Exporting Logs

Export logs for external analysis:
  1. Apply filters to select the logs you need
  2. Click Export
  3. Choose format (CSV or JSON)
  4. Download the file

API Access to Logs

Logs API is available on Enterprise plans.
Query logs programmatically for custom dashboards and integrations.

Next Steps