Authentication
GuardLayer uses API keys for authentication. Include your API key in the X-Api-Key header or Authorization: Bearer {key} header for all requests.
curl https://guardlayer.polsia.app/api/v1/metrics \
-H "X-Api-Key: gl_your_api_key_here"
# Or using Bearer auth:
curl https://guardlayer.polsia.app/api/v1/metrics \
-H "Authorization: Bearer gl_your_api_key_here"
🔑 Generating Your First API Key
API keys are generated using the POST /api/v1/api-keys endpoint (no authentication required). Store your key securely — it's shown only once.
Rate Limiting
All API endpoints are rate-limited to 100 requests per minute per API key. Rate limit headers are included in every response.
| Header | Description |
|---|---|
| X-RateLimit-Limit | Total requests allowed per window (100) |
| X-RateLimit-Remaining | Remaining requests in current window |
| X-RateLimit-Reset | Unix timestamp when the window resets |
⏱️ Rate Limit Exceeded
If you exceed the rate limit, you'll receive a 429 Too Many Requests response with a retry_after_seconds field. Wait until the window resets and retry.
Error Handling
GuardLayer uses standard HTTP status codes. Error responses include a JSON body with an error field describing the issue.
| Status Code | Meaning |
|---|---|
| 200 | Success — Request completed successfully |
| 201 | Created — Resource created successfully |
| 400 | Bad Request — Invalid parameters or missing fields |
| 401 | Unauthorized — Missing or invalid API key |
| 403 | Forbidden — API key is deactivated |
| 404 | Not Found — Endpoint or resource doesn't exist |
| 429 | Too Many Requests — Rate limit exceeded |
| 500 | Internal Server Error — Something went wrong on our end |
{
"error": "Invalid API key."
}
API Keys
Generate API keys to authenticate requests to GuardLayer. Each key is scoped to your account and can track separate applications or environments.
Generate a new API key. The plaintext key is returned only once — store it securely. Default alert rules are automatically created for the new key.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| customer_name | string | Required | Name for this API key (e.g., "my-app-production") |
curl -X POST https://guardlayer.polsia.app/api/v1/api-keys \
-H "Content-Type: application/json" \
-d '{
"customer_name": "my-app-production"
}'
import requests
response = requests.post(
'https://guardlayer.polsia.app/api/v1/api-keys',
json={'customer_name': 'my-app-production'}
)
data = response.json()
print(f"API Key: {data['api_key']}")
const response = await fetch('https://guardlayer.polsia.app/api/v1/api-keys', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ customer_name: 'my-app-production' })
});
const data = await response.json();
console.log('API Key:', data.api_key);
Response (201 Created)
{
"message": "API key created. Save this key — it cannot be retrieved again.",
"api_key": "gl_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6",
"key_id": 1,
"key_prefix": "gl_a1b2c3d...",
"customer_name": "my-app-production",
"created_at": "2026-03-07T12:00:00.000Z",
"default_alerts_created": 3
}
Data Ingestion
Send LLM call data to GuardLayer for monitoring, cost tracking, and alerting. Use the single ingestion endpoint for real-time calls or the batch endpoint for high-volume workloads.
Ingest a single LLM call. Use this endpoint to track latency, token usage, cost, and status for every call your application makes to LLM providers.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Model name (e.g., "gpt-4", "claude-3-opus") |
| latency_ms | number | Required | Request latency in milliseconds |
| tokens_in | number | Optional | Prompt tokens (default: 0) |
| tokens_out | number | Optional | Completion tokens (default: 0) |
| cost_usd | number | Optional | Cost in USD (default: 0) |
| status | string | Optional | Status: "success", "error", or custom (default: "success") |
| guardrail_flags | array | Optional | Array of guardrail flags if violations detected (default: []) |
curl -X POST https://guardlayer.polsia.app/api/v1/ingest \
-H "X-Api-Key: gl_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"latency_ms": 850,
"tokens_in": 120,
"tokens_out": 45,
"cost_usd": 0.0042,
"status": "success"
}'
import requests
response = requests.post(
'https://guardlayer.polsia.app/api/v1/ingest',
headers={'X-Api-Key': 'gl_your_api_key_here'},
json={
'model': 'gpt-4',
'latency_ms': 850,
'tokens_in': 120,
'tokens_out': 45,
'cost_usd': 0.0042,
'status': 'success'
}
)
print(response.json())
const response = await fetch('https://guardlayer.polsia.app/api/v1/ingest', {
method: 'POST',
headers: {
'X-Api-Key': 'gl_your_api_key_here',
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: 'gpt-4',
latency_ms: 850,
tokens_in: 120,
tokens_out: 45,
cost_usd: 0.0042,
status: 'success'
})
});
console.log(await response.json());
Response (201 Created)
{
"id": 1234,
"created_at": "2026-03-07T12:34:56.789Z",
"status": "ingested"
}
Ingest up to 100 LLM calls in a single request. Use this for high-volume applications to reduce network overhead and improve performance.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| calls | array | Required | Array of call objects (max 100). Each object has the same fields as /ingest. |
curl -X POST https://guardlayer.polsia.app/api/v1/ingest/batch \
-H "X-Api-Key: gl_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"calls": [
{
"model": "gpt-4",
"latency_ms": 850,
"tokens_in": 120,
"tokens_out": 45,
"cost_usd": 0.0042
},
{
"model": "claude-3-opus",
"latency_ms": 1200,
"tokens_in": 200,
"tokens_out": 80,
"cost_usd": 0.008
}
]
}'
Response (201 Created)
{
"ingested": 2,
"calls": [
{
"id": 1234,
"created_at": "2026-03-07T12:34:56.789Z"
},
{
"id": 1235,
"created_at": "2026-03-07T12:34:56.791Z"
}
]
}
Query Metrics
Retrieve aggregated metrics for your LLM calls. Metrics are scoped to your API key and can be filtered by time period.
Get aggregated metrics including total calls, latency, token usage, cost, error rates, and breakdowns by model and status.
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| hours | number | Optional | Time window in hours (default: 24) |
curl https://guardlayer.polsia.app/api/v1/metrics?hours=48 \
-H "X-Api-Key: gl_your_api_key_here"
Response (200 OK)
{
"period_hours": 48,
"since": "2026-03-05T12:00:00.000Z",
"total_calls": 1523,
"avg_latency_ms": 842,
"p95_latency_ms": 1450,
"total_tokens": 245600,
"total_cost_usd": "12.4500",
"error_count": 12,
"error_rate": "0.79",
"total_tokens_in": 180400,
"total_tokens_out": 65200,
"by_model": [
{
"model": "gpt-4",
"calls": 890,
"avg_latency_ms": 920,
"total_tokens": 145000,
"cost_usd": "8.2000",
"errors": 5
},
{
"model": "claude-3-opus",
"calls": 633,
"avg_latency_ms": 740,
"total_tokens": 100600,
"cost_usd": "4.2500",
"errors": 7
}
],
"by_status": [
{
"status": "success",
"count": 1511
},
{
"status": "error",
"count": 12
}
],
"timeline": [
{
"hour": "2026-03-07T10:00:00.000Z",
"calls": 42,
"avg_latency_ms": 850,
"cost_usd": "0.350000",
"errors": 1
}
]
}
Alert Management
Create and manage threshold-based alerts for your LLM metrics. Get notified when latency spikes, error rates increase, or costs exceed thresholds.
List all alert rules and recent alert events for your API key.
curl https://guardlayer.polsia.app/api/v1/alerts \
-H "X-Api-Key: gl_your_api_key_here"
Response (200 OK)
{
"rules": [
{
"id": 1,
"metric": "latency_p95",
"operator": "gt",
"threshold_value": "2000",
"cooldown_minutes": 60,
"enabled": true,
"created_at": "2026-03-07T10:00:00.000Z"
}
],
"recent_events": [
{
"id": 42,
"rule_id": 1,
"metric": "latency_p95",
"operator": "gt",
"threshold_value": "2000",
"metric_value": "2450",
"triggered_at": "2026-03-07T11:30:00.000Z",
"notification_sent": true
}
]
}
Create a new alert rule. Alerts are evaluated on every ingested LLM call and fire when thresholds are crossed.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| metric | string | Required | Metric name: latency_p95, error_rate, cost_per_call, hallucination_score |
| operator | string | Required | Operator: gt, gte, lt, lte |
| threshold_value | number | Required | Threshold value to trigger alert |
| cooldown_minutes | number | Optional | Cooldown period between alerts (default: 60) |
| enabled | boolean | Optional | Enable or disable rule (default: true) |
curl -X POST https://guardlayer.polsia.app/api/v1/alerts \
-H "X-Api-Key: gl_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"metric": "error_rate",
"operator": "gt",
"threshold_value": 5,
"cooldown_minutes": 30
}'
Response (201 Created)
{
"id": 2,
"api_key_id": 1,
"metric": "error_rate",
"operator": "gt",
"threshold_value": "5",
"cooldown_minutes": 30,
"enabled": true,
"created_at": "2026-03-07T12:00:00.000Z"
}
Update an existing alert rule. You can modify threshold, cooldown period, or enable/disable the rule.
curl -X PUT https://guardlayer.polsia.app/api/v1/alerts/2 \
-H "X-Api-Key: gl_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"threshold_value": 10,
"enabled": false
}'
Delete an alert rule. This action is permanent.
curl -X DELETE https://guardlayer.polsia.app/api/v1/alerts/2 \
-H "X-Api-Key: gl_your_api_key_here"
Response (200 OK)
{
"message": "Alert rule deleted.",
"id": 2
}
5-Minute Integration
Get up and running with GuardLayer in 5 minutes. Generate a key, wrap your LLM calls, and start monitoring.
📋 Quick Start Checklist
1. Generate an API key: POST /api/v1/api-keys
2. Add ingestion to your LLM calls: POST /api/v1/ingest after each call
3. View metrics in the Dashboard
4. Set up custom alerts: POST /api/v1/alerts
For detailed integration examples, see the Getting Started Guide.