Overview
Rate limiting controls how many API requests can be made within a specific time window. This helps protect your infrastructure from abuse, ensures fair usage among users, and maintains system performance.
**Key Benefits**
- Prevent API abuse and excessive usage
- Ensure fair resource allocation among users
- Maintain system performance and stability
- Graceful handling with automatic retries
Rate Limit Hierarchy
Rate limits can be configured at different levels, with more specific limits taking precedence.
card: 1. Tenant Level
icon: user
**Default: 360/m**
Base rate limit applied to all requests within a tenant. This is the broadest level of control.
card: 2. Organisation Level
icon: building
**Default: 120/m**
Rate limit applied to requests within a specific organisation. Users can configure this to limit their organisation's consumption of the tenant's rate limit.
card: 3. API Key Level
icon: key
**Custom**
Most specific rate limit applied to individual API keys. This allows fine-grained control over specific integrations or users.
**Important Note**
API key rate limits allow users to self-limit their usage to avoid consuming the entire organisation or tenant quota. Setting a lower limit helps prevent accidental overconsumption.
Rate Limit Format
Rate limits are specified using a simple format: number/timeunit. Multiple limits can be combined with commas to enforce different time windows simultaneously.
Syntax
Single limit:
{number}/{timeunit}
Multiple limits (all enforced):
{number}/{timeunit}, {number}/{timeunit}, ...
**Multiple Rate Limits**
When multiple limits are specified, ALL limits are enforced simultaneously. The request will be rate limited if ANY of the limits is exceeded.
Example: `32/s, 120/m, 1000/h, 10000/d`
This enforces: max 32 per second AND max 120 per minute AND max 1000 per hour AND max 10000 per day
Time Units
| Unit | Description |
|---|---|
s | Seconds |
m | Minutes |
h | Hours |
d | Days |
Examples
| Format | Description |
|---|---|
120/m | 120 requests per minute |
1000/h | 1000 requests per hour |
50/s | 50 requests per second |
5/s, 100/m | 5/sec AND 100/min |
10/s, 500/h, 5000/d | Tiered rate limiting |
Rate Limit Headers
The API returns rate limit information through HTTP headers in every response.
| Header | Description | Example |
|---|---|---|
X-RateLimit-Limit | Total number of requests allowed in the current time window | 120 |
X-RateLimit-Remaining | Number of requests remaining in the current time window | 75 |
X-RateLimit-Used | Number of requests already made in the current time window | 45 |
X-RateLimit-Reset | Unix timestamp when the current rate limit window resets | 1693829400 |
X-RateLimit-Policy | The rate limit policy currently in effect | 120/m |
Retry-After | Number of seconds to wait before making another request (only present in 429 responses) | 30 |
429 Response Handling
When rate limits are exceeded, the API returns a 429 status code with retry information.
Example 429 Response
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 120
X-RateLimit-Remaining: 0
X-RateLimit-Used: 120
X-RateLimit-Reset: 1693829400
X-RateLimit-Policy: 120/m
Retry-After: 30
{
"error": "Rate limit exceeded (120/m). Please try again in 30 seconds."
}
Server Automatic Slowdown
**Server Automatic Slowdown**
When approaching rate limits, the server automatically manages request flow:
- If rate limit wait is less than 5 seconds, the server delays request processing instead of rejecting
- This provides smoother experience for burst traffic patterns
- Requests are processed successfully but with increased response time
Automatic Client Handling
**Automatic Client Handling**
Our client libraries automatically handle 429 responses:
- Automatically retry requests after the specified delay
- Respect the Retry-After header for optimal timing
- Implement exponential backoff for repeated failures
- Log retry attempts for debugging purposes
Best Practices
Do
- Monitor rate limit headers in responses
- Implement proper retry logic with exponential backoff
- Set conservative API key limits for external integrations
- Cache API responses when possible to reduce request volume
- Use batch operations to reduce total request count
Don’t
- Ignore 429 responses without implementing retries
- Make excessive parallel requests
- Set API key limits higher than organisation limits
- Retry immediately without respecting Retry-After headers
- Assume rate limits are the same across all endpoints