docs.title/Rate Limits & Quotas

Rate Limits & Quotas

Guides

Rate Limit Headers

Every API response includes rate limit information in the headers:

HeaderDescription
x-ratelimit-limit-requestsMaximum requests per minute (RPM)
x-ratelimit-remaining-requestsRemaining requests this minute
x-ratelimit-limit-tokensMaximum tokens per minute (TPM)
x-ratelimit-remaining-tokensRemaining tokens this minute
x-ratelimit-reset-requestsSeconds until RPM counter resets

Plan Limits

PlanRPMTPMModels
Starter60100K50+ models
Pro6001M200+ models
EnterpriseCustomCustom400+ models

Best Practices

  • Monitor headers: Track x-ratelimit-remaining-* to avoid hitting limits
  • Use exponential backoff: When you receive a 429, wait and retry
  • Batch requests: Combine multiple queries where possible
  • Cache responses: Cache common queries to reduce API calls
  • Upgrade proactively: Monitor usage trends and upgrade before hitting limits

docs.readyToStart

docs.readyToStartDesc

Invite a friend, both get free tokens!

πŸ’‘ Login to get your personal referral link

Read our referral program policy β†’