Learn about API rate limits and how to work with them
Falu's API rate limits requests in order to prevent abuse and overload of our services. Users who send many requests in quick succession may see error responses that show up as status code
Falu employs a rate limiter that limits the number of requests received by the API within any given second. Falu allows up to 100 read operations per second and 100 write operations per second in both live and test modes.
You can view the HTTP response headers containing the rate limit encountered during your requests as you test Falu's APIs and keys as outlined in the Introduction.
As shown above, there are three key fields of note in the rate limit header:
RateLimit-Limit: 1, 100;w=60
RateLimit-Reset: 11/21/2021 09:25:10
The number of requests that can be made
The number of remaining requests that can be made
The time at which the rate limit resets.
These limits should be treated as maximums and you should not generate unnecessary load. If you suddenly see a rising number of rate limited requests, please contact us.
While these limits are the default limits in Falu's APIs, we may reduce or increase them to prevent abuse or enable high-traffic applications, respectively. To request an increased rate limit, please contact us.
Because we may change rate limits at any time and rate limits can be different per application, rate limits should not be hard coded into your application. In order to properly support our dynamic rate limits, your application should parse for our rate limits in response headers and locally prevent exceeding the limits as they change.
Common causes of rate limit errors
Rate limiting can occur under a variety of conditions, but it’s most common in these scenarios:
- Running a large volume of closely-spaced requests can lead to rate limiting. Often this is part of an analytical or migration operation. When engaging in these activities, you should try to control the request rate on the client side.
- A sudden increase in traffic volume might result in rate limiting. We try to set our rates high enough that the vast majority of users are never rate limited for legitimate traffic, but if you suspect that an upcoming event might push you over the limits listed above, please contact us.
Handling rate limits
1. The basic way
A basic technique for integrations to gracefully handle limiting is to watch for 429 status codes and build in a retry mechanism. We advice that you monitor the limit returned to you as part of the Falu client, so you’re always aware of what it is. This can be done through the provided
X-RateLimit-Reset value which indicates when your current rate count will be reset. You can use this to assess when it’s safe to retry your request.
While the reset option is one way to deal with rate limiting, you may want more granular control over your rate limit handling. For example, you might have a specific retry timeframe that you want to follow and not wait for the minute window to pass for your entire rate limit to be reset. If you want more control over your rate limiting handling then you can check for the specific rate limit error returned and use some form of randomization in the backoff calculation. The retry mechanism should follow an exponential backoff schedule to reduce request volume when necessary.
3. Control traffic to Falu
You can only optimize individual requests to a limited degree, so an even more sophisticated approach would be to control traffic to Falu at a global level, and throttle it back if you detect substantial rate limiting. A common technique for controlling rate is to implement something like a token bucket rate limiting algorithm on the client-side. You can find ready-made and mature implementations for token bucket in various programming languages.