API rate limiting is a critical technique in modern web applications, especially for public APIs. It helps prevent abuse, protects server resources, and ensures fair usage among clients. In ASP.NET Core, Microsoft provides a built-in rate limiting middleware that makes it relatively straightforward to implement request throttling.
What Is Rate Limiting and Why It Matters
Rate limiting controls how many requests a client can send within a defined time window. It is widely used to prevent malicious attacks, reduce server overload, and improve overall system stability.
According to official documentation, rate limiting can help:
- Prevent abuse and brute-force attacks
- Ensure fair resource distribution
- Protect backend services from overload
- Improve performance and reliability
In short, it's one of the most effective anti-brushing (anti-spam) strategies for APIs.
Built-in Rate Limiting in ASP.NET Core
Starting from .NET 7, ASP.NET Core introduced the Microsoft.AspNetCore.RateLimiting middleware. This allows developers to define rate limiting policies and apply them globally or per endpoint.
1. Register Rate Limiting Services
In Program.cs, configure a rate limiting policy:
using Microsoft.AspNetCore.RateLimiting;
using System.Threading.RateLimiting;
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddRateLimiter(options =>
{
options.AddFixedWindowLimiter("fixed", opt =>
{
opt.PermitLimit = 10; // max requests
opt.Window = TimeSpan.FromSeconds(60); // time window
opt.QueueLimit = 0;
});
});
This example allows 10 requests per minute per client.
2. Enable Middleware
var app = builder.Build();
app.UseRateLimiter();
This activates the rate limiting middleware globally.
3. Apply Rate Limiting to Endpoints
app.MapGet("/api/test", () => "Hello World")
.RequireRateLimiting("fixed");
Only this endpoint will use the defined policy.
Common Rate Limiting Algorithms
ASP.NET Core supports several built-in algorithms:
- Fixed Window: Limits requests within a fixed time period
- Sliding Window: Smooths request distribution across time segments
- Token Bucket: Allows bursts while maintaining average rate
- Concurrency Limiter: Controls simultaneous requests instead of request rate
Each algorithm fits different scenarios. For example, token bucket is ideal for APIs that allow burst traffic, while concurrency limiting is useful for heavy resource endpoints.
Rate Limiting by IP or User
A powerful feature is partitioned rate limiting, which allows limits based on keys such as IP address or user identity.
Example (limit by IP):
options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(context =>
RateLimitPartition.GetFixedWindowLimiter(
context.Connection.RemoteIpAddress?.ToString() ?? "unknown",
_ => new FixedWindowRateLimiterOptions
{
PermitLimit = 5,
Window = TimeSpan.FromSeconds(10)
}));
This ensures each IP has its own request quota, preventing a single client from consuming all resources.
Handling Requests That Exceed Limits
When a request exceeds the limit, you should return HTTP 429 Too Many Requests:
builder.Services.AddRateLimiter(options =>
{
options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
options.OnRejected = async (context, token) =>
{
context.HttpContext.Response.Headers["Retry-After"] = "60";
await context.HttpContext.Response.WriteAsync("Too many requests");
};
});
This improves client behavior and aligns with API best practices.
Practical Anti-Abuse Strategies
In real-world scenarios, rate limiting is often combined with other techniques:
- IP blacklisting / firewall rules
- CAPTCHA verification for suspicious traffic
- API keys with tiered limits
- Reverse proxy throttling (e.g., Nginx, Cloudflare)
Rate limiting alone is helpful but not a complete defense against large-scale attacks like DDoS .
Best Practices
- Use different limits for different endpoints (e.g., login vs public API)
- Apply stricter limits on sensitive operations
- Combine with logging and monitoring
- Perform load testing before production deployment