How to Implement API Rate Limiting in ASP.NET Core (Anti-Abuse & Throttling Guide)

API rate limiting is a critical technique in modern web applications, especially for public APIs. It helps prevent abuse, protects server resources, and ensures fair usage among clients. In ASP.NET Core, Microsoft provides a built-in rate limiting middleware that makes it relatively straightforward to implement request throttling.

What Is Rate Limiting and Why It Matters

Rate limiting controls how many requests a client can send within a defined time window. It is widely used to prevent malicious attacks, reduce server overload, and improve overall system stability.

According to official documentation, rate limiting can help:

Prevent abuse and brute-force attacks
Ensure fair resource distribution
Protect backend services from overload
Improve performance and reliability

In short, it's one of the most effective anti-brushing (anti-spam) strategies for APIs.

Built-in Rate Limiting in ASP.NET Core

Starting from .NET 7, ASP.NET Core introduced the Microsoft.AspNetCore.RateLimiting middleware. This allows developers to define rate limiting policies and apply them globally or per endpoint.

1. Register Rate Limiting Services

In Program.cs, configure a rate limiting policy:

using Microsoft.AspNetCore.RateLimiting;
using System.Threading.RateLimiting;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("fixed", opt =>
    {
        opt.PermitLimit = 10; // max requests
        opt.Window = TimeSpan.FromSeconds(60); // time window
        opt.QueueLimit = 0;
    });
});

This example allows 10 requests per minute per client.

2. Enable Middleware

var app = builder.Build();

app.UseRateLimiter();

This activates the rate limiting middleware globally.

3. Apply Rate Limiting to Endpoints

app.MapGet("/api/test", () => "Hello World")
   .RequireRateLimiting("fixed");

Only this endpoint will use the defined policy.

Common Rate Limiting Algorithms

ASP.NET Core supports several built-in algorithms:

Fixed Window: Limits requests within a fixed time period
Sliding Window: Smooths request distribution across time segments
Token Bucket: Allows bursts while maintaining average rate
Concurrency Limiter: Controls simultaneous requests instead of request rate

Each algorithm fits different scenarios. For example, token bucket is ideal for APIs that allow burst traffic, while concurrency limiting is useful for heavy resource endpoints.

Rate Limiting by IP or User

A powerful feature is partitioned rate limiting, which allows limits based on keys such as IP address or user identity.

Example (limit by IP):

options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(context =>
    RateLimitPartition.GetFixedWindowLimiter(
        context.Connection.RemoteIpAddress?.ToString() ?? "unknown",
        _ => new FixedWindowRateLimiterOptions
        {
            PermitLimit = 5,
            Window = TimeSpan.FromSeconds(10)
        }));

This ensures each IP has its own request quota, preventing a single client from consuming all resources.

Handling Requests That Exceed Limits

When a request exceeds the limit, you should return HTTP 429 Too Many Requests:

builder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;

    options.OnRejected = async (context, token) =>
    {
        context.HttpContext.Response.Headers["Retry-After"] = "60";
        await context.HttpContext.Response.WriteAsync("Too many requests");
    };
});

This improves client behavior and aligns with API best practices.

Practical Anti-Abuse Strategies

In real-world scenarios, rate limiting is often combined with other techniques:

IP blacklisting / firewall rules
CAPTCHA verification for suspicious traffic
API keys with tiered limits
Reverse proxy throttling (e.g., Nginx, Cloudflare)

Rate limiting alone is helpful but not a complete defense against large-scale attacks like DDoS .

Best Practices

Use different limits for different endpoints (e.g., login vs public API)
Apply stricter limits on sensitive operations
Combine with logging and monitoring
Perform load testing before production deployment