External API access needs guardrails. Without rate limiting, a single runaway script could exhaust database connections, rack up LLM costs, or overwhelm downstream providers. Astrelo uses two layers of protection: per-key rate limiting and monthly usage quotas.
Who Gets Rate Limited?
An important distinction: JWT users (browser sessions) are exempt from both rate limiting and quotas. These protections apply only to API key requests — external integrations hitting Astrelo’s public API.
Why? Browser users are inherently rate-limited by human interaction speed. A user clicking buttons in the UI will never generate 100 requests per second. API keys, however, can be used by automated scripts, so they need mechanical guardrails.
Layer 1: Per-Key Rate Limiting
The rate limiter lives in src/infrastructure/auth/dbRateLimiter.ts and uses a fixed-window algorithm backed by PostgreSQL:
// src/infrastructure/auth/dbRateLimiter.ts
const RATE_LIMIT = 100; // Max requests per window
const WINDOW_SECONDS = 60; // Window size: 1 minute
export async function checkRateLimit(apiKeyId: string): Promise<RateLimitResult> {
const key = `ratelimit:${apiKeyId}`;
const windowStart = new Date(Date.now() - WINDOW_SECONDS * 1000);
const result = await pool.query(
`INSERT INTO rate_limit_buckets (key, count, window_start)
VALUES ($1, 1, NOW())
ON CONFLICT (key) DO UPDATE SET
count = CASE
WHEN rate_limit_buckets.window_start < $2 THEN 1
ELSE rate_limit_buckets.count + 1
END,
window_start = CASE
WHEN rate_limit_buckets.window_start < $2 THEN NOW()
ELSE rate_limit_buckets.window_start
END
RETURNING count, window_start`,
[key, windowStart]
);
const count = result.rows[0].count;
return {
allowed: count <= RATE_LIMIT,
remaining: Math.max(0, RATE_LIMIT - count),
resetAt: new Date(result.rows[0].window_start.getTime() + WINDOW_SECONDS * 1000),
};
}How the Fixed-Window Works
The rate_limit_buckets table has three columns: key (PK), count, and window_start.
The single INSERT ... ON CONFLICT DO UPDATE query does everything atomically:
- First request in a window: Inserts a new row with
count: 1andwindow_start: NOW() - Subsequent requests in the same window:
window_start >= windowStart(the window hasn’t expired), socountincrements by 1 - First request in a new window:
window_start < windowStart(the old window has expired), socountresets to 1 andwindow_startresets toNOW()
No cleanup job needed. No separate expiry mechanism. The window resets itself on the next request.
Fail-Open Design
try {
const result = await checkRateLimit(apiKeyId);
if (!result.allowed) {
return res.status(429).json({ error: 'Rate limit exceeded' });
}
} catch (error) {
// Database error — fail open (allow the request)
console.warn('[RateLimit] Check failed, allowing request:', error);
}If the rate limit check itself fails (database connection dropped, query timeout), the request is allowed through. This is a deliberate choice: a rate limiter should protect the system, not become a single point of failure. Better to occasionally allow an extra request than to block legitimate traffic because the rate limit table is temporarily unreachable.
Standard Headers
Every API response includes rate limit headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 73
X-RateLimit-Reset: 1711234620These follow the standard convention that API consumers expect. The Reset timestamp tells the client when the current window expires, so well-behaved clients can self-throttle.
Layer 2: Monthly Usage Quotas
Beyond per-minute rate limits, API keys have monthly usage budgets tracked in the usage_quotas table:
// src/infrastructure/auth/usageTracker.ts
export async function checkQuota(
userId: string,
endpoint: string
): Promise<QuotaCheckResult | null> {
// Determine which quota category this endpoint falls under
const category = getQuotaCategory(endpoint);
const quota = await pool.query(
`SELECT * FROM usage_quotas WHERE user_id = $1`,
[userId]
);
if (!quota.rows[0]) return null; // No quota row = unlimited
const row = quota.rows[0];
// Safety-net auto-reset: if reset_at has already passed, reset inline
if (new Date(row.reset_at) < new Date()) {
await pool.query(
`UPDATE usage_quotas SET
current_api_count = 0,
current_enrichment_count = 0,
current_discovery_count = 0,
reset_at = NOW() + INTERVAL '1 month'
WHERE user_id = $1`,
[userId]
);
return null; // After reset, all quotas are clear
}
// Check the specific category
const current = row[`current_${category}_count`];
const limit = row[`monthly_${category}_limit`];
if (current >= limit) {
return { exceeded: true, category, current, limit };
}
return null; // Within quota
}Three Quota Categories
function getQuotaCategory(endpoint: string): 'api' | 'enrichment' | 'discovery' {
if (endpoint.includes('bulk-enrich') || endpoint.includes('discovery/enrich')) {
return 'enrichment';
}
if (endpoint.includes('goldilocks') || endpoint.includes('discovery/prospects')
|| endpoint.includes('recommendations') || endpoint.includes('ranking/calculate')) {
return 'discovery';
}
return 'api'; // Default: general API usage
}| Category | Default Monthly Limit | What Counts |
|---|---|---|
api | 1,000 | All API requests not in the other two categories |
enrichment | 50 | Bulk enrichment and discovery enrichment |
discovery | 25 | Goldilocks recommendations, prospect discovery, scoring |
Enrichment and discovery have lower limits because they’re expensive operations — each enrichment call may trigger multiple web searches and LLM calls.
Usage Tracking
After the handler completes, usage is incremented:
export async function incrementUsage(
userId: string,
category: 'api' | 'enrichment' | 'discovery'
): Promise<void> {
const column = `current_${category}_count`;
await pool.query(
`UPDATE usage_quotas SET ${column} = ${column} + 1, updated_at = NOW()
WHERE user_id = $1`,
[userId]
);
}This runs in the finally block of the auth middleware — fire-and-forget, never blocks the response:
finally {
incrementUsage(userId, category).catch(() => {});
logUsage(apiKeyId, userId, endpoint, method, statusCode).catch(() => {});
}Both incrementUsage and logUsage swallow errors silently. If the increment fails, the user gets a free request. If logging fails, we lose one audit entry. Neither failure should affect the user’s experience.
The Full API Key Flow
Here’s the complete flow when an API key request arrives:
Request arrives with X-API-Key header
↓
1. verifyApiKey() — hash the key, look up in api_keys table
→ 401 if not found or inactive
→ 401 if expired
↓
2. checkRateLimit(apiKeyId)
→ 429 if rate limit exceeded
→ Set X-RateLimit-* headers
↓
3. checkQuota(userId, endpoint)
→ 429 if monthly quota exceeded
↓
4. handler() — process the actual request
↓
5. finally:
→ incrementUsage(userId, category) [fire-and-forget]
→ logUsage(apiKeyId, userId, ...) [fire-and-forget]Rate limit is checked first (cheap — single DB query). Quota is checked second (slightly more expensive — reads the full quota row). The handler only runs if both pass.
The Usage Logs Table
Every API key request is logged for analytics:
api_usage_logs (8 cols):
id UUID PK
api_key_id UUID FK → api_keys
user_id UUID FK → users
endpoint VARCHAR(255)
method VARCHAR(10)
status_code INT
response_time_ms INT
created_at TIMESTAMPTZThis table answers questions like:
- “Which API key is generating the most traffic?” (GROUP BY api_key_id)
- “Which endpoints are slowest?” (AVG(response_time_ms) GROUP BY endpoint)
- “How many 500 errors are we returning?” (WHERE status_code = 500)
Alert Pipeline Rate Limiting
Beyond API-level protection, the alert pipeline has its own rate limit:
// Max 20 alerts per hour per user
const recentCount = await getRecentAlertCount(userId, 60);
if (recentCount >= 20) {
return { alertsCreated: 0 };
}
const allowedCount = Math.max(0, 20 - recentCount);
const finalMatches = dedupedMatches.slice(0, allowedCount);This prevents a CRM bulk update from flooding a user with hundreds of alerts. The cap is 20 per hour — high enough that real events get through, low enough that a mass import doesn’t destroy the signal-to-noise ratio.
Key Takeaways
-
JWT users are exempt — rate limiting and quotas protect against automated API abuse, not human UI usage.
-
Fixed-window rate limiting uses a single atomic upsert — no cleanup jobs, no expiry mechanisms, self-resetting.
-
Fail-open design means the rate limiter never becomes a single point of failure.
-
Three quota categories (API, enrichment, discovery) with different limits reflect the different costs of each operation.
-
Fire-and-forget tracking ensures usage logging never blocks or degrades the user’s request.
-
Alert pipeline rate limiting (20/hour) protects the notification feed from CRM bulk operations.
Next chapter: the final piece — how all of this gets deployed to AWS Amplify.