Chapter 19: AI Content Generation

Raw alerts are useful but terse: “Deal moved backward from Negotiation to Qualification.” AI content generation enriches each alert with analysis, context, and actionable recommendations — turning a notification into an intelligence briefing.

The Generation Pipeline

AI content is generated asynchronously. After alerts are persisted, the evaluation service queues alert_ai_content jobs:


// Jobs are queued in chunks of 3 to manage LLM costs
const CHUNK_SIZE = 3;
for (let i = 0; i < alertIds.length; i += CHUNK_SIZE) {
  const chunk = alertIds.slice(i, i + CHUNK_SIZE);
  await pool.query(
    `INSERT INTO jobs (user_id, job_type, status, params)
     VALUES ($1, 'alert_ai_content', 'pending', $2)`,
    [userId, JSON.stringify({ alertIds: chunk })]
  );
}

A cron job (process-alert-ai-content, every minute) picks up these jobs and calls the LLM.

What AI Content Looks Like

The ai_content JSONB column on realtime_alerts stores structured analysis:


{
  "analysis": "This deal regression from Negotiation to Qualification is a significant red flag. The deal value of $85,000 combined with the backward movement suggests the prospect may have raised new objections or brought in additional stakeholders who need convincing.",
  "recommendations": [
    "Schedule a call with the primary contact to understand what changed",
    "Review the last meeting notes for any unresolved concerns",
    "Prepare a competitive comparison if a new vendor entered the picture"
  ],
  "riskLevel": "high",
  "suggestedNextStep": "Request a 15-minute check-in call within the next 48 hours",
  "generatedAt": "2024-03-20T14:30:00Z"
}

This is generated by sending the alert’s context data to the 70B model with a structured prompt:


You are a B2B sales advisor. Analyze this alert and provide actionable guidance.

ALERT: Deal stage regression on Acme Corp
- Previous stage: Negotiation
- New stage: Qualification
- Deal value: $85,000
- Days in previous stage: 12
- Last activity: 5 days ago
- Contact coverage: 2 contacts out of estimated 4-person committee

Respond in JSON format with: analysis, recommendations (array), riskLevel, suggestedNextStep.

The response is parsed (JSON mode ensures valid JSON), validated, and stored.

Cost Management

AI content generation is the most expensive part of the alert pipeline. Each call costs roughly $0.001-0.003 (70B model). With 20 alerts per user per day, that’s $0.02-0.06 per user per day — manageable but worth optimizing.

Chunking (3 alerts per job) reduces overhead by batching multiple alerts into fewer LLM calls when possible.

Async processing means alerts appear instantly in the feed without waiting for AI content. The analysis loads later when the user opens the alert detail drawer.

Caching through the Groq client (Chapter 14) prevents regeneration if the same alert context appears twice.

The Alert Detail Drawer

When a user clicks an alert, the AlertDetailDrawer component fetches the AI content:


// GET /api/alerts/{id}/ai-content
const { data } = useQuery({
  queryKey: queryKeys.alerts.aiContent(alertId),
  queryFn: async () => {
    const res = await fetch(`/api/alerts/${alertId}/ai-content`, {
      headers: { Authorization: `Bearer ${token}` },
    });
    return res.json();
  },
  enabled: !!alertId,
});

If AI content hasn’t been generated yet (the job is still pending), the API returns { status: 'pending' } and the UI shows a loading skeleton. The query polls until content is available.

Key Takeaways

AI content is async. Alerts appear immediately; analysis loads in the background.
Structured JSON output provides consistent fields (analysis, recommendations, risk level).
Cost is managed through chunking, caching, and the async queue pattern.
The alert drawer lazy-loads AI content on demand — most alerts are never opened, so most AI content is never displayed.

Next chapter: we enter Part 5 — the Cosmo chat agent. Starting with its tool architecture.