Chapter 20: Cosmo's Tool Architecture

Cosmo is Astrelo’s AI chat agent — a conversational interface over the entire platform. Ask it “how’s my pipeline?” and it pulls data from six tools, synthesizes them through an LLM, and responds with a strategic briefing. Ask it “draft an email to Sarah at Acme Corp” and it resolves the contact, generates a personalized email, and presents it for your approval.

This chapter covers how Cosmo works under the hood: the message pipeline, intent classification, and tool registry.

User Message

Natural language

→

Classify Intent

8B model

→

Select Tools

Tool registry

→

Execute

Run queries

→

Format Response

70B model

The Message Pipeline

Every message flows through chatService.processMessage() — a ~1,000-line orchestrator that handles the full lifecycle:


User types message
  ↓
1. Create or load conversation
2. Persist user message
3. Load last 10 messages (for context)
4. Extract prior entity context (company names from last response)
5. Classify intent via 8B model
  ↓
Branch A: Confirmation of pending action → execute immediately
Branch B: Action request → resolve entities, propose or execute
Branch C: Data query → run tools, generate response via 70B
  ↓
6. Annotate company/contact names (clickable links)
7. Inject contextual action buttons
8. Persist assistant message

Two LLM calls per normal request: the 8B model classifies intent (fast, cheap), then the 70B model generates the response (slower, smarter). This two-model split keeps costs low while maintaining quality where it matters — in the response the user actually reads.

Conversation Persistence

Messages are stored in two tables:


chat_conversations: id, user_id, title, created_at, updated_at
chat_messages: id, conversation_id, role, content, tool_calls (JSONB), token_usage (JSONB), created_at

The tool_calls JSONB column is the workhorse. It stores everything the UI needs beyond plain text:


{
  "tool_result": { "deals": [...], "pipeline": {...} },
  "chart_data": { "type": "bar", "data": [...] },
  "company_annotations": [{ "id": "uuid", "name": "Acme Corp" }],
  "chat_actions": [{ "type": "action_button", "label": "Draft Email", ... }],
  "email_preview": { "to": "sarah@acme.com", "subject": "...", "body": "..." },
  "pending_action": { "actionTool": "send_email", "params": {...} }
}

This is a clever design: the message content is always plain text (renderable anywhere), while tool_calls carries rich structured data for the React UI to render as charts, buttons, and interactive elements.

Intent Classification

The 8B model (Llama 3.1 8B via Groq) classifies every user message:


// src/infrastructure/chat/intentClassifier.ts
const classification = await classifyIntent(message, conversationHistory, priorEntities);

The classifier returns:


{
  "tools": ["get_pipeline_health", "get_stalled_deals"],
  "confidence": 0.85,
  "reasoning": "User asked about pipeline, selecting health overview and stalled deals",
  "isAction": false,
  "isConfirmation": false,
  "searchTerm": null,
  "actionParams": null
}

How Tool Selection Works

The system prompt given to the 8B model is extensive — it lists all 83 valid tools organized by category, with example messages and selection rules:


You are an intent classifier for a B2B sales AI assistant.

DATA TOOLS (42):
Pipeline: get_pipeline_health, get_stalled_deals, get_deals_by_stage, ...
Discovery: get_discovery_prospects, get_top_recommendations, ...
Revenue: get_quota_attainment, get_quarterly_forecast, ...

ACTION TOOLS (22):
send_email, draft_email, create_followup_task, notify_rep, ...

SELECTION RULES:
- "How's my pipeline?" → [get_pipeline_health]
- "Show me stalled deals" → [get_stalled_deals]
- "Tell me about Acme Corp" → [get_company_detail] with searchTerm: "Acme Corp"
- "Draft an email to Sarah" → isAction: true, actionTool: draft_email

The model returns JSON (JSON mode enforced), which is then post-processed:

Tool name validation — Unknown tool names are filtered out
Action detection — If any tool is in ACTION_TOOL_NAMES, isAction is set
Param sanitization — actionParams is validated into typed shapes per action tool
Fallback — On any error, returns get_pipeline_health with confidence 0.3

Entity Carry-Forward

A subtlety: when you ask “tell me about Acme Corp” followed by “who are the contacts?”, the second message has no company name. The classifier handles this through entity carry-forward:


// If priorEntities present, inject into the classification prompt
if (priorEntities?.companyNames?.length) {
  messages.push({
    role: 'system',
    content: `Previous conversation context: Companies discussed: ${priorEntities.companyNames.join(', ')}.
              If the user's message is a follow-up, use these entities.`
  });
}

The prior entities are extracted from the last assistant message’s company_annotations in tool_calls.

The Tool Registry

42 data tools are organized across three files:

File	Tools	Purpose
`tools/pipeline.ts`	18	Pipeline, deals, companies, contacts
`tools/discovery.ts`	6	Discovery, Goldilocks, exploration
`tools/revenue.ts`	18	Revenue metrics, forecasts, benchmarks

Every tool has the same signature:


type ChatToolFn = (
  userId: string,
  icpProfileId?: string,
  searchTerm?: string
) => Promise<Record<string, unknown>>;

Three parameters — that’s it. The userId scopes all database queries (multi-tenancy). The icpProfileId selects which ICP profile’s scores to use. The searchTerm is the entity the user mentioned (“Acme Corp”, “Sarah”, etc.).

The tools are registered in a flat object:


// src/infrastructure/chat/tools/index.ts
export const chatTools: ChatToolRegistry = {
  get_pipeline_health: getPipelineHealth,
  get_stalled_deals: getStalledDeals,
  get_company_detail: getCompanyDetail,
  find_contacts: findContacts,
  get_quota_attainment: getQuotaAttainment,
  // ... 37 more
};

Each tool is a pure function that queries the database and returns structured data. No tool calls the LLM directly — that’s the orchestrator’s job.

Tool Descriptions for the LLM

The LLM needs to understand what each tool returned. TOOL_DESCRIPTIONS maps tool names to human-readable labels:


export const TOOL_DESCRIPTIONS: Record<string, string> = {
  get_pipeline_health: 'Pipeline Health Overview',
  get_stalled_deals: 'Stalled Deals Analysis',
  get_company_detail: 'Company Detail',
  get_quota_attainment: 'Quota Attainment Report',
  // ...
};

When building the LLM prompt, tool results are prefixed with these descriptions so the model knows what data it’s looking at:


=== Pipeline Health Overview ===
{total_deals: 47, total_value: 2340000, stalled_count: 5, ...}

=== Stalled Deals Analysis ===
[{company: "Acme Corp", days_stalled: 23, value: 85000}, ...]

A Typical Data Query Flow

Let’s trace “how’s my pipeline?” through the system:

Classify: 8B model returns tools: ["get_pipeline_health"], confidence 0.9
Dispatch: chatTools.get_pipeline_health(userId, icpProfileId) runs
Tool executes: Queries deals and pipeline_snapshots tables, returns structured data
Context building: buildCosmoContext() adds pipeline snapshot, win profile, recent activity (cached 1 hour)
LLM prompt: Tool results + context + conversation history → 70B model
Response: “Your pipeline looks solid with 47 active deals worth $2.3M. However, 5 deals are stalled — Acme Corp ($85K) has been in Negotiation for 23 days…”
Annotation: Company names in the response become clickable [[COMPANY:uuid:Acme Corp]] markers
Action buttons: System injects “View Stalled Deals” and “Create Follow-up Tasks” buttons

Key Takeaways

Two-model architecture keeps costs low (8B for classification) while maintaining response quality (70B for generation).
83 tools (42 data + 22 action + shared utilities) cover the full surface area of the platform.
Entity carry-forward lets users have natural follow-up conversations without repeating company names.
tool_calls JSONB stores rich structured data (charts, buttons, annotations) alongside plain text messages.
Every tool has the same 3-parameter signature — simple, consistent, and testable.

Next chapter: how tools get dispatched, how results are combined, and what happens when Cosmo can’t find what you asked about.