The pipeline. A Vercel cron fires every morning at 6 AM ET. For yesterday's ReadyMode call recordings (~1,200/day), the pipeline pulls each MP3 from S3, decides a processing tier from call duration, runs transcription, runs analysis, and inserts a row into call_insights.
Three tiers route by duration after we pre-filter dispositions that have no conversation (Voicemail, WrongNumber, PressAny, DNC, NotLogged):
- Skipped — call < 30 s. Metadata only, no model spend.
- Cheap — call 30–59 s.
gpt-4o-mini-transcribe + gpt-4o-mini analysis with a shortened prompt. ~$0.005/call.
- Full — call ≥ 60 s.
gpt-4o-transcribe-diarize with the rep's reference voice clip + gpt-4.1-mini analysis on the full coaching schema. ~$0.03/call.
The audit watches the watchers. A second cron at 8 AM ET runs call_insight_audit() against yesterday's rows. It flags: stuck-pending rows, >5% empty transcripts, >5% regression in the three plain-English coaching fields (top_strength, top_mistake, one_thing_to_fix_first), internal inconsistencies (e.g. objection counted but no objection captured), and volume/cost anomalies vs. the trailing-7-day median. If anything trips, the system emails Keagan via Resend. Silent on healthy days.
Budget guardrail. A single-row table caps cumulative spend across all backfill + production runs. The chunk RPC auto-pauses the pipeline if a run would breach the cap.
System prompt. Verbatim port of the n8n workflow that produced our first 60 days of analysis — kept identical so trend continuity is preserved. It frames the model as a sales-call analyst evaluating a 4-section call structure: intro · solution · qualifying · transfer.
User message. The diarized transcript (rep + customer turns labeled), plus the call's duration and the rep's name.
Response format. Strict JSON schema (response_format: json_schema mode) with 25 required fields:
- section reach (which of the 4 sections the rep made it through, plus per-section status: skipped / partial / complete)
- transfer outcome (attempted? completed?)
- customer signals (bill amount mentioned, sunlight positive, address confirmed, homeowner status, credit hint)
- disqualification check (should the rep have DQ'd? why?)
- objection list — each with type, the rep's response, response quality (strong / partial / weak), whether it was resolved
- 7 numeric scores (intro · solution · qualifying · objection handling · control · transfer positioning · overall)
- 3 plain-English coaching fields — what the rep did best, their biggest mistake, the one thing to fix on their next call
Why this shape. The schema is designed to power top-down trend analysis: every field is enum-backed (where possible) so we can aggregate across all calls and surface drift. The 3 plain-English fields are the seed for personalized coaching — the audit watches them specifically for regression to enum values or short text.