How Caleb would build AfterHours

How I'd build AfterHours

I'd reach for Next.js on the frontend with a Postgres database, Twilio for inbound call handling and SMS dispatch, Stripe for billing, and Resend for email delivery of the morning briefs. The call agent logic itself runs on Claude's API with structured prompts that enforce the triage decision tree and booking flow. I'd estimate 220-280 hours for a production-ready MVP: four weeks working roughly 55-70 hours a week. That gets you a multi-tenant dashboard, live call intake, calendar sync, brief generation, and billing automation, but not white-label customization or advanced analytics.

Day-by-day plan

Day 1: Postgres multi-tenant schema, row-level security policies, and NextAuth for operator authentication. Provision Stripe account and API keys.
Day 2: Wire Stripe billing, customer creation, and subscription management across three tiers. Add payment success webhooks.
Day 3-4: Build customer onboarding flow: business type selection, service area definition, timezone setup, calendar integration (Google Calendar and Outlook), and voice persona customization (name, greeting tone).
Day 5-6: Implement Twilio inbound call routing, call recording consent capture, and the Claude-powered call agent with structured prompt injection for triage rules.
Day 7-9: Build the brief generation and delivery pipeline (SMS and email), test calendar availability checks, and add manual booking overrides for the dashboard.
Day 10: Implement monitoring, error logging to Sentry, call analytics dashboards, and churn-risk signals (low call volume, missed briefs).
Day 11-13: Security hardening, load testing, edge case refinement, and production deployment to a Caddy reverse proxy.
Day 14: 30-day runbook, credential rotation, and handoff documentation.

What's hard about this build

The triage engine is the crux. A plumbing agent must distinguish "water coming through the ceiling" (P1, same-day callback, legal liability risk) from "Can you come next Tuesday for a quote" (P3, check calendar availability). Building that decision tree requires tight collaboration with a few HVAC and plumbing owners to enumerate the signals that matter. It's not a generic AI problem; it's a domain problem with legal and financial stakes. HIPAA for the clinic vertical is a second blocker: most AI voice vendors are not covered entities, which means call recording, storage, and transcription require either BAA terms or expensive encrypted workflows. For the legal vertical, conflict-of-interest screening is non-negotiable but adds latency to intake if done via API lookup. Real-time calendar sync can miss a booking if a technician schedules offline, and there's no clean recovery pattern without polling. I'd plan for a week of dead ends on vendor integrations alone.

What's fast because of AI

Claude's API compresses what used to be multi-day work into hours. I'd use it to scaffold the prompt template for the triage decision tree, then iterate on real call transcripts to refine accuracy. Writing the test suite for the brief generation logic is maybe 80 percent faster; Claude enumerates edge cases (no callback number provided, caller is a referral, emergency but service area is closed) and auto-writes assertions. Copywriting the onboarding flow and dashboard UI takes a day instead of a week because Claude drafts marketing copy, error messages, and form labels that actually sound like an operator wrote them. Debugging the call agent's booking flow is faster too; I feed Claude a transcript, prompt it to explain why the agent booked or refused, and it spots the logic gap immediately. The real multiplier is exploration: what happens if a customer wants to override an urgent classification. What if they want to route calls to a phone number instead of SMS. Claude helps me enumerate those scenarios in an afternoon instead of a sprint of back-and-forth with stakeholders.

How I'd hand it off

I'd record a Loom walkthrough of the entire operator experience: logging in, creating a new business, triggering a test call, reviewing briefs. Then a written runbook that covers Twilio configuration, Stripe test mode, how to add new service categories, and how to adjust triage rules without touching code. You'd take a 30-day pager rotation; I'd be on call to debug call routing failures or brief delivery misses. All Postgres credentials, API keys, and Twilio account access transfer to your infra; I'd set up a shared password vault (1Password) and train one operator to rotate them after day 14. Monitoring dashboards go into Datadog so you see call success rates, brief latency, and failed bookings in real time.

How Caleb would build AfterHours.

How I'd build AfterHours

Day-by-day plan

What's hard about this build

What's fast because of AI

How I'd hand it off