How Caleb would build Documentation AI

How I'd build Documentation AI

I'd reach for Next.js on the frontend with a Python FastAPI backend, Postgres for the database, Clerk for auth, and the Claude API for doc generation. Stripe handles billing across three tiers. I'd estimate 350-400 hours total, which puts us at roughly nine weeks of sustained work assuming I'm the sole technical operator.

Day-by-day plan

Day 1: Provision Postgres schema for multi-tenant support. Set up Clerk for OAuth so users can sign in via GitHub. Wire the session layer so each tenant's data is isolated.

Day 2: Integrate Stripe. Build the three pricing tiers in the dashboard. Wire webhooks so subscription changes update the database immediately.

Days 3-4: Build GitHub OAuth flow that lets users connect their repository. Write a parser that handles Python, JavaScript, Go, and Ruby files. Cache the file tree so we're not hammering the GitHub API on every request.

Days 5-6: Integrate Claude API. Build the prompt that extracts docstrings, function signatures, and module-level comments, then generates markdown docs. Handle token counting so we don't surprise anyone with sudden costs.

Day 7: Wire the onboarding flow. User connects repo, selects which files to document, hits generate, and gets a preview. Add a "publish to GitHub" button that opens a PR with the generated docs.

Day 8: Containerize the backend, set up monitoring with Sentry, deploy to Railway. Add LogRocket to the frontend for session replay on errors.

Week 2: Load test with medium-sized repos. Iterate on generation quality based on early user feedback. Ship the Product Hunt launch page.

What's hard about this build

LLM costs scale with codebase size. A monorepo with 50k functions could cost 40 dollars to fully document if we're not careful about batching and caching. That's above our 49 dollar monthly price. We need usage metering and per-team quotas, which adds complexity. GitHub's API rate limits will catch us if we naively fetch every file in every request. Data privacy is a real concern: developers don't want their code sitting on our servers. We'll need to either stream directly from GitHub on-demand or delete cached code within 24 hours. Competitors like Mintlify already own this search term, so SEO is an uphill fight. And there's real risk that IDE plugins from GitHub Copilot or Cursor will make this feel redundant within a year.

What's fast because of AI

Claude scaffolds the entire API in minutes. I can give it the database schema and ask for 90 percent of the CRUD endpoints. Tests are dramatically faster because Claude enumerates edge cases I'd normally miss: what if the file is too large, what if the encoding is weird, what if the repo has no docstrings at all. I use Claude for all the product UI copy and the doc-generation prompts themselves. Debugging takes a fraction of the time because Claude can reason through complex stack traces and suggest fixes. Building custom file format parsers is straightforward because Claude can write the scaffolding and I only tweak it for our specific edge cases.

How I'd hand it off

I'd record a Loom walkthrough of the architecture: here's the auth layer, the billing flow, the generation pipeline, and how to add new languages. I'd write a runbook covering deployment, scaling decisions, and common failure modes. You'd own the pager rotation for the first 30 days. I'd transfer all API credentials, database access, and GitHub OAuth tokens via Dashlane or 1Password. Finally, I'd create a Linear board with known issues, feature requests, and growth opportunities so the next operator has a clear starting point.

How Caleb would build Documentation AI.

How I'd build Documentation AI

Day-by-day plan

What's hard about this build

What's fast because of AI

How I'd hand it off