How Caleb would build Shipcheck

How I'd build Shipcheck

I'd ship this as a Next.js frontend on Vercel connected to a FastAPI backend on Railway, with a Postgres database on the same Railway instance for scans, results, and user billing state. Stripe handles recurring subscriptions at three tiers, Resend sends receipt emails, and we hit the Claude API plus Semgrep API and OSV.dev for the scanning logic. Rough estimate is 100 hours, so a solid two-week project at $75/hr.

Day-by-day plan

Day 1-2: Postgres schema for users, repos, scan runs, and findings. GitHub OAuth flow via Clerk with login page. File upload handler for zips, S3 bucket for artifact storage, and multi-tenant tenant isolation.
Day 3: Stack detection logic (regex patterns plus package.json parsing, Gemfile checks for Rails, requirements.txt for Python). Semgrep API integration to pull AI-codegen-tuned rule sets.
Day 4: OSV.dev dependency audit integration to check for known vulnerabilities. Store CVE results alongside Semgrep findings in the scan record, deduped by CVE ID.
Day 5: Claude API loop for logic review. Pass repo context plus Semgrep gaps to Claude with a System prompt tuned for auth and input validation issues. Get structured findings back, store results.
Day 6: Report generation and tiering (ship-blockers vs warnings vs noise) based on severity and pattern prevalence. Write explanations for developers who didn't write the code.
Day 7: Stripe billing schema. Hook up the three pricing tiers, metered usage for rescans, free tier quota logic.
Day 8-9: Next.js pages for upload, scan results, billing dashboard. Rescan button that targets flagged files only, not the full repo. Export report as JSON for CI integration.
Day 10: Quality pass. Add Sentry for error tracking, CloudFlare rate limiting to prevent free-tier abuse, basic analytics (scans per day, avg repo size).

What's hard about this build

The Claude billing model is the tricky part. Each scan hits Claude twice: once to analyze the repo structure and plan attack surface, then again to review logic in the flagged areas. Large monorepos blow through tokens fast, compressing margins if power users adopt first. I'd need to implement token budgeting per scan and either reject oversized repos or tier them to a higher price bucket upfront. The stack detection heuristics also need careful calibration. A Next.js project looks different if it uses tRPC versus REST, and missing that difference will cause the Semgrep rules to fire false positives. Finally, Semgrep API has a rate limit and doesn't cache results across users, so back-to-back scans from different developers on the same public repo cost twice. I'd cache the Semgrep result keyed to (repo_url, commit_hash) to avoid redundant work.

What's fast because of AI

Claude generates the entire Semgrep rule set customization and OSV.dev query logic in two Cursor-assisted sprints instead of a week of hand-tuning. The report formatter that converts raw findings into "here's why this matters and how to fix it in plain English" is basically a Claude System prompt plus a for-loop; a human writing that tone would take days. Debugging deployment issues with Railway is faster because Claude can read stack traces and suggest fixes. Test coverage for the main flow (upload repo, run Semgrep, call Claude, return report) comes from prompt-driven test generation. The UI copy for the landing page and pricing tiers lands in one draft because Claude understands the indie hacker audience better than most copywriters. I'd also use Claude to enumerate edge cases during the QA pass that I'd otherwise miss.

How I'd hand it off

I'd record a Loom walkthrough covering GitHub OAuth setup, the Stripe billing keys, Claude API key, and how to read the Postgres schema. I'd commit a runbook to the repo detailing how to rotate API keys, how to add new Semgrep rules or update the Claude prompt, and what to do if Stripe webhooks fail. If you're hiring me to run Day 1-30 support, I take a pager rotation starting Week 4: any critical bugs (scans failing, auth broken, charges not processing) get handled the same day.

How Caleb would build Shipcheck.

How I'd build Shipcheck

Day-by-day plan

What's hard about this build

What's fast because of AI

How I'd hand it off