Vertical-agent design spec
← All agent specs

Shipcheck - Pre-Launch Scanner for AI-Built Apps - Vertical Agent Spec

One-line definition

An agent that scans AI-generated codebases for security vulnerabilities, broken auth logic, and deployment blockers for solo developers shipping vibe-coded apps.

The workflow it owns end-to-end

  • Developer connects a GitHub repo (or uploads a zip) and specifies their target stack and hosting platform before clicking "Launch."
  • Agent ingests the repo, detects the stack (Next.js, FastAPI, Rails, etc.), identifies AI-generated file signatures (Cursor artifacts, v0 output patterns, Bolt scaffolding), and maps the attack surface.
  • Agent runs three passes in sequence: static analysis via Semgrep rule sets tuned for AI codegen anti-patterns, dependency CVE audit via OSV.dev, and an LLM-based logic review targeting auth flows, input handling, and environment variable exposure.
  • Agent produces a tiered report: ship-blockers (must fix), warnings (fix this week), and noise (low priority). Each finding includes a one-line explanation written for a developer who did not write the code themselves.
  • Developer marks issues resolved and reruns a targeted rescan against the flagged files only, which closes the loop without re-scanning the full repo.

What it knows that a generic LLM doesn't

  • AI codegen tools produce recognizable anti-patterns at high frequency: Supabase RLS left disabled after scaffolding, .env.example committed with real keys still in it, JWT verification present in one route but missing in the three routes added the next day.
  • Stack-specific deployment blockers by platform: Vercel serverless functions with no rate limiting, Railway apps binding to localhost instead of 0.0.0.0, Render deploys missing build cache invalidation that silently ships stale code.
  • What "good enough to ship for an indie project" actually looks like versus what a security scanner tuned for enterprise codebases will flag as critical. Most static analyzers were not calibrated for a 2,000-line MVP with zero users yet.
  • Common vibe-coder patterns that are not bugs but will get flagged by generic linters: intentional any types used throughout, console logs used as the only observability layer, commented-out code blocks left by Copilot that never got cleaned up.
  • Dependency trees for AI-recommended packages tend to include abandoned libraries that were popular in training data but have had no commits in three years.
  • The difference between a hardcoded secret that is a real credential versus a placeholder the LLM generated that looks real but is not.

What it explicitly declines

  • Legal compliance review. If an app handles health data or processes payments, Shipcheck flags that a human review is needed but does not assess HIPAA, PCI-DSS, or GDPR conformance.
  • Code fixes. The agent reports findings and explains them; it does not open pull requests or patch files. The scan-and-fix loop is a separate product problem and a significant liability expansion.
  • Architecture recommendations. If the fundamental data model is wrong or the chosen stack will not scale, that is outside scope. The agent scans what exists, not what should have been built.
  • Penetration testing or runtime vulnerability confirmation. All findings are static. The agent cannot confirm whether a vulnerability is actually exploitable in the deployed environment.

Tools and integrations required

  • GitHub API (OAuth app) for repo access and targeted rescan on specific file paths.
  • Semgrep OSS with a custom rule registry maintained for AI codegen patterns; rules need to be updated as codegen tools evolve, which is an ongoing maintenance cost that is easy to underestimate.
  • OSV.dev or the npm/PyPI advisory APIs for dependency CVE lookups.
  • Stripe for per-scan billing and subscription management.
  • SendGrid or Postmark for report delivery and rescan notification emails.
  • An LLM API (Claude or equivalent) for the logic review pass; this is the variable cost that scales with repo size and is the primary margin risk at flat-rate pricing.

Trust escalation: when it pings a human

  • When the credential pattern detector identifies what appears to be a live API key, database password, or OAuth secret (not a placeholder), the scan pauses and the developer is notified before the report is generated. The credential is not logged.
  • When the repo contains indicators of regulated data handling (HIPAA field names, PAN patterns, SSN formats in schema files), the agent flags it and recommends a human security review rather than treating it as a standard finding.
  • When scan confidence is low because the repo is structured in a way the agent has not been trained against (monorepos with unusual layout, non-standard framework configurations), the report says so explicitly rather than returning a clean result.

Pricing model

Per-scan pricing at $19 per full scan makes sense for this audience and use case. A free tier of one scan per month is the GTM hook, but the honest question is whether a vibe coder who gets a free scan with findings they do not understand will pay $19 to rescan after fixing. The evidence from similar developer tools suggests conversion rates in the 3 to 7 percent range from free to paid. A subscription at $39 per month for unlimited rescans targets the developer who is iterating daily in the week before launch, which is a real behavior, but the total addressable paying population is likely small. LLM API costs for a 10,000-line repo scan run $0.50 to $2.00 depending on model and context window usage, which means margin is acceptable at $19 per scan only if the median repo stays under 5,000 lines.

Differentiation from a generic LLM wrapper

The honest answer is that an experienced developer can paste their codebase into Claude, ask it to identify security issues and deployment blockers, and get 70 percent of what Shipcheck produces. The remaining 30 percent is workflow friction reduction (no manual copy-paste, automatic file traversal, structured report format, rescan targeting), the Semgrep rule set calibrated specifically for AI codegen output, and the dependency CVE lookup that a plain LLM cannot do reliably. Whether that 30 percent gap justifies $19 per scan depends entirely on how lazy or time-pressed the buyer is, and vibe coders who just shipped a working app in three days are often not the audience with the highest pain tolerance for tooling overhead. The stronger argument for Shipcheck as a vertical agent is not capability differentiation but workflow integration: if the scan lives inside the GitHub push flow as a required check before a deployment proceeds, developers do not have to choose to use it. That requires a different go-to-market than Product Hunt, and it puts Shipcheck in direct competition with GitHub Advanced Security and Vercel's built-in checks, both of which are free at the tier this audience uses.

→ See repo-scanner as a SaaS landing page · → Fermi math (SaaS shape)