Wishdeal Factory
Concept essay · too complex to MVP

realized my cursor chat history contains every customer record i pasted in for "help debug this." that history is. somewhere?

An honest investment memo for an idea the studio decided not to ship as a landing page. Investors and founders read this kind of memo. Marketing copy is on the homepage; this is the math.

What This Is

A tooling layer that maps, audits, and provides compliance-grade visibility into personal LLM chat histories where customer data lives. It's not about preventing developers from pasting data (they will anyway). It's about answering the question: where is my sensitive information actually stored, who has access, how long does it persist, and what's my exposure?

The concrete problem: you have 200+ cursor chats. Embedded in those chats are real customer records, transaction IDs, email addresses, API errors showing internal infrastructure, stripe webhook payloads, database query results with PII. Cursor stores these on Anthropic's servers. You have zero visibility into retention policy, zero control over deletion, zero audit trail of what's in there. When you need to comply with GDPR, respond to a security audit, or prepare for a data breach disclosure, you have no way to say "this data is sanitized."

This is real. It's happening at every startup using Claude, ChatGPT, and other LLMs for day-to-day debugging. The larger the company and the longer the tool adoption, the more data quietly accumulates in chat histories you don't control.

Why It's Interesting

The security industry has spent 20 years solving code repository scanning (secrets detection, SAST). This is the blind spot of the LLM era. Developers are trained to think of chat as ephemeral. They're not. A Cursor workspace is a permanent, centralized log of every question you asked the AI, and those questions are about your infrastructure, your customers, your business logic.

The business insight: companies will pay to know what they've inadvertently disclosed. Not because they want to prevent it (they won't), but because they need compliance documentation, they need to respond to security incidents, and they need to sleep at night. This is less "save developers from themselves" and more "give ops and legal peace of mind."

The second insight: this creates recurring revenue. Once a company knows the scope of exposure, they'll want continuous monitoring. They'll want alerts when something matches customer PII patterns. They'll want exportable audit logs for their SOC 2 auditor.

Why a Landing Page Would Fail

You can't sell this to founders or engineering managers directly. They know they have the problem but will rationalize it ("we're careful," "we mostly use open data," "it's fine"). The person who cares is the person whose job is on the line: the security officer, the compliance officer, the ops lead who just got asked by board counsel "what's your exposure to LLM chat data leaks?"

A landing page implies a self-serve product. This isn't. You're selling to a risk function, not a productivity function. The buyer needs proof that your analysis is trustworthy and auditable. That requires a pilot with a real company, visibility into their chat history, and a written report they can share with counsel.

The second friction: this is uncomfortable. Companies don't want to admit they've been pasting customer data into chat. They certainly don't want a vendor knowing exactly what they've disclosed. You'll need to earn trust, prove you're not storing the data yourself, and demonstrate that your analysis is confidential.

The Realistic Shape

Technical architecture:

API that connects to Cursor/Claude/ChatGPT workspaces via OAuth. Pulls chat history locally. Doesn't store it. Runs pattern matching and PII detection locally. Generates a report of risk vectors and retention recommendations. Deletes the local copy. Never touches vendor APIs with the data twice.

Support for other LLM platforms (ChatGPT, Claude web, etc.) via account connection and login. The vendor data is transient in your system. You run analysis and immediately purge.

Revenue model:

$5k-15k per year, per company (not per seat). Billed as security/compliance. Enterprise pricing based on company size and chat history volume.

Team:

Capital:

$200-400k to build, pilot with 5 companies, and reach PMF. Not a venture-scale bet. This is lifestyle SaaS or early-stage venture, depending on founder goals.

6-month milestones:

Month 1-2: Build and test pattern matching pipeline locally. Month 2-3: Integrate Cursor OAuth. Month 3-4: Run 2 pilot audits with real companies. Month 4-5: Polish reporting and add ChatGPT support. Month 5-6: Launch and target 5 paying customers.

Honest 12-Month Case

Revenue scenarios:

Conservative: 3 customers at $8k/year = $24k ARR. Churn risk is low (once a company runs an audit, they're unlikely to stop). Growth comes from word-of-mouth in security circles.

Optimistic: 20 customers at $10k/year = $200k ARR. Requires strong positioning and active sales outreach to security officers at funded companies.

Realistic: 8-10 customers, $80-100k ARR by month 12. Slow early growth, then acceleration once you have case studies.

Kill criteria:

If after 6 months and 5 pilots, you can't articulate why a company should pay for this (i.e., they don't view LLM chat exposure as a real risk), the problem is either smaller than you think or will be solved by the LLM vendors themselves (Anthropic enforcing retention limits, ChatGPT adding compliance controls). Kill it.

If Cursor/Anthropic rolls out native compliance tooling that covers this, your window closes. Move fast.

Five Questions to Answer Before Committing

1. Will vendors let you build this? Cursor/Anthropic/OpenAI can shut you down if they view this as competitive or if their terms of service forbid automated chat export. Talk to legal before you build.

2. Who's actually the buyer? Is it the CISO? The ops lead? The CTO trying to cover their tracks? Get 3 real conversations with people in these roles before you commit. Understand what would make them sign a check.

3. Can you prove no data is stored? This is a trust business. Your entire value prop evaporates if a customer believes you're keeping copies of their chat history. What's your architecture for proving this to a SOC 2 auditor?

4. How do you handle deletion at the source? If you find sensitive data, can you actually request deletion from the vendor? Or are you limited to reporting? That's a feature question that affects buyer confidence.

5. What's the real unit of value? Is it the risk score, the compliance report, the continuous monitoring, or the remediation advice? Until you know this, you can't price it or position it.