# Marcus Fielding, Head of Platform Engineering at Carepath Analytics — read of Anonymizer, June 13 2026

> 9 years in health-tech backend, currently holding the HIPAA line at a 90-person company that sells risk-scoring tools to regional payers. We have three devs poking at GPT-4 for internal stuff and I have been the person saying "not yet" for 8 months.

## How I got here

Searched "redact PHI before openai api call python" last Tuesday. Got a Stack Overflow thread, a medium post from 2023 that was already half-wrong, and then this in the second page of results. I actually came back to it the next day because I had tabbed it closed and remembered the URL had the word "anonymizer" in it. That's the only reason I'm here now.

## What I clicked first

The hero did one thing right. "Once it reaches an API, it's logged. Training data. Competitors' intelligence. HIPAA violations. GDPR fines." That list is basically a verbatim transcript of my last three conversations with legal. I didn't click anything, I just kept reading. That's rare for me on a product page.

## Where I paused

The reversibility claim. "We map redactions to tokens so results stay coherent. De-anonymize output automatically." I stopped here for a while. This is actually the hardest part of the problem and they've given it one sentence. If the LLM response says "the patient [PERSON_1] showed elevated [MEDICAL_TERM_2] consistent with [CONDITION_3]" and you de-anonymize that back, you need the mapping to be rock-solid across multi-turn sessions and streaming responses. I've seen three open source attempts at this and they all fall apart in edge cases. One sentence of confidence on the hard part makes me more nervous, not less.

## What I distrusted

Two things.

First: "30 minutes to integration." I've heard this from every SDK vendor in the last five years. What they mean is "30 minutes if your data is already clean, you're on a modern stack, and nothing weird is happening." For a compliance use case where the whole point is handling weird edge cases, this number is marketing math.

Second, and this is the one that stopped me cold: at the bottom, buried under pricing and CTAs, it says "Honest disclosure: we don't have live customers on this idea yet." The whole page is written in the present tense. "Anonymizer intercepts your data." "We detect and redact." "Results come back." There is no product. This is a business idea dossier being sold for $5 to $99. The homepage is written as if I'm evaluating live software. I had to read it twice to confirm that's what I was looking at.

That's not a small thing. That's the entire framing of my read, reversed.

## What would convince me

If this were a real product: false positive and false negative rates on a standard PII benchmark. Not "Regex + ML model identifies SSN, email, credit cards" -- what percentage does it catch, what percentage does it flag that aren't PII, and what happens to my LLM output coherence when 40% of a sentence is redacted tokens. A five-minute loom of a real codebase running a real document through the SDK. Not a demo environment, a real messy doc.

If someone is actually building this from the dossier: I want to talk to the first three paying teams. What did they break. What did they have to customize. What patterns did the default detection miss.

## What I'd ask in an email reply

1. What does the token mapping look like in a multi-turn conversation where the LLM references a redacted entity 6 messages later? Does the de-anonymization hold?

2. The "on-device processing" framing is doing a lot of work here. Does the ML detection model run entirely local, or does it phone home for model updates, telemetry, or licensing checks?

3. You say "custom patterns you define" -- what's the interface for that? Regex strings in a config file, or something that needs retraining?

## Verdict: on-the-fence

The problem is real and the approach is technically sound, which puts this ahead of 80% of the compliance-tool pages I've read this year. But I got to the bottom and found out I wasn't evaluating a product, I was evaluating an idea someone is selling as a business kit. That's a different conversation than the one the page was having with me.

---
*Memo by skeptic persona, generated 2026-06-13. Studio breaks own self-grading loop.*
