# Priya Mehta, Principal AI Engineer at Clarafield (Series B, ~160 eng) — read of AgentX, June 24 2026

> 9 years backend, 2.5 years neck-deep in LLM infrastructure, currently the person on-call when our agent pipeline goes sideways at 2am.

## How I got here

Searched "AI agent observability tool" on Google last Thursday. We have three agents running in production — a document classifier, a customer-triage router, and something our ML lead built that I still don't fully trust. We've been duct-taping together Langfuse and a homegrown eval harness and it's starting to crack. I've been comparison-shopping. AgentX showed up on page one.

## What I clicked first

"Catch reasoning loops, hallucinations, constraint violations, and divergence patterns before they impact production workflows." That's the first hero sub-line under Automated Issue Detection and it's specific enough that I kept reading. Most tools in this space say "monitor your AI." This one named the actual failure modes. That earns thirty more seconds.

The "Real-time Replay Console" also made me stop. Replaying a failed agent run step-by-step is exactly the thing I've been manually cobbling together with log exports. If that actually works the way it sounds, I want it.

## Where I paused

The pricing. "$0.01 per 1K agent evaluations" is the kind of number that sounds cheap and then isn't. I did the math on our current volume: we run maybe 80K agent calls a day, so that's $0.80/day, $300/year. Fine, actually reasonable. But I have no idea what counts as "one evaluation." Is it per LLM call? Per tool invocation? Per full agent run? That ambiguity is load-bearing and the page just doesn't address it.

## What I distrusted

"Trusted by Leading AI Teams" followed by "12+ Production deployments at scale." Then two scrolls later: "Honest disclosure: we don't have live customers on this idea yet."

Read that again. The same page puts a social proof badge at the top and then explicitly tells me they have zero live customers. I genuinely sat there for a minute trying to figure out if I misread something. I didn't. That's not a small contradiction. That's the page eating itself.

And then I hit the pricing table for "Unlock the dossier $5" and "Adopt the build $99-$199" and I finally understood: this isn't a product I can buy. This is a business idea for sale. The hero copy is a pitch for a product that doesn't exist yet. Someone is selling me the blueprint for AgentX, not AgentX.

The "42% Avg token reduction after optimization" stat is also doing a lot of heavy lifting with no denominator. 42% across how many runs, on what tasks, for which agent architectures? That number might be real or it might be from one pilot with a badly-written prompt. No way to know.

## What would convince me

If a real, shipped version of this tool existed, I'd want to see a trace from an actual agent failure. Not a screenshot of a dashboard, but a short screen recording of someone loading a failed run into the replay console and walking through what it caught. Twenty seconds of real UX beats every stat on this page.

For the "3,000+ issues caught in pilot" claim specifically: one paragraph describing what types of issues those were (reasoning loops vs. tool selection errors vs. cost overruns) would make that number mean something instead of floating.

## What I'd ask in an email reply

1. The page says "automated issue detection" but also that there are no live customers yet. Is there a working build I can actually run against my agent logs, or are you still in pre-build?
2. What's the unit of billing: per LLM call, per tool invocation, or per full agent run? Our pipelines nest sub-agents and I need to know if that gets expensive fast.
3. The replay console is the thing I'd pay for on day one. How does log ingestion work? Do I instrument my agents with an SDK, ship logs to an endpoint, or connect an existing provider like Langfuse?

## Verdict: on-the-fence

The underlying problem is real and the feature language is more specific than most of what I've seen. But I spent five minutes on this page and I'm still not certain whether I'm looking at a live beta I can sign up for or an idea package I can buy for $99. That ambiguity is fatal for a tool aimed at engineers. Fix that and I'd probably book the demo.

---
*Memo by skeptic persona, generated 2026-06-24. Studio breaks own self-grading loop.*