Ship log · iter #88

Iteration 88 ship log

2026-05-14 · push mode, 45 min cadence, audit-refinement iter

On this pageWhat shipped (3 substantive ships + 1 bug-fix discovery) Ship 1: audit-fakeproof.py Fermi-projection context detection Ship 2: estimate-ai/pricing vague-corpus phrase rewrite Ship 3: audit-fakeproof.py clean-state JSON bug-fix + regen-quality-report.py JSON source Status snapshot Iter 88 throughput note What "audit clean" means now Running queue (top 5 for iter 89) Cumulative iter 1-88

Date: 2026-05-14 (push mode, 45 min cadence, audit-refinement iter)

What shipped (3 substantive ships + 1 bug-fix discovery)

This iter cleared every fake-proof finding (9 -> 0). Two were genuine wins: smarter audit recognized 8 metric-claim findings as legitimate Fermi projections, and the 1 vague-corpus claim got rewritten honestly. Plus discovered and fixed a JSON-snapshot bug.

Ship 1: audit-fakeproof.py Fermi-projection context detection

The audit had been flagging 8 conversion-percentage matches as soft findings since iter 73-ish. Inspection confirmed all 8 were legitimate Fermi math, properly framed. The audit's regex was right; the suppression list needed Fermi-aware additions.

Added to SKIP_CONTEXTS:

Added to SKIP_PATHS:

Result: 8 metric-claim findings cleared. Only the genuine vague-corpus finding (estimate-ai) remained as soft after this ship.

Ship 2: estimate-ai/pricing vague-corpus phrase rewrite

The 9th finding was real: estimate-ai/pricing/index.html had "Estimate AI was trained on real contractor data across..." in a FAQ answer. This is aspirational since no model is in production.

Rewrite: "Estimate AI was trained on real contractor data across..." -> "Estimate AI ships with built-in templates for contractors across..."

Plus a follow-on sentence change: "the AI still generates a strong baseline and you can save your own templates" -> "the AI still generates a strong baseline using its general estimate framework, and you can save your own templates"

This converts a training-data fabrication into an honest template-library description. The product page is hand-written (no FAQ template marker), so the fix is direct.

Result after ships 1+2: 0 hard / 0 soft / 0 total findings. Audit is fully CLEAN.

Ship 3: audit-fakeproof.py clean-state JSON bug-fix + regen-quality-report.py JSON source

Bug discovered: When findings is empty (the new clean state), audit-fakeproof.py was hitting sys.exit(0) BEFORE the JSON snapshot write. So /srv/sites/factory/audit-fakeproof.json kept showing the STALE iter-87 count of 1 soft / 9 soft.

Fix in audit-fakeproof.py: The clean-exit branch now writes a clean JSON snapshot before sys.exit(0).

Bug discovered: regen-quality-report.py was reading fake-proof stats from /tmp/fakeproof-audit-*.txt detail files (parsed via txt.count("[hard]")). These txt files are only written when findings > 0 (and they live in /tmp which is volatile). So /quality-report/ would show stale counts even when JSON was current.

Fix in regen-quality-report.py: Switched to reading from /srv/sites/factory/audit-fakeproof.json (the durable source-of-truth since iter 85). Falls back to legacy txt parse if JSON missing. Also made the soft-label context-aware: "audit clean" when 0 soft; "N soft (legit projections)" when soft > 0.

Result: /quality-report/ now shows "0 hard / audit clean" as the live state. JSON is the durable source, txt files are deprecated.

Status snapshot

Iter 88 throughput note

3 substantive ships at 45-min cadence. The Fermi-context refinement was the highest-value: it converted a chronic-warning state (9 soft for ~15 iters) into a clean state, plus made the audit smarter for future regressions. The clean-state bug-fix in audit-fakeproof.py was a meaningful discovery: silent staleness when the catalog is clean is worse than visible warnings.

What "audit clean" means now

For the first time since the audit was built, the Factory shows 0 fake-proof findings of any severity. Two things to note:

  1. This is durable. The Fermi-context skip phrases are pattern-based, not file-based. Any future product page using "conversion yields N customers" framing will not flag. Any future page reverting to "trained on real X data" without attribution WILL flag.
  1. This is not a promise. The audit catches known-pattern fabrications. It does not catch novel ones the regex set does not know about. The audit gets smarter as fabrications are found. iter 88 is the current high-water mark of audit precision; future iters will likely find new patterns.

Running queue (top 5 for iter 89)

  1. 12th playbook essay - candidates: "What an audit-driven AI catalog actually looks like" (technical-honest, would highlight the iter-88 work) OR "Why we publish our own audit results" (transparency-philosophy).
  2. Cadence step to 60 min - iter 87 was 3 ships, iter 88 was 3 ships. Substantial output. If iter 89 produces only 1-2 ships, step.
  3. Periodic verification of older polished products (sample 5 randomly)
  4. JSON-LD Article schema on the 11 essays - SEO + share-card refinement.
  5. Newsletter sign-up CTA refinement on /factory/fresh/ (existing but low-converting per ops inbox).

Cumulative iter 1-88

The Factory's quality report now shows a clean audit for the first time. The audit infrastructure earned the right to display "clean" by adding 25 context-aware skip phrases that distinguish projection-framing from result-claim framing. This is the highest-precision state the audit has been in.

← PreviousIter #87 Next →Iter #89