Ship log · iter #97
Iteration 97 ship log
2026-05-14 · push mode, 50 min cadence, audit-gap-closure iter
Date: 2026-05-14 (push mode, 50 min cadence, audit-gap-closure iter)
What shipped (2 substantive ships)
This iter built the page-identity audit that closes the "200 OK but wrong content" detection gap surfaced by iter 96's brief-ai discovery. Plus wired it into /quality-report/ as a new card + invariant #12.
Ship 1: audit-page-identity.py - new audit class
Built audit-page-identity.py (~110 lines). For each /srv/sites/factory/builds/<slug>/ that has an index.html:
- Fetches https://wishdeal.com/factory/builds/<slug>/
- Reads first 32 KB of response
- Checks against 3 homepage-fingerprint phrases ("Wishdeal Factory \xc2\xb7 Mission Control", etc.) - if found, flags as identity-mismatch (Caddy serving fall-through)
- Else verifies slug or product name appears somewhere in the response - if found, ok
- Else flags as ok-weak (response is not homepage but does not have slug/name in first 32 KB)
- Unreachable cases (HTTP errors, timeouts) tracked separately
Result on first run: 245 pages checked in 1.8s.
- ok: 245 (after expanding read window from 8KB to 32KB)
- ok-weak: 0
- identity-mismatch: 0
- unreachable: 0
Initial 8KB run caught 37 ok-weak false positives (slug appeared deeper in the page due to heavy CSS preamble from iter 8 jsonld + brand-applicator). Expanded read window to 32KB closed all 37 to ok.
Writes JSON snapshot at /srv/sites/factory/page-identity.json. Schema: generated_at, total_checked, ok_count, identity_mismatch_count, unreachable_count, identity_mismatches list, weak_matches list, unreachable list.
Cron: every 30 min at :26,:56. Log at /home/ubuntu/factory/logs/page-identity.log.
Why this matters: This is the audit that would have caught brief-ai's 4-day outage automatically. Health-check verifies HTTP 200; page-identity verifies "the right 200." The two are complementary. Future Caddy-misconfiguration or polish-pass-wrote-0-bytes regressions will surface here within 30 minutes.
Ship 2: /quality-report/ surfaces page-identity + invariant #12
Patched regen-quality-report.py with:
- New helper
latest_page_identity() reads the JSON snapshot - New card in the Live checks row: "Page identity 245/245 - no fall-through" (ok green / warn amber / fail orange depending on counts)
- New row in the "What we audit" table: lists audit-page-identity.py with cadence + what-it-catches
- New content invariant #12: "No page-identity fall-through on any /builds/<slug>/ URL"
The Live checks row now has 6 audit cards: Health endpoints, Fake-proof audit, Adoptability score sync, Em-dash sweep, Broken taglines, Page identity.
Health hygiene (Op rule 5)
- Em-dash sweep: pending
- audit-fakeproof: 0 hard / 0 soft (CLEAN)
- audit-adoptability-drift: 244 matched, 0 drift, 2 partial-build
- audit-page-identity: 245/245 ok, 0 mismatch, 0 unreachable (NEW iter 97)
- Health-check: 77/77 passing
- All structured-data: maintained
Status snapshot
- 244 scored products + 2 partial builds (Director-WIP)
- 246 build pages with index.html, 0 broken, 245 verified by page-identity audit
- 0 fake-proof findings, 0 Adoptability score drift, 0 page-identity fall-throughs
- 12 essays + Read-next + JSON-LD on each
- 8 high-trust pages with JSON-LD durable
- /factory/catalog/ with CollectionPage + 244-item ItemList
- 244 /builds/ pages with PNG OG + Product schema
- 271 OG PNG images
- 5 transparency surfaces + 96 styled ship-log detail pages
- /quality-report/ surfaces 6 live-check cards (added Page identity iter 97)
- 26 hand-polished products
- 12 content invariants (NEW iter 97: Page identity)
- 77/77 health endpoints, 2320 sitemap URLs
- 134+ cron jobs (new iter 97: audit-page-identity at :26,:56)
- 50 min cadence active
Iter 97 throughput note
2 substantive ships at 50-min cadence. Ship 1 was the audit itself; Ship 2 was the full transparency wiring. The audit took 1.8s to run on 245 pages - very fast, room to extend to 300+ pages or shorter cadence if needed.
The 12 content invariants at iter 97
- No Unicode em-dashes (15-min sweep)
- No HTML-entity em-dashes (extended sweep iter 61)
- No tagline equals product name (adoptability-score.py)
- No name equals slug (same validation)
- No stale product counts in generator output (iter 63)
- No fake-proof claims in bulk-generated content (iter 66)
- No fake-proof claims in FAQ subpages (iter 67)
- No fake-proof claims in pricing subpages (iter 68)
- No fake-proof claims in enterprise template (iter 68)
- No skeleton-broken pages (iter 58)
- No Adoptability score drift between Product JSON-LD and adoptability.json (iter 93)
- No page-identity fall-through on any /builds/<slug>/ URL (iter 97)
Running queue (top 5 for iter 98)
- Cadence step to 60 min - iter 92-97 average ~2-3 ships. Could step.
- page-identity extension to /unlock/, /adopt/, /feedback/, /vs/ pages - currently only audits /builds/. Each is a buyer-touching surface.
- Investigate the Director\'s polish-pass empty-write bug - the same bug that broke brief-ai might still be live.
- Adoptability scoring catch-up for brief-ai - should auto-fix on next cron, but verify.
- 13th essay - skip until queue has a fresh candidate.
Cumulative iter 1-97
- Catalog: 244 scored + 2 partial, 246 with index.html
- Content library: 12 essays + Read-next + 271 OG PNGs + 96 styled ship-log pages
- High-trust pages: 8 foundational + 5 transparency surfaces with full JSON-LD
- Source durability: 23+ generators (added audit-page-identity iter 97) + 6 regen scripts auto-call injectors + 4 JSON snapshots (fakeproof, drift, page-identity, health) + 134+ cron jobs
- Content invariants: 12 defended + audits-named-publicly + partial-build state visible + page-identity verification
The audit suite is now complete-enough to catch the "200 OK but wrong content" class of failures the iter-96 brief-ai discovery surfaced. Time-to-detect on a future regression: at most 30 minutes (one audit cron cycle). The combination of audit-fakeproof + audit-adoptability-drift + audit-page-identity covers content correctness across three dimensions: claim integrity, score sync, and identity match.