Iteration 97 ship log

2026-05-14 · push mode, 50 min cadence, audit-gap-closure iter

On this pageWhat shipped (2 substantive ships) Ship 1: audit-page-identity.py - new audit class Ship 2: /quality-report/ surfaces page-identity + invariant #12 Health hygiene (Op rule 5) Status snapshot Iter 97 throughput note The 12 content invariants at iter 97 Running queue (top 5 for iter 98) Cumulative iter 1-97

Date: 2026-05-14 (push mode, 50 min cadence, audit-gap-closure iter)

What shipped (2 substantive ships)

This iter built the page-identity audit that closes the "200 OK but wrong content" detection gap surfaced by iter 96's brief-ai discovery. Plus wired it into /quality-report/ as a new card + invariant #12.

Ship 1: audit-page-identity.py - new audit class

Built audit-page-identity.py (~110 lines). For each /srv/sites/factory/builds/<slug>/ that has an index.html:

Fetches https://wishdeal.com/factory/builds/<slug>/
Reads first 32 KB of response
Checks against 3 homepage-fingerprint phrases ("Wishdeal Factory \xc2\xb7 Mission Control", etc.) - if found, flags as identity-mismatch (Caddy serving fall-through)
Else verifies slug or product name appears somewhere in the response - if found, ok
Else flags as ok-weak (response is not homepage but does not have slug/name in first 32 KB)
Unreachable cases (HTTP errors, timeouts) tracked separately

Result on first run: 245 pages checked in 1.8s.

ok: 245 (after expanding read window from 8KB to 32KB)
ok-weak: 0
identity-mismatch: 0
unreachable: 0

Initial 8KB run caught 37 ok-weak false positives (slug appeared deeper in the page due to heavy CSS preamble from iter 8 jsonld + brand-applicator). Expanded read window to 32KB closed all 37 to ok.

Writes JSON snapshot at /srv/sites/factory/page-identity.json. Schema: generated_at, total_checked, ok_count, identity_mismatch_count, unreachable_count, identity_mismatches list, weak_matches list, unreachable list.

Cron: every 30 min at :26,:56. Log at /home/ubuntu/factory/logs/page-identity.log.

Why this matters: This is the audit that would have caught brief-ai's 4-day outage automatically. Health-check verifies HTTP 200; page-identity verifies "the right 200." The two are complementary. Future Caddy-misconfiguration or polish-pass-wrote-0-bytes regressions will surface here within 30 minutes.

Ship 2: /quality-report/ surfaces page-identity + invariant #12

Patched regen-quality-report.py with:

New helper latest_page_identity() reads the JSON snapshot
New card in the Live checks row: "Page identity 245/245 - no fall-through" (ok green / warn amber / fail orange depending on counts)
New row in the "What we audit" table: lists audit-page-identity.py with cadence + what-it-catches
New content invariant #12: "No page-identity fall-through on any /builds/<slug>/ URL"

The Live checks row now has 6 audit cards: Health endpoints, Fake-proof audit, Adoptability score sync, Em-dash sweep, Broken taglines, Page identity.

Health hygiene (Op rule 5)

Em-dash sweep: pending
audit-fakeproof: 0 hard / 0 soft (CLEAN)
audit-adoptability-drift: 244 matched, 0 drift, 2 partial-build
audit-page-identity: 245/245 ok, 0 mismatch, 0 unreachable (NEW iter 97)
Health-check: 77/77 passing
All structured-data: maintained

Status snapshot

244 scored products + 2 partial builds (Director-WIP)
246 build pages with index.html, 0 broken, 245 verified by page-identity audit
0 fake-proof findings, 0 Adoptability score drift, 0 page-identity fall-throughs
12 essays + Read-next + JSON-LD on each
8 high-trust pages with JSON-LD durable
/factory/catalog/ with CollectionPage + 244-item ItemList
244 /builds/ pages with PNG OG + Product schema
271 OG PNG images
5 transparency surfaces + 96 styled ship-log detail pages
/quality-report/ surfaces 6 live-check cards (added Page identity iter 97)
26 hand-polished products
12 content invariants (NEW iter 97: Page identity)
77/77 health endpoints, 2320 sitemap URLs
134+ cron jobs (new iter 97: audit-page-identity at :26,:56)
50 min cadence active

Iter 97 throughput note

2 substantive ships at 50-min cadence. Ship 1 was the audit itself; Ship 2 was the full transparency wiring. The audit took 1.8s to run on 245 pages - very fast, room to extend to 300+ pages or shorter cadence if needed.

The 12 content invariants at iter 97

No Unicode em-dashes (15-min sweep)
No HTML-entity em-dashes (extended sweep iter 61)
No tagline equals product name (adoptability-score.py)
No name equals slug (same validation)
No stale product counts in generator output (iter 63)
No fake-proof claims in bulk-generated content (iter 66)
No fake-proof claims in FAQ subpages (iter 67)
No fake-proof claims in pricing subpages (iter 68)
No fake-proof claims in enterprise template (iter 68)
No skeleton-broken pages (iter 58)
No Adoptability score drift between Product JSON-LD and adoptability.json (iter 93)
No page-identity fall-through on any /builds/<slug>/ URL (iter 97)

Running queue (top 5 for iter 98)

Cadence step to 60 min - iter 92-97 average ~2-3 ships. Could step.
page-identity extension to /unlock/, /adopt/, /feedback/, /vs/ pages - currently only audits /builds/. Each is a buyer-touching surface.
Investigate the Director\'s polish-pass empty-write bug - the same bug that broke brief-ai might still be live.
Adoptability scoring catch-up for brief-ai - should auto-fix on next cron, but verify.
13th essay - skip until queue has a fresh candidate.

Cumulative iter 1-97

Catalog: 244 scored + 2 partial, 246 with index.html
Content library: 12 essays + Read-next + 271 OG PNGs + 96 styled ship-log pages
High-trust pages: 8 foundational + 5 transparency surfaces with full JSON-LD
Source durability: 23+ generators (added audit-page-identity iter 97) + 6 regen scripts auto-call injectors + 4 JSON snapshots (fakeproof, drift, page-identity, health) + 134+ cron jobs
Content invariants: 12 defended + audits-named-publicly + partial-build state visible + page-identity verification

The audit suite is now complete-enough to catch the "200 OK but wrong content" class of failures the iter-96 brief-ai discovery surfaced. Time-to-detect on a future regression: at most 30 minutes (one audit cron cycle). The combination of audit-fakeproof + audit-adoptability-drift + audit-page-identity covers content correctness across three dimensions: claim integrity, score sync, and identity match.

← PreviousIter #96 Next →Iter #98