# Wishdeal Factory buyer-path - iteration 97 ship log

**Date:** 2026-05-14 (push mode, 50 min cadence, audit-gap-closure iter)

## What shipped (2 substantive ships)

This iter built the page-identity audit that closes the "200 OK but wrong content" detection gap surfaced by iter 96's brief-ai discovery. Plus wired it into /quality-report/ as a new card + invariant #12.

## Ship 1: audit-page-identity.py - new audit class

Built audit-page-identity.py (~110 lines). For each /srv/sites/factory/builds/<slug>/ that has an index.html:
- Fetches https://wishdeal.com/factory/builds/<slug>/
- Reads first 32 KB of response
- Checks against 3 homepage-fingerprint phrases ("Wishdeal Factory \xc2\xb7 Mission Control", etc.) - if found, flags as identity-mismatch (Caddy serving fall-through)
- Else verifies slug or product name appears somewhere in the response - if found, ok
- Else flags as ok-weak (response is not homepage but does not have slug/name in first 32 KB)
- Unreachable cases (HTTP errors, timeouts) tracked separately

**Result on first run:** 245 pages checked in 1.8s.
- ok: 245 (after expanding read window from 8KB to 32KB)
- ok-weak: 0
- identity-mismatch: 0
- unreachable: 0

**Initial 8KB run** caught 37 ok-weak false positives (slug appeared deeper in the page due to heavy CSS preamble from iter 8 jsonld + brand-applicator). Expanded read window to 32KB closed all 37 to ok.

**Writes JSON snapshot** at /srv/sites/factory/page-identity.json. Schema: generated_at, total_checked, ok_count, identity_mismatch_count, unreachable_count, identity_mismatches list, weak_matches list, unreachable list.

**Cron**: every 30 min at :26,:56. Log at /home/ubuntu/factory/logs/page-identity.log.

**Why this matters**: This is the audit that would have caught brief-ai's 4-day outage automatically. Health-check verifies HTTP 200; page-identity verifies "the right 200." The two are complementary. Future Caddy-misconfiguration or polish-pass-wrote-0-bytes regressions will surface here within 30 minutes.

## Ship 2: /quality-report/ surfaces page-identity + invariant #12

Patched regen-quality-report.py with:
- New helper `latest_page_identity()` reads the JSON snapshot
- New card in the Live checks row: "Page identity 245/245 - no fall-through" (ok green / warn amber / fail orange depending on counts)
- New row in the "What we audit" table: lists audit-page-identity.py with cadence + what-it-catches
- New content invariant #12: "No page-identity fall-through on any /builds/<slug>/ URL"

The Live checks row now has 6 audit cards: Health endpoints, Fake-proof audit, Adoptability score sync, Em-dash sweep, Broken taglines, Page identity.

## Health hygiene (Op rule 5)

- **Em-dash sweep**: pending
- **audit-fakeproof**: 0 hard / 0 soft (CLEAN)
- **audit-adoptability-drift**: 244 matched, 0 drift, 2 partial-build
- **audit-page-identity**: 245/245 ok, 0 mismatch, 0 unreachable (NEW iter 97)
- **Health-check**: 77/77 passing
- **All structured-data**: maintained

## Status snapshot

- 244 scored products + 2 partial builds (Director-WIP)
- 246 build pages with index.html, 0 broken, **245 verified by page-identity audit**
- 0 fake-proof findings, 0 Adoptability score drift, 0 page-identity fall-throughs
- 12 essays + Read-next + JSON-LD on each
- 8 high-trust pages with JSON-LD durable
- /factory/catalog/ with CollectionPage + 244-item ItemList
- 244 /builds/ pages with PNG OG + Product schema
- 271 OG PNG images
- 5 transparency surfaces + 96 styled ship-log detail pages
- /quality-report/ surfaces 6 live-check cards (added Page identity iter 97)
- 26 hand-polished products
- **12 content invariants** (NEW iter 97: Page identity)
- 77/77 health endpoints, 2320 sitemap URLs
- **134+ cron jobs** (new iter 97: audit-page-identity at :26,:56)
- 50 min cadence active

## Iter 97 throughput note

2 substantive ships at 50-min cadence. Ship 1 was the audit itself; Ship 2 was the full transparency wiring. The audit took 1.8s to run on 245 pages - very fast, room to extend to 300+ pages or shorter cadence if needed.

## The 12 content invariants at iter 97

1. No Unicode em-dashes (15-min sweep)
2. No HTML-entity em-dashes (extended sweep iter 61)
3. No tagline equals product name (adoptability-score.py)
4. No name equals slug (same validation)
5. No stale product counts in generator output (iter 63)
6. No fake-proof claims in bulk-generated content (iter 66)
7. No fake-proof claims in FAQ subpages (iter 67)
8. No fake-proof claims in pricing subpages (iter 68)
9. No fake-proof claims in enterprise template (iter 68)
10. No skeleton-broken pages (iter 58)
11. No Adoptability score drift between Product JSON-LD and adoptability.json (iter 93)
12. **No page-identity fall-through on any /builds/<slug>/ URL (iter 97)**

## Running queue (top 5 for iter 98)

1. **Cadence step to 60 min** - iter 92-97 average ~2-3 ships. Could step.
2. **page-identity extension to /unlock/, /adopt/, /feedback/, /vs/ pages** - currently only audits /builds/. Each is a buyer-touching surface.
3. **Investigate the Director\'s polish-pass empty-write bug** - the same bug that broke brief-ai might still be live.
4. **Adoptability scoring catch-up for brief-ai** - should auto-fix on next cron, but verify.
5. **13th essay** - skip until queue has a fresh candidate.

## Cumulative iter 1-97

- **Catalog**: 244 scored + 2 partial, 246 with index.html
- **Content library**: 12 essays + Read-next + 271 OG PNGs + 96 styled ship-log pages
- **High-trust pages**: 8 foundational + 5 transparency surfaces with full JSON-LD
- **Source durability**: 23+ generators (added audit-page-identity iter 97) + 6 regen scripts auto-call injectors + 4 JSON snapshots (fakeproof, drift, page-identity, health) + 134+ cron jobs
- **Content invariants**: **12 defended** + audits-named-publicly + partial-build state visible + page-identity verification

The audit suite is now complete-enough to catch the "200 OK but wrong content" class of failures the iter-96 brief-ai discovery surfaced. Time-to-detect on a future regression: at most 30 minutes (one audit cron cycle). The combination of audit-fakeproof + audit-adoptability-drift + audit-page-identity covers content correctness across three dimensions: claim integrity, score sync, and identity match.