Docs Readiness Audit v5.0 changelog and validation plan

Four things changed today: a Docs Readiness Audit rubric refinement (Cat 7 split into 7a + 7b; meta-description moved Cat 4 → Cat 8); a more specific methodology page — citation-prediction boundary stated, Score-deterministic / Story-LLM-written separation made explicit, hard-cap trigger conditions inlined, references-and-conventions split, known-limitations listed; the start of the Docs Validation Study, with first cohort run scheduled for 2026-05-11; and Site Readiness Audit v1.0 — the consumer-neutral baseline for crawlability, semantic structure, and raw-HTML availability — entering internal preview.

This post is the narrative changelog explaining the v5.0 change. The methodology page is the canonical scoring reference.

At a glance

  • What changed: Cat 7 (10 pts in v4.3) splits into Cat 7a — Architecture for Traversal (4 pts) and Cat 7b — Payload Efficiency (6 pts). Meta-description moves from Cat 4 to Cat 8 (Cat 4: 12 → 11 pts; Cat 8: 9 → 10 pts). Total weight unchanged at 100.
  • What didn’t change: universal categories (Cat 1, 5, 6) and their hard caps. Cat 2, 3 weights. The Score scale, the Score + Story contract, the determinism property, the Lightning ceiling.
  • Product: Site Readiness Audit v1.0 enters internal preview — consumer-neutral baseline.
  • Research: Docs Validation Study v1 first run scheduled 2026-05-11; results publish after run completion and review.

Why we split Cat 7

v4.3 scored site architecture and payload efficiency as a single 10-point bucket. Traversal signals (sitemap, BreadcrumbList, internal-link density) and payload signals (raw-HTML text-to-HTML ratio, visible-text floor) measure different failure modes, with different owners and different remediation paths. Bundling them obscured both diagnostics. Splitting them gives traversal and payload independent point losses, which makes the Story actionable instead of vague.

The combined weight stays at 10 because the relative re-evaluation of the two split categories against a real cohort is a Validation Study output, not a v5.0 deliverable. Both 7a and 7b weights are explicitly provisional; they move when the data lands.

The structural delta table is in the methodology changelog.

Why meta-description moved

Cat 4 — AEO Readiness narrows in v5.0 to signals specifically about answer-span extractability: question-style headings, concise direct answers in the paragraph immediately following each question heading, list and table structure. Identity signals — title quality, Organization schema, canonical URLs, Open Graph basics — already lived in Cat 8. Meta-description belongs with the other identity signals. Cat 4 lost one point; Cat 8 gained one; total stayed flat.

Reconciling a v4.3 Score

Customers who ran an audit under v4.3 can map their prior Score to v5.0 categories using the structural crosswalk — a side-by-side category mapping with structural deltas isolated, published alongside the changelog entry on /methodology. Per-customer numeric-Score reconciliation across versions is not part of v5.0; if and when paid customers are affected by a future version cut, that crosswalk gets authored against persisted scan data at that time.

Site Readiness Audit v1.0 — internal preview

The Docs Readiness Audit is the calibrated v5.0 product. Underneath it sits a smaller thing we are also building in public: the Site Readiness Audit v1.0 — universal hygiene only, scored against Cat 1 AI Crawlability & Access, Cat 5 Content Structure, and Cat 6 SSR & AI Rendering. No per-consumer calibration. It is designed to become the consumer-neutral baseline anyone can test their site against, and the floor the per-consumer rubrics build on top of.

Status: internal preview. The order is intentional: the consumer-neutral baseline ships, the calibrated rubrics layer on top, and the Validation Study begins measuring whether the assumptions behind the per-consumer weights hold up against observed fetch and content-recovery outcomes.

Docs Validation Study v1 — first cohort run scheduled 2026-05-11

The Docs Validation Study v1 measures two site-side outcomes: fetcher success rate (ObaronBot) and content-recovery rate, against the audit cohort. The run uses the same raw-HTML-only fetch path as the audit, so the result tests Obaron’s observable site-side assumptions rather than hidden AI-engine behavior.

Two metrics. Both site-side. Both reproducible once the cohort URLs and canonical questions are frozen. Neither dependent on a proprietary AI engine’s behavior.

  1. Fetcher success rate (ObaronBot). Given a URL in the cohort, can a monitored fetcher retrieve the page’s main content from raw HTML at HTTP 200, with no robots-block and no client-side-rendering wall? AI consumers using different user-agents may see different results; the v1 study does not characterize that variance.
  2. Content-recovery rate. Each cohort page is paired with 3–5 canonical questions a real user would ask the docs to answer. Questions are authored before scoring, frozen in the study protocol, and written from documented DevTools user-intent patterns rather than from the observed scan result. A question passes when the answer text is present in raw HTML and locatable via heading or near-heading prose. A page passes when ≥ 80% of its canonical questions resolve.

Cohort: DevTools docs sites. The study protocol publishes before the cohort run. The protocol includes cohort-selection criteria, frozen cohort-list timestamp, question-authorship rule, pass/fail thresholds, exclusion rules, and per-site result treatment. The cohort list is frozen and timestamped before the run, then published with the results alongside per-site outcomes.

What the v1 study does not measure. Citation frequency in real AI engines (out of scope). Score correlation. Per-engine variance. Statistical significance; v1 is an exploratory cohort run, not a powered statistical study.

The result line on /methodology will use this format: Docs Validation Study v1 — 2026-05-11. Cohort: n=[N] DevTools docs sites. Fetcher success (ObaronBot): [X/N], [%]. Content-recovery (≥80% threshold): [X/N], [%].

Self-pilot this week

The self-pilot uses the Site Readiness Audit baseline against Obaron’s buyer-facing pages: homepage, methodology, pricing, docs hub, selected docs articles, and selected blog pages. The pilot’s purpose is calibration: shake out edge cases in the scoring before it scores someone else’s site, and put a Score on Obaron itself as part of earning trust in the audit. The self-pilot result publishes after run completion and review alongside the Docs Validation Study v1 results.

Why this shape

A measurement framework that never tests its assumptions is just positioning. The validation study is how Obaron starts turning the rubric into evidence. The Site Readiness Audit is the consumer-neutral baseline the rest stands on. The Docs Readiness Audit is the calibration on top. The Docs Validation Study starts the work of putting numbers behind whether the calibration earns its weight.

For the rubric itself, see the methodology page. For the v5.0 structural-changes table, see the methodology changelog.


Cite this version: Obaron Docs Readiness Audit v5.0. Published 2026-05-04. First validation run scheduled 2026-05-11. Canonical methodology: /methodology/. Changelog: /blog/methodology-v5-changelog/. Study protocol/results: /methodology/docs-validation-study/ — protocol pending publication before 2026-05-11.