In v4.3 of the Docs Readiness Audit rubric, Category 7 covered “narrative shape and structured-data signals” as a single 10-point bucket. When we rebuilt for v5.0, we split it into two distinct categories: 7a — Architecture for traversal (4 pts) and 7b — Payload efficiency (6 pts).
Here is why.
The failure modes are different
Architecture for traversal failures look like this: sitemap absent, BreadcrumbList schema missing, internal links sparse. An AI agent navigating the docs hits dead ends. It can’t follow the content graph because the graph isn’t declared.
Payload efficiency failures look like this: the raw HTML returned to an AI crawler is 90% JavaScript bundle and 10% content. An agent fetches the page and gets back markup that yields almost nothing extractable. The content exists — it’s just not in the payload.
These two failure modes have different owners, different remediation paths, and different severity profiles. A site can have excellent traversal signals and terrible payload efficiency, or vice versa. Combining them into one score obscures the diagnostic.
What the combined score hid
Under v4.3, a 7/10 in Cat 7 told you something was off but couldn’t tell you which half. The questions evaluators asked while scoring were actually:
- Can an AI agent navigate from page to page through declared structure?
- Does the raw HTML payload contain extractable content, or is it mostly chrome?
These are independent questions. Mixing the score gave a proxy for “something is probably wrong in this area” without directing remediation effort.
How 7a and 7b separate the signal
7a — Architecture for traversal (4 pts) measures the declared navigation layer: sitemap presence and quality, BreadcrumbList schema, and internal-link density across the primary doc pages. These signals let AI coding agents traverse the content graph the way humans navigate menus. If they’re absent, an agent can complete one task but can’t reliably reach adjacent answers.
7b — Payload efficiency (6 pts) measures the text-to-HTML ratio in raw fetched pages. We score the ratio across primary pages and surface a finding when raw HTML returns below a floor of visible text. Sites with real content that happens to be delivered in a heavy template don’t get penalized — the check targets pages where AI consumers genuinely can’t extract what isn’t there.
The 6:4 weighting reflects the asymmetry in impact: payload efficiency failures affect every AI consumer who fetches the page; traversal failures affect agents specifically (crawlers can still index what they reach). Both matter; they’re not equal.
Why both are provisional
7a and 7b are marked provisional in v5.0. The weights are informed by current audit data, but the sample is small. As the audit corpus grows, we’ll have better signal on whether 4 and 6 reflect the actual remediation effort distribution. If traversal fixes consistently produce larger score lifts than payload fixes — or vice versa — the weights will adjust at the next version cut.
Provisional means the methodology is live and the categories are scored, but the weights carry a confidence interval we’re not hiding. The full rubric and current weights are at /methodology.