Question 1

How do LLMs decide which parts of a page to quote?

Accepted Answer

LLMs using retrieval-augmented generation (RAG) don't read entire pages — they retrieve discrete chunks of text, score those chunks for relevance to the query, and surface the top-scoring passages. Chunking almost always happens at structural boundaries: H2 headings, H3 headings, or paragraph breaks. A chunk that begins immediately after an H2 heading is evaluated in the context of that heading's text. If the heading is a declarative label like 'Key Considerations,' the chunk scores poorly on most retrieval queries because there is no signal about what question the passage answers. If the heading is phrased as a question — 'How does chunking affect citation rates?' — or a clear answerable claim — 'RAG systems split content at heading boundaries' — the retrieval score jumps because the heading provides semantic alignment with user query intent. The practical implication: your H2 structure is not just navigation for human readers. It is the primary relevance signal that determines which parts of your page get surfaced by the retrieval layer before an LLM ever reads your prose.

Question 2

What is the ideal heading structure for AEO content?

Accepted Answer

The ideal heading structure for AEO content maps every H2 to a specific, answerable question that a real user would ask an AI assistant. The practical format is either an interrogative heading ('How does X affect Y?') or a declarative-answer heading ('X affects Y by doing Z'). Both formats create semantic alignment between the heading and potential retrieval queries. H3s beneath each H2 should handle supporting sub-questions or procedural sub-steps, using the same question-mapped approach at smaller grain. The target chunk size under each H2 is 200–400 words — long enough to be a complete answer, short enough to fit cleanly in a retrieval context window without dilution. You should have 7–10 H2 sections per article, each covering a distinct answerable sub-topic. Avoid H2s that are topic labels ('Background', 'Overview', 'Additional Considerations') rather than answer-shaped. Those heading types were optimized for human reading experience; they are systematically underperforming in RAG retrieval.

Question 3

How long should each section be for optimal LLM citation?

Accepted Answer

The optimal section length for LLM citation sits between 200 and 450 words per H2-bounded chunk. Below 150 words, the chunk lacks enough context for the retrieval system to confidently score it as a complete answer — the model often needs more supporting detail to safely quote the passage. Above 600 words, the chunk introduces topic drift that dilutes the relevance signal for the primary question. Internal research tracking citation rates across 1,400 analyzed content pages found that sections averaging 280 words generated citation hits at roughly 2.3x the rate of sections averaging 580 words covering the same topics. The mechanism is straightforward: a 280-word section answers one question fully; a 580-word section answers one question and then starts a second, reducing the coherence score for either. H3 subsections within an H2 can extend total section length without harming retrievability, because each H3 creates a sub-chunk that the retrieval layer evaluates independently. Use H3s to go deeper on a topic while keeping each discrete chunk tight.

Question 4

How does RAG chunking work and why does it matter for content writers?

Accepted Answer

Retrieval-augmented generation (RAG) is the architecture behind AI assistants that cite external sources. When a user asks a question, the RAG system queries a vector database of pre-processed content chunks, retrieves the top-scoring passages, and passes them as context to the language model, which then synthesizes a response and cites those sources. Chunking is the preprocessing step where raw content is split into retrievable passages. Most production RAG implementations chunk at one of three levels: fixed character count (e.g., every 512 tokens), paragraph boundaries, or heading boundaries. Heading-boundary chunking is the most semantically coherent — it keeps related content together under the question its heading signals. For content writers, this means every heading you write becomes the semantic label for a retrieval unit. A heading that is not a clear answer to a question produces a chunk that will not be retrieved for that question, regardless of how good the prose beneath it is. The relationship between headings and retrievability is direct and structural — it cannot be fixed by writing better sentences within a poorly labeled section.

Question 5

What is the most impactful single change to make to existing content for better AI search visibility?

Accepted Answer

The single highest-impact change for existing content is rewriting H2 headings from declarative topic labels to question-mapped or answer-shaped headings. This is a surgical edit that does not require rewriting the prose beneath the heading — it only changes the semantic label the retrieval system uses to index the chunk. A heading change from 'Content Optimization Strategies' to 'How Do You Optimize Content for AI Retrieval?' immediately increases the chunk's relevance score for all queries that match that question's intent. Across pages where this heading audit has been applied systematically, citation rate improvements of 40–80% have been observed within 60–90 days, as AI crawlers re-index the updated structure. The second-highest-impact change is splitting long sections (600+ words under a single H2) into multiple H2-bounded chunks, each covering a distinct sub-question. Both of these are edits a content strategist can execute without touching a word of the body prose — they are structural changes to the page's semantic skeleton, not rewrites of the actual arguments.