When Your Product Becomes Someone Else's Source

The audience you did not design for

Your product is no longer just a destination. It is a source.

For most of the web’s history, that distinction did not exist. Designing a product meant designing for a person who would visit it — someone who types a URL, clicks a search result, navigates your pages, and interacts with your content on your terms.

That assumption is breaking.

Today, a growing share of your audience never visits your product. They encounter your information through an intermediary: a search engine that summarizes your content in a generative snippet, an AI assistant that cites your data in a conversation, a research agent that extracts your structured information and presents it alongside competitors.

Between 60% and 68% of Google searches now end without a click to any website. Google AI Overviews — the feature that synthesizes answers directly in search results — appear in nearly half of all queries, up from 13% in March 2025.¹ And the organizations that treat this as a design and governance problem — not an SEO tactic or an AI feature — will be the ones that retain authority over how their information is understood.

What changes when you are a source

When someone visits your product, you control the experience. You control the hierarchy, the framing, the sequence, the emphasis. You decide what is shown first, what is grouped together, what is compared, and what is left out.

When an intermediary uses your product as a source, all of that control disappears. The intermediary decides the framing. It decides what to extract, what to summarize, what to combine with other sources, and what to attribute.

This is already happening at scale:

Search engines generate summaries from your content and present them above your link — with organic click-through rates dropping 61% on queries where AI Overviews appear.²
AI assistants answer questions using your data and may or may not link back.
Research agents crawl your structured information and fold it into reports that never mention your brand.
Aggregators repackage your content with their own editorial framing.

The downstream effect has been dramatic. Google’s AI Overviews caused a 33% global organic search traffic drop for publishers by November 2025.³

You can build the best product in your category and still lose the narrative if your content is structured for visitors but not for intermediaries.

The trust problem

For institutions — public agencies, regulatory bodies, knowledge organizations — this is more than a traffic concern. It is a trust and authority problem.

I saw this directly in a project for a European public institution that maintains a curated technology catalog for a specialized audience. The information was authoritative — maintained by domain experts to institutional standards. But it was structured for visitors navigating the portal, not for the search engines and AI assistants that were increasingly becoming the first point of contact with that content.

The risk was concrete: an AI assistant could summarize the institution’s catalog, strip the caveats and provenance, combine it with less authoritative sources, and present the result as a neutral answer. The institution’s authority would be borrowed without being credited. The user would not know where the information came from or how reliable it was.

Research auditing generative search engines confirms this is systemic — responses from AI systems are “fluent and appear correct” but frequently fail standards for full verifiability, with sourcing that skews toward news outlets regardless of domain authority.⁴

The institution did not risk losing the information. It risked losing the framing. And in knowledge-intensive domains, framing is trust.

The key design decision in that project was to define “AI readiness” as governance and structure — templates, semantics, and exports — not as adding an AI feature. A pragmatic path to being an authoritative reference in an intermediary-mediated world, without betting on a chatbot.

Why “adding an AI feature” is not the answer

When this problem becomes visible, the reflex is often to add an AI feature: a chatbot, a search copilot, a “smart” summary layer on top of the existing product.

This can help, but it misses the structural issue. The problem is not that your product lacks AI capabilities. The problem is that your content is not structured for reuse by systems you do not control.

Adding a chatbot to your product gives you one more channel. Structuring your content for governance gives you control over every channel — including the ones that have not been built yet.

The distinction matters:

A chatbot answers questions within your product. Content governance ensures your information is accurately represented wherever it appears.
A chatbot requires maintenance, prompts, and ongoing tuning. Content structure, once established, works passively across any system that consumes it.
A chatbot is a feature. Governance is a layer.

There is a counterintuitive opportunity in this shift. Being cited by AI intermediaries does not just preserve your traffic — it amplifies it. AI referral traffic converts at 4 to 9 times the rate of traditional organic search, and brands cited in AI-generated answers see significantly more engagement than those that are not.⁵ The goal is not to survive intermediation. It is to be the source intermediaries prefer.

What “structured for reuse” actually means

Structuring content for reuse is not a technical project. It is a design and governance project. It requires deciding — before someone else decides for you — how your information should be understood, attributed, and bounded when it leaves your product.

The emerging disciplines describe this shift in strategic terms:

Approach	What it optimizes for	Key signals	What you measure
SEO	Ranking in a list of links	Keywords, backlinks, page authority	Click-through rate, traffic
AEO	Being cited in direct answers	Schema, Q&A structure, factual density	Citation share, answer box presence
GEO	Feeding the LLM generation pipeline	Semantic embeddings, structured data, authority signals	Context inclusion in model outputs

Where traditional SEO optimized for ranking, Answer Engine Optimization (AEO) optimizes for citation, and Generative Engine Optimization (GEO) optimizes for being the source an AI synthesizes from.⁶ The underlying mechanism is retrieval-augmented generation (RAG) — the architecture most AI assistants use to answer factual questions. Structured, clearly authored content is easier to retrieve, easier to quote accurately, and more likely to be cited.

In practice, structuring for reuse means four things.

Semantic clarity

Each piece of content should have a defined type, scope, and relationship to other content. Not just “a page about X” but “a comparison of technologies in category Y, maintained by authority Z, last verified on date W.” When intermediaries consume this information, they inherit the structure — which reduces the chance of misrepresentation.

This means implementing structured data. Pages with correct schema markup earn up to 40% more rich-result impressions than unmarked pages, and pages with schema-aligned structure earn significantly higher citation rates in AI-generated responses. Fewer than a third of websites currently implement schema beyond the basics — a gap that translates directly into citation surface lost to competitors.⁷

Exportable metadata

The information that gives your content authority — provenance, update dates, scope, licensing, methodology — should be machine-readable, not buried in prose. If an AI assistant cannot extract your authority signals, it cannot pass them on to the user. In practice, this is the layer most teams overlook until they discover their carefully caveated content is being cited without its caveats.

Standards are forming. The IPTC Photo Metadata Standard 2025.1 introduced fields specifically for AI-related content provenance. The Coalition for Content Provenance (C2PA), whose members include OpenAI, Meta, Google, and the BBC, provides cryptographic content credentials that make provenance tamper-evident.⁸ These are not yet mainstream for most product teams, but they signal the direction of travel for any organization where provenance is a competitive or regulatory concern.

Bounded reuse

Not all information should be freely summarizable. Some content requires context that summaries strip. Some data requires caveats that generative snippets omit. Teams rarely think about this until they find their careful guidance flattened into a confident, unconditional answer.

Technical mechanisms for expressing these preferences are developing. robots.txt remains the baseline, but compliance is voluntary — AI crawlers increasingly bypass it. llms.txt is an emerging proposed standard that lets content owners specify how their content should be accessed and used by AI models. It is low-effort to implement and creates a documented position that may have legal value as regulations evolve.⁹

The regulatory layer is also moving. The EU AI Act requires machine-readable disclosure metadata for AI-generated content, with enforcement beginning August 2026. California’s provenance standards, effective since March 2025, already require major platforms to disclose provenance data.¹⁰ Product teams cannot ignore this compliance dimension for long.

Governance, not just structure

Structure without governance decays. Someone needs to own the question: “How is our information being represented outside our product, and is it accurate?”

This requires new monitoring practices. Only 12% of sources cited across ChatGPT, Perplexity, and Google AI Overviews overlap — which means each platform constructs its own picture of your authority from different parts of your content. ChatGPT only began appending attribution data to citation links in June 2025. Google AI Overviews and most mobile AI referrals still pass no attribution at all.¹¹ Tracking “share of voice” in AI-generated responses is becoming as important as tracking organic rankings.

The parallel to operations design

In Designing Operations, Not Just Interfaces, I made the argument that front-stage interactions and back-stage operations must be designed as one system.

This is the same pattern, extended outward. Your product has a front-stage (the visitor experience), a back-stage (the operations and content production), and now a third layer: the intermediary experience — the information that leaves your product and enters someone else’s system.

That intermediary layer is not monolithic. It bifurcates into two distinct populations:

Synthesis systems — Google AI Overviews, ChatGPT, Perplexity — that extract and recombine your content in real time to answer questions. These are the systems most product teams are starting to notice, because the traffic impact is visible.
Agentic systems — autonomous AI agents that crawl, research, and compile reports across multiple sources without human direction at the point of retrieval. These are less visible but growing fast, and they have different content preferences, citation patterns, and compliance obligations.

If you design only the front-stage, you get products that look good but break under operational pressure. If you design the front-stage and back-stage but ignore the intermediary layer, you get products that work well for direct visitors but lose authority and control everywhere else.

The complete design challenge is all three.

Signals that this matters for your product

Your analytics show declining direct visits but stable or growing brand mentions in AI-generated content.
Users reference your information but cannot trace it back to your product.
Competitors are cited alongside you in generative summaries, even when your information is more authoritative.
Your content is structured for human reading but not for machine extraction — no schema, no structured data, no metadata beyond page titles.
You have considered adding a chatbot but have not audited how your existing content is being consumed by external systems.

If any of these are familiar, here is a practical starting point:

Audit your content for machine readability. Can an AI system identify what each page is about, who authored it, when it was last verified, and what its scope is — without reading the full text?
Implement schema markup for your key content types. Article, FAQPage, Organization, Person, and HowTo schemas are the highest-leverage starting points.
Add llms.txt. Low effort, documented preference position — even if AI platform support remains incomplete.
Track your AI share of voice. Set up custom dimensions in your analytics to capture known AI referrers. Treat citation frequency as a KPI alongside traffic.
Assign ownership of the intermediary layer. The question “How is our information being represented outside our product?” needs a named owner — whether that lives in product, content, or SEO is less important than the accountability being explicit.

The tools and standards are still forming. But the structural shift — from destination to source — is already underway. The organizations that respond to it as a design problem, rather than waiting for a fully baked solution, will have a meaningful head start.

If you are navigating this shift — whether as a content governance question, an AI readiness audit, or a product strategy problem — I would like to hear how it is showing up in your context. Book a call and let’s scope what makes sense for your situation.

References

SparkToro/Datos and multiple industry analyses report between 60–68% of Google searches ending without a click in early 2026, with the figure reaching 77% on mobile. Google AI Overviews appear in approximately 48% of queries, up from 13% in March 2025. Gartner projects a 25% decline in traditional search engine volume by the end of 2026. ↩
Organic click-through rates collapse from 1.76% to 0.61% on queries where AI Overviews appear — a 61% decline. Paid CTR on the same queries drops 68%. AIVO analysis of 25.1 million impressions. ↩
Google’s AI Overviews caused a 33% global organic search traffic drop for publishers by November 2025 (ainvest, citing industry data). Digital Content Next member publishers saw median Google referral traffic fall nearly every week across May and June 2025, with losses outpacing gains two-to-one. ↩
Urman, A., & Makhortykh, M. (2025). Generative AI Search Engines as Arbiters of Public Knowledge: An Audit of Bias and Authority. The study audited ChatGPT, Bing Chat, and Perplexity, finding responses are “fluent and appear correct” but frequently fail standards for full verifiability, with sourcing that skews toward news and media outlets regardless of domain authority. ↩
AI referral traffic converts at 4 to 9 times the rate of traditional organic search traffic. Brands cited within AI Overviews see 35% more organic clicks and 91% more paid clicks than uncited brands. AI referrals currently represent just over 1% of all web traffic but are growing at 357% year-on-year. ↩
Generative Engine Optimization (GEO) emerged from academic work presented at KDD 2024. Answer Engine Optimization (AEO) focuses specifically on making content the preferred source for AI-generated direct answers — optimizing for citation rather than click. ↩
Pages with correct schema markup earn up to 40% more rich-result impressions than unmarked pages. Fewer than 33% of websites implement schema beyond basics. Companies with robust schema strategies report 40–60% higher citation rates in AI-generated responses. ↩
IPTC Photo Metadata Standard 2025.1, ratified November 2025, introduced four new XMP fields for AI-related content provenance. The Coalition for Content Provenance (C2PA), whose members include OpenAI, Meta, Google, Amazon, the BBC, and Adobe, provides cryptographic content credentials for tamper-evident provenance verification. ↩
llms.txt is an emerging proposed standard — similar in concept to robots.txt — that lets content owners specify how their content should be accessed and used by AI models. About 21% of the top 1,000 websites include rules for GPTBot in robots.txt as of July 2025, but approximately 13% of AI crawlers were bypassing robots.txt declarations by Q2 2025. ↩
The EU AI Act Article 50 requires machine-readable disclosure metadata for AI-generated content, with enforcement beginning August 2026. California’s Provenance, Authenticity and Watermarking Standards Act, effective March 2025, requires major online platforms to disclose provenance data in distributed content. ↩
Only 12% of sources cited across ChatGPT, Perplexity, and Google AI Overviews overlap. ChatGPT began appending utm_source=chatgpt.com to citation links in June 2025, but Google AI Overviews, AI Mode, and most mobile AI referrals still pass no attribution data. ↩