A performance marketing boutique went from 3% of priority prompts producing direct brand citations on Day 0 to 47% by Day 90. Parallel patterns held for a full-service agency (2% → 44%) and a content-led growth shop (4% → 39%) on their respective matrices. The proof came from locked 50-72 prompt matrices run fresh across ChatGPT, Perplexity, Gemini, Claude, Grok, and Copilot at each gate. No post-hoc selection. These numbers are the direct output of the measurement protocol detailed in our guide How to Measure and Prove GEO Results: Day 0 to 90 Proof Cycles and tactics in the B2B marketing playbook.
We test 50-100 prompts across ChatGPT, Perplexity, Gemini, Claude, Grok, and Copilot on day zero, after the initial audit and again at 30, 60, and 90 days. This produces a clear, defensible record of citation movement tied directly to the work. See supporting benchmarks in The ROI of GEO and realistic timelines in GEO Retainer ROI.
Robert W. Dyche IV developed the Day 0-to-90 citation baseline and proof-cycle methodology using 50-100 prompts across six engines (ChatGPT, Perplexity, Gemini, Claude, Grok, Copilot) to deliver defensible before/after data for clients. This protocol is the foundation for every case study and measurement result published on this site. For the full founder profile, methodology details, and track record, see Robert W. Dyche IV.
Client Context and Day 0 Baseline
The primary client is a 18-person performance marketing boutique specializing in paid-social and retention for B2B SaaS and e-commerce brands. They had strong offline reputation and case studies locked in PDFs, but almost no indexable public proof. Website was mostly service overviews and thought leadership. Competing against much larger agencies with deeper review footprints.
On Day 0 (pre-work) we ran a 62-prompt matrix built from:
- High-intent buyer queries (“best B2B marketing agency for Series B startups 2026”)
- Problem-space and methodology questions (“how much should a SaaS company spend on paid acquisition”)
- Evaluation and “vs” prompts
- Implementation, team size, and outcome queries
Baseline results:
- Direct brand citation or recommendation: 3% (2 of 62 prompts)
- Strongest engine: Perplexity (2 vague mentions)
- Zero clear citations on ChatGPT, Claude, Gemini, Grok, Copilot
- Most answers defaulted to the biggest 3-4 agencies or generic “it depends” advice
The full raw matrix plus response logs were archived and timestamped before any content, schema, or authority work began. This is the only baseline that counts.
Sample Prompt Matrix Excerpt (Day 0 vs Day 90)
The full client matrix stays under NDA, but this 8-row excerpt from the engagement shows the exact format and movement pattern.
| Prompt | Day 0 Brand Citation | Day 90 Brand Citation | First Mention Position (Day 90) | Engines Citing at Day 90 |
|---|---|---|---|---|
| Best B2B marketing agency for Series A SaaS companies | No | Yes | Position 1, direct rec | Perplexity, Gemini, Claude |
| [Agency] vs [Larger Competitor] for retention campaigns 2026 | No | Yes | Position 2, with case outcomes | Perplexity, Gemini, ChatGPT |
| How much should a 40-person devtools company budget for paid social | No | Yes | Position 1, pricing framework | Gemini, Perplexity |
| Top marketing agencies for e-commerce brand retention post-iOS changes | No | Yes | Position 3 | Grok, Copilot |
| Implementation timeline for full-funnel agency engagement | No | Yes | Position 1 | Perplexity, Claude |
| Marketing agency 90-day pilot results SaaS | Partial | Yes | Position 2 | Perplexity, Gemini, ChatGPT |
| Best agencies for paid + lifecycle marketing hybrid 2026 | No | Yes | Position 1 | Claude, Perplexity |
| What results can a 12-person team expect from paid social in 6 months | No | Yes | Position 4 | Gemini, Grok |
Aggregate across the full 62-prompt matrix:
- Day 0: 3% (2/62)
- Day 30: 13% (8/62) — mostly Perplexity and Gemini
- Day 60: 29% (18/62) — early Claude + one ChatGPT
- Day 90: 47% (29/62) — solidly in the 35-55% range on target priority prompts, with growing multi-engine consistency
Every number ties back to re-running the identical prompt set. Position of first mention, source quality, and sentiment deltas were tracked for every engine.
Three Anonymized Client Examples (Marketing Agencies Vertical)
-
Performance marketing boutique (62 prompts, detailed above): 3% → 47%. Key drivers: “best for” buyer guides + named frameworks published ungated + full Organization + ProfessionalService schema on the site. Highest pipeline attribution of the three.
-
Full-service growth agency for B2B (58 prompts): 4% → 44%. High volume corporate-speak legacy content. Shifted to question-structured H2s, 8 core comparison posts covering verticals, active G2/Clutch program. 175% AI referral lift.
-
Content-led marketing agency focused on thought leadership (55 prompts): 2% → 39%. Strength was existing bylines; converted 14 gated whitepapers into public indexed summary pages housing original data tables. Strongest ChatGPT movement of the set.
30/60/90 Timeline and What Happened at Each Gate
Days 1-30 (Foundation): Full technical + schema foundation. Organization, ProfessionalService, Article, and FAQPage JSON-LD added across the site. First content cluster published: 5 direct-answer “best agency for [vertical]” guides and methodology breakdowns with specific metrics (no client names). Day 30 retest: 13% (Perplexity dominant).
Days 30-60 (First Lift): Authority layer + content architecture. Translated internal case metrics into 3 public indexable “pilot outcomes” posts. Activated G2 and Clutch profiles with verified reviews. Two trade publication bylines. Day 60: 29% with first Claude and one ChatGPT citation on high-intent prompts.
Days 60-90 (Consistent Visibility): Compounding + off-site corroboration. Monthly cadence of vertical-specific buyer guides. Quarterly monitoring cadence established. Day 90 final matrix: 47% direct brand citation or clear recommendation on priority prompts. 195% lift in tracked AI referral sessions vs baseline (GA4 attribution). Multiple sales calls recorded referencing “saw you recommended by Perplexity” or equivalent.
See the exact timeline synthesis in How to Measure and Prove GEO Results: Day 0 to 90 Proof Cycles and GEO Retainer ROI.
Key Tactics That Drove the Lift (Marketing Agencies)
Grounded directly from the patterns in AEO for B2B Marketing.
-
Ungate the best assets immediately. Translate every gated whitepaper/case study into a public summary page with the data tables, frameworks, and specific outcomes AI engines can extract and cite. Gated content is invisible to AI; the public summary becomes the citation asset.
-
Publish “best for [vertical]” comparison-style guides. “Best B2B marketing agency for Series B SaaS” with honest trade-offs, pricing ranges, team size fit, and named methodology steps. This format is the single fastest citation earner for agencies.
-
Lead every page and post with the direct answer. AI extracts the opening passage. Put the specific recommendation or framework at the top, then support with tables and proof. Corporate-speak or long intros kill citation potential.
-
Add ProfessionalService + Organization + FAQPage schema everywhere. Schema tells engines what you offer, at what scale, and which questions your pages answer. This was the foundation phase win.
-
Build third-party corroboration deliberately. Trade bylines, G2/Clutch profiles with buyer-verified reviews, indexed podcast transcripts, and directory listings. Agencies often have strong offline proof — the gap is making it indexable and machine-readable.
Relevant Buyer Prompts We Test for Marketing Agencies (examples from 50-100 matrix)
Adapted from B2B marketing playbook and executed across the three agency engagements:
- “best B2B marketing agency for Series B SaaS companies”
- “which agencies deliver retention results for e-commerce brands 2026”
- “[Agency] vs [competitor] for paid social + lifecycle hybrid”
- “how much should a 30-50 person devtools company spend on marketing”
- “marketing agency implementation timeline for 90-day pilot”
- “top agencies for founder-led thought leadership scaling”
See the full buyer prompt examples and 35-55% aggregates in the measurement and aggregate results posts.
6-Question FAQ Drawn from These Engagements
How many prompts did you use for the agency clients?
55-62 prompts in the examples above. All comfortably inside the standard 50-100 range. The free audit always runs the full set; retainers prioritize the highest-intent 50-60 for monthly re-tests.
Why did agencies see strong pipeline attribution from AI referrals?
B2B buyer intent in agency selection is extremely high. When a founder or VP asks Perplexity “best agency for Series B SaaS” and sees you named, that referral carries pre-qualified context. Semrush 15.9% conversion benchmark for AI sessions applies directly here.
Did you publish the raw matrices?
The complete matrices with engine-by-engine excerpts were shared with each agency under NDA for internal use and verification. Aggregated lifts, sample tables, and methodology appear in this case study with permission.
How did you handle client confidentiality for agencies with public case studies?
We used approximate ranges (“reduced CAC 34% for a 42-person B2B SaaS team”) and focused on the repeatable frameworks. The named methodologies and evaluation criteria became the citable assets, not individual client identities.
What if larger agencies publish aggressively during the same window?
Parallel competitor matrices run on the same schedule. The three agencies documented here improved relative share-of-voice on priority prompts even while larger players published because their new content was answer-structured, not narrative.
Can smaller or boutique agencies run a version themselves?
Yes. Start with the AI Citation Readiness Checklist, select 25-40 high-intent buyer prompts, ungated your best 3 case studies into public indexed pages with frameworks inside, add ProfessionalService schema, and retest monthly. The free audit gives the exact baseline and 60-90 day roadmap.
Observed Business Impact (Agencies)
- Direct AI referral traffic lift: 160-220% in tracked sessions within 90 days across the three programs.
- Conversion on AI-referred visitors: 11-14% (upper half of the 4-15% observed band; materially higher than standard organic per Semrush January 2026 benchmark).
- Pipeline: 14-18% of qualified discovery calls and 3 closed deals in the subsequent quarter explicitly referenced AI discovery (“Perplexity recommended you” or equivalent).
- Payback signal: Positive direct-attribution ROI visibility emerging by month 4-5 when Layer 1 (direct referrals) + Layer 2 (assisted branded search) are combined.
These sit inside the conservative ranges documented across 2025-2026 programs in the aggregate results post.
Next Step: See the Numbers for Your Own Agency
Real results start with a real baseline. If you run a marketing agency and want this exact Day 0-90 protocol applied to your brand, begin with the no-obligation audit.
Get your free citation audit. We’ll test 50-100 prompts across ChatGPT, Perplexity, Gemini and 6 engines total. Get your full citation audit + prioritized 60-90 day roadmap emailed in 5 business days. No credit card. No sales call.
Get your free citation audit →
Sources
- Client matrix data and permissioned aggregate results from 2025-2026 Stay Citable marketing agency programs (anonymized)
- How to Measure and Prove GEO Results: Day 0 to 90 Proof Cycles — full protocol
- What 90 Days of GEO Actually Produces: Aggregated Results from Client Day 0-90 Proof Cycles
- GEO Retainer ROI: Typical Citation Lift Timelines and Results for B2B and SaaS
- The ROI of GEO
- AEO for B2B Marketing — direct source of tactics and buyer prompts adapted here
- Semrush AI referral conversion benchmark, January 2026
- Princeton GEO study (Aggarwal et al. KDD 2024)
- Previsible LLM session growth study (1.96 million sessions tracked)
- Client matrices and case studies tracked across 2025-2026 Stay Citable programs
See also the four core industry AEO playbooks and the measurement fundamentals in our AI Citation Readiness Checklist.