Five Companies Just Named the AI-Search KPIs They Actually Track. Here's the Decision Frame.

Kyle Poyar’s recent piece in Growth Unhinged did something most AEO commentary refuses to do: it named names, named metrics, and showed the receipts. Five companies. Five different KPIs. Five different surfaces. None of them measuring the same thing.

That mismatch is the story.

If you read the post looking for “the right AI search metric,” you will leave disappointed. There is no single answer. What there is, instead, is a decision frame: each KPI corresponds to a signal class with a different cost of capture and a different defensibility profile in front of a finance team. The governance question for marketing leaders in 2026 is not “which KPI is best.” It is “which signal class can my organization actually defend when the CFO asks what we are paying for.”

What the five companies actually track

Five cases, five distinct measurement choices.

Beehiiv tracks LLM-attributed signups as the primary metric. CMO Darren Chait reports significant growth in leads and signups attributed to LLMs after restructuring their enterprise landing page, with no real change in visibility metrics. Read that twice. The visibility scores did not move. The conversions did. They picked the bottom-of-funnel signal and let the top-of-funnel noise stay noisy.

Reply consolidated 500-plus blog posts around a tighter positioning (“sales engagement and AI SDR”) and now appears in roughly 25 percent of the 300-plus prompts they monitor. Their KPI is prompt-coverage rate against a defined prompt set they curated.

Air tracks citation rate via Profound, a third-party measurement platform. Their signal is externally instrumented. They did not build the meter; they bought access to one.

ClickUp measures video appearance rate in AI Overviews, between 20 and 40 percent across the queries that matter to them. Their KPI is surface presence on a specific generative-search artifact, not a generic “share of voice.”

Skio did something none of the others did: they mined sales discovery calls to identify a 28-day post-content traffic window, which they describe as their best in four years. The measurement is causal-window inference grounded in qualitative signal from real buyer conversations.

Five companies. Five measurement philosophies. All defensible on their own terms. None interchangeable.

The signal classes hiding behind the KPIs

Strip the brand names off the table and a pattern emerges. Each company picked a different signal class:

Beehiiv: conversion-attributed signal (LLM → signup)
Reply: prompt-coverage signal (curated prompt set → presence rate)
Air: third-party citation signal (instrumented by an external vendor)
ClickUp: surface-share signal (specific generative-search artifact)
Skio: temporal-causal signal (content-publish-to-traffic window)

These are not the same thing. They have different costs of capture, different audit trails, and different sensitivity to platform whim. A CFO evaluating any of them is implicitly evaluating which signal class your organization is willing to pay to maintain.

This is the governance question we have been mapping in our hard-signal AEO governance work and our marketing governance stack analysis. Five companies have just told us what their answer is. The answers are not the same.

The defensibility test

Here is the frame I would use if a marketing leader asked me which of these to adopt. Run each candidate KPI through three questions:

1. Where does the meter live?

Beehiiv’s meter is its own analytics stack, instrumented for LLM-source attribution. Air’s meter is Profound. ClickUp’s meter is Google’s AI Overviews surface itself. Skio’s meter is sales-call data plus traffic logs. Reply’s meter is a curated prompt set the team owns.

If the meter is yours, you control the methodology and bear the cost. If the meter is a third party’s (Profound, Ahrefs, Profound competitors), you outsource the methodology and the trust question. If the meter is the platform’s (AI Overviews itself), you are at the mercy of the platform’s reporting changes. Each posture has different defensibility. None is wrong. They are different bets.

2. What does a 10 percent move in the metric mean?

Beehiiv’s KPI ladder ends in revenue. A 10 percent lift in LLM-attributed signups is a number a CFO can multiply by ARPU. Reply’s prompt-coverage rate is an upstream proxy, a 10 percent move means more presence, but presence has to be translated into pipeline downstream. Air’s citation rate is similarly upstream. ClickUp’s video-share rate sits in the middle: it tells you whether your specific format is winning a specific surface.

Pick the KPI whose unit of motion you can translate into a dollar without three slides of explanation. If you cannot translate the unit, the CFO will discount the metric, regardless of how sophisticated the measurement is.

3. How brittle is the measurement to platform change?

Skio’s 28-day window is grounded in qualitative buyer evidence, durable, but only as durable as the discovery-call discipline behind it. ClickUp’s AI Overviews share is fragile to Google’s reporting changes. Air’s Profound number is fragile to Profound’s vendor decisions. Beehiiv’s attribution is fragile to whichever attribution method they chose. Reply’s prompt-set is fragile to whether the curated prompts stay representative.

Every signal has a brittleness profile. The honest move is to know yours, document it, and have a fallback signal one layer over when the primary breaks.

What this means for governance teams

The five-company snapshot is not a menu of options. It is a forcing function. Every marketing org running AI search initiatives in 2026 already implicitly chose one of these signal classes. Most chose by default, whichever vendor demoed first, whichever metric the agency reports, whichever number looks cleanest in a slide.

Default choice is the failure mode. The compounding choice is to pick the signal class deliberately, document why, and run it through the defensibility test before the CFO does.

Three practical moves:

Map your current KPI stack to a signal class. Your team is already measuring something. Name the class. Conversion-attributed? Prompt-coverage? Third-party citation? Surface-share? Temporal-causal? If you cannot name the class in one sentence, the metric is not yet defensible.

Pick one primary, one fallback. The five companies are each strong on a primary. None of them rely on only one. Beehiiv has visibility scores in the background even though they do not lead with them. Reply has organic-search baseline data behind the prompt-coverage view. A marketing operation that survives platform turbulence has at least two signals from different classes.

Make the meter audit-ready. Whatever class you pick, write down where the meter lives, who controls the methodology, what a 10 percent move means in dollars, and what you do when the platform changes. That document is what survives a CFO conversation. The dashboard is downstream of it.

The compounding asset

The companies in Poyar’s piece are not winning because they picked the perfect KPI. They are winning because they picked one, named it, and built operations around it. The compounding asset is not the metric. It is the discipline of treating AI-search measurement as a first-class governance question rather than a quarterly tactic refresh.

The first-wave AEO playbook that we argued was already commoditized measured the wrong thing on purpose: it measured what was easy. The next wave measures what is defensible. Defensibility is a cost. The five companies that paid the cost get to keep the data when the next platform shift happens. Everyone else is back at zero.

Pick your signal class on purpose. Document the meter. Translate to dollars. That is the work.

This analysis is grounded in What’s Working Right Now in AI Search (Kyle Poyar / Growth Unhinged, May 2026).

Victorino Group helps marketing leaders pick AI-search KPIs that survive a CFO conversation. Let’s talk.