← Back to all Insights
June 14, 2026·AI Search·9 min read

How to get cited by ChatGPT for a local business (the 2026 AEO/GEO playbook)

A growing share of your future customers will never see a list of ten blue links. They will ask ChatGPT, Perplexity, or Google AI Overviews "who's the best HVAC company near Katy?" and act on the two or three names the model gives back. Getting cited is now a separate discipline from ranking — and most Houston businesses are losing it on a technicality they don't know exists.

The most common reason a Houston business is invisible inside ChatGPT and Perplexity isn't bad content, weak reviews, or thin pages. It's that the site is accidentally blocking the AI crawlers at the CDN — a checkbox someone flipped on, or a security plugin doing its job too well. The model literally can't read the page, so it never names you.

This is the technical playbook to fix that, plus the four moves that follow. For the strategy-level "why this matters," read getting your Houston SMB cited by ChatGPT, Perplexity, and AI Overviews. This is the wrench-turning version.

// THE SHORT ANSWER — HOW TO GET CITED BY CHATGPT
  • 1. Be crawlable. Allow GPTBot, ChatGPT-User, ClaudeBot, and PerplexityBot in robots.txt and at your CDN/firewall — where most accidental blocks actually live.
  • 2. Ship an llms.txt. A plain Markdown map at your root that points models straight at services, locations, hours, and FAQs.
  • 3. Write extractable answers. One-sentence direct answer + a cited statistic + a named quote, near the top of each page.
  • 4. Fix your entity. Identical name/address/phone everywhere, plus LocalBusiness and FAQPage schema.
  • 5. Earn trusted citations. Get named on the third-party sources the models already pull from.

Why those exact ingredients? Because a controlled study measured them. The Princeton-led "GEO: Generative Engine Optimization" research tested which page edits actually change whether an AI cites you: adding relevant statistics lifted citation likelihood by up to 41%, adding quotations by 28%, and adding inline citations by as much as 115%. It points in one direction — write like a source, not a brochure.

1. Be crawlable — the block is almost always at the CDN

An AI assistant can only cite a page it can fetch. There are two layers where you can be blocked, and owners check the wrong one.

Layer one is robots.txt. The honest AI crawlers read it and obey it. The ones that matter for citations:

A line as small as User-agent: GPTBot followed by Disallow: / takes you out of ChatGPT. Pull up yoursite.com/robots.txt in a browser and read it with your own eyes.

Layer two is the ambush: the CDN or firewall. Cloudflare shipped a one-click "Block AI bots" toggle, and many owners — or their host's defaults — switched it on without connecting "AI bots" to "the thing that puts me in ChatGPT." Bot-mitigation plugins, aggressive WAF rules, and managed-hosting presets do the same silently. Your robots.txt can say "come in" while your CDN returns a 403 to every model. The crawler never reaches the page, you never get cited, and nothing in analytics tells you why.

// PRINCETON GEO STUDY — WHAT MOVED CITATIONS
+41% · +28% · +115%
The Princeton-led GEO study found adding relevant statistics raised citation likelihood by up to 41%, adding quotations by 28%, and adding inline citations by as much as 115%. Keyword stuffing did the opposite.

The crawlability checklist

  1. Open yoursite.com/robots.txt — confirm none of the bots above are disallowed.
  2. In Cloudflare/your CDN, turn off the AI-bot block for the assistant crawlers you want (you can keep training-only bots blocked, but allow the live-fetch and search ones).
  3. Whitelist the same user-agents in any "block bad bots" plugin.
  4. Make sure each money page returns its real content in raw HTML, not only after JavaScript runs — many assistants take a light pass and miss client-side content.

2. Publish an llms.txt

llms.txt is a proposed standard: a single Markdown file at yoursite.com/llms.txt that hands models a clean, curated map of what matters — services, service areas, hours, pricing, and your best FAQ pages — stripped of the navigation and scripts that bury the signal on a normal page. Think of it as the index card you'd give a new dispatcher.

No assistant is required to read it yet, and it doesn't replace clean HTML or schema. But it's a 30-minute, low-risk file that makes the right facts trivial to lift. A good one names the company, the entity behind it, the neighborhoods served (Katy, Sugar Land, The Woodlands, Pearland, Cypress), the phone number, and links to the five pages you most want cited.

3. Write extractable facts, stats, and direct answers

This is where the Princeton numbers earn their place. Models reward content that reads like a citable source. Three habits, applied to every important page:

The flip side: the same research found keyword stuffing and fluff reduced citation rates. The old SEO instinct to repeat your keyword fifteen times now works against you. Write for a smart human and a literal machine at once — short answers, real numbers, clean attribution.

Stop writing brochures. Start writing the sentence you'd want the AI to quote back, word for word.

4. Entity and schema clarity

Models build a picture of "this business" by stitching together every mention across the web. If your name, address, and phone vary — "Acme A/C" here, "Acme Air Conditioning LLC" there, two phone numbers — the model isn't sure you're one entity, and an unsure model stays quiet. Pick one canonical name, address, and phone and use them identically everywhere: your site, Google Business Profile, Yelp, the chamber directory, every citation.

Then make the facts machine-readable. LocalBusiness schema in your <head> declares your name, address, hours, service area, and category in a format models and search engines parse directly. FAQPage schema exposes question-and-answer pairs in exactly the structure an assistant wants to reuse. Full markup walkthrough: the Houston SMB schema markup guide.

5. Digital PR — be on the sources models already trust

Assistants don't only read your site. When someone asks "best HVAC company near Katy," the model leans on third-party sources it treats as neutral: local roundups, "best of Houston" lists, chamber pages, news mentions, review aggregators. If your competitors appear there and you don't, the model names them.

So the off-site move is digital PR aimed at those trusted sources: get into reputable local roundups and "best of" lists, earn Houston-area press, sponsor something that links back, and keep your review profile genuinely strong. Reviews work twice — as a ranking signal and as raw material the model quotes when it explains why it picked you.

How to measure AI referrals

You can't manage what you can't see, and AI referrals hide in places the standard dashboard ignores. Three checks:

WhereWhat to look forCadence
Google Analytics 4Referral segment for chatgpt.com, perplexity.ai, gemini.google.com, copilot.microsoft.comMonthly
Search ConsoleImpressions on long, question-shaped queries that feed AI OverviewsMonthly
Spot checksAsk ChatGPT & Perplexity your top buyer questions; note if you're namedMonthly
Lead intakeAdd "ChatGPT / AI search" as a "how did you find us?" optionEvery lead

Volume is still small for most Houston SMBs, so don't panic at low session counts. The point is the trend and the quality: a visitor who arrived because a model vouched for you is pre-qualified. Which makes the next step — answering instantly — matter even more.

Why getting cited only pays if you answer fast

A model citing you is a warm introduction, not a closed deal. The visitor still has to reach you, and the business that responds first usually wins. Industry studies find replying within five minutes makes a lead 21x more likely to qualify than waiting 30 minutes, and roughly 78% of customers buy from the first business that responds. Meanwhile about 27% of inbound SMB calls go unanswered, each worth $350 to $800 in lost revenue for a service business — so a single Houston SMB can bleed $45,000 to $120,000 a year to missed and after-hours calls.

The AEO/GEO work and the response work are one campaign. The citation drives a qualified visitor; an instant, capture-everything response turns that visitor into a booked job. That's why we pair this playbook with a speed-to-lead system and a 24/7 AI chat agent that catches the lead the model just sent — including in Spanish, since roughly 45% of the Houston metro is Hispanic and a bilingual EN+ES capture surface is still a low-competition lane.

What to do this month

  1. Confirm you're crawlable — robots.txt plus the AI-bot toggle in your CDN and any security plugin.
  2. Publish an llms.txt with services, service areas, hours, phone, and five priority links.
  3. Rewrite your top 3 pages to lead with a direct answer, back it with a statistic, and add a named quote plus an inline citation.
  4. Unify your entity and add LocalBusiness + FAQPage schema.
  5. Set up AI-referral tracking in GA4 and on your lead-intake form.
  6. Make sure you answer instantly — speed-to-lead plus bilingual chat so the citation converts.

Frequently asked questions

How do you get cited by ChatGPT for a local business?

Five steps, in order: (1) let the AI crawlers in — allow GPTBot, ChatGPT-User, ClaudeBot, and PerplexityBot in robots.txt and at your CDN, where most blocks happen; (2) publish an llms.txt pointing models at your key pages; (3) write extractable answers — a one-sentence direct answer, a cited statistic, and a named quote near the top of each page; (4) unify your entity and add LocalBusiness schema; (5) earn third-party mentions on sources models trust. The Princeton GEO study found statistics lift citation likelihood by up to 41%, quotes by 28%, and inline citations by as much as 115%.

Are ChatGPT and Perplexity blocked from my website by accident?

Very often, yes — and the block usually isn't in robots.txt, it's at the CDN or WAF layer. Cloudflare's one-click "Block AI bots" toggle removes many sites from ChatGPT and Perplexity results, and bot plugins, firewall rules, and host defaults do the same silently. Check your live robots.txt and your CDN bot settings before assuming you're crawlable.

What is llms.txt and does my local business need one?

llms.txt is a plain Markdown file at your root (yoursite.com/llms.txt) that gives AI models a clean, curated map of your key pages — services, locations, hours, pricing, FAQs — without the clutter of a normal HTML page. It's a low-cost, ~30-minute signal worth doing alongside, not instead of, clean schema and crawlable pages.

How do I measure whether ChatGPT or AI Overviews are sending me traffic?

Three places: (1) in GA4, build a referral segment for chatgpt.com, perplexity.ai, gemini.google.com, and copilot.microsoft.com; (2) in Search Console, watch impressions on long, question-shaped queries that feed AI Overviews; (3) ask every lead how they found you and log "ChatGPT / AI search." Volume is still small, but these visitors convert well because the model pre-qualified them.

Sources & further reading

Stat note: the speed-to-lead, missed-call, and bilingual figures above are drawn from published industry studies and are presented as general benchmarks, not guarantees for any one business. Any example is illustrative.

DD
Dimitri Dimitrovski · Founder, WhiteBoxForge (HarbingerScope LLC)
Houston-metro digital studio. I help SMBs get named by AI assistants — and answer the leads those assistants send within seconds, in English and Spanish.
// JOIN THE DISCUSSION

Have a different take?

The conversation lives on LinkedIn. Reply on the post or message the page directly — we read every one.

Discuss on LinkedIn →
// THE PLAYBOOK INBOX

Get the next one in your inbox.

One Houston SMB playbook a week. No fluff, no upsells. Unsubscribe instantly.

— Get started —

Want to know if AI tools are naming your Houston business?

Free 30-min call. We check whether ChatGPT can crawl you, audit your entity and schema, and map the fastest path to getting cited — then make sure you answer the leads it sends.

Book your AI-visibility call →