How to get cited by ChatGPT for a local business (the 2026 AEO/GEO playbook)
A growing share of your future customers will never see a list of ten blue links. They will ask ChatGPT, Perplexity, or Google AI Overviews "who's the best HVAC company near Katy?" and act on the two or three names the model gives back. Getting cited is now a separate discipline from ranking — and most Houston businesses are losing it on a technicality they don't know exists.
The most common reason a Houston business is invisible inside ChatGPT and Perplexity isn't bad content, weak reviews, or thin pages. It's that the site is accidentally blocking the AI crawlers at the CDN — a checkbox someone flipped on, or a security plugin doing its job too well. The model literally can't read the page, so it never names you.
This is the technical playbook to fix that, plus the four moves that follow. For the strategy-level "why this matters," read getting your Houston SMB cited by ChatGPT, Perplexity, and AI Overviews. This is the wrench-turning version.
- 1. Be crawlable. Allow GPTBot, ChatGPT-User, ClaudeBot, and PerplexityBot in robots.txt and at your CDN/firewall — where most accidental blocks actually live.
- 2. Ship an llms.txt. A plain Markdown map at your root that points models straight at services, locations, hours, and FAQs.
- 3. Write extractable answers. One-sentence direct answer + a cited statistic + a named quote, near the top of each page.
- 4. Fix your entity. Identical name/address/phone everywhere, plus LocalBusiness and FAQPage schema.
- 5. Earn trusted citations. Get named on the third-party sources the models already pull from.
Why those exact ingredients? Because a controlled study measured them. The Princeton-led "GEO: Generative Engine Optimization" research tested which page edits actually change whether an AI cites you: adding relevant statistics lifted citation likelihood by up to 41%, adding quotations by 28%, and adding inline citations by as much as 115%. It points in one direction — write like a source, not a brochure.
1. Be crawlable — the block is almost always at the CDN
An AI assistant can only cite a page it can fetch. There are two layers where you can be blocked, and owners check the wrong one.
Layer one is robots.txt. The honest AI crawlers read it and obey it. The ones that matter for citations:
GPTBot— OpenAI's training and search crawlerChatGPT-User— the fetch ChatGPT runs when a user's question triggers live browsingOAI-SearchBot— OpenAI's search index crawlerClaudeBotandClaude-User— Anthropic's crawlersPerplexityBotandPerplexity-User— Perplexity's crawlersGoogle-Extended— Google's AI/Gemini opt-in (separate from regular Googlebot)
A line as small as User-agent: GPTBot followed by Disallow: / takes you out of ChatGPT. Pull up yoursite.com/robots.txt in a browser and read it with your own eyes.
Layer two is the ambush: the CDN or firewall. Cloudflare shipped a one-click "Block AI bots" toggle, and many owners — or their host's defaults — switched it on without connecting "AI bots" to "the thing that puts me in ChatGPT." Bot-mitigation plugins, aggressive WAF rules, and managed-hosting presets do the same silently. Your robots.txt can say "come in" while your CDN returns a 403 to every model. The crawler never reaches the page, you never get cited, and nothing in analytics tells you why.
The crawlability checklist
- Open
yoursite.com/robots.txt— confirm none of the bots above are disallowed. - In Cloudflare/your CDN, turn off the AI-bot block for the assistant crawlers you want (you can keep training-only bots blocked, but allow the live-fetch and search ones).
- Whitelist the same user-agents in any "block bad bots" plugin.
- Make sure each money page returns its real content in raw HTML, not only after JavaScript runs — many assistants take a light pass and miss client-side content.
2. Publish an llms.txt
llms.txt is a proposed standard: a single Markdown file at yoursite.com/llms.txt that hands models a clean, curated map of what matters — services, service areas, hours, pricing, and your best FAQ pages — stripped of the navigation and scripts that bury the signal on a normal page. Think of it as the index card you'd give a new dispatcher.
No assistant is required to read it yet, and it doesn't replace clean HTML or schema. But it's a 30-minute, low-risk file that makes the right facts trivial to lift. A good one names the company, the entity behind it, the neighborhoods served (Katy, Sugar Land, The Woodlands, Pearland, Cypress), the phone number, and links to the five pages you most want cited.
3. Write extractable facts, stats, and direct answers
This is where the Princeton numbers earn their place. Models reward content that reads like a citable source. Three habits, applied to every important page:
- Lead with the direct answer. Open the section with one complete sentence that answers the question a customer would type: "A standard residential AC tune-up in the Houston metro runs roughly $90 to $180." A model can lift that whole.
- Back it with a statistic. Statistics lifted citation likelihood by up to 41%. Numbers read as authority. Industry studies find around 62% of HVAC calls come in after hours — a concrete figure beats "we get a lot of evening calls."
- Add a named quote and an inline citation. Quotations added 28%; inline citations added up to 115%. A short attributed quote and a linked source give the model exactly the shape it likes to reproduce.
The flip side: the same research found keyword stuffing and fluff reduced citation rates. The old SEO instinct to repeat your keyword fifteen times now works against you. Write for a smart human and a literal machine at once — short answers, real numbers, clean attribution.
Stop writing brochures. Start writing the sentence you'd want the AI to quote back, word for word.
4. Entity and schema clarity
Models build a picture of "this business" by stitching together every mention across the web. If your name, address, and phone vary — "Acme A/C" here, "Acme Air Conditioning LLC" there, two phone numbers — the model isn't sure you're one entity, and an unsure model stays quiet. Pick one canonical name, address, and phone and use them identically everywhere: your site, Google Business Profile, Yelp, the chamber directory, every citation.
Then make the facts machine-readable. LocalBusiness schema in your <head> declares your name, address, hours, service area, and category in a format models and search engines parse directly. FAQPage schema exposes question-and-answer pairs in exactly the structure an assistant wants to reuse. Full markup walkthrough: the Houston SMB schema markup guide.
5. Digital PR — be on the sources models already trust
Assistants don't only read your site. When someone asks "best HVAC company near Katy," the model leans on third-party sources it treats as neutral: local roundups, "best of Houston" lists, chamber pages, news mentions, review aggregators. If your competitors appear there and you don't, the model names them.
So the off-site move is digital PR aimed at those trusted sources: get into reputable local roundups and "best of" lists, earn Houston-area press, sponsor something that links back, and keep your review profile genuinely strong. Reviews work twice — as a ranking signal and as raw material the model quotes when it explains why it picked you.
How to measure AI referrals
You can't manage what you can't see, and AI referrals hide in places the standard dashboard ignores. Three checks:
| Where | What to look for | Cadence |
|---|---|---|
| Google Analytics 4 | Referral segment for chatgpt.com, perplexity.ai, gemini.google.com, copilot.microsoft.com | Monthly |
| Search Console | Impressions on long, question-shaped queries that feed AI Overviews | Monthly |
| Spot checks | Ask ChatGPT & Perplexity your top buyer questions; note if you're named | Monthly |
| Lead intake | Add "ChatGPT / AI search" as a "how did you find us?" option | Every lead |
Volume is still small for most Houston SMBs, so don't panic at low session counts. The point is the trend and the quality: a visitor who arrived because a model vouched for you is pre-qualified. Which makes the next step — answering instantly — matter even more.
Why getting cited only pays if you answer fast
A model citing you is a warm introduction, not a closed deal. The visitor still has to reach you, and the business that responds first usually wins. Industry studies find replying within five minutes makes a lead 21x more likely to qualify than waiting 30 minutes, and roughly 78% of customers buy from the first business that responds. Meanwhile about 27% of inbound SMB calls go unanswered, each worth $350 to $800 in lost revenue for a service business — so a single Houston SMB can bleed $45,000 to $120,000 a year to missed and after-hours calls.
The AEO/GEO work and the response work are one campaign. The citation drives a qualified visitor; an instant, capture-everything response turns that visitor into a booked job. That's why we pair this playbook with a speed-to-lead system and a 24/7 AI chat agent that catches the lead the model just sent — including in Spanish, since roughly 45% of the Houston metro is Hispanic and a bilingual EN+ES capture surface is still a low-competition lane.
What to do this month
- Confirm you're crawlable — robots.txt plus the AI-bot toggle in your CDN and any security plugin.
- Publish an llms.txt with services, service areas, hours, phone, and five priority links.
- Rewrite your top 3 pages to lead with a direct answer, back it with a statistic, and add a named quote plus an inline citation.
- Unify your entity and add LocalBusiness + FAQPage schema.
- Set up AI-referral tracking in GA4 and on your lead-intake form.
- Make sure you answer instantly — speed-to-lead plus bilingual chat so the citation converts.
Frequently asked questions
How do you get cited by ChatGPT for a local business?
Five steps, in order: (1) let the AI crawlers in — allow GPTBot, ChatGPT-User, ClaudeBot, and PerplexityBot in robots.txt and at your CDN, where most blocks happen; (2) publish an llms.txt pointing models at your key pages; (3) write extractable answers — a one-sentence direct answer, a cited statistic, and a named quote near the top of each page; (4) unify your entity and add LocalBusiness schema; (5) earn third-party mentions on sources models trust. The Princeton GEO study found statistics lift citation likelihood by up to 41%, quotes by 28%, and inline citations by as much as 115%.
Are ChatGPT and Perplexity blocked from my website by accident?
Very often, yes — and the block usually isn't in robots.txt, it's at the CDN or WAF layer. Cloudflare's one-click "Block AI bots" toggle removes many sites from ChatGPT and Perplexity results, and bot plugins, firewall rules, and host defaults do the same silently. Check your live robots.txt and your CDN bot settings before assuming you're crawlable.
What is llms.txt and does my local business need one?
llms.txt is a plain Markdown file at your root (yoursite.com/llms.txt) that gives AI models a clean, curated map of your key pages — services, locations, hours, pricing, FAQs — without the clutter of a normal HTML page. It's a low-cost, ~30-minute signal worth doing alongside, not instead of, clean schema and crawlable pages.
How do I measure whether ChatGPT or AI Overviews are sending me traffic?
Three places: (1) in GA4, build a referral segment for chatgpt.com, perplexity.ai, gemini.google.com, and copilot.microsoft.com; (2) in Search Console, watch impressions on long, question-shaped queries that feed AI Overviews; (3) ask every lead how they found you and log "ChatGPT / AI search." Volume is still small, but these visitors convert well because the model pre-qualified them.
Sources & further reading
- Getting your Houston SMB cited by ChatGPT, Perplexity, and AI Overviews — the strategy-level companion to this technical guide
- The Houston SMB schema markup guide for 2026
- Speed-to-lead: catch the qualified visitor the model just sent you
- 24/7 bilingual AI chat agent
- Run a free 90-second site audit
- Aggarwal et al., "GEO: Generative Engine Optimization" (Princeton-led research) — the source of the +41% / +28% / +115% figures
Stat note: the speed-to-lead, missed-call, and bilingual figures above are drawn from published industry studies and are presented as general benchmarks, not guarantees for any one business. Any example is illustrative.