The 15 signals we test

The free CatalogScan scan reads your storefront's public surfaces and scores 15 machine-readable signals that AI shopping agents — ChatGPT Shopping, Perplexity Shopping, Google AI Mode, Shopify's Global Catalog — use to decide whether to surface your products. This page explains every signal: what it is, why agents read it, how to test it yourself, and how to fix it.

Last updated 2026-04-30 · 5 floor signals (100 pts) + 10 ranking-spread signals (70 pts) · 170 pts total · every signal now has a per-signal deep-dive page

How scoring works: Floor score (0–100) is what decides whether you show up at all. Ranking-spread (0–70) decides which of the stores that show up get placed first. Most well-kept Shopify stores clear the floor; the spread is where every ranking decision is made.

The 15 signals

Public product feed floor · 25 pts · deep dive →
Product JSON-LD on PDP floor · 30 pts · deep dive →
Sitemap floor · 15 pts · deep dive →
Open Graph tags floor · 15 pts · deep dive →
Robots not blocking floor · 15 pts · deep dive →
GTIN coverage across variants deep · 10 pts · deep dive →
Review schema (AggregateRating) deep · 10 pts · deep dive →
Brand entity in JSON-LD deep · 8 pts · deep dive →
Canonical URL on PDP deep · 8 pts · deep dive →
Description richness deep · 8 pts · deep dive →
Offers availability deep · 6 pts · deep dive →
SKU coverage across variants deep · 6 pts · deep dive →
Image alt-text coverage deep · 6 pts · deep dive →
Hreflang on PDP deep · 4 pts · deep dive →
Structured data validity deep · 4 pts · deep dive →

Floor signals 5 signals · 100 pts · can't skip

Every AI shopping agent needs these five. Miss any one and you're invisible to that agent on that dimension, no matter how strong the rest of your catalog is. A default Shopify storefront gives you all five out of the box; headless stores frequently drop one or two during the cutover.

1. Public product feedFloor25 pts

What it isShopify's open catalog endpoint at /products.json — the single most-ingested surface on your store. Returns up to 250 products per call with standard ?page=N&limit=N pagination and includes handles, titles, descriptions, vendor, product type, variants, options, images, and inventory flags.

Why agents careThis is how ChatGPT Shopping, Perplexity, Google AI Mode, and Shopify's own Global Catalog discover what you sell at scale. It's the bulk-ingest path — crawling 10,000 product HTML pages is orders of magnitude slower than reading one JSON feed.

How to testcurl -s https://yourstore.com/products.json?limit=1. Expect a JSON body with a products array. A 404, an HTML login page, or plain HTML means the endpoint is dead on your storefront.

How to fixOn default Shopify: it's already on; make sure you didn't set storefront_password. On headless (Hydrogen, Next.js + Storefront API, custom stacks): you lost the endpoint during the cutover — re-expose it by proxying /products.json back to Shopify, or ship a custom endpoint that emits the same shape. In our 100-store scan, 40% of headless DTC brands dropped this endpoint entirely.

Deep dive/products.json: the AI bulk-ingest feed (full guide)

2. Product JSON-LD on PDPFloor30 pts

What it isA <script type="application/ld+json"> block on every product detail page with @type: "Product" describing price, availability, brand, GTIN, SKU, reviews, and description in machine-readable form.

Why agents careThis is the single biggest discovery signal — more points (30) than any other single test. Every AI shopping agent parses PDP JSON-LD first and only falls back to scraping the page body if the structured block is missing or malformed. It's also the source of truth for downstream signals (brand entity, aggregate rating, offers availability, canonical URL).

How to testView source on any product page, search for application/ld+json, confirm a block where @type equals Product. Validate it via Google's Rich Results Test.

How to fixShopify's default Dawn theme emits this. Custom themes and page-builder apps often break it. If your PDP has JSON-LD but nothing of @type: "Product", you're getting zero credit — the block has to be a Product, not just an Organization or WebPage. Fix in your theme's product.liquid template.

Deep diveProduct JSON-LD: how to fix the single biggest AI-shopping signal

3. SitemapFloor15 pts

What it isA valid XML sitemap at /sitemap.xml listing every canonical URL on your store. Either a flat <urlset> or a <sitemapindex> with sub-sitemaps per content type.

Why agents careTells crawlers what pages exist without having to spider. For a store with 10,000 products, the difference between "indexed from sitemap" and "indexed by random link discovery" is measured in weeks and in long-tail coverage gaps.

How to testcurl -s https://yourstore.com/sitemap.xml | head -5. Expect <?xml> plus <urlset> or <sitemapindex>.

How to fixShopify auto-generates this. If you went headless and broke it, restore it on the custom front — every frontend framework (Next.js, Remix, Hydrogen) has a first-party sitemap generator, and pointing it at the Storefront API is a one-file change.

Deep diveSitemap.xml: the discovery surface AI agents read first (full guide)

4. Open Graph tagsFloor15 pts

What it isThree tags in the homepage <head>: og:title, og:description, og:image. Partial credit: 5 pts per tag present.

Why agents careThis is what AI assistants render when they cite your store in a response. Missing OG tags means your link shows up as a bare URL — far lower click-through than a rich card with image, title, and description.

How to testView source on your home page; search for og:title, og:description, og:image. The image should be at least 1200×630 and actually representative of the brand.

How to fixAdd the three tags to theme.liquid's <head>. Most Shopify themes have them for PDPs but not the homepage. A one-liner per tag.

Deep diveOpen Graph tags: the homepage gap most themes ship (full guide)

5. Robots not blockingFloor15 pts

What it isA /robots.txt that doesn't blanket-block catalog paths for User-agent: *. Specifically: no Disallow: /, Disallow: /products, Disallow: /products.json, or Disallow: /collections.

Why agents careMost AI shopping bots honor robots.txt. One wrong Disallow line renders every other signal moot — you can have perfect JSON-LD and it won't matter because the bot doesn't fetch the page. This is the most common "I got invisible overnight" bug we see.

How to testcurl -s https://yourstore.com/robots.txt. Confirm no blanket Disallow inside the User-agent: * block. Disallow: /*?sort_by is fine (it only blocks sort-permutation URLs).

How to fixShopify has a robots editor under Admin → Online Store → Preferences. Remove any catastrophic Disallow: / or Disallow: /products. If your store went private while testing and you forgot to flip it back, that's also this signal.

Deep diveRobots.txt for AI shopping agents — Cloudflare, dev-store passwords, the AI-bot UA list

Ranking-spread signals 10 signals · 70 pts · decides placement

These are what AI agents use to rank the stores that already cleared the floor. When 20 Shopify stores all sell the same-ish t-shirt, these 10 signals are how the agent decides which one to lead with. Every one of them is checkable from public data — no Shopify login required.

6. GTIN coverage across variantsDeep10 pts

What it isGlobal Trade Item Numbers (UPC/EAN/ISBN) on your variants. Shopify stores them under variants[].barcode. Full credit at ≥90% of sampled variants having a barcode; half credit 50–89%.

Why agents careGTINs let an agent match your product to the same product on other retailers and to the manufacturer's listing. Without them, your product is a standalone text string — trust and confidence scores for a GTIN-less product are materially lower, and you're rarely chosen for commodity "where to buy X" queries.

How to test

curl -s https://yourstore.com/products.json | jq '[.products[].variants[].barcode] | length as $t | map(select(. != null and . != "")) | length as $f | "\($f)/\($t)"'

. Target ≥90%.

How to fixThree paths: (a) enter your manufacturer-issued UPC/EAN directly in Shopify admin per variant; (b) bulk-fill via CSV import on the Barcode column; (c) if you don't have GTINs, buy them from GS1 — a one-time per-SKU fee, usable across retailers. Our Pro tier auto-enriches via public GS1 lookups before you pay.

Deep diveGTIN coverage across variants — bulk-fill paths and GS1 lookup

7. Review schema (AggregateRating)Deep10 pts

What it isAn aggregateRating node in your Product JSON-LD, with ratingValue and reviewCount. Either inline on the Product or as a separate AggregateRating graph node.

Why agents careReview schema is how an agent decides between two equivalent products. A 4.8 star / 3,200 review product will be placed before a 4.9 / 12 review product for most tie-breaks — but only if the agent can actually read the scores. No aggregateRating node = no social proof input to the ranker. In our 100-store scan, ~9 out of every 10 stores was missing this.

How to testView source on a PDP, search the JSON-LD block for aggregateRating. If the tag's missing but Judge.me / Yotpo / Loox / Stamped etc. is installed, the review app isn't injecting it.

How to fixYour reviews app almost certainly has a toggle: "Include structured data / rich snippets / SEO schema." In Judge.me it's under Settings → SEO; Loox has it under Integrations → Rich Snippets; Yotpo under Widget → Schema. Turn it on. Free fix; immediate effect on next cache flush.

Deep diveAggregateRating: the signal 9 of 10 stores fail (with per-app fix table)

8. Brand entity in JSON-LDDeep8 pts

What it isA brand property on your Product JSON-LD, ideally nested as {"@type": "Brand", "name": "YourBrand"}. A plain string brand gets half credit; a missing brand gets zero.

Why agents careThe brand entity is how an agent decides which store is the "official" seller of a branded product vs. a marketplace reseller. If ChatGPT is asked "where to buy Allbirds Wool Runners," the store whose JSON-LD clearly identifies itself as the Allbirds brand wins over third-party listings.

How to testView source on a PDP, find the Product JSON-LD block, look for "brand". A string ("brand": "Allbirds") gets 4 pts; a nested entity ("brand": { "@type": "Brand", "name": "Allbirds" }) gets 8.

How to fixMost Shopify themes emit brand as a string pulled from the Vendor field. Update your theme's product.liquid schema block to emit the nested form — single commit, same data.

Deep diveBrand entity in JSON-LD: string vs nested entity (one-line product.liquid fix)

9. Canonical URL on PDPDeep8 pts

What it isA <link rel="canonical"> tag on every PDP pointing at the product's absolute canonical URL. Absolute URL = full credit; relative (/products/foo) = half credit; missing = zero.

Why agents careShopify PDPs are accessible via many URL shapes — root (/products/foo), collection-scoped (/collections/x/products/foo), tracking-param-decorated — and agents need a single trusted URL to consolidate ranking signals against. Missing canonicals split your authority across variants, and deduping across stores becomes impossible.

How to testView source on a PDP. Search for rel="canonical". The href should be an absolute https:// URL, not a relative path.

How to fixShopify's default theme emits this. If your theme dropped it, add back to theme.liquid: <link rel="canonical" href="{{ canonical_url }}">. If your theme emits a relative canonical, change to {{ shop.url }}{{ canonical_url }}.

Deep diveCanonical URL on PDP: 5+ Shopify URL shapes and how to consolidate them (full guide)

10. Description richnessDeep8 pts

What it isMedian description word count across your catalog (we sample your feed). Full credit ≥80 words per product; half credit 40–79; zero under 40.

Why agents careDescriptions are the primary text signal an AI agent uses to match a user's free-text query to your product. "100% cotton, breathable, runs small" is a bad description for "where to buy a shirt for someone allergic to wool." Unique, attribute-rich descriptions answer more queries and surface for more intents.

How to testcurl -s https://yourstore.com/products.json | jq '.products[].body_html' | awk '{ gsub(/<[^>]*>/, ""); print NF }' — find the median.

How to fixThe bulk-rewrite path is what our Pro tier automates — each product gets a Claude-generated, brand-voiced description matching your existing tone, with attribute-rich body copy. Doing it by hand: aim for 80+ unique words covering material, use-cases, sizing, compatibility. Don't repeat the title in the description; that hurts more than it helps.

Deep diveDescription richness: the 5-block 80-word recipe and bulk-rewrite path (full guide)

11. Offers availabilityDeep6 pts

What it isThe availability field on your Product JSON-LD's offers node, pointing to a schema.org vocabulary URL like https://schema.org/InStock. Full credit for the full URL; half credit for the short form (InStock); zero if missing or malformed.

Why agents careAI agents don't want to recommend a product that's sold out — it's a bad user experience that degrades trust in the agent. Without a parseable availability field, the agent has no way to know stock status and your product gets down-ranked for safety.

How to testView source, find the offers object in your Product JSON-LD, check the availability value.

How to fixIn your theme's JSON-LD, emit the full schema.org URL form: "availability": "https://schema.org/{% if product.available %}InStock{% else %}OutOfStock{% endif %}". Strict parsers reject the short form.

Deep diveOffers availability: URL form vs bare-name partial credit, variant-aware Liquid, pre-order/back-order/discontinued routing

12. SKU coverage across variantsDeep6 pts

What it isYour own internal stock-keeping unit on every variant. Separate from GTIN: SKU is yours, GTIN is global. Full credit ≥90% of sampled variants; half credit 50–89%.

Why agents careSKUs give agents a stable ID per variant to cite, cache, and cross-reference. If a user asks an agent to "re-order the same t-shirt in medium," a reliable SKU is the agent's canonical handle for that specific variant across fetches.

How to test

curl -s https://yourstore.com/products.json | jq '[.products[].variants[].sku] | length as $t | map(select(. != null and . != "")) | length as $f | "\($f)/\($t)"'

How to fixBulk-fill in Shopify admin — the SKU column is editable on the variants page. If you manage inventory via a PIM or ERP, pull the SKUs down from there via Shopify's Admin API. A variant without a SKU usually indicates stale catalog hygiene — fix both.

Deep diveSKU coverage: SKU vs GTIN vs MPN, four bulk-fill paths (admin, Matrixify, Admin API, PIM/ERP), and the duplicate-collision check

13. Image alt-text coverageDeep6 pts

What it isAlt text on your product images — Shopify exposes this as images[].alt in the feed. Full credit ≥80% of sampled images have non-empty alt; half credit 30–79%.

Why agents careAlt text is a secondary text-matching signal that complements the description. Agents parse alts to pick up attributes not covered in the main copy — color, angle, model, size on body. It's also the accessibility signal, which high-trust rankers weight positively.

How to test

curl -s https://yourstore.com/products.json | jq '[.products[].images[].alt] | length as $t | map(select(. != null and . != "")) | length as $f | "\($f)/\($t)"'

How to fixShopify admin lets you set alt per image, but for catalogs over a few hundred products this is the single highest-leverage Claude rewrite target. Our Pro tier generates brand-voiced alts from the product name + image content. Doing it by hand: 8-15 words per alt, include the product name and the distinguishing visual attribute.

Deep diveImage alt-text: the 6 cheapest points in the catalog (full guide with bulk-fill paths)

14. Hreflang on PDPDeep4 pts

What it is<link rel="alternate" hreflang="…"> tags on your PDP mapping each region-specific version of a product page. Full credit for 2+ locales; half credit for 1.

Why agents careIf you ship internationally, hreflang tells agents which version of your product to show to which region. Without it, a UK shopper might get served your US page with US pricing and shipping — a ranking hit because the agent learns to distrust your surfacing. For single-region stores, one hreflang is fine.

How to testView source on a PDP, search for hreflang=. Count unique locale tags.

How to fixIf you use Shopify Markets, emit the Markets-aware hreflang block from your theme. If you maintain localized stores manually, emit one <link rel="alternate"> per store, pointing at the region-equivalent product URL. Don't forget the x-default fallback.

Deep diveHreflang on PDP: BCP 47 codes (en-GB not en-UK), Shopify Markets Liquid block, and the x-default fallback

15. Structured data validityDeep4 pts

What it isEvery application/ld+json block on your PDP parses as valid JSON. If you have 3 JSON-LD blocks and 1 throws a parse error, you score 0 — agents skip the whole script tag on syntax error, including the valid ones before the break.

Why agents careA single unescaped quote, trailing comma, or unterminated string in one block invalidates the entire graph from an agent's perspective. The failure mode is silent — there's no visible page break — which is why this signal catches bugs themes and apps ship into production unnoticed.

How to testCopy each <script type="application/ld+json"> block's contents into Google's Rich Results Test. Any red X = one invalid block.

How to fixThe common culprit is a review-widget app injecting unescaped product names or description fragments with smart quotes. The fix is usually one Liquid filter — {{ description | strip_html | json }} — instead of manual string concatenation. When you upgrade your theme, re-test.

Deep diveStructured data validity: smart-quote breakage, trailing-comma loops, the universal `| json` filter rule, and per-deploy CI checks

Per-signal deep dives — all 15 signals now have one

Product JSON-LD: the 30-pt signal 60% of stores fail entirely · floor · 30 pts
Shopify /products.json: the 25-pt feed 40% of headless stores hide by accident · floor · 25 pts
Sitemap.xml: the floor signal headless rebuilds quietly break · floor · 15 pts
Open Graph tags: the homepage gap most themes ship · floor · 15 pts
Robots.txt: the "I went invisible overnight" floor signal — Cloudflare and dev-store password bugs · floor · 15 pts
AggregateRating: the signal 9 of 10 stores fail (and how to flip the toggle) · deep · 10 pts
GTIN coverage on variants: cross-retailer matching and bulk-fill paths · deep · 10 pts
Canonical URL on PDP: 5+ Shopify URL shapes and how to consolidate them · deep · 8 pts
Description richness: the 5-block 80-word recipe and bulk-rewrite path · deep · 8 pts
Brand entity in JSON-LD: string vs nested entity (4 pts to 8 pts in 5 minutes) · deep · 8 pts
Image alt-text: the 6 cheapest points in the catalog · deep · 6 pts
Offers availability: URL form vs bare-name partial credit, variant-aware Liquid · deep · 6 pts
SKU coverage: SKU vs GTIN vs MPN, four bulk-fill paths, and the duplicate-collision check · deep · 6 pts
Hreflang on PDP: BCP 47 codes, Shopify Markets Liquid, and the x-default fallback · deep · 4 pts
Structured data validity: smart-quote breakage and the universal | json filter rule · deep · 4 pts

Which of the 15 are you failing?

Free 2-minute scan. Paste your store URL, get a color-graded scorecard with every signal checked inline.

Scan my store → See pricing

The 15 signals we test

The 15 signals

Floor signals 5 signals · 100 pts · can't skip

Ranking-spread signals 10 signals · 70 pts · decides placement

Per-signal deep dives — all 15 signals now have one

See also

Which of the 15 are you failing?