Blog › Technical SEO

How to audit Shopify structured data: the exact 3-tool workflow to verify your JSON-LD is working for AI shopping agents

CatalogScan — June 5, 2026 — Technical SEO Structured Data AI Shopping

Most Shopify merchants who've added JSON-LD to their theme believe their structured data is working. Most of the time, it isn't — not in the form AI shopping agents can actually parse and trust. Here's the exact audit workflow that surfaces what's broken.

67%
of Shopify stores with JSON-LD have at least one critical parse error (CatalogScan corpus, 2026)
5
silent failure patterns that break structured data without throwing a visible page error
3
tools needed for a complete structured data audit — and only one requires Search Console access

The gap between "I added JSON-LD" and "AI agents are reading it"

When Shopify's default theme outputs structured data, it includes a <script type="application/ld+json"> block on product pages with a basic Product schema: name, description, image, offers. Many merchants stop there and assume the work is done.

But AI shopping agents — ChatGPT Shopping, Perplexity Commerce, Google AI Mode, Shopify's own Global Catalog feed — don't give partial credit. If a required property is missing, incorrectly typed, or rendered in a form the parser doesn't recognize, the entire product may be excluded from the agent's consideration set for that query signal. The failure is invisible: your page still renders, your products still sell through human-initiated search, but you're absent from AI-mediated recommendations.

The audit gap has three common sources:

None of these trigger a 500 error. None produce a broken page. They only show up when you test the JSON-LD output directly — which is what the three-tool audit is for.

Tool 1: Rich Results Test — per-product verification

1
Rich Results Test search.google.com/test/rich-results
No login required — works on any public URL

Google's Rich Results Test renders your page the way Googlebot would — executing JavaScript, resolving Liquid-rendered output — and then parses every structured data block it finds. For Shopify, this means it sees the actual JSON-LD that Googlebot sees, not the Liquid template source.

What to test:

  • Your highest-traffic product page (the one most likely to have complete data)
  • A product with no metafields set (the "worst case" — likely to expose Liquid rendering gaps)
  • A product that has multiple variants at different price points (tests price range rendering)
  • A product that is currently out of stock (tests availability enum rendering)

What to look for in the results:

  • Detected items: Should show "Product" — if it shows nothing, your JSON-LD is either absent, malformed to the point of being unparseable, or placed after the closing </body> tag
  • Errors vs. Warnings: Errors block rich result eligibility. Warnings are advisory. For AI shopping agent purposes, treat both as blocking — agents are stricter than Google's rich-result eligibility rules
  • Missing required fields: "name", "image", "description" are required. Any missing = error
  • Offer block: Check that "price", "priceCurrency", and "availability" are all present and showing real values — not empty strings or Liquid syntax leaking through

Reading a Rich Results Test output for a Shopify product

When you paste a product URL and click "Test URL," the tool shows you a parsed tree of every JSON-LD block it found. For Shopify's default theme output, you'll typically see a Product item with nested Offer items. The critical things to check in each Offer:

Property Expected value Common broken state
price 29.99 Empty string, "$29.99" (currency symbol included), or "29.99" (string not number — technically allowed in JSON-LD but flags a warning)
priceCurrency USD Empty string (metafield not set), "$" (symbol instead of ISO code), missing entirely
availability https://schema.org/InStock "InStock" (bare string), "available" (custom string), empty string when Shopify returns nil for out-of-stock variants
url https://store.com/products/handle Relative URL (/products/handle), missing entirely, or variant-scoped URL on a product with no variants
image https://cdn.shopify.com/... Empty array [] (product has no images), CDN URL without scheme (//cdn.shopify.com/...)
Tip: Run the Rich Results Test on your canonical product URL (without variant query parameters), then again on a variant URL like /products/handle?variant=12345. Some Shopify themes output different JSON-LD depending on which variant is selected — and the default (no variant) URL may render an empty price if the theme uses JavaScript to populate it dynamically.

Tool 2: Schema.org Validator — JSON-LD syntax and property validation

2
Schema.org Validator validator.schema.org
No login required — accepts URL or paste-in markup

While the Rich Results Test validates against Google's rich-result eligibility requirements, the Schema.org Validator checks compliance against the full schema.org specification. It catches property names that Google silently ignores (but AI agents may rely on), incorrect type nesting, and deprecated properties that were replaced in more recent schema.org releases.

What it catches that Rich Results Test misses:

  • Properties on the wrong type (gtin13 on an Organization instead of a Product)
  • Deprecated property names (e.g., offers.seller syntax changes between schema.org versions)
  • Type mismatches where a property expects a URL type but receives a plain string
  • Missing @context or incorrect context URL
  • GTIN format validation — a gtin13 that isn't 13 digits will pass the Rich Results Test but fail the Schema.org Validator

The paste-in workflow for Shopify

The Schema.org Validator's URL mode doesn't execute JavaScript, so it sees the raw server-rendered HTML. For Shopify stores using JavaScript-injected structured data (rare but possible with some headless setups), use the paste-in mode instead:

  1. Open a product page in your browser, right-click → "View Page Source" (not Inspect — you want the raw HTML, not the post-JavaScript DOM)
  2. Search for application/ld+json — copy the entire content of the script tag (the JSON object inside)
  3. Paste into the Schema.org Validator's "Validate by Direct Input" tab
  4. Note every red error (blocking) and yellow warning (advisory)

For Shopify Online Store 2.0 themes, the JSON-LD is server-rendered by Liquid, so the URL mode works correctly. The paste-in approach is only necessary if your theme uses a custom storefront or injects structured data via a JavaScript app.

Comparing the two tools' error sets

It's worth running both tools on the same URL and comparing the error lists. They use different validation rule sets and will often catch non-overlapping issues. A clean Rich Results Test result does not mean clean Schema.org validation — and for AI shopping agents that implement the full schema.org spec rather than just Google's rich-result subset, the Schema.org Validator errors are the ones that matter.

Tool 3: Google Search Console — catalog-wide coverage errors

3
Google Search Console — Enhancements › Shopping tab Requires GSC access
search.google.com/search-console — requires property verification

The Rich Results Test and Schema.org Validator check individual URLs. Google Search Console's structured data report shows you errors across your entire product catalog — and clusters them by error type so you can fix one template problem that's affecting 400 products simultaneously.

Where to find it: In Search Console, go to Shopping in the left nav (under Enhancements). This shows Product-type structured data errors across all crawled pages. If you don't see the Shopping section, your domain hasn't had any Product JSON-LD crawled yet (or the structured data is sufficiently broken that Googlebot couldn't classify it).

Reading the Search Console structured data report for Shopify

The report groups errors into three categories:

The most common Search Console errors seen in Shopify catalogs:

Error message What it means for Shopify Typical cause
"Missing field 'price'" Offer block is present but price is empty or missing Product has no active variants; Liquid variant.price returns nil
"Invalid value for field 'availability'" Availability string not a recognized schema.org URI Theme outputs bare string (InStock) instead of full URI
"Missing field 'priceCurrency'" Offer block has price but no currency Theme hardcodes USD or uses shop.currency which returns empty in some market configurations
"Invalid value for field 'image'" Image URL is protocol-relative or returns 404 CDN URL missing https: scheme, or deleted product image still referenced in Liquid
"Missing field 'description'" Product description empty or stripped to empty string Product has no description set in Shopify admin; Liquid outputs empty string with no fallback
Important: Search Console data reflects Googlebot's crawl, which can lag 1–3 weeks behind your current theme state. After fixing a template-level structured data error, use the URL Inspection tool in Search Console to fetch the current live version of a specific URL and immediately see whether the fix resolved the error for that URL, without waiting for the next crawl.

Bonus: manual curl verification for crawlability

Before any structured data can be read, the crawler has to be able to reach your pages. A quick curl check tells you whether AI shopping agent crawlers are being blocked, redirected, or served different content than what your browser sees.

Check that JSON-LD is present in the server-rendered HTML

curl -s https://your-store.com/products/your-product-handle | grep -c "application/ld+json"

Should return a non-zero number. Zero means no JSON-LD is server-rendered at all — the structured data is either absent or injected client-side (invisible to crawlers).

Check that AI crawlers aren't blocked by Cloudflare or a WAF

# Simulate OAI-SearchBot (ChatGPT Shopping crawler)
curl -s -o /dev/null -w "%{http_code}" \
  -H "User-Agent: OAI-SearchBot/1.0 (+https://openai.com/searchbot)" \
  https://your-store.com/products/your-product-handle
# Simulate PerplexityBot
curl -s -o /dev/null -w "%{http_code}" \
  -H "User-Agent: PerplexityBot/1.0 (+https://docs.perplexity.ai/bots)" \
  https://your-store.com/products/your-product-handle

Both should return 200. A 403 or 429 means your CDN or WAF is blocking AI crawlers by user agent — a common misconfiguration documented in detail in our Cloudflare AI crawler guide. A 301 or 302 redirect chain on product URLs can also cause crawlers to abandon indexing if the redirect target is slow or returns a different content type.

Extract the raw JSON-LD for inspection

curl -s https://your-store.com/products/your-product-handle \
  | grep -o '<script type="application/ld+json">[^<]*</script>' \
  | sed 's/<[^>]*>//g'

This prints the raw JSON to your terminal. Pipe it to a JSON formatter (| node -e "process.stdin.resume();let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>console.log(JSON.stringify(JSON.parse(d),null,2)))") to check for Liquid syntax leaking through, empty strings in critical fields, or malformed JSON that would cause a parse error.

5 silent failure patterns and how to fix them

These patterns are "silent" because they produce no visible page error — your store works fine for human visitors. But they reliably cause AI shopping agents to either skip your product or index it with degraded signals.

01
Liquid variable rendering as empty string in JSON-LD
When a product is missing a metafield, variant attribute, or description, Shopify Liquid outputs an empty string — and your JSON-LD ends up with "price": "" or "description": "". This fails schema validation even though the JSON is syntactically valid. The product will either be excluded from AI recommendation sets or demoted due to incomplete signals.
How to detect Rich Results Test will show "Invalid value for field 'price'" on the affected products.
Fix Add Liquid conditionals around each property to emit the property only when the value is non-empty: {%- if product.price > 0 -%}"price": {{ product.price | money_without_currency | remove: "," }}{%- endif -%}. For description: use a fallback chain — product description, then product type, then a static default that describes the product category.
02
Availability value not a recognized schema.org URI
Shopify's Liquid returns availability as a short string ("in stock", "out of stock", "preorder"). Many Shopify themes output this directly into the JSON-LD availability field. The schema.org spec requires a full URI like https://schema.org/InStock. AI shopping agents that implement the full spec will reject the bare string and treat the product as having no availability signal — which often means they won't recommend it for availability-sensitive queries ("in stock now", "ships today").
How to detect Schema.org Validator shows "Invalid value for property availability" or Rich Results Test shows "Invalid value for field 'availability'".
Fix Map Shopify's availability strings to schema.org URIs with a Liquid conditional block:

{%- if product.available -%}https://schema.org/InStock{%- else -%}https://schema.org/OutOfStock{%- endif -%}

For pre-order products, add a metafield check: if product.metafields.availability.is_preorder == true, output https://schema.org/PreOrder instead.
03
HTML entities encoded inside JSON-LD strings
Some Shopify themes run product names and descriptions through Liquid's HTML escaping filter before embedding them in JSON-LD. The result is product names like "Women&amp;s Running Shoes" inside the JSON string. JSON-LD parsers don't decode HTML entities — they treat &amp; as the literal text, so your product name becomes "Women&amp;s Running Shoes" in the AI agent's index. This corrupts the name, breaks keyword matching, and makes your product appear in AI recommendations with garbled metadata.
How to detect Curl the product page, extract the raw JSON-LD, and search for &amp;, &quot;, or &lt; inside string values.
Fix In Shopify Liquid, use product.title | strip_html (not | escape or | xml_escape) when embedding values into JSON-LD. The escape filter is for HTML attribute contexts. For JSON strings, use | strip_html | replace: '"', '\"' to handle embedded quotation marks without HTML-encoding the ampersands.
04
Price formatted as a currency string instead of a number
Shopify Liquid's money filter outputs a human-formatted price string like "$29.99" or "USD 29.99". When this is embedded directly into the JSON-LD price property, the value is a string containing a currency symbol — not a number. The schema.org spec defines price as a Number or a string representation of a number without currency symbols. Some AI shopping agents accept formatted strings; many reject the currency symbol prefix and either exclude the price signal or fail to parse the offer entirely.
How to detect Run the Schema.org Validator and look for "The 'price' property has a value that does not look like a number" warning. Or curl the product and grep for the JSON-LD price value — if it starts with a currency symbol, it's malformed.
Fix Use Shopify Liquid's money_without_currency filter and strip the thousands comma: {{ variant.price | money_without_currency | remove: "," }}. This outputs 29.99 (a bare decimal number as a string), which all schema.org parsers accept as a valid price value.
05
Structured data present on homepage but absent from product pages
AI shopping agents crawl product page URLs, not the homepage — that's where the actual product data lives. Some Shopify themes (especially older Debut-era themes) output rich structured data on the homepage (Organization, WebSite, perhaps a FeaturedCollection), but product pages only get a minimal or malformed Product block. The homepage test passes; the product pages fail. Since most merchants test their homepage first, this is one of the most common reasons an audit shows "structured data working" while the actual catalog is invisible to AI agents.
How to detect Test three different URLs with the Rich Results Test: homepage, a product page, a collection page. Compare the detected item types. A healthy Shopify store should return "Product" on product pages with a complete Offer block.
Fix Check your theme's product.liquid template (in Shopify Online Store 2.0, sections/main-product.liquid) for a script type="application/ld+json" block. If it's missing, add a Product schema block to the template. Do not rely on app-injected structured data in the page footer — place the JSON-LD block in the <head> of the product template for guaranteed above-fold rendering.

How CatalogScan's automated audit fits into the workflow

The three-tool manual workflow above is thorough but time-consuming — especially for catalogs with hundreds or thousands of products where the Rich Results Test and Schema.org Validator require individual URL checks. CatalogScan's automated scan addresses the catalog-scale problem: it crawls your entire product feed the way AI shopping agents do, extracts the JSON-LD from every product page, and scores each of the 18 AI-agent-critical signals across your full catalog.

The output maps directly to what the three-tool workflow would find if you ran it on every product:

Manual tool What it checks CatalogScan equivalent
Rich Results Test Per-product Google parse validity + required field presence JSON-LD parse score across all products + signal-level pass/fail breakdown
Schema.org Validator Full spec compliance, type correctness, deprecated properties Property-level validation including availability URI format, price number type, GTIN digit count
Search Console Catalog-wide error clustering, affected URL counts "Top 5 fixes" report — errors ranked by how many products are affected and estimated AI visibility impact
curl + grep Crawler accessibility, server-rendered JSON-LD presence Crawlability check included in scan; flags bot-block patterns by user agent

The manual workflow is still valuable for two reasons: it lets you verify CatalogScan's findings independently before making theme changes, and it gives you the context to explain specific errors to a developer in terms of the exact field, tool, and error message they'll see when they test the fix.

Recommended workflow: Run a CatalogScan scan to identify which products and which specific error types are affecting your catalog most broadly. Then use the Rich Results Test and Schema.org Validator to verify the top 3 error types on representative product pages before and after making theme changes. Use Search Console to confirm the fix has propagated across the crawled catalog 1–2 weeks after deployment.

10-step structured data audit checklist

Frequently asked questions

Does Shopify's default Dawn theme have correct structured data for AI shopping agents?

Dawn outputs the basic Product + Offer JSON-LD block with name, description, image, price, priceCurrency, and availability. However, it uses bare availability strings (InStock / OutOfStock) rather than schema.org URIs, it doesn't include GTIN or MPN fields even when metafields are populated, and it doesn't handle the case where a product has no active variants (which causes an empty Offer block). For basic Google rich results eligibility, Dawn is sufficient. For AI shopping agent optimization — where GTIN, MPN, brand, and condition signals all affect recommendation ranking — Dawn's output needs to be extended.

How often should I run a structured data audit?

Run the full 3-tool audit whenever you: upgrade your Shopify theme, install or update a JSON-LD app, modify your product.liquid template, or change your pricing structure (especially if adding multi-currency or Shopify Markets). For ongoing monitoring, Google Search Console's structured data report will flag new errors automatically as Googlebot crawls your catalog. For AI shopping agent-specific monitoring, a monthly CatalogScan run catches signal degradation that Search Console doesn't track (like GTIN coverage declining as new products are added without GTINs).

If the Rich Results Test passes, does that mean AI shopping agents can read my structured data?

Not necessarily. The Rich Results Test validates against Google's rich-result eligibility rules, which are a subset of the full schema.org specification. AI shopping agents like ChatGPT Shopping and Perplexity Commerce implement broader portions of the schema.org spec — they use GTIN, MPN, brand, material, condition, and other properties that aren't required for Google's Product rich results. A product that passes the Rich Results Test but is missing GTIN and brand will be eligible for Google shopping rich results but may rank lower (or be excluded) in AI agent recommendation sets for brand-specific or GTIN-matched queries.

What's the difference between an error and a warning in the Rich Results Test?

Errors in the Rich Results Test indicate that a required property is missing or has an invalid value that prevents the page from qualifying for any rich result type. Warnings indicate that a recommended property is missing — the page can still qualify for rich results, but the quality and ranking may be lower. For AI shopping agents, the error/warning distinction is less meaningful: both represent incomplete signals that reduce the agent's ability to match your product to relevant queries. Treat all Rich Results Test warnings as issues to fix for AI shopping optimization, not just errors.

Can I test structured data without a live public URL?

Yes — both the Schema.org Validator and Rich Results Test support "Direct Input" mode where you can paste in raw HTML or JSON-LD text. This is useful for testing structured data in a development or staging environment before deploying to production. For the Rich Results Test, use the "Code" tab (the small icon next to the URL field) to switch to paste mode. Note that Direct Input mode doesn't execute JavaScript, so if your theme injects structured data via JavaScript, paste in the JSON-LD from the post-JavaScript DOM (copy from the browser's Inspect panel, not View Source).

See exactly which structured data signals are failing across your catalog

CatalogScan checks all 18 AI-agent-critical signals across your full product catalog — not just one URL at a time. 90 seconds, no login required.

Run a free catalog scan More guides