SEO Guide · 2026
Shopify Headless Commerce SEO: What You Lose and How to Rebuild It
Going headless — with Shopify Hydrogen, Next.js Commerce, or a custom React storefront — gives you full design control. It also silently removes four things AI shopping agents depend on: your /products.json feed, Product JSON-LD, a compliant robots.txt, and your sitemap. None of these break loudly. Here's how each breaks and the exact rebuild for each.
What breaks when you go headless
| Lost signal | What it was | AI agent impact | Rebuild effort |
|---|---|---|---|
| /products.json | Shopify's built-in product feed endpoint (Products JSON API), available at any store at yourstore.com/products.json |
Primary feed source for ChatGPT Shopping and Perplexity's catalog indexing. Without it, agents must crawl individual PDPs — much slower and less complete. | Medium — must proxy the Storefront API or Admin API response at that path in your headless framework |
| Product JSON-LD | The <script type="application/ld+json"> block Shopify's Liquid theme injects in every PDP's <head> |
Without Product JSON-LD, agents can't read GTIN, AggregateRating, ProductGroup, or Offer details from the page — they get raw HTML only, which they parse unreliably. | Medium — must generate and inject JSON-LD client-side or server-side in your React/Next.js framework |
| robots.txt | Shopify generates a standard robots.txt at /robots.txt that allows all crawlers by default |
If your headless framework doesn't serve a robots.txt, some crawlers fall back to assuming Disallow:all. Many headless frameworks return a 404 at /robots.txt by default. | Low — add an explicit robots.txt file in your headless app's public directory |
| sitemap.xml | Shopify generates a complete sitemap at /sitemap.xml covering all products, collections, and pages |
Without a sitemap, AI crawlers must discover URLs organically from your homepage — many product pages are never found. | Medium-high — must generate dynamically from Storefront API and serve at /sitemap.xml |
Rebuilding /products.json for a headless store
Shopify's standard /products.json endpoint returns paginated product data (up to 250 per page) from the storefront. Headless stores that serve on a custom domain no longer serve this endpoint from their own domain — requests hit the custom domain, not the Shopify storefront.
The rebuild options, in order of effort:
- Proxy the Shopify endpoint. In your Next.js app, add a route at
/products.jsonthat proxies requests toyourstore.myshopify.com/products.json. Lowest effort, accurate data, but requires per-request latency and Shopify API rate limit management. - Generate a static feed at build time. Pull from Storefront API at build time, generate a products.json, and serve it as a static file. Fastest to serve, but data goes stale between builds — not ideal for stores with frequent inventory changes.
- Use a product feed app. Apps like Flexify Feed Manager or Litcommerce generate and host a continuously-updated product feed that AI agents can index without depending on your headless app.
Rebuilding Product JSON-LD in Next.js / Hydrogen
In a Next.js headless storefront, add JSON-LD to every product page by including a <script> tag in your PDP component. This must be server-rendered (not client-side injected with useEffect) so crawlers see it in the raw HTML response.
Minimum viable Product JSON-LD for AI agent visibility:
@type: "ProductGroup"as the outer container (not just "Product")hasVariantarray listing each variant as a childProductwith its ownOfferand GTINAggregateRatingpulled from your review app's API- Each
Offerwith real-timeavailabilityandprice
FAQ
Does Shopify Hydrogen (Remix-based) handle JSON-LD automatically?
Hydrogen provides React components for rendering structured data via the @shopify/hydrogen package, but it doesn't generate ProductGroup JSON-LD by default — you must add it. The Product component generates basic Product schema; you need to augment it with ProductGroup and hasVariant manually.
Will AI agents crawl my myshopify.com domain instead of my custom headless domain?
Only if you've set up canonical URLs pointing there. Agents typically crawl the primary domain (your custom headless domain). If your headless store doesn't serve the AI-required endpoints, agents won't find them at myshopify.com either — that URL is typically redirected to the custom domain.
My store scores low on CatalogScan. How do I know if it's the headless setup or something else?
CatalogScan distinguishes headless-specific failures from standard catalog gaps in the scan report. Signals like "products.json not found" and "Product JSON-LD missing or malformed" are typical headless indicators. Run the free scan — the top-5 findings will tell you whether you're hitting a headless-origin gap or a standard catalog hygiene issue.
Check if your headless store is AI-visible
CatalogScan scans your public storefront endpoints — no Shopify login required. See exactly which signals are missing and get a fix priority list.
Run the free scan →