SEO Guide · 2026
Shopify Sitemap Optimization for AI Search Agents
Shopify auto-generates a sitemap index at /sitemap.xml with sub-sitemaps for products, collections, pages, and blogs. For traditional SEO, this is sufficient. For AI shopping agent inclusion in 2026 — where Google Shopping, Bing's Shopping Graph, and LLM crawlers each have different needs — there are real gaps worth addressing.
sitemap_products_1.xml directly to Bing Webmaster Tools and Google Search Console.
Shopify's sitemap structure
The sitemap index at /sitemap.xml references four sub-sitemaps, each auto-generated and maintained by Shopify:
| Sub-sitemap | Contents | Relevant to AI agents? |
|---|---|---|
/sitemap_products_1.xml | All published products (canonical /products/ URLs only, not collection-path duplicates) | Critical — this is the Shopping Graph discovery file |
/sitemap_collections_1.xml | All published collections | Moderate — helps agents understand category taxonomy |
/sitemap_pages_1.xml | All published Shopify Pages (not blog posts) | Moderate — includes policy pages, landing pages created via Shopify Pages |
/sitemap_blogs_1.xml | All published blog posts across all blogs | Moderate — content crawl for LLM training crawlers |
The sitemap_products_1.xml file is the most important for AI shopping agent inclusion. It uses a suffix number (_1) because Shopify can generate multiple product sitemap files if the product count exceeds the per-file limit. For large stores, check if /sitemap_products_2.xml exists.
What Shopify's sitemap does well
Before covering gaps, it's worth noting what Shopify handles correctly and doesn't need manual intervention:
- Canonical URL discipline.
sitemap_products_1.xmlonly lists/products/[handle]URLs — never collection-product path duplicates. This is correct and avoids canonicalization confusion. - Real-time lastmod dates. Shopify updates
lastmodtimestamps accurately when products are edited. This helps crawlers prioritize recently updated product pages. - No noindex pages included. Products marked as hidden (draft status) are not included in the sitemap.
- Automatic removal. Deleted products are removed from the sitemap on the next Shopify refresh cycle — usually within hours.
The three gaps to fix
Gap 1: Custom pages outside Shopify CMS are missing
If you've built landing pages, SEO guides, tool pages, or other content outside Shopify's Pages CMS (e.g., as static HTML files, as a headless frontend, or in a separate directory), these URLs will not appear in any Shopify-generated sitemap. AI crawlers that rely solely on /sitemap.xml for discovery will miss them entirely.
Fix: Maintain a supplemental sitemap (e.g., /sitemap-custom.xml) for any URLs outside the Shopify CMS, and reference it in your robots.txt:
# In robots.txt — reference all sitemaps explicitly: Sitemap: https://yourstore.com/sitemap.xml Sitemap: https://yourstore.com/sitemap-custom.xml
Submit both sitemaps to Google Search Console and Bing Webmaster Tools as separate entries.
Gap 2: No robots.txt sitemap reference in some themes
The Robots Exclusion Protocol best practice is to reference your sitemap(s) in robots.txt so any crawler can discover them without prior knowledge of the URL. Shopify's default robots.txt template does include a sitemap reference in recent versions, but some older or custom themes do not. Verify:
curl -s "https://yourstore.com/robots.txt" | grep -i sitemap # Should return: Sitemap: https://yourstore.com/sitemap.xml
If missing, add the sitemap line to your Shopify robots.txt template (Online Store → Themes → Edit code → Templates → robots.txt.liquid).
Gap 3: No real-time indexing notification
Shopify updates its sitemap in near-real-time, but search engines discover sitemap changes on their own crawl schedule — typically weekly or slower for smaller stores. For time-sensitive products (new launches, limited inventory), waiting for organic re-crawl is too slow.
Fix: Use IndexNow for Bing + Yandex, and the Indexing API for Google. IndexNow is a push protocol that notifies participating search engines immediately when a URL changes:
curl -X POST "https://api.indexnow.org/indexnow" \
-H "Content-Type: application/json" \
-d '{
"host": "yourstore.com",
"key": "YOUR-INDEXNOW-KEY",
"urlList": [
"https://yourstore.com/products/new-product-handle"
]
}'
IndexNow notifications reach Bingbot (ChatGPT Shopping) within minutes. For Google, use Google Search Console's URL Inspection tool for individual product pages, or the Indexing API for high-frequency updates.
AI crawler behavior and sitemaps: what each agent actually does
| Crawler | Uses sitemap? | Primary use |
|---|---|---|
| Googlebot (Shopping Graph) | Yes | Uses sitemap_products_1.xml for product page discovery; prioritizes pages with recent lastmod |
| Bingbot (Shopping Graph / ChatGPT) | Yes | Actively follows sitemap submissions; IndexNow notifications trigger immediate recrawl |
| GPTBot (OpenAI training) | Occasionally | Primarily follows links; uses robots.txt more than sitemap |
| PerplexityBot | Occasionally | Link-following crawler; sitemap provides fallback discovery for orphaned pages |
| ClaudeBot (Anthropic) | Occasionally | Link-following; robots.txt compliance; sitemap used as breadcrumb when links are sparse |
| Google-Shopping | Yes | Dedicated product crawler that treats sitemap_products_1.xml as authoritative product list |
FAQ
Where is the Shopify sitemap?
Shopify's sitemap index is at /sitemap.xml (e.g., https://yourstore.com/sitemap.xml). This references sub-sitemaps: /sitemap_products_1.xml, /sitemap_collections_1.xml, /sitemap_pages_1.xml, and /sitemap_blogs_1.xml. These are automatically generated and updated by Shopify — no manual management needed.
Can I add custom URLs to Shopify's sitemap?
Not directly via admin for the auto-generated sitemap. To add custom pages, either create them as Shopify Pages (they auto-appear in sitemap_pages_1.xml) or maintain a separate custom sitemap file and reference it in robots.txt and Google/Bing Webmaster Tools. This covers headless frontends, static HTML landing pages, and tool pages outside the Shopify CMS.
Do AI crawlers use sitemaps?
It depends on the agent. LLM training crawlers (GPTBot, ClaudeBot, PerplexityBot) primarily follow links and rely more on robots.txt than sitemaps. Google Shopping and Bingbot's Shopping Graph crawlers actively use the product sitemap for discovery and prioritization. Submitting sitemap_products_1.xml to Google Search Console and Bing Webmaster Tools is the most reliable way to ensure Shopping Graph inclusion.
How often does Shopify update its sitemap.xml?
Shopify updates sub-sitemaps in near-real-time as products, pages, collections, and blog posts are added, modified, or removed. The lastmod date reflects the last modification. However, Shopify doesn't automatically notify search engines when the sitemap changes — use IndexNow (Bing) or Google Search Console's URL Inspection for important new or changed product pages to trigger immediate recrawl.
Audit your sitemap and crawlability
CatalogScan checks sitemap completeness, robots.txt configuration, Bingbot crawlability, and 15 more AI-readiness signals in 2 minutes.
Run the free scan →