Home · The 15 signals · SKU coverage
Shopify SKU coverage on variants
A SKU is your private internal identifier for a specific variant — the medium navy crew-neck, not the t-shirt product page. GTIN identifies the same product across the world; SKU identifies the same variant inside your stack. AI shopping agents care about SKUs because once they have decided to surface your product, the user's follow-up actions — re-ordering, caching, "buy the same one again" — all need a stable handle to that exact variant. A blank or duplicate SKU forces the agent to guess; an agent that has to guess across catalogs decides "low confidence, skip." Shopify exposes SKUs at variants[].sku in the public products feed, and the bulk-fill is one of the cheapest score moves a Shopify operator can make.
/products.json feed (up to 250 products, all variants per product) and count the share of variants where sku is non-null and non-empty. Full credit (6 pts) at ≥90% coverage. Half credit (3 pts) at 50–89%. Zero below 50%. We also flag two "passes the static check but loses on retrieval" patterns separately: collisions (the same SKU on multiple variants) and ephemeral SKUs (auto-generated UUID-like strings that change between deploys).
What it is
SKU stands for "stock-keeping unit." Inside Shopify, it's a free-text field on every variant. You set it; Shopify never auto-generates one (unlike GTIN, where Shopify will accept the manufacturer-issued UPC/EAN and validate the check digit). A typical SKU looks like TEE-NAVY-M, SHOE-BLK-09-W, or your ERP's internal ID like SKU48291. The encoding doesn't matter to AI agents — they just need a stable identifier. What matters is that the same variant always returns the same SKU and that no two variants share one.
What you want
"variants": [
{ "id": 1, "title": "Navy / S",
"sku": "TEE-NAVY-S" },
{ "id": 2, "title": "Navy / M",
"sku": "TEE-NAVY-M" },
{ "id": 3, "title": "Navy / L",
"sku": "TEE-NAVY-L" }
]
What we still find
"variants": [
{ "id": 1, "title": "Navy / S",
"sku": "" },
{ "id": 2, "title": "Navy / M",
"sku": null },
{ "id": 3, "title": "Navy / L",
"sku": "" }
]
Same SKU on every variant
"variants": [
{ "id": 1, "title": "Navy / S",
"sku": "TEE-NAVY" },
{ "id": 2, "title": "Navy / M",
"sku": "TEE-NAVY" },
{ "id": 3, "title": "Navy / L",
"sku": "TEE-NAVY" }
]
Auto-generated, churns
"variants": [
{ "id": 1, "title": "Navy / S",
"sku": "auto-7f3a91b" },
{ "id": 2, "title": "Navy / M",
"sku": "auto-2c4d8e1" },
{ "id": 3, "title": "Navy / L",
"sku": "auto-9b6f3d2" }
]
// ...regenerated every PIM sync
The third shape — every variant getting the same SKU — passes a naive "is the field populated" check but fails the harder uniqueness test. Agents need to differentiate the medium from the small, and they cannot if both report TEE-NAVY. We surface this collision case as a separate "SKU duplicates detected" alert in the scan output even when the static signal scores half-credit.
SKU vs GTIN vs MPN: which is which?
| Field | Scope | Issued by | Where it lives |
|---|---|---|---|
| SKU | Your store, internal | You | variants[].sku |
| GTIN (UPC/EAN) | Global, across retailers | GS1 (purchased) or manufacturer | variants[].barcode |
| MPN | Manufacturer's part number | The brand that makes the product | Metafield (e.g. custom.mpn) |
SKU and GTIN are independent — set both. SKU is the cheapest to fix; GTIN often costs $30–$1,000 if you don't have a manufacturer-issued one (see GTIN coverage). For the rare case where you sell a third-party brand and the manufacturer's GTIN is the only stable cross-retailer identifier you have, set the GTIN as both the GTIN and the SKU; it's not ideal but it beats blank.
Why AI shopping agents care
- Stable handle for re-order intent. "Buy me the same medium navy tee I got last year" is exactly the kind of follow-up an AI shopping agent is supposed to handle. Without a per-variant SKU, the agent's only handle is the variant title string ("Navy / M") which collides across stores and changes when you rename a colorway. With a SKU, the agent has a value it can cache locally and look up against your feed deterministically.
- Cross-channel reconciliation. Your Shopify catalog feeds Google Merchant, Meta product feeds, Pinterest catalog, TikTok Shop, and the Shopify Global Catalog. Each surface uses SKU as the join key. Variants without SKUs fall out of channel sync silently — they ingest as a different product on each surface, fragmenting the listing.
- Inventory truth across stores. Multi-store brands (US site + UK site + B2B portal) reconcile inventory by SKU. Agents that observe the same product across multiple Shopify storefronts use the SKU to detect "this is the same variant, just in a different region" — and route the user to their region's store. Without SKUs, the variants look unrelated.
- Confidence in tied rankings. Catalog hygiene is a positive prior on tied rankings. SKU coverage is the easiest hygiene signal to read because it's a uniform per-variant field — a 100% covered catalog reads as well-managed; a 30% covered catalog reads as a side hustle.
How to test it on your store
One jq one-liner against your public products feed gives you both the coverage rate and the duplicate count.
curl -s 'https://yourstore.com/products.json?limit=250' \
| jq '
.products[].variants[].sku as $s |
[$s] | length as $t |
[$s | select(. != null and . != "")] | length as $f |
"\($f) of \($t) variants have a SKU (\(($f*100/$t) | floor)%)"
'
# Duplicate check
curl -s 'https://yourstore.com/products.json?limit=250' \
| jq '[.products[].variants[].sku | select(. != null and . != "")] | group_by(.) | map(select(length > 1)) | length'
The first command returns your coverage percentage. Target ≥90%. The second returns the number of SKU values that appear on more than one variant — should be zero. Anything above zero means you have collisions, which we flag as a separate alert.
For stores with more than 250 products, run with ?page=2, ?page=3, etc. and concatenate. The CatalogScan free scan does this paginated walk for you up to the first 1,000 products and surfaces the per-page SKU coverage in case you have a problem isolated to one collection.
How to fix it
Shopify Admin → Products → Bulk editor. Add the SKU column. You get an editable spreadsheet view; type SKUs directly. Fast for one-off fixes; painful past a couple hundred variants. Generate values from the variant title with a consistent encoding — {PRODUCT-CODE}-{COLOR}-{SIZE} — so a human reading the SKU can decode the variant. Don't use the variant title verbatim; whitespace and slashes break some downstream channel integrations.
The Matrixify app exports a single CSV with one row per variant — far more useable than Shopify's native CSV export, which forces you to flatten into one-row-per-product. Export with the SKU column included, fill the gaps in Excel/Google Sheets using a formula (=A2&"-"&B2&"-"&C2 on product handle / option1 / option2), re-import. Matrixify supports updating only the rows that changed, so re-import is incremental — no risk of overwriting other fields. Matrixify is paid ($20–$50/mo depending on tier) but pays for itself the first time you have 5,000 variants to fill.
If your SKU rule is non-trivial (encodes warehouse location, supplier code, season identifier) script it. Walk products.list via the GraphQL Admin API, generate the SKU per variant, push back via productVariantUpdate:
// Pseudo-code
for await (const product of paginate('products')) {
for (const variant of product.variants) {
if (!variant.sku) {
const sku = encodeSku(product.handle, variant.option1, variant.option2);
await mutate('productVariantUpdate', { id: variant.id, sku });
}
}
}
Throttle to Shopify's 1,000-credits/min API limit (the GraphQL Admin API meters credits per query cost). Set this up as a scheduled job that runs nightly to backfill any new variant — variants get added all the time and someone always forgets the SKU.
Most stores doing $1M+ already have SKUs in their inventory or ERP system that just aren't getting written into Shopify. Find the integration: NetSuite (via Celigo or Boomi), Cin7, Skubana, Linnworks, or a homegrown one. The SKU should already be the join key — write a one-time backfill script that pulls every variant's SKU from the source-of-truth and pushes to Shopify. Going forward, the integration writes SKU at variant creation. Don't try to maintain SKUs in two places.
5 mistakes themes and operators keep making
1. Same SKU on every variant of a product
Founders new to Shopify sometimes set the SKU to a product-level identifier (the parent SKU) on every variant. Static check passes, uniqueness check fails. Two variants with the same SKU look identical to an agent's reconciliation logic — when the agent caches "user re-orders TEE-NAVY," it can't pick which size to surface, so it picks the cheapest one or skips the product entirely. Always make SKUs variant-unique.
2. Auto-generated UUID SKUs that change every PIM sync
Some PIM/ERP integrations write a generated string SKU into Shopify at variant create — auto-7f3a91b, prod_8210391. Worse: some implementations regenerate on every sync, so the SKU on a given variant changes every night. Agents that cache by SKU can never re-find the variant they cited yesterday. SKUs must be stable across the variant's lifetime; if your PIM regenerates, store the PIM-generated value as a metafield and use a deterministic encoding (product handle + option) for the actual SKU field.
3. Whitespace and slashes that break downstream channel sync
Shopify accepts almost anything as a SKU — including "M / Navy" with the slash. Channels downstream are stricter: Google Merchant Center rejects slashes, Meta strips trailing whitespace, the TikTok Shop sync silently truncates at 40 characters. Your variant ingests onto each channel as a different product. Stick to A-Z 0-9 - _, max 30 characters; this is the durable subset every channel accepts.
4. SKUs visible in public source — the founder is upset about this
SKUs appear in /products.json, in the rendered HTML of variant selectors, and in some review apps' rich snippets. Operators sometimes want to "hide" SKUs by leaving them blank to keep their internal coding private from competitors. Don't. The public SKU is your internal identifier exposed; if you want internal-only metadata (cost, supplier, warehouse), put it in a private metafield. Empty SKUs cost you 6 pts and a stable agent handle for a privacy benefit a determined competitor can recover anyway.
5. SKUs that change when you rename a colorway
Marketing renames "Navy" to "Midnight" in 2026. The product handle changes; the variant title changes; if your SKU encoding is {HANDLE}-{TITLE}, the SKU changes too. Caches everywhere — including agent caches that promised to re-order this exact product — go stale silently. SKUs should encode stable attributes (a season ID, an option-position number, an immutable internal code), not the public-facing variant title. If you rename, the SKU should not.
See also
- The 15 signals — full reference
- GTIN coverage on variants (the global identifier signal SKU sits next to)
/products.json: the AI bulk-ingest feed (where SKUs surface atvariants[].sku)- Product JSON-LD on PDPs (the second place SKU should appear, in the
skuproperty of theProductnode) - The full 18-signal Agentic Storefronts checklist
- Leaderboard: 100 DTC stores scored on SKU coverage and 14 other signals
What's your SKU coverage right now?
Free 2-minute scan. We pull your /products.json, check every variant's SKU, flag duplicates and empties, and tell you exactly which products to fix first.