March 16, 2026

Your shoppable videos are invisible to AI agents

AI agents are becoming shoppers. ChatGPT handles 50 million shopping queries a day. Google AI Mode has native checkout. Copilot lets people buy mid-conversation. Shopify just launched Agentic Storefronts, plugging a million-plus merchants into every major AI platform at once. Most shoppable video apps aren't ready for any of this.

Most shoppable video apps are completely unprepared for it.

How AI agents actually discover products

To understand why video metadata matters, you first need to understand how AI-powered product discovery works. It's fundamentally different from traditional search.

When someone asks ChatGPT "find me a summer dress with video reviews that runs true to size," the system doesn't return a list of links. It doesn't even work like Google. Instead, a specialized shopping model — trained specifically for product research — kicks in. It asks clarifying questions. It researches across multiple sources. Then it builds a personalized recommendation with specific products, explaining why each one fits.

Behind the scenes, AI agents pull product information from three separate channels:

Product feeds. Shopify Catalog, Google Merchant Center, and direct merchant feed submissions provide structured product data — titles, descriptions, prices, images, availability. This is the primary data source for ChatGPT Instant Checkout and Google AI Mode shopping. For Shopify merchants, this happens automatically through Agentic Storefronts.

Web crawling. AI platforms operate their own crawlers — OAI-SearchBot for ChatGPT, Google's various bots, PerplexityBot, and others. These crawlers visit product pages, read the HTML, and parse any structured data they find. This is the organic discovery path, and it's where Schema.org markup (like VideoObject JSON-LD) becomes critical.

Third-party signals. Reviews on Reddit, expert roundups, comparison sites, and user-generated content across the web all feed into how AI systems evaluate and rank products. ChatGPT's shopping model was specifically trained to read trusted sites and synthesize information across many sources.

Here's the thing most merchants and app developers miss: product feeds contain basic product data — titles, images, prices. They don't contain video metadata. They don't tell an AI agent that your product page has a UGC try-on haul where a real customer talks about fit, fabric feel, and sizing. That information only exists on the product page itself, in the structured data.

And that's where the problem starts.

The JavaScript blind spot

Most Shopify apps — including nearly every shoppable video app on the market — inject their content into product pages using JavaScript. The widget loads, the videos appear, and if the app includes any structured data at all, it gets injected into the DOM dynamically via JS.

For human visitors, this works fine. They see the videos. They can watch, click, and buy.

For Google's crawler, this mostly works too. Googlebot uses a headless Chrome browser that executes JavaScript, renders the page, and can read dynamically injected structured data. This is why apps can claim their structured data is "SEO-ready" — because Google can see it.

But here's what nobody talks about: every other AI agent that matters for agentic commerce does not execute JavaScript.

ChatGPT's OAI-SearchBot fetches raw HTML. It does not render JavaScript. If your VideoObject JSON-LD is injected via JS, OAI-SearchBot sees an empty page where your video metadata should be.

Perplexity's crawler fetches raw HTML. Same result. No video metadata.

Claude, when used as a shopping research tool, fetches raw HTML through web search. No JS execution. No video metadata.

Microsoft Copilot's crawler — same story.

When these AI agents visit a product page with a JS-injected shoppable video widget, they see the product itself (title, price, images from Shopify's server-side rendered Liquid templates), but the videos? The rich descriptions of what's in those videos? The language, the content type, the key moments? Completely invisible.

Any shoppable video app that injects structured data via JavaScript and claims to be ready for agentic commerce is telling you half the truth. They're visible to Google Search. They're invisible to ChatGPT, Perplexity, Google AI Mode's organic crawling, and every other AI agent that reads raw HTML.

Every AI crawler works this way today.

Why video descriptions matter for AI product discovery

Let's say a potential customer asks ChatGPT: "I'm looking for a satin prom dress in white, preferably with video reviews showing how it looks on different body types."

ChatGPT's shopping research handles exactly these kinds of multi-constraint, detail-heavy requests. The system will search across product feeds, crawl product pages, and synthesize what it finds.

Now consider three product pages selling similar dresses:

Store A has a shoppable video widget with three UGC videos. The videos are embedded via JavaScript. There is no structured data describing what's in the videos. The AI agent sees: a product title, a price, some images, and a basic product description. It has no idea the videos exist, let alone that one of them shows a creator trying on the exact white satin dress and discussing the fit.

Store B has a shoppable video app that pulls video descriptions from Instagram captions. The structured data technically exists, but looks like this: "obsessed with this look 🤍✨ new drop SS26 link in bio 💫 #promdress #ootd #grwm #satin." An AI agent can parse it — and learns absolutely nothing useful. No mention of white, no mention of satin (buried in a hashtag), no information about fit, fabric, or sizing. The metadata exists but carries zero signal.

Store C uses Storista. The same three UGC videos are embedded, but the page also contains server-side rendered VideoObject JSON-LD with AI-generated descriptions:

json

{

"@type": "VideoObject",

"name": "White Satin Prom Dress Try-On Review", "description": "Creator tries on a floor-length white satin prom dress, shows fit from front and back, discusses fabric drape and comfort. Mentions dress runs slightly long for petite frames. Shows movement and how the satin catches light.",

"inLanguage": "en",

"duration": "PT45S"

}

The AI agent can now match this page to the user's query with high confidence. White? Yes. Satin? Yes. Video reviews? Yes, and here's what they show. Different body types? The description mentions petite fit considerations.

Store C gets recommended. Store A and Store B don't exist in the conversation.

The gap here is not between having videos and not having videos — it's between having described videos and having invisible ones. In agentic commerce, if the AI can't read it, it can't recommend it.

What Google's Head of Search just confirmed

The same shift is playing out in traditional search.

In March 2026, Google's VP of Search Liz Reid confirmed on the Access Podcast that multimodal LLMs now allow Google to understand video content at a level that wasn't possible before. Not just transcripts — but what the video is actually about, its style, its depth, its relevance.

Google has been adjusting its ranking systems to surface more short-form video, forums, and user-generated content since October 2025. And with Google I/O 2026 scheduled for May, more announcements around video understanding in search are expected.

But here's the important nuance: even as Google's ability to understand video content improves, structured data remains the explicit signal that Google trusts today. VideoObject JSON-LD tells Google exactly what your video contains, in a format that's unambiguous and machine-readable. It's the difference between Google having to figure out your video content and you telling Google directly.

Merchants who have both — rich structured data AND quality video content — will have a compounding advantage. The structured data ensures discoverability now, across every AI platform. The quality content ensures relevance as video understanding matures.

The agentic commerce stack is forming now

The pieces are snapping together fast. Shopify's Agentic Storefronts connect merchants to ChatGPT, Google AI Mode, and Microsoft Copilot. The Universal Commerce Protocol (co-developed by Shopify and Google) standardizes how AI agents transact with merchants. OpenAI's Agentic Commerce Protocol powers Instant Checkout inside ChatGPT.

The transaction layer is built. The product feed layer is built. What's missing is the content understanding layer — the part where AI agents know not just what a product is and what it costs, but what real customers think about it, how it looks on real people, and whether it lives up to its description.

Video is the richest source of that information. But only if the AI can actually read it.

Shopify's own Catalog system — the centralized product index that powers agentic commerce — currently returns images only. No video metadata. That means even if a merchant has incredible UGC content on their product pages, Shopify Catalog can't surface it to AI agents through the feed path. The only way video metadata reaches AI agents today is through the web crawling path — and that requires server-side rendered structured data.

The window is open. Merchants and apps that solve video discoverability now — while the agentic commerce stack is still forming — will have an entrenched advantage as AI-driven shopping scales. Those that rely on JavaScript injection and hope for the best will discover, too late, that their videos were never part of the conversation.

What this means for merchants

If you're a Shopify merchant using shoppable video on your product pages, ask your app provider one question: Is your structured data rendered server-side, or injected via JavaScript?

If the answer is JavaScript, your video metadata is visible to Google but invisible to ChatGPT, Perplexity, Google AI Mode's organic crawler, and every other AI agent. In the context of agentic commerce — where AI agents are making purchase recommendations and processing transactions on behalf of your customers — your videos don't exist.

The shift from search-driven commerce to agent-driven commerce is accelerating across multiple platforms simultaneously. The merchants who prepare their product pages for this reality — with AI-analyzed video content, server-side rendered structured data, and rich metadata that agents can actually parse — will be the ones whose products get recommended.

Everyone else will be wondering why their traffic from AI channels isn't growing.

TL;DR

AI shopping agents don’t behave like traditional search engines. They:

Don’t just return links; they run a specialized shopping model that asks follow-up questions, researches across sources, and explains recommendations.
Rely heavily on structured, crawlable product and content metadata—especially VideoObject JSON-LD—to understand which products match nuanced, intent-rich queries.