Large Language Models and JavaScript Consumption

Modified on Thu, 15 Jan at 9:48 AM

Schema App has two primary ways of inserting markup onto a webpage: client-side rendering and server-side rendering. The intent of this article is to provide comprehensive information about how LLMs make use of markup so users can make informed decisions about how they want to render and deploy their JSON-LD. 


Alongside an analysis of how LLM bots crawl and index JSON-LD, this article client and server side rendering as it relates to Schema App deployment.


Key Takeaways and Recommendations


Although Gemini and CoPilot use crawlers than can render JavaScript, other major LLM platforms use crawlers that do not have this capability. This presents a visibility risk for users who want to ensure their structured data can be accessed and used by all LLMs. This section outlines how LLMs use schema markup, and suggestions for mitigating this "indexing gap".


Strategies to mitigate the "indexing gap"

The most direct way to mitigate the impact of this indexing gap is to explore server-side rendering solutions. In the context of Schema App, this will typically involve integration via a specific content management system (CSM) such as WordPress, AEM, or Drupal.


If this is not an option (e.g. your website is not hosted on one of these platforms), explore solutions that let you include JSON-LD in the initial HTML using pre-rendering or edge-side rendering.


How Schema Influences LLM Outputs

Schema markup typically gets used in the retrieval and grounding stage of LLM responses. LLMs that perform web-grounding and/or RAG (Retrieval Augmented Generation) are likely to use structured data to produce cited results and more accurate outputs.


LLMs and their Rendering Capabilities

Some crawlers (Google and Bing) can render JavaScript client side. Others like Perplexity and possibly ChatGPT cannot render that markup when crawling themselves but may use indices and sources that can render JSON-LD client side.


Google and Bing can render JavaScript

Googlebot and Bingbot can reliably render JavaScript. This means that Gemini and CoPilot are able to access and ground their AI responses because they have access to markup rendered by GoogleBot and BingBot respectively.


Perplexity and LLM Aggregators

Perplexity and other LLM aggregators use the data that their various web-sources (scrapers, indices, etc) will provide. If these upstream sources can "see" and index client-side rendered JSON-LD, then Perplexity will likely have access to the rendered JSON-LD. 


ChatGPT and Claude

ChatGPT and Claude currently both use web crawlers that do not typically render JavaScript when they crawl a page. Although ChatGPT and Claude may use other indices and sources that include structured data, it is likely that they are unable to render and index JSON-LD that is inserted Client-side.



Client-side and Server-side Rendering of JSON-LD

A key component of this discussion is the distinction between client side and server side rendering.


Client Side Rendering

In client side rendering, the browser downloads an initial HTML shell and JavaScript code. The JavaScript runs in the client to fetch data (e.g. Schema App fetches & generates JSON-LD as the page loads).


Server Side Rendering

The server generates full page for each request and sends it to the browser. Schema App JSON-LD will have already been generated and sent to the client’s server to be included in the full page. Configuring Server-side rendering requires additional communication between Schema App’s CDN and the servers that host the client’s website. Typically, it is limited to CMS-specific integrations (WordPress, AEM, and Drupal).


AI Crawler Technical Specifications & Client Side Rendering Compatibility

The following list describes various AI crawlers and their compatability with client side rendering.

 

Crawler IdentityOrganizationJavaScript ExecutionSchema SupportPrimary ConstraintCSR Compatibility
OAI-SearchBotOpenAI (Search)NoJSON-LD (Raw)Latency / SpeedZero
GPTBotOpenAI (Training)NoJSON-LD (Raw)Packet VolumeZero
PerplexityBotPerplexity AINoJSON-LD (Raw)Real-time FetchZero
ClaudeBotAnthropicNoText ExtractionSafety / ScaleZero
GooglebotGoogleYes (WRS)ExtensiveQueue TimeHigh
BingbotMicrosoftYes (Limited)ExtensiveConsistencyModerate
Gemini (Agent)GoogleVariableNative GroundingContext WindowLow (unless grounded)

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article