External Entity Linking Explainer

Modified on Thu, 19 Dec at 11:06 AM

External Entity Linking (External EL) is one method Schema App uses for their Entity Linking service. This document describes the External EL feature offered by Schema App, what it is, what it does, and how Schema App implements it. It provides content considerations, and notes current limitations and future enhancements for this feature.


TABLE OF CONTENTS


What is External EL?

External Entity Linking is the automated process of identifying named entities in text and then linking them to external identifiers from authoritative knowledge bases (like Wikipedia and the Google Knowledge Graph). Schema App's External EL feature automatically embeds identified entities within your Schema Markup.


What does External EL do?

Once embedded in your markup, these entities provide additional semantic value to your metadata. They help Google (and other web crawlers) better understand your content by pulling in known entities. This reduces ambiguity in the interpretation of your content and supports more accurate matching to user queries.


How does External EL work?

External EL tags can be applied to Schema App’s Highlighter templates. Once applied, External EL runs text through an API to identify linked entities. If URIs for those entities are found, the API returns the following information:


URIs from external sources are added to the entity using the sameAs property. As a result, the markup will look something like this:

 

The results are cached for 12 weeks.


How is External EL implemented?

First, you’ll consult with your Customer Success Manager (CSM) to find a page set with content that contains entities: people, places, things, or concepts. Then, you and your CSM will decide which schema.org property to use for mapping the returned entities. 


If you expect to receive entities of many different types, you'll want to use a property that expects schema.org/Thing so that entities of any type can be added to your content.


Example: if applying to the article body of a BlogPosting, use the mentions property to capture entities of any type.


You can also restrict the results to only one type in order to use more precise properties.


Example: if you want to say a Service has an areaServed, the External EL results will need to be restricted to the Place type.


Example: if you want to say a Product if from a brand, the External EL results will need to be restricted to the Organization type.




What types can External EL identify?

External EL identifies entities with the following schema.org types:



Note: The API is able to identify entities typed as Product and Event. However, since these types can be eligible for Rich Results, they trigger errors in Google Search Console. As a result, we have chosen to type Product and Event entities as Thing to prevent errors from appearing in Google Search Console enhancement reports.


How do I know External EL is working?

Your CSM will validate whether External EL is working by:

  • Validating sample URLs in the Schema Validator tool
  • Checking the Response in the Console Network tab
  • Running a report in Schema App's administrator tools

External EL reporting is available at the following levels:

  • Project (website)
  • Highlighter Template
  • Individual External EL tags

General Content Considerations

1. Use standardized names

When possible, use terms that can be found on authorities like Wikipedia or Wikidata. Provide additional content that includes language familiar to users. This way your content is optimized for both entity SEO and content SEO.


2. Capitalization Is Important!

Proper nouns are differentiated from common nouns with the same name by capitalization. Ensure that proper nouns are consistently capitalized to facilitate External EL matches.

Example: Apple, the “American multinational technology company” and apple, the “fruit of the apple tree”.

3. Consider surrounding content

What other entities are relevant to your primary entity? Take Amazon, for example. When surrounded by other keywords like “technology”, “e-commerce” and “digital streaming”, it is identified as the Amazon the company, whereas keywords such as “tropical”, “trees”, or “biodiversity”  make it clear that the entity being mentioned is the Amazon rainforest in Brasil.

This approach isn’t that different from keyword clusters in content SEO. The only difference is that it takes NLP APIs into account alongside human users searching for content. 


External EL Limitations

Limitations of API

The API used for External EL occasionally matches to entities that are incorrect (for example, matching to Hamilton the person, rather than Hamilton the place). In cases like this, it’s best to omit the impacted URL from the Highlighter page set to avoid deploying inaccurate markup. This is a limitation of the API itself.


Our testing found API results to be correct 83% of the time.


An enhancement to enable more control over the results is planned.


External EL will return no entities for some content that it recognizes if the entity returned by the API lacks any identifying metadata such as a Wikipedia URL or Google Knowledge Graph MID.


Tag Run Limitations

External EL tags can be run up to 10K times on up to 10K characters per URL. This means one External EL tag can be run on up to 10K URLs, and two External EL tags can be run on up to 5K URLs.


If you're interested in increasing the scope of External EL on your account, you can do so as a monthly add-on.


To start implementing External EL on your account, get in touch with one of our Customer Success Managers at [email protected]



 

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article