External Entity Linking Explainer

Modified on Wed, 30 Jul at 3:20 PM

External Entity Linking (External EL) is one method Schema App uses for their Entity Linking service. This document describes the External EL feature offered by Schema App, what it is, what it does, and how Schema App implements it. It provides content considerations, and notes current limitations and future enhancements for this feature.

TABLE OF CONTENTS

What is External EL?
What does External EL do?
How does External EL work?
How is External EL implemented?
What types can External EL identify?
How do I know External EL is working?
General Content Considerations
External EL Limitations
- Limitations of API Results
- Tag Run Limitations

What is External EL?

External Entity Linking is the automated process of identifying named entities in text and then linking them to external identifiers from authoritative knowledge bases (like Wikipedia and the Google Knowledge Graph). Schema App's External EL feature automatically embeds identified entities within your Schema Markup.

What does External EL do?

Once embedded in your markup, these entities provide additional semantic value to your metadata. They help Google (and other web crawlers) better understand your content by pulling in known entities. This reduces ambiguity in the interpretation of your content and supports more accurate matching to user queries.

How does External EL work?

External EL tags can be applied to Schema App’s Highlighter templates. Once applied, External EL runs text through an API to identify linked entities. If a URI for an entity is found, it will return that entity's:

Type (e.g. Organization)
name (e.g. Apple)

And a URI from at least one of the following resources:

Wikipedia URI (e.g. https://en.wikipedia.org/wiki/Apple_Inc.)
Wikidata URI (e.g. https://www.wikidata.org/entity/Q312)
Google Knowledge Graph URI (e.g. kg:/m/0k8z)
Crunchbase URI (e.g. https://www.crunchbase.com/organization/communitech)
LinkedIn (e.g. https://www.linkedin.com/in/danielwaisberg/)

URIs from external sources are added to the entity using the sameAs property. As a result, the markup will look something like this:

The results are cached for 12 weeks.

How is External EL implemented?

First, you’ll consult with your Customer Success Manager (CSM) to find a page set with content that contains entities: people, places, things, or concepts. Then, you and your CSM will decide which schema.org property to use for mapping the returned entities.

If you expect to receive entities of many different types, you'll want to use a property that expects schema.org/Thing so that entities of any type can be added to your content.

Example: if applying to the article body of a BlogPosting, use the mentions property to capture entities of any type.

You can also restrict the results to only one type in order to use more precise properties.

Example: if you want to say a Service has an areaServed, the External EL results will need to be restricted to the Place type.

Example: if you want to say a Product if from a brand, the External EL results will need to be restricted to the Organization type.

What types can External EL identify?

External EL identifies entities with the following schema.org types:

Note: The API is able to identify entities typed as Product and Event. However, since these types can be eligible for Rich Results, they trigger errors in Google Search Console. As a result, we have chosen to type Product and Event entities as Thing to prevent errors from appearing in Google Search Console enhancement reports.

How do I know External EL is working?

Once pages have been run through the API, it will start to populate results in the Entity Hub's Entity Reports.

For more information, see our Entity Reports documentation.

General Content Considerations

1. Use standardized names

When possible, use terms that can be found on authorities like Wikipedia or Wikidata. Provide additional content that includes language familiar to users. This way your content is optimized for both entity SEO and content SEO.

2. Capitalization Is Important!

Proper nouns are differentiated from common nouns with the same name by capitalization. Ensure that proper nouns are consistently capitalized to facilitate External EL matches.

Example: Apple, the “American multinational technology company” and apple, the “fruit of the apple tree”.

3. Consider surrounding content

What other entities are relevant to your primary entity? Take Amazon, for example. When surrounded by other keywords like “technology”, “e-commerce” and “digital streaming”, it is identified as the Amazon the company, whereas keywords such as “tropical”, “trees”, or “biodiversity” make it clear that the entity being mentioned is the Amazon rainforest in Brasil.

This approach isn’t that different from keyword clusters in content SEO. The only difference is that it takes NLP APIs into account alongside human users searching for content.

External EL Limitations

Limitations of API Results

External EL will not return entities if the API lacks any identifying metadata such as a Wikipedia URL or Google Knowledge Graph MID.

The API used for External EL has an accuracy rate of approximately 83%. This means it occasionally matches to entities that are incorrect (for example, matching to Hamilton the person, rather than Hamilton the place).

Use the Entity Manager to edit an entity's properties for improved accuracy or to block entities from deploying to eliminate irrelevant or redundant data.

For more information, see our Entity Manager documentation.

Tag Run Limitations

External EL tags can be run up to 10K times on up to 10K characters per URL. This means one External EL tag can be run on up to 10K URLs, and two External EL tags can be run on up to 5K URLs.

If you're interested in increasing the scope of External EL on your account, you can do so as a monthly add-on.

To start implementing External EL on your account, get in touch with one of our Customer Success Managers at support@schemaapp.com