GEO Strategy14 min read

The Death of the Keyword:
Entity-First Indexing Explained

Matt Ryan
Matt Ryan
Founder & CEO
Oct 24, 2025
Location
Category

"Strings are dead. Things are alive." Amit Singhal, Former Head of Google Search (2012).

For two decades, the SEO industry has been obsessed with strings of characters. We called them "keywords." We counted them, we stuffed them into meta tags, and we tracked their density like obsessive accountants. We treated Google like a filing cabinet: if the label on the file matched the user's query, we ranked.

But to a Large Language Model (LLM) or a modern search engine like Google, a keyword is a relic of a simpler time. The algorithm no longer matches strings; it maps entities.

The shift from "Strings" to "Things" is not just a semantic update; it is a fundamental restructuring of how information is organised on the web. If you are still optimising for keywords in 2026, you are optimising for a version of Google that doesn't exist anymore.

What is an Entity?

In the context of the Knowledge Graph, an entity is a distinct, independent concept that exists in the real world. It can be a person, a place, an organisation, a book, a chemical element, or a historical event.

Crucially, an entity is defined by its relationships to other entities (edges), not by the words used to describe it (labels). This relationship is often stored as a "Subject-Predicate-Object" triplet.

Case Study: The Disambiguation Problem

Consider the search query: "The Big Apple".

Legacy Search (Keyword Based)

The engine scans the web for pages containing the strings "Big" and "Apple". It might return a page about a large fruit variety, a grocery store promotion, or New York City. It has no way of knowing which one you mean without more words.

Modern Search (Entity Based)

Google identifies "The Big Apple" as a known alias for the entity [New York City] (Entity ID: /m/02_286). It returns results about the city's population, weather, and hotels, even if the page content only says "NYC" and never mentions a fruit.

This capability is powered by the Knowledge Graph—a massive database of over 500 billion facts about 5 billion entities. When you search, you aren't searching the web; you are querying this graph.

Vector Space: The Math of Meaning

How does a computer know that "King" is related to "Prince"? It doesn't have a dictionary; it has math.

Modern search engines use Vector Embeddings to map words into a multi-dimensional geometric space. Imagine a 3D graph where every concept is a point. Concepts that are semantically similar are placed close together in this space.

vector([King]) - vector([Man]) + vector([Woman]) ≈ vector([Queen])

This calculation allows Google to understand intent without exact keyword matches. If you write an article about "Best DSLR Cameras," Google's vector analysis knows you are also relevant for queries like "Professional Photography Gear" because those entities occupy the same vector neighbourhood.

The Takeaway: You no longer need to stuff every synonym into your H2 tags. You need to cover the *topic* comprehensively to establish your document's position in the correct vector space.

The "Entity-First" Optimisation Protocol

So, how do we optimise for a machine that thinks in concepts rather than words? At DubSEO, we have moved all clients to an Entity-First framework. Here are the three pillars:

1. Disambiguation via Schema

You cannot rely on Google to "guess" which entity you are referring to. You must tell it explicitly using Structured Data (JSON-LD).

Don't just mark up your address. Use the `sameAs` property to link your content to the definitive source of truth for that entity, such as a Wikipedia page, a Wikidata ID, or a Crunchbase profile.

"about": {
  "@type": "Thing",
  "name": "Generative Artificial Intelligence",
  "sameAs": "https://www.wikidata.org/wiki/Q107593635"
}

2. Co-Occurrence & Semantic Proximity

Google determines the confidence of an entity relationship based on how often two entities appear together across the web.

If you want to rank for [Coffee Beans], your content graph must also contain nodes for [Arabica], [Roasting], [Ethiopia], [Caffeine], and [Grinding]. If you talk about coffee but never mention roasting, your entity signal is weak. This is not keyword stuffing; it is "Entity Gap Analysis."

3. Becoming an Entity

The ultimate goal of SEO in 2026 is not to rank for a keyword; it is to become an entity in the Knowledge Graph.

When Google recognises your brand as a named entity, you graduate from being a "publisher of content" to being a "source of facts." This is the key to appearing in AI Overviews (SGE). Google cites entities it "knows." It does not cite random blogs it just "found."

Future-Proofing for the AI Web

As we move toward an Agentic Web where AI bots browse for us, keywords become even less relevant. An AI agent doesn't type "best shoes" into a search bar. It has a complex, multi-step goal: "Find me a pair of running shoes suitable for flat feet, under £100, available for delivery by Friday."

To answer this, the AI parses entities: [Running Shoes] (Category), [Flat Feet] (Attribute), [£100] (Price Constraint), [Friday] (Time Constraint).

If your product data is trapped in unstructured text (keywords), the agent will miss it. If it is structured as entities with clear attributes, you win the sale.

"In the age of AI, you are not competing for a ranking position. You are competing for a node in the neural network."

Audit Your Entity Graph

About the Author: Matt Ryan is the Founder of DubSEO. He has been deconstructing search algorithms since 2012 and specialises in Knowledge Graph architecture.