Hero Image

The Role of Natural Language Processing (NLP) in SEO

Natural Language Processing, or NLP, is an area of AI that focuses on the interaction of natural language between humans and computers, whether via speech or text.

An exchange between humans and computers using NLP in the context of search engines would be a user either typing a query into a search engine or asking a question (spoken) through a smart device. The computer would then decipher and process this data, analysing content on web pages to provide relevant results.

Google has continued to improve its language understanding capabilities over the years; however, they admit that they “sometimes still don’t quite get it right, particularly with complex or conversational queries”. This is where Google’s machine learning or NLP, comes in.

person on desktop computer


When it comes to SEO, NLP goes further than looking at words or phrases alone. As Google describes it:

“Natural Language uses machine learning to reveal the structure and meaning of text. You can extract information about people, places, and events, and better understand social media sentiment and customer conversations.”

In 2018, Google introduced their neural network-based technique for NLP pre-training, which they called ‘Bidirectional Encoder Representations from Transformers’ – more commonly abbreviated to BERT. Then, in late 2019, Google released BERT as an algorithm update, with the aim to better serve users accurate search results.

Here’s an example of how BERT affected search results for the better. In this example, the word “to” and its relationship to the other words in the query are critical to the meaning of the phrase. It’s about a Brazilian traveling to the U.S. – not the other way around. Google stated that “Previously, our algorithms wouldn’t understand the importance of this connection”.

Essentially, how BERT works is that it collects sets of data relating to content, and learns how to analyse this data. It looks at the context of words and phrases, considering words preceding and following on from it, as well as the entire page’s content.

Google explains:

“With the latest advancements from our research team in the science of language understanding–made possible by machine learning–we’re making a significant improvement to how we understand queries, representing the biggest leap forward in the past five years, and one of the biggest leaps forward in the history of Search.”

processing natural language from humans to computers

How to Analyse the Meaning & Structure of Your Content

To help Google’s machine learning better comprehend your content for SEO, you can perform your own analysis to see how your content would be processed.

Google offers a free trial of its Natural Language API, which you can use to analyse yours (and your competitors’) content. The tool splits analysis into four areas. These are:

1. Entities

An entity is a word or a phrase that reflects an object that Google can identify and categorise. These can include:

  • Organizations
  • Events
  • Number
  • Price
  • Locations
  • Person
  • Consumer Good
  • Address
  • Other

Entity Salience

This is the importance of a particular ‘object’ in the content. Google’s NLP scores salience from 0.0 to 1.0 and ranks entities in order of importance. For example, if an entity is seen as less important and relevant to the context at hand, it will have a lower salience score, if it is important and more relevant, it will be higher.

An example of this would be a piece of content about strawberry growing. Random words that don’t add to the context of the page, e.g., ‘dog’, would have a very low salience score.

Why Entities Matter for SEO

When looking at entity salience from an SEO perspective, obviously you would want your primary keyword to have a high entity salience. If it isn’t as high as you would have expected, then Google won’t be understanding or recognising the importance of what you’re trying to target. Therefore, you may need to look at how you can re-frame your content.

2. Sentiment

In addition to understanding the topic at hand, Google’s NLP is also able to comprehend the feeling (i.e. positive or negative) towards the topic. Google’s example of this is:

“Sundar Pichai said in his keynote that users love their new Android phones.” This would undoubtedly be positive sentiment, as a positive word has been used – ‘love’.

Measuring sentiment requires Google to use sentiment scoring system which works as follows:

Positive: 0.25 to 1.0
Neutral: -0.25 to 0.25
Negative: -1.0 to -0.25

Why Sentiment Matters For SEO

Sentiment’s role in SEO is a difficult one. Google seems to mostly assert that they want to show diversity in opinions within the SERPs. Therefore, we shouldn’t expect a result with negative sentiment to not show amongst those with positive sentiment just because it differs.

In fact, Google-related patent expert, Bill Slawski, stated:

“I don’t believe that Google would favor one sentiment over another. That smells of showing potential bias on a topic. I would expect Google to want some amount of diversity when it comes to sentiment, so if they were considering ranking based upon it, they would not show all negative or positive.”

Where it should be considered for SEO is when reference is made to your brand or services. Are your reviews mostly positive or negative? Are links to your website coming from articles or commentary with good sentiment?

3. Syntax

This part of Google’s NLP focuses on the structure of content itself. Google explains:

“Syntactical Analysis breaks up the given text into a series of sentences and tokens (generally, words) and provides linguistic information about those tokens”

This linguistic information provides information about the content’s morphology (study of the internal structure of words) and syntax (study of the structure of phrases and sentences). Basically, this means that it looks at the structure of the text and where words are placed within a query.

Why Syntax Matters For SEO

Good sentence structure means better content, and good, useful content is at the heart of SEO. Though measuring syntax isn’t something you can easily take action from, it serves as a good reminder to ensure every bit of content that we write is high quality, and has been checked before publish.

4. Categories

Google’s Natural Language API text classification service classifies text into a large set of categories. The categories are structured in a hierarchical way. These categories include anything from business to health, videogames to law.

You can access the full list of categories here.

Why Categories Matter For SEO

We know from our personal experience that Google favours sites (from an SEO standpoint) that are niche and centred around a specific topic, i.e., it’s clear what ‘categories’ it would fit into from the get-go.

So, as an example, if you’re trying to tell Google that you provide IT services to businesses, you would want to make sure that it hits both ‘IT’ and ‘business’ categories. If you’re managing to fit into IT but not business, you need to consider the context and semantics to do with business on your page (e.g., business, corporate, SMEs etc).

searching on a computer

In Summary

At the end of the day, Google wants to serve the best search results possible. To reiterate this in the context of NLP, remember that Danny Sullivan tweeted:

“There’s nothing to optimize for with BERT, nor anything for anyone to be rethinking. The fundamentals of us seeking to reward great content remain unchanged.”

As BERT is centred around Natural Language Processing, our advice would be not to over optimise your content. The whole point of NLP is that it can comprehend requests better than ever, and so, as SEMs and business owners, we should be doing less ‘writing for SEO’ or ‘keyword-ese’ writing as Google puts it (words that we want Google to understand as opposed to how’s best for the reader). Instead, we should be focusing on quality content that will rise more naturally to the top.

Delivering quality SEO and content is our speciality. Find out more about our SEO services here.

Look who’s talking…

Estimated Read Time: 10 minutes

See more articles in…

Sharing is caring!

What’s Good, What’s Great and What’s New

  • Hero Image

    The End of Universal Analytics: When Will UA be Deprecated?

    With Google’s official announcement confirming when Universal Analytics will be replaced with GA4 for good, this blog will answer your burning questions.

    Read more: The End of Universal Analytics: When Will UA be Deprecated?
  • Good content, Gone BARD

    Good Content Gone Bard – Should You Use AI to Write Content?

    Beep boop, AI here. AI has its place in the world – but is that to write content? We look at the pros and cons. Read the blog today.

    Read more: Good Content Gone Bard – Should You Use AI to Write Content?
  • The Role of Natural Language Processing (NLP) in SEO

    It’s Time to ‘Be-Real’ – The Rise of Brand Authenticity in Digital Marketing

    With photo editing available at the touch of a button, staged ‘candids’ and the ability to build an image however you choose online, the 21st century is a difficult time to be authentic. And whilst this statement is certainly no revelation, it looks like things might be changing. Authenticity is increasingly on the agenda of […]

    Read more: It’s Time to ‘Be-Real’ – The Rise of Brand Authenticity in Digital Marketing
  • search blog hero

    Search Updates – March ’23 Changes

    Spring has sprung and with it a whole host of search updates. March saw Google’s first core algorithm update of 2023 (and the first for 6 months), updates from Google’s much-anticipated ChatGPT rival, Bard, and more. Read on to discover all the changes you need to know from the last four weeks. Google’s First Core […]

    Read more: Search Updates – March ’23 Changes