The largest SEO-myth version of Bill Slavskaya

Based on materials from the article Director of SEO-Research Go Fish Digital and expert Google Patents Slavskaya Bill (Bill Slawski).

This article was written after a discussion of industry myths and what can be done to help the SEO-experts to avoid misconceptions and misinformation about search engines and SEO.

There are several themes, which is associated with a large number of SEO-myths. In this article we will look at that in them is true and what is not.

understanding Google

Google - one of the most popular search engine in the world, used by many companies in North America and Europe to drive traffic to websites. Google - this is one of the main objectives of many SEO-campaign. With him are also associated with many SEO-myths.

Therefore, it is desirable optimizers learn more about how Google can be in terms of business intelligence. Fortunately, Google provides a lot of information about themselves. In this section we will look at the resources that are useful to track and read.

First of all we would recommend the book - In The Plex (in Russian language - «Plex: how Google thinks, works and shapes our lives"). This book was written by an American journalist Steven Levy on the technology and saw the light in 2011. With its help, you'll get a good idea of what constitutes a Google at the dawn of its existence.

This book humanizes Google, and the perception of Google as a business with people who are trying to do something useful for other people, creates a more adequate image of the company.

Now Google is a public company and regularly publishes financial reports to its shareholders, which are often discussed the goals and directions of development of the company. These reports can be found on Google's parent company - Alphabet.

In 2004, Google founders also wrote the so-called « of An Owner's Google's the Manual for the Shareholder », which is worth reading to understand the direction in which they want to develop the company.

Google resources

Google is actively working to provide information about users of the search engine, as well as hiring evangelists to work with webmasters who share important news and clarification in such channels, like Twitter. Among these experts appear on the John Mueller , of Gary Illyes and by Danny Sullivan , for accounts that should follow. They are active and responding to emerging questions about Google.

It is also useful to monitor the following background information and resources: the Think with the Google , the Google devs , the Google the AI , Blog.google , Webmasters.googleblog and AI.googleblog.com .

Additionally, Google supports references forums where employees and volunteers answer questions owners of sites with their resources issues. A forum for webmasters available on the link .

Google Webmasters team also conducts a hangout for webmasters (Office Hours), where you can ask questions directly to find employees. These meetings are held in live mode on the YouTube .

The use of all these resources help to better understand Google.

Now consider just myths.

Popular SEO-myths

Latent Semantic Indexing (LSI)

In the late 80's, before the advent of the Internet, researchers at Bell Labs published scientific articles and patents registered at approach to indexing, which is ideal to work with small static data sets. The patent cited the example of data from eight books and said that every time new information is added to the body of evidence that are indexed using the LSI, indexing should be performed again.

In this case, the Internet contains a much larger amount of information, which is often modified by the addition of new information, as well as delete and update old ones.

LSI - a technology that has been developed up to the Internet and did not involve indexing the like. In some of its patents Google is sometimes referred to as an approach to indexing, but not necessarily one that can be used for the data indexed by the search engine. One of these patents - Computer information retrieval using the latent the semantic structure .

Google uses pre natural language processing technologies, such as BERT ( the Pre-training of Deep the Bidirectional Transformers for the Language Understanding ) and others who prepare documents for the following approaches, which can be used in the search engine work such as answering questions and defining sentiment in documents. The company has also developed an approach that is associated with the insertion of the words used in the RankBrain.

These newer approaches - are technologies that have been developed with the understanding of the size and nature of the data body in the index of the Internet.

What you need to know about the phrase "keywords LSI»

There is a tool called LSI CET Keywords, which actually does not use LSI and does not generate keywords, but rather picks up associated words that will be placed on the same page as the keyword that you have chosen for this page.

On the page, the tool does not say that he is using the LSI, as something that was invented and patented in the late 80s as an approach to indexing (rather than as a tool for keyword selection).

Some people mean by "key words LSI" adding synonyms and semantically relevant words on the page. LSI but it is quite another.


LSI - a process that uses a base (or hidden) page structure in order to understand the semantics of the words may be associated with each other.

Some people also believe that the collection of similar queries that appear at the bottom of the search results page, this is the use of «LSI keywords", but this is again not necessarily so.

Google has shown us that it can rewrite queries that are looking for people to show pages, which, in the opinion of the search engine, or meet the situational needs of the user information, content, which basically means the same thing. And this is the idea that underlies the Hummingbird algorithm.

With a patent that explains how Google can rewrite queries, please follow the link: Synonym identification based on co-Occurring terms . However, this patent is silent on how to optimize your web page for Hummingbird.

TF-IDF

As LSI, TF-IDF - this is the old indexing method, which was developed before the advent of the Internet. It analyzes the frequency of the use of the term in the document and the frequency of these terms together indexed documents. This allows you to determine whether the page is associated with a certain term, and how common or popular, this term can be taken together documents.

Moreover, this method does not take into account the so-called "stop words" in English, is often found in texts such as «and», «or», «the» and «to».

This approach to indexing, probably was replaced in the early search engine more advanced algorithm, called BM25. In Google Patents there are mentions of TF-IDF as the one part of the process of determining the similar prompt that appears at the bottom of the search results at Google. But we have not seen mention of the TF-IDF as part of how the indexed pages on the Internet.

TrustRank

TrustRank mention of the concept first appeared in a joint paper of researchers from Stanford University, and Yahoo "Combating Web Spam using TrustRank» ( Combating the Web Spam with TrustRank ). The aim of the process described in this article was to identify spamnyh pages on the Internet.

An abstract of the article reads:

"To get higher positions in search results using a variety of methods spamnye page. Although experts can identify spam, evaluate manually a large number of pages are too expensive. Instead, we propose a method of semi-automatic reliable separation pages from spam. First, we select a small set of initial pages for an expert assessment. Once we manually identify the reputable home pages, we use the link structure of the Internet to discover other pages that can also be good.

In this article, we discuss different methods for selecting the initial pages and pages of reliable detection. We present results of experiments carried out on the Internet that are indexed by AltaVista, and evaluate the effectiveness of their techniques. Our results show that we can effectively filter out spam from a significant part of the Internet-based quality basic set of less than 200 sites. "

misunderstandings

Since the article about TrustRank was first published on the website of Stanford University, many people associate it with Google, because it was founded by students and researchers from Stanford. But in fact this is no connection.

Google mentioned «trust» as something that can be taken into account when ranking pages, but nothing like the TrustRank in Yahoo's search engine work is not being used.

TrustRank is not engaged in ranking pages on the Internet, although some experts argue that this approach to ranking used for ranking content in search results.

It is enough to read the annotation to the above article or the entire article to understand that there are no grounds for these assertions about the role of TrustRank not there.

Google has developed an approach based on how people create a Google Custom Search Engine, and select and annotate specific sites in the content files for these user systems as a search engine resources to these pages and sites considered as expert in the topics that span the search engines .

This approach is very different from developed Yahoo TrustRank. On our website, we called it "TrustRank Google's version" and tried to explain how much it differs from what is invented and patented in the Yahoo (Google hold so by copying their spam filtering method of the search results).

Google also explained in the Guidelines for assessors, that he wants to assessors evaluated the web page based on the concept of «EAT» (Expertise, Authoritativeness and Trustworthiness).

According to said Google vice president of search Ben Gomes (Ben Gomes), Guidelines for assessors allows us to understand in which direction the company wants to develop its algorithm:

"You can consider recommendations for assessors such as where we want to move our algorithm. They do not tell you how the algorithm ranks search results, but they show, what he should do. "

Thus, «trustworthiness» Guide for assessors has nothing to do with TrustRank or ranking of web pages, but Google would like to see in their search results pages that users will perceive as credible.

If someone tells you that Google uses to rank pages TrustRank (like TrustRank, developed by Yahoo), then you are misleading on several counts:

  • TrustRank does not rank web pages;
  • Based on trust approach, patented by the Google, has nothing to do with TrustRank, Yahoo created.
  • Google mentions «trustworthiness» Guide for assessors, but this again has nothing to do with TrustRank, so that argument is devoid of reason.

Be careful with what you read about TrustRank. Some articles on this topic contain verifiable facts mixed up with hasty generalization (Google has a patent on the Trust ...) and quite reasonable evidence (Guide to Google assessors mentioned «trustworthiness»), to back up the assertion that Google uses something like Yahoo TrustRank to rank pages. These inconsistencies in the reported arguments and make them SEO-myths.

RankBrain

Google has a long history of developing approaches to rewriting of queries (in the past they called them "Expanding queries") - (since at least 2003, the oldest found us patent date to query synonyms Search queries improved version based on the query the semantic information ). Google modifies the request, finds the terms that can be a substitute for, or synonymous terms in the original query.

Launch of Hummingbird algorithm in 2013 showed us how Google can rewrite queries.

In 2015, Google introduced a new algorithm called RankBrain. This was done by means of an interview with Bloomberg News one of Google Brain expert team, which has developed RankBrain Update.

In this interview, Google employee said that RankBrain - an approach to rewrite the query, based on the Word Technology Vector, set up Google Brain team. We also managed to find a patent, from which you can learn more about the Word Vector. This patent can be read in our article Citations behind the the Google Brain Vectors a Word Approach .

Google has published at least one patent on the approach to rewriting of queries, which is referred to as the inventor of Google Brain employee team. This patent uses a large amount of data from your search history and web pages and is called Solution: Using concepts as with the contexts for the query TERM substitutions .

We can not say that this patent is the one on which created RankBrain, but there is enough data to suggest such an option.

As we mentioned Google, RankBrain aims to reduce ambiguity in queries.

Is it possible to optimize for RankBrain?

According to Google employees, web pages can not be optimized for RankBrain. Since RankBrain working on rewriting the query, it seems quite logical. However, there are people who, in their paper describes the approach by which, they argue, it is possible to optimize pages for RankBrain and improve their ranking.

Some articles about optimizing pages for RankBrain contain recommendations to improve the quality of content on pages, increase the time users spend on these pages and increase the likelihood that the user selects one of these pages, when they see it in search results. This is useful stuff, but it does not optimize pages for RankBrain. And so these statements are misleading, which also makes them SEO-myths.

All these things do not optimize your pages for the approach, which is to query reformulation, and known as RankBrain. And if someone tells you that it is not, check on what is based their claims to the contrary.

Instead of a conclusion: the myths, experts and gurus

On the Internet there is a lot of information and misinformation about many things, including the creation of websites and optimize them for the search engines. Be careful about SEO-myths.

In an industry as there are people who call themselves experts or gurus. Such statements also be taken with a certain skepticism.

When you read about how to optimize the page, or that may be a ranking factor, read the arguments that are used to substantiate these allegations.

If the authors of the article suggests, is not backed by actual data, whether they admit it? whether they have provided information indicating that the knowledge and experience that enable them to make such statements? It is also important to pay attention.

If you read about SEO, to learn about some things that you can test yourself, and author of the article makes emphasis on this and suggests that, over what you might think, and that can be verified, such articles can be considered valuable.

If you write about SEO, if you back up its conclusions with facts, documents, links, or simply offering unfounded assumptions and vast generalizations? Here it is over what exactly it is worth considering.

Source: Блог SEMRush
subscribe

Subscribe to SearchEngines newsletter

preview All you need to know about BERT algorithm in Google search

All you need to know about BERT algorithm in Google search

In October, Google launched a major update of search algorithms over the past five years - BERT Update...
preview Modern problems of SEO-specialists

Modern problems of SEO-specialists

Report by Sergey Koksharov, SEO-analyst, consultant, author of the blog devaka...
preview Turbo pages on real projects

Turbo pages on real projects

So to introduce at the turbo-page or not? Report Ilya Gorbachev (Racurs) on Optimization 2019 conference...
preview The new format of the stored copy of the page in Yandex

The new format of the stored copy of the page in Yandex

In October 2019 the year Yandex sly, without any announcements and advertisements, replacing the stored copy of the page format...
preview How to calculate the CTR for a website based on data from Search Console

How to calculate the CTR for a website based on data from Search Console

Author: Luca Bares (Luca Bares) - Senior SEO-analyst, Wayfair...
preview 20 times measure: complete checklist semantic core test

20 times measure: complete checklist semantic core test

Formation of each semantic core is held in 4 stages: selection of the direction of advance, the selection query, clustering and distribution of pages...