Mobile app version of vmapp.org
Login or Join
Karen161

: How do search engines identify keywords in domain names? Do search engines identify keywords in domain names as substrings or special character separated strings, or both? For example: abcwidgets.com

@Karen161

Posted in: #Domains #Keywords #SearchEngines #Seo

Do search engines identify keywords in domain names as substrings or special character separated strings, or both? For example: abcwidgets.com vs abc-widgets.com (obviously with "widgets" as the keyword).

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Karen161

3 Comments

Sorted by latest first Latest Oldest Best

 

@Pierce454

The beauty of the new domain extensions is that you can have a new structure called keyword.keyword. When you can now have actual words on the right side of the dot, the game has changed. In the case of coffee.club, the site managed to rank on page 1 for the term "coffee club" within weeks. The only signal to Google was the anchor text from links, which in most cases was the domain name.

10% popularity Vote Up Vote Down


 

@Becky754

Okay, just for the sake of explanation, I thought I would jump in. This is a simple process to understand. I thought I would explain it theoretically so that you can take the concepts forward with you into the future and apply it in other areas.

There are a few things you need to know in order to understand how this works.

The first is how Google stores terms in the index.

Google stores terms in a relatively simple index file. This file contains known dictionary terms along with terms found on the Internet that exists on more than one site and can be understood using semantic analysis. Each term within the index has a checksum number assigned to it that uniquely identifies the term. A checksum is a numerical algorithm that can be used to quickly check if two things are identical. When a checksum is created, it can represent a block of text, generally used against content or terms in search engines, with a unique number. If a term is found on the web and Google wants to see if it is a known term, a checksum of the found term is created and then looked up in the index. Google does not compare words when comparing numbers is far more efficient and smaller.

How Google matches terms to a web page.

That said, Google creates a series if indexes that tie the terms to the web page. Google originally explains two indexes that consist of a document ID (web page) and a term ID where one is sorted by term ID and the other by document ID. The reason for this so that Google can find what terms are on a web page and what web pages contain a particular term through standard forward indexes. Heavily hinted are the semantics behind all of this. Google tells us the title tag is important along with any link text where the web page is the target, the URL/URI, and the description meta-tag. This indicates that additional indexes exist specifically for this purpose.

How are terms discovered in a domain name?

Programmers understand word boundaries. Word boundaries are segments of alpha-numeric text between delimiters generally a space or special character. This is not to be confused with an actual term which it may not be- it is just a segment of readable characters. In the case of domain-name.com, the hyphen (-) would be the delimiter that separates domain and name and the period (.) separates the domain name from the TLD. If Google wants to check if valid terms exist within a domain name, it simply creates a checksum for each text segment found in a domain name and looks it up in the term index.

In the case of a domain name that does not contain a delimiter, the process changes slightly. For domainname.com, the domain name could be examined by extracting and adding characters of the domain name one at a time in a loop. This is an old-school method no longer used and just mentioned for illustration. For example, domainname.com could discover domain and name by extracting d, then do, then dom, then doma, etc till each term is discovered.

Another similar method is using n-grams which is what Google and search engines actually uses. In this case, n is a number where Google will take n number of characters at a time. For example, 3-gram would get dom, then oma, then mai, etc. and Google would attempt to see if the n-gram segment is a valid term by creating a checksum and looking it up in the term index. N would be increased by one and the process would start over again and again until all the terms are found. This allows a term like headache to be indexed as headache, head, and ache.

If a valid term is found for a domain name, then an entry is entered into the URL/URI index for the term and the site/page. Google does not give too many clues on this except that it appears that the domain name, directory path, and file name are at least segmented. This can be stored in a single index or not. But let's assume for the moment that it is not. Based upon this assumption, the domain name would have it's own index that identifies the domain name as having domain and name as terms found within it.

Simple huh?

So you can see that Google and indeed all search engines can easily find terms that exist within a domain name no matter how the domain name is actually presented.

It is important to recognize that when we talk about rank, we are talking about two things; site and page. Terms found in a domain name do not help a site to rank, but do help pages rank in the SERPs only as a last resort when search query matches are limited. This is due to over optimization and previous spam sites that took advantage of the over optimization.

10% popularity Vote Up Vote Down


 

@Heady270

They can do both. But keywords in the domain name are no longer a ranking factor, which could give you an edge comparable to your competitor.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme