Mobile app version of vmapp.org
Login or Join
Harper822

: Why would a website with keyword stuffing rank higher than one without in google search results? Like some other websites, I try to maintain a balance in keywords in comparison to other words.

@Harper822

Posted in: #Keywords #KeywordStuffing #Ranking #Seo #Serps

Like some other websites, I try to maintain a balance in keywords in comparison to other words. I ran tests of my (optimized) website as well as a (keyword stuffed) competitor website via this SEO tools/ Keyword Analyzer.

Whats even crazier is that the exact phrase people search for ("bloke and 4th") contains a word search engines ignore. Take a look at the results.

and...



As you can see, the website that ranks higher has possible spam indicators attached to it where as my site does not.

So why is it possible for one website with many spam indicators to rank higher than a website with no spam indicators? Is google actually beginning to promote keyword stuffing now with all the changes they make to their own pages?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Harper822

1 Comments

Sorted by latest first Latest Oldest Best

 

@Sherry384

This is an easy one. Keyword density is a myth- sorta. At least it is now.

What is important to note is how the terms are used and not how many times the terms are used. SEOs like to intentionally confuse the issue to keep you dependent upon them and paying for tools and advice. P.T. Barnum used to say that there is a sucker born every minute. In SEO, the sideshow seems to be all the online advice. Sadder still, SEOs move slower than PageRank which is much slower than grass growing in the Sahara. They do not come off of the old concepts easily even when they were dead wrong to begin with.

This is a mini-tutorial on how terms on a site are weighted. It is not a complete explanation by any stretch, but an illustration. It is a worthwhile trip to take to better understand how SEO works.

Prior to weighing site terms and topics using semantics, keyword weighting was had using a few indicators including use and placement of terms in tags such as title tags, header tags, description meta-tags, as well as proximity to each other and important tags, and other importance indications, etc. A part of indicating importance was the use of terms, synonyms, complimentary terms, and how prominent these terms appeared to be. This follows the notion of keyword density somewhat, and please know that term ratios were applied to determine a page topic, however, it was not the high or low ratios of terms, but a ratio that would effectively remove common terms, repetitive terms, unnatural use of terms, and terms that simply have no value through lack of use, etc. These term ratios were automatically evaluated on a page by page basis and the results matched up with calculations that determine if the results were within an operational realm. When all was said and done, terms did determine topic and topic scope using semantics described later. But density had no baring on search rank per se', but rather topic and matching search intent. The secondary effect is matching on terms of a certain density by happenstance as the same terms fit a profile determined through semantic links and were used for determining search intent. This followed the parser model which in part still exists, but is not the entire model. Not any more.

Semantics is the primary model today, though because the web follows a traditional text model, the parser model cannot be dropped entirely. The reason for this is simple. It still applies and makes sense and is very useful.

Semantics can be described as "relational pairing" even though for some more complex semantic models, you are really talking about "relational chains". This is known as semantic links and the relationship between semantic links is known as the semantic web which has nothing to do with the world wide web except that one is handy for the other. For my illustration, I will keep it to simple pairs though semantics gets rather complicated rather fast. So for my illustration, I will be over-simplifying things by quite a bit.

Relational pairing is the simple notion of triplets; the subject, the predicate, and the object. The predicate can be anything as long as it is representational between the subject and the object.

I will deviate to an early PageRank model. Please stick with me. It applies.

When Google was conceived, the notion of page rank was a fairly simple representation of trust networks using semantics. A link is made from one page to another. In this case:

Subject: examplea.com
Predicate: trusts
Object: exampleb.com
Read as: examplea.com trusts exampleb.com

Subject: exampleb.com
Predicate: trusts
Object: examplec.com
Read as: exampleb.com trusts examplec.com therfore examplea.com trusts examplec.com


While we know that the "therefore" clause above is not necessarily true, this was the early model and still holds somewhat true though not absolutely true. We know that examplea.com may have no knowledge of examplec.com and therefore cannot entirely trust examplec.com. Still, a relationship exists that must be accounted for.

The early use of the term PageRank was calcualted on a page by page - link by link basis but applied to the entire site. For exampleb.com, how many trust links exist? PageRank was a fairly simple calculation of the links to the pages of a site. But there were obvious problems with this. Links can be made to artificially inflate the importance of a site. The calculation contained a fairly standard decay rate that could correct for this, however, the decay rate by itself posed new issues in that no single decay rate can fully account for actual value since it's natural inclination is to have a curve in it's calculation.

Using the trust model further, domains were weighted based upon factors that indicated trust. For example, the greatest trust metric is site age. Older sites can generally be trusted. Sites with consistent registration, consistent IP address, quality registrar, quality network (host), does not have a history of spam, porn, phishing, etc. all indicate trust. I count over 50 domain trust factors so I will skip these and continue to keep it simple.

Subject: examplea.com
Predicate: domain trust score
Object: 67

Subject: exampleb.com
Predicate: domain trust score
Object: 54

Subject: examplea.com
Predicate: trusts
Object: exampleb.com
Read as: examplea.com trusts exampleb.com


Using another calculation, some level of trust can be made and not just a binary one site trusts another. Where the first example passed trust, the second example passes a trust value proportional in how it is calculated.

Now, please understand that PageRank is calculated on a page by page basis and TrustRank is a majority portion of SiteRank of which links, link quality, link value all play a part though far less important than it did originally and far less than site trust score. Keep this in mind.

How does this apply to keywords on a page??

All content terms are weighted, however, only some tag terms are weighted. One primary example is the keywords meta-tag. We all know that there is no weight for terms within this tag at all. In fact, it is completely ignored. One misconception is the description meta-tag does not count for SEO. This is not true. For terms within this tag, there is weight, however, it is relatively low. The description meta-tag does have value. You will understand why in a bit.

The old parser model still has value. In this, the page is read top-to-bottom and tags and content blocks are read and weighted using values that gauge importance following a top-to-bottom model. Some metrics are static. For example, the title tag will have an importance score higher than the h1 tag which will be higher than any h2 tag, etc. The description meta-tag will have an importance metric that is fairly high. Why? Because it is still an important indicator of what the page is about. However, the terms found in the tag carry little weight. This is done so that search intent matches will still match the description meta-tag almost as easily as a title tag and an h1 tag, but cannot be manipulated too heavily to game the system. Please note that there are conditions that can apply. For example, a search will not match against the description meta-tag without matching elsewhere primarily the title tag or h1 tag or within content.

Continuing with the parser model, imagine a point at the beginning of the actual content. Proximity is a measure that is used in a variety of ways. One is where a term, tag, content block, etc. is in relation to that point at the beginning of the content. Now think of header tags as indications of sub-topics and imagine a point at the beginning of the content immediately following a header tag being terminated by the next header tag. Again proximity is measured. Proximity is measured between terms in a paragraph, sets of paragraphs, header tags, etc. These measures are calculated as weight for terms in how they are used and their apparent importance. Going beyond this, terms, phrases, citations, and indeed any similar portion of content can be measured between pages and sites using a slightly different but still similar proximity model.

Pages are related using links both from page to page and proximity from the home page or any other page where a relationship cloud can be determined. For example, a topic page on SEO can have links to several SEO sub-topic pages. This would indicate that the topic page for SEO is important in that it links out to several similar topic pages and a relationship cloud can be determined. So for any SEO sub-topic page, proximity would be a count of the links between the SEO topic page and SEO sub-topic page as well as the number of links from the home page. In this, a pages importance can be calculated. How important is the SEO topic page? It is one link from the navigation links on the home page and indeed every page- very important. However, the SEO sub-topic pages do not have links from the navigation and therefore gets any importance from the metric for the SEO topic page. This follows the PageRank Semantic Link Trust Network model.

Going back to the original PageRank model, you can value pages in how you link to them just as links pass value throughout the world wide web. This is called sculpting though excessive manipulative sculpting can be determined and ignored so be natural. As you are doing this, you are also indicating the importance of terms found on these pages. So any term on any page is not only weighted in where and how they are used on that page, but also the apparent importance of the page in how and where it exists on your site. Is it starting to make sense?

Okay. Well and good, but how are terms related and how does semantics help with this? Again, keeping it very simple.

I have a site about cars. You are in the U.K. and have a site about automobiles. It is rather obvious that cars and automobiles are the same word. Search engines use a dictionary to better understand relationships between words and topics. Google differentiated itself by creating a self-learning dictionary early on. I will not get into that, but you will still get the picture. Using semantics:

Subject: cars
Predicate: equals
Object: automobiles


In this, Google can figure out that my site and your site are about the same thing. Taking it a step further.

Subject: car
Predicate: is painted
Object: dark red

Subject: automobile
Predicate: is painted
Object: maroon

Subject: deep red
Predicate: equals
Object: maroon


Assuming for a moment that only these two sites exist, any search for deep red automobile could result in maroon automobile and deep red car even though deep red automobile does not exist on the web.

In the early days of SEO, it was recommended that synonyms and plural versions of terms be used. This was back when semantics was not used or as strong. Today, you can see this is not necessary since relationships between words and usage are kept in a semantics database.

Using the same model but jumping ahead quite a bit, if I write a brilliant piece that is quoted on several other webpages, semantics can note this as a citation and attribute this back to my original work giving it much more importance even without links to my page at all. In this, a page with no inbound (back) links, can outrank a page with a high number of inbound (back) links simply because of a citation. Citations are an important part of applying the semantic web to the world wide web. In fact, while SEOs were chasing the allusive AuthorRank, there was no such thing. It was all semantics and data pair matching which I will not get into but to say that, for example, written by could indicate authors name immediately follows and therefore a citation credit can be applied to the author if the piece was quoted.

Why did I go through all of this??

So that you will easily see, that the mechanism behind valuating any term on a site is far more complicated and no longer dependent upon density which was never fully the case anyway. In fact, density is no longer a secondary effect at all. The reason for this simple. It was easily gamed and no decay rate could compensate for the gaming just like in the original PageRank schema.

As for any keyword stuffed site, it is only a matter of time before semantics will give them away. Panda started out as a periodic task that was designed specifically to measure this and other similar things and adjust metrics to down-grade the effects of an offending site in the SERPs. While the SiteRank generally stays the same, any site found to spam will take a knock in the TrustRank score having had a violation thus down-grading the SiteRank slightly. I believe there is a severity component to this mechanism that allows for minor offenses to be corrected without harm. This knock sticks around even when the problem is solved. This is because the violation is retained in the sites history. So what happens is that the SERP placement will drop until the problem is solved in which the SERP placement will begin to rise again but never to the level that the offending site once had due to the notation of the violation. The older a violation becomes, the more it is forgiven allowing a previous offense to lose it's negative effect over time. As a note, while it is said that Panda and others run more often and my be a continual process today, it still takes time to build the semantic link map to know if a site is an offender. This means that a site will get away with stuffing for a period, but fail in the end once the semantic links and metrics are fully established. As well, I am sure there is an initial effect for stuffing, but is diminished greatly using the semantic model and the effect is rather superficial as a by product. This is because when a page is discovered, there is little to go on until the semantic link maps are filled out. Google, in it's wisdom, allows some grace thus allowing the page to rank high for terms within the important signals initially before settling into it's proper placement in the SERPs. Assuming that the signals match the semantics, then recalculating SERP placement will result in a relative shift in how the page is found. Otherwise, if the signals and the semantics do not agree, the placement within the SERP will be based upon semantics and how the page is found will change. This is why it is important to send the right signals in the first place by using keywords and tags accurately and honestly.

[Update]

I cut and pasted this answer into TextRazor www.textrazor.com/demo and here is an example. You will see the relative position to that imaginary point at the beginning of the content and other linguistics analysis in the table as well as the topic scores to the right. You can do the same by cutting the text of this answer (above this update) and pasting it into the demo page and playing around a bit. I encourage it. It will give you a good idea of how content is processed.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme