Mobile app version of vmapp.org
Login or Join
BetL925

: Will space typos help my page ranking? (Dutch language) In the Dutch language we concatenate all words that are a combined noun. Some examples of English words where this happens: spacebar, doorbell

@BetL925

Posted in: #Content #Keywords #Ranking #Seo

In the Dutch language we concatenate all words that are a combined noun. Some examples of English words where this happens: spacebar, doorbell and whitespace. In Dutch words like "app development", separated, the words translate to "app ontwikkeling", will be concatenated to "appontwikkeling". This happens to any word, no matter how many words: "fietszadelleer", which means leather for bicycle saddles (probably not the best example but I was looking at a bicycle while thinking of a word).

What I see a lot in highly ranking Dutch pages is that they are full of space typos. Words like "androidontwikkelaar", "marketingexpert" all get written with spaces (which is grammatically wrong).

There are 3 reasons I could think why they do this:


If people search in a sentence with words separated, both words get triggered
The Google algorithm tries to understand what your article is about. Because not many languages concatenate nouns, maybe Google doesn't understand those concatenated words
They don't know the Dutch grammar (probably not the case)


Will deliberately making these space typos ("spatiefouten" or wrong: "spatie fouten") improve my page ranking / search traffic?

10.04% popularity Vote Up Vote Down


Login to follow query

More posts by @BetL925

4 Comments

Sorted by latest first Latest Oldest Best

 

@Hamaas447

Information retrieval technologies lemmatize the search terms. That means the terms should be understood in their proper form by the search engine instead of being used verbatim. The same applies for indexed contents. The problem with a language like Dutch is that the current technology employed by Google might still not support the language well enough to correctly lemmatize all words, especially words not previously found in the dictionary such as Android developer.

So the general rule should be to write as correctly as possible and not worry about such details which the search engine is supposed to take care of, unless you have a good reason to believe that the search engine isn't grammatically correct about a term (e.g. when you search it in Google it says "did you mean" with a whitespace added in between, then you should probably add whitespaces also.

10% popularity Vote Up Vote Down


 

@Nimeshi995

Nothing really matters if ontologies are not considered first.

For example, I can type fastredcar which is not a term in the English ontology. No matter what people type in the search query bar and what matches are found, your site can never rank for such a term because the term does not exist within an ontology and therefore not indexed as is with weight.

Ontologies are how search engines understand content. This is based upon information retrieval (IR) technologies that are older than most of us. So for fastredcar, if Google does not recognize it as fast red car, it will not recognize fastredcar as a term simply because it is not one.

Keep in mind that Google does not and never has directly matched search terms. Direct term matches was the paradigm that the original Google research paper written by Brin and Page railed against. The point behind the creation of Google is that direct term matches return poor results.

Quoted from: The Anatomy of a Large-Scale Hypertextual
Web Search Engine written by Sergey Brin and Lawrence Page in 1997/98


Automated search engines that rely on keyword matching usually return
too many low quality matches. To make matters worse, some advertisers
attempt to gain people’s attention by taking measures meant to mislead
automated search engines. We have built a large-scale search engine
which addresses many of the problems of existing systems. It makes
especially heavy use of the additional structure present in hypertext
to provide much higher quality search results. We chose our system
name, Google, because it is a common spelling of googol, or 10 100 and
fits well with our goal of building very large-scale search engines.


Using the concept of direct term matches, the results are documents that closely match the query and not documents of high relevance which is what is required for such a small presentation space such as the first page of search results. (Paraphrased from the research paper.) For this reason, semantic analysis is used including topical analysis. Topical analysis weights the entire content and content segments by terms and topics recognized within topical ontologies. These topical weights are what is used to determine if a term is used within a topical context and not just arbitrarily. For example, I can write a web page about cats but want to increase search for other terms by inserting them into the text. However, without context, the term car, tire, engine, are not contextual for a web page about cats. As well, topical strength scores also weed out short content and content that is watered down.

As an example, I worked with a webmaster quite a few years ago that had a website about cars and was getting search result for women's clothing and shoes. After a quick review, I realized that his use of terms to describe a car as sexy, curves, svelte, vinyl, leather, tight fit, etc., was the problem. This was before the shift to include Google Scholar which relied heavily upon semantic analysis into the regular search algorithm. Today this is not a problem especially as adjectives are recognized as belonging to topics using the same sets of ontologies.

You are fortunate in that Google does recognize that not all valid terms exist within ontologies. For this reason, Google creates it's own ontology using some simple AI (artificial intelligence) that allows terms used in multiple places to be defined both as a term and as a topic using the surrounding context and linguistic semantic analysis. While this may not always be perfect, it does work.

Another consideration is nGram analysis. Using our example, fastredcar may be broken down into single terms to help understand the term itself. This is done using terms such as prototype where the term is broken into subsets assuming that it does not already exist within an ontology. Prototype would be broken into proto and type so that the term can be understood. Ontologies of different languages and common roots are be used for this. Using fastredcar as an example, can be broken down into fast red car so that it can be understood.

So to answer your question, using my example fastredcar cannot be weighted as it is, however, it could be weighted as fast red car and can return valid results as separate terms. You would have to consider this when concatenating terms and whether the search engine would be confused or not and what possible values would be assigned. It is possible that the terms you are concatenating are in an ontology. Not being familiar with Dutch, it is possible that some of the concatenations you are using are common enough to be recognized within a custom ontology. Who knows? It is something to explore. As a rule, I would say to avoid this convention as much as possible (unless you are sure it is correct and okay) to better ensure that what you mean is what is understood by the search engine.

10% popularity Vote Up Vote Down


 

@Miguel251

It's actually not dependant on what Google understands but what users type. This is a really common issue for things like Dutch language optimisation. Many users will do partial word searches that may be relevant and Google will offer similar searches at the bottom of the page (Users also searched for:) but unfortunately gives results based on the individual nouns used.

Google's language understanding is done from the query perspective. If users are concatenating for the noun commonly, then it's better to join it. In fact, if this is the common way to do it - then Google will even autocorrect.

If it's a concatenated noun rarely searched for - then you will need to add space errors. This is why you'll see it all over the place. Sometimes it's better safe than sorry.

I realise it's inconvenient to be researching every individual noun combination but once you understand your demographics language use this can become a lot easier. A technique that might assist you is studying your users social media and the Latent Dirichlet Allocation of nouns. There are guides on how to perform this study with a program called KNIME online.

10% popularity Vote Up Vote Down


 

@YK1175434

I'm a fan of "Make it right for your visitors, only tweak if you know you benefit from it".

The rule is very simple: If two words can be beconnected (fiets zadel), you must connect them (-> fietszadel). Don't reinvent it for your needs.

Google knows synonyms or similarities between words (this is why you don't put 'fiets, zadel, fietszadel' in your keyword meta tag), so I'd say keep your text proper.



I just Googled 'fiets zadel' and got a 'Did you mean fietszadel?'

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme