Mobile app version of vmapp.org
Login or Join
Jamie184

: Generating statistically likely searches for a existing web page In the process of attempting to do a Stack Exchange beta community quality evaluations, which includes the instructions to take

@Jamie184

Posted in: #Keywords #Seo #SeoAudit

In the process of attempting to do a Stack Exchange beta community quality evaluations, which includes the instructions to take a question and:


Run comparative Google searches on these questions and see if the
content is better or worse than what is already out there on the
internet.


What is the best way to insure that the search queries I'm reviewing for a given question are statistically speaking likely to produce the results users would be the most likely to get?

(Worth noting that normally my solution for this would be just to use Google's Webmaster Tools - though even that would have limits, since it's only for the top 2,000 searches "for a site" over a relatively short timeframe; meaning it's possible that not all the searches for a given URL would show up, or that they would be a good sample of what over a longer duration users would really search for.)

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Jamie184

3 Comments

Sorted by latest first Latest Oldest Best

 

@Pierce454

Let me see if I understand what you are trying to achieve:


Person enters content which can be anywhere from a clear and concise question to a vague concept with additional content supporting the primary question.
Community attempts to answer by providing content including a clear answer, supporting evidence or explanations, and links or assets to support their answer.
You are attempting to automatically take that content produced by both parties, compare with content found both on your site and elsewhere on the web to determine if the content statistically answers the question either on par or better than what is found elsewhere.


If this is the case then you need to understand a few inherent flaws with what you are trying to achieve:


Unless you want to build a Natural Processing Language system and create an Artificial Intelligence platform which will know what is "good answer" and what is not, at best you'll be able to find what keywords are or aren't present compared to the competition.
Because of the many variations and possibilities in asking a question, it would be difficult to programmatically determine if the answer portion of the content succinctly answers the question. Case in point is this very post: we all have provided (IMO) lets say 85% of the answer in some form or fashion however because of the presentation of content or lack of 15% supporting evidence, you have determined that our answers do not correctly answer your question. You can see how even if you were to determine statistical significance, your results could in fact be false positives much like we had thought we correctly answered your question here.


What you could do (to attempt to answer your question):

Build a topic model graph for questions & answers


Take question content and run it through a NLP entity extraction tool like AlchemyAPI. I say only question content because you do not want other page content skewing your data.
Store results which include sentiment, entities, concepts, etc. to analyze the data for this question itself and also future questions on the topic.
Use that same content and run it through a keyword or content planning tool to get keyword ideas and search volume. Idealy (I'm not aware of a tool that does this) you'll want to add modifiers to the content: who, what, when, where, why, how, is, etc. to get results which include the content + any of these modifiers.
Do the same you did in steps 1 and 2 with the answers portion of the content.
Compare the results you get from the keyword tool with the content from the answer portion of your page. Here is where you'll use statistics on the two results (question and answer) to determine if there are enough supporting or expansionary topics which clarify and answer the question.
Find the entities, concepts, topics, etc. which are lacking in the answers and provide this to community members as guidance when answering the question. Think of it as meta tag optimization but for your answer content : "try to include supporting concepts like x, y, z when answering this question as users have found these helpful".


Until now we are just determining if the answers on this site answer the questions to an acceptable degree. Now we'll want to see how this compares to answers found on the web. This might be a bit more difficult and would need help if anyone wants to jump in.


If the question is clear and concise enough i.e. "Is UX the same as Usability?" Run through a tool which returns the top X pages in SERPs. Extract page content from these to compare with your work done above (This is a bit more difficult because you will be looking at the entire page not just the Q&A content. Unless you find a way to identify and extract only this content)
If the question is not clear and concise, Take the results from both the Q&A portions and run them through the same process as you did above.
Now you'll want to compare these web results with your page to get a statistical significance on the topics, entities, concepts contained in both to determine which better corresponds with the most popular queries found in our keyword research tool in step 3 above.


Now the graph part...


For long-term success, you'll want to build a graph of all pages with these entities as parts of the graph. Something like Neo4j should work.
Build a topic model which identifies which pages contain which entities and the strength or relevance of those entities as the pertain to the overall topic. For example: "What is UX?" would contain general topics related to UX and other supporting topics. On the other hand, "How do I use [specific tool] in conjunction with [other specific tool] to achieve [specific concept]?" is very specific and should have (presumably) a larger amount of entities and less general terminology to answer a question for a presumed experienced UX user asking a specific question.
Leverage this graph data to provide helpful hints to the community when answering a users question. At the end of the day, your entity graph will be used to ask those answering the question to provide more detail around a specific set of topics.


Phew! Hope that helps...

10% popularity Vote Up Vote Down


 

@Heady270

When I want to figure out the keywords that users are actually searching for, I use the Google AdWords keyword planner. It used to be publicly available, but now you have to have an AdWords account to access it.

I start by coming up with a list of things that I think that users would be likely to search for to get to the page. Plug that into the tool which tells me which of those actually gets the most searches and gives me further examples.

Take this question as an example. Here are four phrases that I plugged in that I thought might be what users were searching for:



It also gave me a list of related keywords that I can pick through. Here are some of its best suggestions:


keyword research (14,800 monthly searches)
keyword generator (6,600 monthly searches)
keyword research tool (6,600 monthly searches)


If I were asked to evaluate the quality of this question I would compare it to other content available for those searches.



If you can't get an AdWords account, you can use Google Trends to see how popular the ideas that you came up with are. Unfortunately, it won't give you suggestions of similar related terms.

10% popularity Vote Up Vote Down


 

@Holmes151

This probably isn't the ideal way, but a possible way to get ideas.

Chances are people will get to the site/page only if the result is in the top one or two pages of the SERPs. So then it's a matter of finding what search terms will have the webpage to show in the top 10-20 results. Then get the other pages, primarily the top few, that show up for those search terms as quality comparisons. The higher in the SERPs the page in question is, the more "statistically likely" it is to be found by users and the more relevant the comparing content.

The goal is to get any search term in which the page in question will appear in the top 10-20 search results. Rather than testing every possible search query, you can probably assume that the page will only rank for relevant terms, and only relevant terms will get users to that page. Brainstorm every possible relevant term, use Google Keyword Planner to expand the list dramatically, and Google autocomplete as well to expand the list. Once you've garnered any and every relevant keyword that the page in question could possibly rank for, then you see if that page does rank for any of those terms. Plug that list into a bulk SERP checker. A premium SEO tool might work better, but a couple free ones I found quickly searchenginereports.net/ and serp-checker.ezmlm.org/ might work. Narrow down the terms that provide the site within the top 10-20 results and those are probably the most relevant search queries. Then use the other pages that show up for those search terms as quality comparisons. The higher up in the SERP's the page in question appears, the more likely people will get to the page with that query. Thus the more relevant the competing pages are for that term and for quality comparison.

A backlink checker will also help find keywords that the page could possible rank for reasonably well in the SERP's. The anchor text used in those links will boost the SEO for those terms increasing the likelihood people would find the site with that search term. ahrefs.com/, moz.com/researchtools/ose/, majestic.com/
It may not be ideal but there doesn't seem to be a lot of other suggestions. Hope it points you in the right direction.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme