Mobile app version of vmapp.org
Login or Join
Berryessa370

: Advice on trying to determine the rankings of my keywords via PHP I've written a piece of code, which crawls the search engines, first 40 50 results and check's for the specified keywords and

@Berryessa370

Posted in: #Google #Search

I've written a piece of code, which crawls the search engines, first 40 50 results
and check's for the specified keywords and tries to gather some statistics for them.

I've heard that this may be called content scraping but i really don't think that
this is the issue here since i would only call it maybe twice a day, on a set of
keywords that i think will be relevant to my service.

For using SEOmoz and other players in the game, i really think it's unfair because
it's really much they want for some simple keyword tracking.

Please give me some advice on what can go wrong and what would be a good way.

Thank you very much

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Berryessa370

2 Comments

Sorted by latest first Latest Oldest Best

 

@XinRu657

If you're only running the script a few times a day then it's unlikely anything will actually go wrong. The main issue are as follows:


You need to supply a dummy useragent similar to one produced by a browser
Search results vary depending on factors such as the IP you're using so you might not get results that are as useful as they seem
Google regularly changes the way they display results and it could break your code so it needs to be updated


Google terms what you are doing "Screen Scraping" and this is against their usage terms of service no matter how few times the script is run. They state that this is because it puts excessive loads on their servers.

However, I think that there's probably a copyright issue as well. If they let people easily copy their content then it would be possible to create a very cheap search engine that takes results from Google but replaces the ads. Google spends a lot of money on their search algorithm and needs to collect ad revenues to make a profit and continue development.

I guess that they also don't want marketers running these types of script because they don't want SEO's trying to gain insights they can use the unfairly manipulate rankings. There might be a similar issue for AdWords where they don't want marketers knowing too much about the keywords and ads that their competitors are using.

The main defense that Google uses against automated scripts is to temporarily block IP's that issue too many requests within a short period of time. As I stated earlier, your script shouldn't run into this problem but people issuing large numbers of queries usually use proxy servers to make regular IP changes and reduce the chance of detection.

Finally, this is beyond the scope of the question but another issue this type of script faces is that they also violate the terms of service of other Google systems. For example, the AdWords API TOS explicitly states you can't use the API with such a program. This could lead to API Developer Tokens being revoked on detection of these methods.

10% popularity Vote Up Vote Down


 

@Pope3001725

Use Google's web search API and you won't be content scraping anymore. (I linked to the search results since it returned a lot of resources on the topic).

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme