Mobile app version of vmapp.org
Login or Join
Berryessa370

: Google search results are invalid I'm writing a program that lets a user perform a Google search. When the result comes back, all of the links in the search results are links not to other

@Berryessa370

Posted in: #Google #GoogleSearch #Http #Redirects #SearchEngines

I'm writing a program that lets a user perform a Google search.

When the result comes back, all of the links in the search results are links not to other sites but to Google, and if the user clicks on one, the page is fetched not from the other site but from Google.

Can anyone explain how to fix this problem?

My Google URL consists of this:
google.com/search?q=gargle
But this is what I get back when the user clicks on the Wikipedia search result, which was www.google.com/url?q=http://en.wikipedia.org/wiki/Gargling&sa=U&ei=_4vkT5y555Wh6gGBeOzECg&ved=0CBMQejAe&usg=AFQjeNHd1eRV8Xef3LGeH6AvGxt-AF-Yjw
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en" dir="ltr" class="client-nojs" xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Gargling - Wikipedia, the free encyclopedia</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="MediaWiki 1.20wmf5" />
<meta http-equiv="last-modified" content="Fri, 09 Mar 2012 12:34:19 +0000" />
<meta name="last-modified-timestamp" content="1331296459" />
<meta name="last-modified-range" content="0" />
<link rel="alternate" type="application/x-wiki" title="Edit this page" >
<link rel="edit" title="Edit this page" >
<link rel="apple-touch-icon" >
<link rel="shortcut icon" >
<link rel="search" type="application/opensearchdescription+xml" >
<link rel="EditURI" type="application/rsd+xml" >
<link rel="copyright" >
<link rel="alternate" type="application/atom+xml" title="Wikipedia Atom feed" >
<link rel="stylesheet" href="//bits.wikimedia.org/en.wikipedia.org/load.php?debug=false&amp;lang=en&amp;modules=ext.gadget.teahouse%7Cext.wikihiero%7Cmediawiki.legacy.commonPrint%2Cshared%7Cskins.vector&amp;only=styles&amp;skin=vector&amp;*" type="text/css" media="all" />
<style type="text/css" media="all">#mwe-lastmodified { display: none; }</style><meta name="ResourceLoaderDynamicStyles" content="" />
<link rel="stylesheet" href="//bits.wikimedia.org/en.wikipedia.org/load.php?debug=false&amp;lang=en&amp;modules=site&amp;only=styles&amp;skin=vector&amp;*" type="text/css" media="all" />
<style type="text/css" media="all">a:lang(ar),a:lang(ckb),a:lang(fa),a:lang(kk-arab),a:lang(mzn),a:lang(ps),a:lang(ur){text-decoration:none}

/* cache key: enwiki:resourceloader:filter:minify-css:7:d5a1bf6cbd05fc6cc2705e47f52062dc */</style>


What's unclear to me is how a regular browser ever receives the wikipedia.org link to be able to put it in its address bar...

The headers aren't too helpful either:

"Cache-Control" = "private, max-age=0";
"Content-Type" = "text/html; charset=UTF-8";
Date = "Fri, 22 Jun 2012 15:28:51 GMT";
Expires = "-1";
Server = gws;
"Transfer-Encoding" = Identity;
"X-Frame-Options" = SAMEORIGIN;
"X-XSS-Protection" = "1; mode=block";


Status code was 200, not redirect.

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Berryessa370

3 Comments

Sorted by latest first Latest Oldest Best

 

@Si4351233

You have to scrape the search results page and get the first <a> tag in the <h3> tag within the ID ires

There are plenty of PHP classes to parse the DOM and HTML which let you specify tags nested tags and even attributes of tags to get the data you want.

10% popularity Vote Up Vote Down


 

@Goswami781

They use a redirect page to track what you click on. If you fetch that redirect page yourself then you will get redirected to the link and you could give that page to your users. The redirect page uses Javascript. Here's one example:

<script>window.googleJavaScriptRedirect=1</script>
<script>var f={};
f.navigateTo=function(b,a,g){if(b!=a&&b.google)
{if(b.google.r){b.google.r=0;b.location.href=g;a.location.replace("about:blank");}}
else{a.location.replace(g);}};f.navigateTo(window.parent,window,"http://www.parliament.uk/bigben");
</script>
<noscript><META http-equiv="refresh" content="0;URL='http://www.parliament.uk/bigben'"></noscript>


You can see it has a <noscript> fallback if you don't have javascript enabled. (The way I got that was to right click on the title to get the link and then put view-source:(link) into my browser (Chrome).

The link is also listed on the second line of the results (but not as a link) so you could take that text and use that for the link you give to your users.

10% popularity Vote Up Vote Down


 

@Sarah324

That's personalized search. If someone is signed in to Google and they allow Google to track their web history, or Google is tracking clicks for any other reason, that is what they will see. You can't do anything to prevent this from happening.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme