Mobile app version of vmapp.org
Login or Join
Eichhorn148

: Can google index/crawl random subdomains that aren't linked to from anywhere Me and a friend are having a debate where she claims that every and all subdomains will get crawled/indexed by google

@Eichhorn148

Posted in: #Googlebot #GoogleIndex #GoogleSearch #Seo #WebCrawlers

Me and a friend are having a debate where she claims that every and all subdomains will get crawled/indexed by google unless you specifically tell it not to, and I'm saying that if a page isn't linked to from anywhere then it shouldn't get crawled.

For example, let's say I own example.com and I make a new subdomain with a weird random name such as adayinthewoods.example.com and I throw a wordpress installation on there that I plan to use for testing purposes.

What would have to happen for google to start crawling and indexing this? Is google going to look at the whois records and see that I've added a subdomain in my DNS table and then start crawling it as a result? Is the fact I've installed wordpress on it making my installation "ping search engines"? How does that work? How do top level domains get crawled when it's a brand new domain? I assume the mechanism there is different from random subdomains.

What if I add a new page in my root folder called "noOneWillEverSeeThis.html", could that ever get crawled/indexed if it wasn't included in any sitemap and wasn't linked to from anywhere?

Would really appreciate a solid answer from someone that understands what's going on with this.

Thanks much

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Eichhorn148

3 Comments

Sorted by latest first Latest Oldest Best

 

@Berumen354

Nope if you sub domain was not having any high quality content or any link from anywhere (from main domain or from anywhere else) google was not going to index/crawl your random subdomains.

10% popularity Vote Up Vote Down


 

@Angela700

Yes, Google can do that and it is best to assume that anything that is publicly available on the internet may be indexed by Google. Linked to or not.

Of course, if you don't link to it the chances of it being indexed go way down. However, Google uses a multitude of tools to gather URLs for indexing. Recently there was a news item about Dropbox links that had been shared (thus becoming publicly available) being indexed by Google because people clicked on links in the documents or put their URLs in Google search box.

It doesn't really matter how Google finds the link. The point is that it may.

So the bottom line is that if you don't want something to wind up in Google you must put a robots.txt file in place to keep Google out. Relying on obscurity is not advised.

10% popularity Vote Up Vote Down


 

@Harper822

Think of the linking as a chain reaction. Google won't link to domains if it has no way of accessing it or even finding it. If a friend advertises your URL on a popular forum site that Google always indexes, then there's a chance Google will scan your URL and possibly index it, thinking the link may be part of the site.


Is the fact I've installed wordpress on it making my installation "ping search engines"? How does that work?


I doubt content management systems will randomly ping search engines, but you can always check the source code and see if any code contains commands for opening remote URL's. Examples of such code in PHP (which is what wordpress uses) might include (in no particular order):

$data=file_get_contents("http://www.searchengine.com/submittoengine/data.cgi?whatever=whatever");

$remote=fopen("http://www.remote.com/upload.cgi?website=bla.com");


Or even curl functions which may include:

curl_exec($webdata);


A good way to see how wordpress behaves network wise is to create your own LAMP/WAMP setup. This means use Linux or Windows and install Apache, MySQL and PHP on one computer then disconnect from your actual internet and setup Apache so that you can access content when you type in one of the following URLs:
127.0.0.1/ http://localhost/


When apache is installed correctly for the first time regardless of the status of your real internet connection, you should see something like "It works" or something other than a "could not connect to remote server" type of message. Next, install wordpress and then see if it complains about the internet or pinging, etc. I bet it wouldn't.

Everything is best done by experimentation.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme