Mobile app version of vmapp.org
Login or Join
Holmes151

: The right way of using index.html I have quite a lot of issues I'd like to hear your opinion on, so I hope I'll manage to explain it well enough. I should also note that I'm beginner equipped

@Holmes151

Posted in: #GoogleSearch #Html #Seo

I have quite a lot of issues I'd like to hear your opinion on, so I hope I'll manage to explain it well enough. I should also note that I'm beginner equipped only with the knowledge of HTML and CSS so although I'm almost sure that there is a simple solution using powerful PHP, it won't help me.

Let's say that I have my personal blog on the address example.com/blog.html and there are links to several sub-blogs example.com/blog/math.html, example.com/blog/coding.html etc. So my root folder contains blog.html and blog folder, the blog folder itself contains files math.html and coding.html.

First of all, I learned (from Google Webmasters Tools) that for SEO and aesthetical purposes it's good to unify example.com.com and example.com/index.html by adding _rel="canonical"_ attribute into the source of the index.html. Using a couple of other tricks (like linking to ../ and ./) I got rid of the ugly index.html appearing in my web addresses.

And now I wonder if this trick can be used not only for the root folder but for any folder? I mean, I would move my blog.html into the blog folder, rename it into the index.html and add rel="canonical" to unify example.com/blog/index.html with example.com/blog/.
This trick would change the address of my blog from example.com/blog.html into example.com/blog/.

Not finished! I'm also experiencing problems with the google robot indexing my folders. So when I type site:example.com/ into the google search, the link to my folder example.com/blog/ with raw files, icons etc. appears among the other results. I guess there are also other ways how to fix it, but IMHO the change mentioned above would do the trick too - the index.html in the blog folder would preserve the user from viewing the actual raw content of that folder, there would appear only the right link example.com/blog/ in the google search and (I hope that) _rel="canonical"_ would make the second, unwanted link example.com/blog/index.html not to appear in the search results.

So my questions are:


Is it a good practice to have the index.html file in every subfolder or is it intended to be only in the root folder?
Are there any disadvantages or problems that may occur when using the second, "index in every folder" method?
Which one of the two ways of structuring the website described above would you prefer?

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Holmes151

2 Comments

Sorted by latest first Latest Oldest Best

 

@Kevin317

The technical term for index.html is Directory Index for Apache and Default Document for IIS. The other Apache directive of interest is the Options directive. As indicated in the documentation, when Options Indexes is set:


If a URL which maps to a directory is requested, and there is no DirectoryIndex (e.g., index.html) in that directory, then mod_autoindex will return a formatted listing of the directory.


When I setup a website that is not using a content management system, my preferred setup is to have one content page per directory. That page is the directory index (default document) for the directory. All links on the site only link to the directory and end with a trailing slash (e.g., example.com/blog/ instead of example.com/blog/index.html or ./blog/ instead of ./blog/index.html). The trailing slash is important to avoid what is commonly referred to as a courtesy redirect. (If the trailing slash is omitted, everything still resolves correctly, but the number of HTTP requests and thus bandwidth increase.)

My primary motivation for the above methodology is twofold. First, it facilitates switching the technology used on the website. For example, I can change a page from index.html to index.php without breaking any links or search engine listings. Second, the file extension of a content page is "noise"; removing the file extension from the URL results in shorter and hopefully more readable URLs.

As for other file types:


All CSS files reside in a css directory in the root of the website.
All image files reside in an image directory or subdirectory thereof in the root of the website.
All JavaScript files reside in a scripts directory in the root of the website.
All flash and other movie files reside in a video directory or subdirectory thereof in the root of the website.


On an Apache server, I disable Options Indexes for the abovementioned directories. On both Apache and IIS servers, I do not specify a directory index (default document) for the abovementioned directories. Thus, a request for any of the directories results in an HTTP 403 error.

10% popularity Vote Up Vote Down


 

@Rambettina238

The reason we use index.html or home.html or derivitives thereof, is because the webserver software itself actually looks for that and serves it. For example:

This is INVALID: (www-directory)

/var/www/
|_blog.html
|_blog/
|_math.html
|_page2.html
|_page3.html
|_(...)


This will in fact get served as a page listing the folders and files. (Not what you want). You can try this structure, but also make an index.html file next to blog.html. Notice how it will not serve blog.html unless you specify www.site.com/blog.html) This is why www.google.com/ shows the page without you having to specify www.google.com/index.html
This is VALID:

/var/www/
|_index.html (renamed blog.html to index.html)
|_blog/
|_math.html
|_page2.html
|_page3.html
|_(...)


This will serve your blog.html file AS THE HOMEPAGE. (Not list all the folders/files in that directory)

The webserver software has (in the config) a specialized list of file names that will be served as the homepage or the main page of a folder. (In my experience, index.html takes precedence over index.php, so if you have index.html and index.php in a folder, the index.html is what the public will see) Of course that can all be changed, and you can even set blog.html to be recognized as an "index".

Addressing your comment:


"This trick would change the address of my blog from
xxx.com/blog.html into xxx.com/blog/.


This would be done by moving blog.html entirely into /blog/ and renaming it to index.html.

Your new structure would be:

/var/www/
|_blog/
|_index.html (renamed from blog.html)
|_math.html
|_page2.html
|_page3.html
|_(...)


This should correctly serve www.site.com/blog/ to show the contents of your blog.html which we renamed to index.html so the software could set it as the index of your directory /blog/

You're also free now to put and index.html file into the root of your site www.site.com/(index.html) to have links to /blog/ and whatever else you wish.

Specifically answering your questions in short statements:


Is it a good practice to have the index.html file in every subfolder or is it intended to be only in the root folder?

Yes, because it prevents people from seeing what files are in your directories. You can prevent this with a .htaccess file containing Options -Indexes
Are there any disadvantages or problems that may occur when using the second, "index in every folder" method?

None that I can think of.
Which one of the two ways of structuring the website described above would you prefer?

I usually have an index.html or index.php file in the root, subfolders based on category (such as forum or news or login etc.) and then some sort of index inside each of those.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme