Mobile app version of vmapp.org
Login or Join
Merenda212

: How can I see what filename my browser is reading when it's on a site? It would be valuable to know what file a browser has found to read to render the home page of domain. I know the

@Merenda212

Posted in: #Browsers #Filenames #Homepage

It would be valuable to know what file a browser has found to read to render the home page of domain.

I know the browser goes to something like google.com and then starts to look for default file names/types such as index.html then perhaps index.htm and so on through a list of about a dozen other files.

I'm curious what file my browser has actually began to render (right clicking in the browser window and clicking "Save As" doesn't give the file name), and then secondly, I wonder if a site would begin to render faster if the file name present on the domain (e.g. index.php) was one of the initial file attempts the browser looked for vs. something more atypical (eg. placeholder.html).

10.04% popularity Vote Up Vote Down


Login to follow query

More posts by @Merenda212

4 Comments

Sorted by latest first Latest Oldest Best

 

@Hamaas447

I'm limited to 2 links in my response... to talk about how links actually work, it's a challenge :-/
Perhaps I'll gain a few points and I could come back and improve this expiation soon.

As mention already, website behaviours and response is specific to server used and it's wide range of settings.
Perhaps I could describe the 'typical' behaviour of a LAMP server (Linux Apache Mysql Php - Probably the most common web server used nowadays)

In the apache configuration of exemple.com you will have a DocumentRoot directive telling apache where to look for folder matching that site, lets say /www/

You can have a file called .htaccess that can hold some specific definitions for that directory and all it's subdirectory.

Placing .htaccess in /www/ will apply to the entire site (note that in unix convention, a filename starting by a dot is a hidden file)

Placing that file ins /www/test/ will apply to all calls (GET POST PUT etc...) starting with exemple.com/test/ A GET http :// exemple.com /test/ in reality is

GET /test/ HTTP/1.1
Host: exemple.com


But it's easier to write GET http : // exemple.com /test/

Apache will receive that call as it listens on port 80
When you see exemple.com:8080/ it means you force the port to be 8080

http : // exemple.com /test/ = http : // exemple.com :80 /test/

Apache will look for a .htaccess file in /www/ and in /www/test/

It will interpret first the one in /www/ then the one in /www/test/ (of course if the file exists)

So first things you want to have a look for those files as they are there to give specific directives that are not the default behaviour.

10% popularity Vote Up Vote Down


 

@Hamm4606531

Your browsers doesn't load any file, it requests a resource which the server then provides at his discretion (lengthy elaboration below).

If you type google.com into your browsers toolbar, the browser wil first append a protocol, either or https.
Then browser will look up the IP address belonging to google.com, which is 172.217.19.206. Your browser will then establish a socket with that server on the proper port (80 for http, 443 for https).

Your browser will then send the following request to the server:

GET / HTTP/1.1
Host: google.com


The web server will then decide what to do. This can involve a lot of steps.

A web server usually has something called a document root for any domain he serves. Files that the web server is allowed to serve the user usually reside inside this document root. For example, the document root for google.com might reside at /var/www/domains/google.com/htdocs/.

Now, when you request a resource the web server first inspects the resource, and then takes proper action. For example, if the resource ends wih .php, the web server might decide not to serve anything himself, but instead invokes the PHP interpreter, and lets the PHP interpreter execute the proper PHP-file for the requested resource, and then serves the user the output.

Take for example this request:

GET /article.php?id=123456 HTTP/1.1
Host: news.example.org


In this case, the web server on news.example.org is tasked to serve the resource /article.php?id=123456. What likely happens is that this web server will start the PHP interpreter. fetch the article.php file from the document root, feed it into the PHP interpreter and wait for the output. It will then send the output bck to the browser that requested it. In this case, this would likely be a site from a blog with certain content loaded from a database (the contents of the article stored with the id 12345).

But other things can happen, too.

Lets get back to the original example:

GET / HTTP/1.1
Host: google.com


What with any stanadard web server (Apache, Lighttpd, etc.) happens, is more or less the following:


Look for a file named index.html (in the document root) and serve it
If that doesn't exist, look for a file index.htm and serve it
If that doesn't exist, look for a file index.php and start the PHP interpreter, serve the output
If that doesn't exist, serve a 404 NOT FOUND error


The precedence of the extension is usually configurable on the side of the web server. The server might not serve any index.xxx file at all. For example, if you have a node.js server running, then the web server would task the node.js server to provide the resource /, which might just be anything the JS-App that is runnng on node decides it to be.

tl:dr; The browser doesn't look up a file. The browser requests a resource, the web server then handles the request and serves the content approrpiate for the requested resource, which might be a file, but might also be the output of a 3rd-party program.

As far as speed is concerned, this is dependent on the web server. But if you want your web server to always serve the asjkdjhfz9874jykdfndsk.html file when / is requested, you would usually configure the web server to look for such a file first, making it as fast as any other configuration.

Disclaimer This is not a full decsription how any web server works, nor tailored specific to one. Most web servers work similar, but especially sites like google.com will likely run some custom things that are tailored specifically to their needs.



Your browser will usually offer tools to inspect network activity. Using Chrome, you can open the "Dev tools" and inspect the headers. this is what my browser send to SE to let me edit this answer:

GET /posts/93567/edit HTTP/1.1
Host: webmasters.stackexchange.com


There are a few more that tell the server about caching, what language I am expecting, what browser I am using and where I'm coming from, but those aren#t interesting here. The point is, my browser requests the resource /posts/93567/edit. My browser will never have any idea about what file the web server serves. SE runs on ASP.NET MVC 5, which means that the web server (in SEs case, an IIS) will likely load some proper .asp file (that can be located anywhere), and lets the runtime evaluate it for the parameter postId=93567. The actual file or inner workings are never exposed to the browser, because the browser doesn't need to know (and because it is safer to hide that information for the one running the server).

The view will also show you any other resources (CSS files, JS files, images etc.) that your browser requests to correctly render the site. But with them, you will only learn about the resource, not wether or not those are actually files in the file system.

10% popularity Vote Up Vote Down


 

@LarsenBagley505

I'm wondering how I know what file is being rendered. I can eventually guess it accurately on a long enough timeline by simply explicitly calling on that filename in the URL. xyz.com/index.html fail to load anything? Then try xyz.com/index.htm and then so on until I get the site to render. I'm just looking for a shortcut to know what file my browser has loaded.


I agree with John here that what you are requesting by specifying a URL is a resource (or for a better word, an object) from the server.

You'll never know 100% for sure what actual disk file is being read when a URL is requested. This is especially true if the server requires a third-party program to associate with it in order to produce output.

A typical third-party program is the PHP interpreter which is something Wordpress uses to deliver content. The interpreter can process code that may involve loading any number of files from the server's disk in order to construct the HTML data which is then delivered to the user's browser.

On top of that, special configuration can be applied to the server to assign special URLs to resources. This (in an apache environment) is known as URL rewriting, and this is very good since its the start to friendly URL creation.

The users won't know the exact filenames of the files loaded nor will they care (unless they are hackers) because all they care about is the actual content on the page.

It's also possible that some server admins decide not to use actual filenames in URLs for security reasons.

10% popularity Vote Up Vote Down


 

@Pope3001725

The browser isn't looking for a file. It's just asking for a resource. The server then decides what that resource returns.

At it's most basic level that "file" is literally just a file. In the case of the default index page of a directory how the server is set up will determine which files is returned. Some servers are configured by default to return index.html if the file exists, then fall back to index.htm, etc. Others default to default.html, etc. They will keep trying until the list of default files is exhausted and will then return a 404 error.

In the case when server rewriting is turned on or dynamic pages are being constructed, the content being returned typically isn't a file at all. It resembles a file as the output is (typically) HTML just like a .html file would contain. But behind the scenes tens or hundreds of files create that content.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme