Mobile app version of vmapp.org
Login or Join
Connie744

: How to copy a web page as static content (no JavaScript) Part of my job is to create mockup designs for clients, where our plugin is inside their existing website. This involves finding a

@Connie744

Posted in: #Copy #Download

Part of my job is to create mockup designs for clients, where our plugin is inside their existing website. This involves finding a page, downloading all the content, adapting it to run offline (use no online content), and adding the plugin.

By far the hardest (most time-consuming) stage is adapting the site to use local content. I'm familiar with Chrome's "save page as" option, and always use it as the first step, but I'm left with a mountain of JavaScript and CSS which always references online content (these aren't the best-made websites, and they're quite huge. Most of the referenced content is from CDNs).

Since this is only a mockup, I don't care about the interactive JavaScript so I use Chrome's "copy as HTML" option to get a snapshot of the HTML, and delete any JavaScript it had. I also use Chrome to delete any complicated bits that aren't needed, like Facebook/Twitter plugins. But I'm still left with the task of searching through the HTML and CSS for URLs, ensuring the resource each links to was downloaded properly (quite often with CSS links they aren't), and updating the URL to use a relative path.

My background doesn't go far enough into web design, so I'm hoping there's a tool or workflow I don't know about which does this sort of thing easily.

Can anybody suggest an easier/faster method for getting a 100%-local, static copy of a page?

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Connie744

2 Comments

Sorted by latest first Latest Oldest Best

 

@Lengel546

To view entire web page offline without having to save dependent files, consider saving the web page as a .mht file. While IE supports the format, not all browsers support it though.

More on MHT from Wikipedia -


MHTML, short for MIME HTML, is a web page archive format used to
combine resources that are typically represented by external links
(such as images, Flash animations, Java applets, audio files) with
HTML code into a single file. The content of an MHTML file is encoded
as if it were an HTML e-mail message, using the MIME type
multipart/related. The first part of the file is normally encoded
HTML; subsequent parts are additional resources identified by their
original URLs and encoded in base64. This format is sometimes referred
to as MHT, after the suffix .mht given to such files by default when
created by Microsoft Word, Internet Explorer, or Opera.

Beginning with Opera 9.50, the default format for saving pages is
MHTML

Creating MHTML files in current versions of Google Chrome (25.0) is
supported by toggling the "Save Page as MHTML" option on the
"chrome://flags" page. However, enabling this experimental option
disables the options to save pages as HTML-only or HTML Complete files

10% popularity Vote Up Vote Down


 

@Radia820

The problem is that outgoing links is considered external sites so you will may need to edit these out, but generally the best application for grabbing websites online to a local copy is httrack. Give it a try, it has many options.

Quote from their manual page regarding external pages.


No external pages

Rewrite all external links (links that needs an Internet connection)
so that there can be a warning page before ("Warning, you need to be
online to go to this link..") Useful if you want to separate the local
and online realm

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme