How to Save Web Pages and Blogs for Offline Reading

Store Web Pages for Offline Viewing

If you have Google Desktop running in the background, you already have a local copy of all web pages that you have recently opened / read in any browser on your computer. You can click "Browse Timeline" inside Google Desktop and your web history will be listed in reverse chronological order - the most recently visited websites will be listed at the top.

The problem with web history in Google Desktop is that it can get cluttered too easily and finding relevant pages from the history may require some effort.  In that case you may install Scrapbook for Firefox and only save relevant web pages that you intend to read in an offline environment.

Scrabbook, like Google Notebook, is primarily for organizing web research but it’s an excellent offline browser as well. You can specify the depth level and all target links from the current web page (up to that level) will be saved offline automatically. For instance, you want to read all stories on the CNN and BBC website offline. Capture the home page with Scrapbook and set the depth as 1 - it will then save full text of all the front page stories as well.

Scrapbook can export all the web captures as an HTML web page so you can easily read the saved content on a mobile phone or your PDA. Another popular tool for downloading web pages in Firefox is DownloadThemAll.

The limitation with either of the above tools is that they work only in Firefox and also require some manual work. What if you want to read all front stories from all major news websites while offline? All news sites provide RSS feeds but they aren’t full text so you have no option but to scrap content from the main website in order to read it offline.

HTTrack is a free website copying software where you can create download jobs and execute them whenever you go online. For example you can create a single download job for all news websites (like BBC, NYT, etc.), set the depth limit as 1 and get an offline version all the front news stories in one go. You can also save this job and re-execute it anytime later either manually or set it up as a scheduled task.

Another good alternative to HTTrack is wget available for Mac, Windows and Linux. You don’t have to spend time learning the complicated command line switches of wget as there are nice GUI apps available both for Mac (CocoaWget) and Windows (WinWget).

Download Blogs for Offline Reading

Blogs, or websites that offers RSS feeds, are much easy to handle and save because we know exactly what stuff has changed since we last visited that site.

There are two categories of blog readers - (a) Addicts or people who are subscribed to several hundred feeds and want to read them all while offline and (b) Casual Readers or people who follow only a dozen or so feeds.

Casual readers can simply add their favorites feeds to Tabbloid and download them all as a PDF newsletter (example).

For people who fall in the category of addicts, the solution that will work best is a dedicated offline reader that can pre-fetch all the new articles and here are some good choices:

My first recommendation has always been FeedDemon - it’s fast, rich in features and the upcoming v2.8 is even better since it lets you export unread items as an HTML web page that can be read on any device.

If you are subscribed to feeds in Google Reader, you can either try RSS Bandit or  Scoop - these are desktop based readers that work in offline mode and can synchronize with your Google Reader subscriptions. If you are on Bloglines, a similar solution for you exists in the form of GreatNews - a desktop RSS reader that is also portable. Google Gears is another solution for Google Reader users but it has limitations.

The advantage with either of the above solutions is that they all support synchronization - so if you mark an item as read in an offline environment, the change will get propagated when you go online next so there’s no double work.

Saving Blogs & Web Pages for Mobile Phones

If you plan to save web pages for offline viewing on a mobile device (with a small screen), I would recommend Web2Book - it not only downloads multiple web pages and blogs in one go but also converts them into formats like HTML or PDF that are supported on almost every mobile device.

Web pages saved with Web2Book can be easily read on ebook devices like the Microsoft Reader or the new Sony Reader. Another option for mobile devices is Plucker - it’s an offline browser available both for Windows Mobile and Palm based PDAs.

If you are an iPod owner (the old models, not the latest iPod touch), you can even turn your MP3 player into a notes reader and read web pages as plain text.

Drawloop, an online service that I mentioned in the previous Adobe PDF guide,  too can join multiple web pages and save them in a single PDF file like in this example where you have the home pages of three news websites saved in a single file.

source:labnol

Post a Comment

Previous Post Next Post