Cover of masterclass

Masterclass for Web Managers, incl. editable templates ($29.99)

BBC News Online recently celebrated its 10th birthday. Although 1997 may not seem so long ago, it has been long enough for many aspects of the original design and content to have been lost forever.

As the organization itself admits, little thought was given back in the 1990s to preserving its web activity.

Yet, the idea of creating a permanent website archive predates even the establishment of the BBC online.

In 1996, a group of enthusiasts set up the Internet Archive. The aim was to record a 'snapshot' of the design and content of as many sites as possible, at regular intervals. These snapshots could then be preserved as fully navigable and working models for the rest of time.

To its credit, the WayBackMachine (as the database is called) now holds over 85 BILLION web pages. This includes a copy of the BBC News site from December 1998, the iQ Content website from 2003 and probably yours too!

Other website archiving services

A variety of other market solutions are now available if you wish to create a custom archive of your site. This includes:

The advantage of these tools over the Internet Archive is that they can capture all the content on a website, including any dynamic elements.

Unfortunately, because of the sheer volumes of data it collects, many images and other formats are missing from the WayBackMachine, e.g. video, Flash. Indeed, the non-subscription part of the Internet Archive should not be counted on if you are serious about keeping a permanent record of your website.

For the price of a couple of hundred Euro per year, the other vendors will record a complete inventory of the design and content of your website at agreed intervals. These will then be made available to you - possibly on DVD - where they can be browsed as if live.

Why start a website archive?

Well, the first reason is history.

Believe it is not, your website is an important document. It is a record of the circumstances of your organization at a particular moment in time.

More generally, it offers revealing insights into the economic, social, technological (and design!) development of your country at a point in history.

Understandably, this line of reasoning may not be adequate to persuade the owner of a small brochureware site to inaugurate an archive. Neither should it. For this purpose, the Internet Archive is more than adequate (and free)!

Nevertheless, there are some organizations who should seriously consider establishing their own collection. Government, in particular, stands out.

Government, official records & the web

Government has an almost insatiable appetite for recording and preserving its deeds and actions. Just think of the gigantic size of the Library of Congress.

This penchant for record keeping should now be extended to the web. Indeed, this is already a statutory obligation in some jurisdictions, e.g. USA, Sweden, and likely to be introduced elsewhere.

If you are a webmaster for a government body, I strongly recommended that you begin an archive of all your websites. This is particularly important for once-off initiatives, microsites or projects of a limited timescale.

For example, is a record being kept of these short-term Irish government websites?

What about sites that are already past their 'use by' date, such as www.eu2004.ie
Will it be kept online forever?!

The danger is that in the absence of an official record - the information, design and knowledge within these sites could be lost for good. Reformatting a disk is just too easy!

Business website = intellectual capital

For business, the driver for starting an archive is to preserve the raw intellectual capital that the design & content of your website represents.

Think of your site in terms of the investment it has absorbed. No matter what its stage of development, it represents the output of some bloody hard work. Do you really want this to be deleted?

Intel had to shell out 10,000 sterling for an original copy of the magazine that described Moore's Law - simply because the firm forgot to preserve a copy of it themselves!

How often to update your archive

There is no hard and fast rule for how often you should update your archive.

My advice is to consider the 'Scale' of your site. In particular, think about its size and the frequency of changes made to it. You will then able to make an informed decision.

As a rule of thumb, the following may be appropriate:

  • Once a month for a busy site.
  • Once every 8-12 weeks for an average site.
  • Once every 6-9 months for a quiet site.

'Store in a dry location (away from direct sunlight)'

Happily for you, government webmasters have the ultimate in long-term storage at their disposal. Just toss your DVDs into the National Archive and be assured that they will be professionally catalogued and stored for future researchers.

(Worryingly the National Archive of Ireland makes no mention on its own site of the means by which it intends to record web content generated by the state!)

For business, the best approach is to treat your archive DVDs in the same way as other important media, e.g. customer data, financial books. That is, Store it offsite in a secure (and insured) location where it can be protected against fire, theft, flooding, etc. Many such services are available.

Further reading

Archiving Web Site Resources: A Records Management View
Archives.gov (US Government)
UK Web Archive