The term ‗Disaster Recovery‘ describes the contingency measures that organizations have adopted at key computing sites to recover from, or to prevent any monumentally bad event or disaster. A disaster may result from natural causes such as fire, flood or earthquake etc. or from other sources such as a violent takeover, wilful or accidental destruction of equipment or any other act of such catastrophic proportions that the organization could be ruined. The primary objective of a disaster recovery plan is to assure the management that normalcy would be restored in a set time after any disaster occurs, thereby minimizing losses to the organization. The disaster recovery plan must take into account the physical location of the computer centre, since it can increase or decrease the chance of a disaster. Protection against flood, fire, earthquake or water logging etc. must be considered.
Although each organization would like to have a specifically tailored disaster recovery plan, the general components of the plan would be as follows:
1. Emergency Plan: This part of the Disaster Recovery Plan (DRP) outlines the actions to be undertaken immediately after a disaster occurs. It identifies the personnel to be notified immediately, for example, fire service, police, management, insurance company etc. It provides guidelines on shutting down equipment, termination of power supply, removal of storage files and removable disks, if any. It sets out evacuation procedures like sounding the alarm bell, activating fire extinguishers, evacuation of personnel. It also provides return procedures as soon as the primary facility is ready for operation like backing up data files at off-site, deleting data from disk drives at third party‘s site, relocation of proper versions of backup files etc.
2. Recovery Plan: This part of the DRP sets out how the full capabilities will be restored. A recovery committee is constituted. Preparing specifications of recovery like setting out priorities
for recovery of application systems, hardware replacement etc. will be the responsibility of Recovery Committee. The following steps may be carried out under this plan:
i. An inventory of the hardware, application systems, system software, documentation etc. must be taken.
ii. Criticality of application systems to the organization and the importance of their loss must be evaluated. An indication must be given of the efforts and cost involved in restoring the various application systems.
iii. An application systems hierarchy must be spelt out. This would be used when management decides to accept a degraded mode of operation.
iv. Selection of a disaster recovery site must be made. A reciprocal agreement with another organization having compatible hardware and software could be made. However, systems availability and data security problems must be considered at this point. Hiring a service bureau is another option. If the situation warrants, a fully operational backup site could also be considered.
v. A formal backup agreement with another company must be made. This should cover the periodical exchange of information between the two sites regarding changes to hardware/software, the time and duration of systems availability, modalities of testing the plan etc.
3. Back up plan: Organizations no matter how physically secure, their systems are always vulnerable to disaster. Therefore, an effective safeguard is to have a backup of anything that could be destroyed, be it hardware or software. As regards hardware, standby, as discussed above, must be kept with regard to the needs of a particular computer environment. So far as the software is concerned, it is necessary to make copies of important programs, data files, operating systems and test programs, etc. in order to get back into operation before the company can suffer an intolerable loss. Often, the originals are stored at site that is physically distant from the actual site, and where duplicate copies are used for processing. The backup copies must be kept in a place, which is not susceptible to the same hazards as the originals.
4. Test Plan: This plan looks after testing of DRP and analysis of the result. It identifies deficiencies in the emergency, backup or recovery plan. It contains procedures for conducting DRP testing like
i. Paper walk throughs: It involves critical personnel in the plan‘s execution, reasoning out what
might happen in the event of different disasters.
ii. Localised tests: It simulates system crash. This test is performed on different aspects of DRP.
iii. Full operational test: It is nearer to disaster conditions. Paper walk through and localized tests should have been conducted before completely shutting down the operations to simulate disasters.