LOADING

Type to search

Disaster Recovery Plan

Information Technology Disaster Recovery Plan

Issued by: John Murray
Version: 1.0
Date: 7/20/18

1.0 Purpose and Scope

This plan has been designed to be used as a guide in the event of a critical event that affects any part of the operation of the Information Technology department.   This plan is structured around plan management, plan objectives, the responsible personnel, and a plan of action.  The decision to invoke this plan will be the responsibility of the Disaster Management Team Leader which is designated to the Chief Information Officer. This plan contains all the information necessary to recover and restore operational service in the event of a serious disruption in IT services.

1.1 Updating this plan

This plan must be kept up to date.  It is the responsibility of the Plan Owner to ensure that procedures are in place to implement the plan.  The Plan Owner will review the document quarterly for any revisions. The Plan Owner will ensure that all members of the recovery team are kept informed of changes to the plan.

1.2 Distribution List

The Plan Owner will ensure the distribution of the plan to team members as needed. Below lists a table of members of the recovery team.  Each team member should get 2 copies of the plan.  One copy is to be kept at the place of the work.  The other copy should be kept at home or some other offsite location.

Disaster Management Team Leader
Network and Operations Team Leader
Facilities Team Leader
Other (IT application and support staff)

2.0 Plan Objectives

A disaster is defined as an incident which results in the loss of computer processing or network access at Saint Joseph’s College to the extent that there is a loss or damage of assets such that computer processing capability is lost over an extended period of time, or that assets must be relocated to a standby facility.  A disaster can result from a number of accidental, malicious, or environmental events such as fire, storm damage, terrorist attack, human error, software or hardware errors.

2.1 Specific goals of the plan are:

  • To minimize the total amount of downtime and disruption to business.
  • To be operational using alternate facilities (if needed)  within stated period of time.
  • To reinstate operations at the original location within stated period of time.
  • If required, a standby facility may need to be brought online with necessary environmental and power requirements.   This would likely be located in another building on-campus that was not directly affected by the disaster.

3.0 Recovery Team and responsibilities

The Recovery Team will be made up of various Saint Joseph’s College personnel.  The Team is responsible for providing overall direction of the data center recovery operations.  It ascertains the extent of the damage, activates recovery organization and notifies Team Leaders. Its prime role is to monitor and direct the recovery effort.

3.1 The Disaster Management Team Leader

Responsible for deciding whether or not the situation warrants the introduction of  disaster procedures.  If he does decide it does, then the organization defined in this section comes into force and, for the duration of the disaster, superseded any current management structures within IT.

  • Notifying senior management of the disaster, recovery progress and problems.
  • Initiating disaster recovery procedures.
  • Coordinating recovery operations
  • Monitoring recovery operations and ensuring that the schedule is met.
  • Documenting recovery operations.
  • Liaising with user management.
  • Expediting authorization of expenditures by other teams.
  • Recording emergency extraordinary costs and expenditure
  • Making a detailed accounting of the damage to aid in insurance claims.
  • Ensuring that the conversion to the standby facilities and the final resumption of operations at the data centre are under sufficient audit control to provide reliability and consistency to the accounting records.
  • Monitoring computer security standards.
  • Ensuring that appropriate arrangements are made to restore the site and return to the status quo within the time limits allowed for emergency mode processing.
  • Approving the results of audit tests on the applications which are processed at the standby facility shortly after they have been produced.
  • Performing a detailed audit review of the critical accounting files after the first back up cycle has been completed.
  • Declaring that the Disaster recovery Plan is no longer in effect when computer processing is restored at the primary site.

3.2 Network and Operations Team

The Network and Operations Team is responsible for the computer environment (computer room and other vital computer locations) and for performing tasks within those environments.  This team is responsible for restoring computer processing and for performing computer room activities. The team is responsible for all computer networking and communications.

  • Providing ongoing technical support of recovery stage and at standby facility (if required).
  • Obtaining all necessary back ups from off site storage.
  • Initiating operations at the standby facility (if required).
  • Re establishing software libraries and databases to the last backup.
  • Coordinating the user groups to aid the recovery of any non recoverable data.
  • Providing sufficient personnel to support operations at the standby facility.
  • Managing the standby facilities to meet users’ requirements.
  • Establishing processing schedule and inform team leaders.
  • Arranging for acquisition and/or availability of necessary computer supplies.
  • Evaluate the extent of damage to the voice and data network and discuss alternate communications arrangements with telecoms service providers.
  • Establish the network at the standby facilities(if required) in order to bring up the required operations.
  • Define the priorities for restoring the network in the user areas.
  • Order the voice/data communications and equipment as required.
  • Supervise the line and equipment installation for the new network.
  • Providing necessary network documentation.
  • Providing ongoing support of the networks at the standby facility.
  • Re establish the networks at the primary site when the post disaster restoration is complete.

3.3 Facilities Team

The Facilities Team is responsible for the general environment including buildings, services and all environmental issues outside of the computer rooms. This team has responsibility for security, health and safety and for replacement building facilities.

  • In conjunction with the Disaster Management Team, evaluating the damage and identifying equipment which can be salvaged.
  • As soon as the standby site(if required) is occupied, cleaning up the disaster site and securing that site to prevent further damage.
  • Supplying information for initiating insurance claims.
  • Ensuring that insurance arrangements are appropriate for the prevailing circumstances (i.e. any replacement equipment is immediately covered etc.)
  • Preparing the original data center for reoccupation.
  • Maintaining current configuration schematics of the Data Center (stored off site) This should include:
    • air conditioning
    • power distribution
    • electrical supplies and connections
    • specifications and floor layouts
  • Arranging for all necessary office support services.

4.0 Disaster determination and action

The Disaster Management Team Leader decides whether to activate the Disaster recovery Plan, and which recovery scenario will be followed. The recovery teams then follow the defined recovery activities and act within the responsibilities of each team, as defined in this Disaster Recovery Plan.

4.1 Disaster Management Team Tasks

4.1.1 Immediate

  • Receive an initial assessment of the nature and extent of the problem.
  • Decide whether to activate the Plan.
  • Alert all Recovery team leaders.
  • Alert and mobilize all other team members.
  • Make a preliminary (verbal) report to senior management.
  • Call an initial meeting of the recovery team leaders with the following objectives:
    • To define the problem, the extent of the disruption, its consequences and the probable implications for the foreseeable future.
    • To set up a specified location as a Control Center.
    • To agree each team’s objectives for the next three hours.
    • To set up a second meeting for three hours later.
  • Make a second, more detailed, report to senior management on the content of the meeting and the actions being taken.

4.1.2 Within 3 hours

  • Call a second meeting of the recovery team leaders with the following objectives:
    • To receive initial reports from the recovery team leaders.
    • To take the decision to implement disaster recovery procedures.
    • To agree each team’s objectives for the next twenty four hours.
    • To set up a third meeting for twenty four hours later.
  • Contact vendors and alert them about the situation and that their services will be needed.

4.1.3 Within Twenty Four Hours

  • Work with vendors to order needed replacement equipment.
  • Prepare plans for the transition to the standby facility (if required).
  • Report progress to senior management.
  • Act as the main point of contact with Campus Safety.
  • Monitor on a regular basis all activities to exercise and maintain control over delivery and installation dates.
  • Document progress against agreed schedules.

4.1.4 Ongoing

  • Call all user contacts on a regular basis, advising them of the disruption and the actions being taken.
  • In conjunction with the Facilities Team, monitor the delivery and installation of new/replacement hardware, communications and ancillary equipment.
  • In the light of the disruption, review all production schedules in terms of jobs to be run, timings, priorities and dependencies.
  • Prepare production schedules in readiness for start up at the standby site (if required).
  • Accept hand over of standby site from the Facilities Team.
  • In conjunction with the Network and Operations Team, initialize and test the systems:
    • Hardware
    • operating systems
    • communications network
  • Start processing in accordance with prepared production schedules.
  • Discontinue work at any interim site(s).

4.2 Network and Operations Team Tasks

4.2.1 Immediate

  • Alert and mobilize all other team members.
  • Attend the initial meeting called for recovery team leaders.

4.2.2 Within 3 hours

  • Inform all staff of the problem and the actions being taken.
  • Ensure all staff remain calm and understand their roles.
  • Inform all staff of any temporary instructions.
  • Report back at the second meeting of recovery team leaders.
  • Help to compile an inventory of surviving communications equipment (voice/data) and that to be acquired.
  • Ensure that all relevant documentation is at hand or retrieved from the off site storage facility, for the reinstatement of the network.
  • Provide further information to enable the Disaster Management Team Leader to keep users informed of current position if required.
  • Ensure that all documentation/ information is available for the vendors to connect the voice, local and wide area network to the standby facility (if required).
  • Liaise with the Standby Facility and telecom service providers to monitor progress of communications reinstatement.
  • Report back at the second meeting of recovery team leaders.

4.2.3 Within Twenty Four Hours

  • Contact suppliers of:
    • Hardware
    • communications equipment
    • ancillary equipment.
  • Inform them of the arrangements for moving to the standby facilities (if required).
  • Order new equipment and arrange to have it installed in the standby facility.
  • Define the priorities for restoring the network on a gradual basis in order to provide a minimum initial communications requirement for normal operations.
  • Liaise with suppliers of communications equipment to ensure prompt delivery, if required.
  • Ensure that the reinstated communications network is operable and tested.
  • Provide ongoing support for the communications network and carry out any re configuration of the reinstated network that may be necessary.
  • Attend the third meeting of the disaster recovery team leaders and report the restoration status.

4.2.4 Ongoing

  • Monitor the delivery and installation of new/replacement hardware, communications and ancillary equipment.
  • In the light of the disruption, review all production schedules in terms of jobs to be run, timings, priorities and dependencies.
  • Prepare production schedules in readiness for start up at the standby site (if required).
  • Initialize and test the systems:
    • Hardware
    • operating systems
    • communications network
  • Start processing in accordance with prepared production schedules.
  • Discontinue work at any interim site(s).
  • Monitor the network’s performance.
  • Monitor and deal with users’ requests in the light of the restricted network.
  • Prepare an inventory of all communications equipment requiring replacement in order for the original computer processing environment to be re-utilized.
  • Order replacement equipment as required (in conjunction with the Disaster Management Team for. expenditure approval).

4.3 Facilities Team Tasks

4.3.1 Immediate

  • Provide an initial damage report to the Disaster Management Team Leader.
  • Alert and mobilize all other team members.
  • Attend the initial meeting called for recovery team leaders.

4.3.2 Within Three Hours

  • Conduct an asset inventory.
  • Make a full evaluation of the damage.
  • In conjunction with the Network and Operations Team identify all potentially salvageable equipment.
  • Carry out safety inspections.
  • Make the site secure, to prevent unauthorized access by staff or the public.
  • Estimate the time required to recover.
  • Report back at the second meeting of recovery team leaders.

4.3.3 Within Twenty Four Hours

  • Provide the required facilities at the Command Center.
  • Transfer staff to temporary locations.
  • Remove vital documents from disaster site.
  • Remove re usable equipment from disaster site.

4.3.4 Ongoing

  • Remove salvaged items from the disaster area.
  • Contact suppliers of essential services (electricity, gas, water) and make any arrangements required as a result of the disruption.
  • Supervise delivery and installations at the standby facility (if required).
  • Monitor the installation of:
    • Electricity
    • heating/lighting
    • air conditioning
    • fire detection systems
    • access control systems
  • Provide office furniture for the standby facility (if required).