B
USINESS
C
ONTINUITY AND
D
ISASTER
R
ECOVERY
P
AGE
1
D
ECEMBER
2004
I
SSUE
O
VERVIEW
Business Continuity Management (BCM) is defined by the Business Continuity Institute as
“a holistic management process that identifies potential impacts that threaten an
organization and provides a framework for building resilience and the capability for an
effective response that safeguards the interests of its key stakeholders, reputation, brand
and value-creating activities.”
1
More tangibly, BCM, also referred to as BCP (Business
Continuity Planning), is designed
to reduce the risk of an unexpected disruption to the
critical functions and operations (both manual and automated) necessary for the survival
of the business.
2
Since the terrorist attacks of September 11, 2001, financ ial institutions have increased the
emphasis placed on business continuity management. As such, BCM today is
significantly different than BCM in the early 1990’s. Figure I, below, illustrates the gradual
evolution of BCM practices and processes , as originally developed by Gartner.
3,4
Figure I: Business Continuity Timeline
1990 - Principal component of BCM is IT disaster recovery; BCM provides
protection from natural disasters and component failure; recovery time objective is
72 hours.
1995 - BCM begins to include business processes protection; development of
recovery plans for critical work processes common.
1999 - Y2K concerns cause institutions to reassess conting ency planning;
traditional 72 hour recovery periods are insufficient and improved to between four
and 24 hours.
2001 - Post-September 11
th
BCM concerns include increased emphasis on crisis
management, improved “people” management and testing scenarios for new
situations such as loss of life, lack of transportation and destruction of facilities.
2004 - Concerns with four to 24 hour site outages led institutions to incorporate
BCM into business processes, applications and technology architecture designs.
Emphasis on ability to provide round-the-clock availability.
With the increased dedication of resources to BCM, financial institutions have developed
new departments and have increased employee responsibilities. Not surprisingly, the
organization of these departments varies by institution. This primary research brief
highlights the various organizational structures and planning practices used at six
institutions from across the globe.
Profiled
Institution
Net Premiums
Written
Geographic
Region
A
Under $1 Billion
Europe
B
Under $1 Billion
Australia
C
$3 - $5 Billion
North America
D
$5 - $10 Billion
North America
E
$5 -$10 Billion
North America
F
$1 - $3 Billion
North America
Primary Research Brief
Business Continuity and Disaster Recovery
Key Questions:
§ How fast is a company’s critical information
restored?
§ Which software solutions and vendors
assist in recovering data?
§ Who supports the business continuity and
disaster recovery programs?
§ What form of training is provided on
business continuity and disaster recovery?
www.insuranceadvisoryboard.com
I
NSURANCE
A
DVISORY
B
OARD
D
ECEMBER
2004
T
ABLE OF
C
ONTENTS
Issue Overview ……………1
Summary of Findings ……2
Testing and Recovery
Process …………………….4
Strategies, Staff and
Training …………………….5
Vendors ………………….…6
Research Methodology ….7
Recommended
Readings …………………...8
This project was researched and written
to fulfill the specific research request of
a single member of the Insurance
Advisory Board and as a result may not
satisfy the information needs of other
members. In its short-answer research,
the Insurance Advisory Board refrains
from endorsing or recommending a
particular product, service or program
in any respect. Sources are contacted at
random within the parameters set by the
requesting member, and the resulting
sample is rarely of statistically
significant size. That said, it is the goal
of the Insurance Advisory Board to
provide a balanced review of the study
topic within the parameters of this
project. The Insurance Advisory Board
encourages members who have
additional questions about this topic to
assign short-answer research projects of
their own design.
Catalogue No.: IAB12OAA1I
B
USINESS
C
ONTINUITY AND
D
ISASTER
R
ECOVERY
P
AGE
2
D
ECEMBER
2004
S
UMMARY OF
F
INDINGS
§
Business continuity is defined by Gartner to include five principal components. Table I, below, highlights the objectives
of various BCM components. The crisis management component, as illustrated, addresses the management of an
event and the plans to protect employees and the institution, regardless of what type of interruption.
5
Table I: Principal Components of Business Continuity Management
Disaster
Recovery
Business
Recovery
Business
Resumption
Contingency
Planning
Objective
Mission-critical
applications
Mission-critical
business processing
(workspace)
Business process
workarounds
External event
Focus
Site or component
outage (external)
Site outage
(external)
Application
outage (internal)
External behavior
forcing change to
internal
Deliverable
Disaster recovery
plan
Business recovery
plan
Alternate
processing plan
Business
contingency plan
Sample
Event
Fire at data center
Electrical outage
Credit
authorization
system down
Main supplier cannot
ship due to external
problem
Sample
Solution
Recovery site in
different location
Recovery site in
different power grid
Manual procedure
25% backup of vital
products; backup
supplier
Crisis Management
Source: Gartner Research, 2004.
§
Financial companies’ spending on Business Continuity and Disaster Recovery (BCDR) has spiked in recent years.
After the September 11
t h
attacks, spending rose 19 percent to US$3.4 billion. According to TowerGroup, spending on
disaster recovery is expected to rise another 12 percent to US$4 billion by the end of 2004 .
6
§
Figure II, below, depicts the growing concern for disaster recovery in relation to other IT security measures.
41.3%
17.4%
34.8%
2.2%
4.3%
Disaster Recovery
Privacy
Hackers/Viruses
Identity Theft/Fraud
Other
§
For the past 11 years, the federal government has mandated that financial institutions possess some form of disaster
recovery program. Since September 11
th
, 2001, more attention has been given to BCDR programs and companies
are now held accountable for their programs. All federally supervised financial institutions are now required to have
BCDR plans.
7
S
UMMARY OF
F
INDINGS
I
SSUE
O
VERVIEW
P
ROFILED
I
NSTITUTIONS
R
ESEARCH
M
ETHODOLOGY
Business Continuity Management is generally defined to include five components: disaster recovery, business
recovery, business resumption, contingency planning and crisis management.
Figure II: Top IT-Security Concerns at Insurance Companies
Source: Best’s Review, May 2004.
Disaster recovery is currently cited as the number one concern for Information Technology security at financial
institutions around the world.
I
SSUE
O
VERVIEW
S
UMMARY OF
F
INDINGS
T
ESTING AND
R
ECOVERY
P
ROCESS
S
TRATEGIE S
,
S
TAFF AND
T
RAINING
V
ENDORS
R
ESEARCH
M
ETHODOLOGY
R
ECOMMENDED
R
EADINGS
B
USINESS
C
ONTINUITY AND
D
ISASTER
R
ECOVERY
P
AGE
3
D
ECEMBER
2004
S
UMMARY OF
F
INDINGS
(
CONTINUED
)
Disaster Recovery is cited as the most important security concern for chief information officers and IT decision makers.
8
Every hour of downtime costs companies immensely. Table II, below, displays the average hourly cost of downtime.
Source: AIIM E-Doc Magazine, July/August 2004.
BCDR experts state that it is better to have a plan that prevents a disaster from taking systems down than to rely on
quickly restoring data after a disaster or attack. However, some businesses would rather pay the cost of downtime than
pay for the upkeep of a mirrored site that replicates data applications and IT infrastructure.
9
To manage the IT needs of organizations, disaster recovery vendors offer identical sets of applications and data, which
are available at off-site locations in the event of a catastrophic system failure. Experts state that most financial services
firms run “hot” or “warm sites,” where data are mirrored at an offsite facility on either real-time or near real-time basis,
since several minutes of downtime can result in significant revenue losses.
10,11
Contingency Planning & Management Online categorizes the following four options available for alternate sites:
1 2
§
Hot Sites: Monthly subscriber fees pay for the availability, space, equipment, and services of fully
operational facilities maintained by independent providers.
§
Cold Sites: Computer-ready space held in reserve for the user’s own systems, for any subscriber whose
recovery exceeds the period pre-designated for hot site usage.
§
Warm Sites: Data center or office space partially equipped with hardware, communications interfaces,
power sources, and environmental conditioning.
§
Mobile Recovery Centers: Custom- designed, transportable structures outfitted with computer and
telecommunications equipment, transported to and set up at the chosen location.
In order to prevent downtime in a crisis, it is imperative that Human Resources (HR) be a key play er in programs that
manage internal and external operations and communication.
1 3
An effective key contingency plan includes setting aside reserve funds for dealing with the unexpected expenses
associated with replacing a key executive. Assigning a conti ngency planning team whose responsibilities include
proactively developing high potential employees in tandem with succession plans proves a useful tactic to prepare the
company for sudden leadership vacuums . Consequently, if a key officer at the top suddenly needs a replacement, ideally ,
executives have been groomed to assume key positions.
14,15
One tactic is to mentor and cross-train high- potential
employees for key positions throughout the organization. Such training also serves as an effective retention tool in an
industry with high turnover.
16
Table II: Average Hourly Cost of Downtime
Type of Business/Technology
Cost of Downtime
Brokerage House or Large E-commerce Site $6.4 million
Credit Card Sales and Authorization
$2.6 million
Catalog Sales
$90 thousand
Package Shipping and
Transportation Industry
$28 thousand
UNIX Networks
$75 thousand
PC LANs
$18 thousand
KEY
POINT
Human resources play a key role in minimizing the loss of data through contingency business resumption planning.
I
SSUE
O
VERVIEW
S
UMMARY OF
F
INDINGS
T
ESTING AND
R
ECOVERY
P
ROCESS
S
TRATEGIES
,
S
TAFF AND
T
RAINING
V
ENDORS
R
ESEARCH
M
ETHODOLOGY
R
ECOMMENDED
R
EADINGS
B
USINESS
C
ONTINUITY AND
D
ISASTER
R
ECOVERY
P
AGE
4
D
ECEMBER
2004
T
ESTING AND
R
ECOVERY
S
TRATEGIES
Business Continuity and Disaster Recovery Process
Figure III, below, charts the business- defined recovery time objective (RTO) and the business- defined recovery point objective
(RPO) for each of the six profiled institutions. Institution E currently lacks a definition for RPO, and therefore is not included in
the graph.
48
72
7 2
24
24
7 2
48
16
24
72
0
24
0
10
20
30
40
50
60
70
80
A
B
C
D
E
F
Institution
Hours
RTO
RPO
Each of the profiled institutions conducts a full continuity test once a year. The point of scope in these tests varies by
institution, although all include testing of critical systems:
§
Institution A tests their desktop and in-house databases. No third party is involved with the testing.
§
Institution B examines their mainframe and midrange systems. Employees participate in the testing of recovered
critical process chains and follow prepared test scripts.
§
Institution D conducts a continuity test by having a complete building outage of one of their seven offices. Third
party vendors are employed to carry out the testing.
§
Institution F conducts a full recovery test two times per year on the mainframe side and annually for distributed
processing and business functional plans.
None of the six companies who participated in this study had their BCDR plans certified. However, Institution B works closely
with the Australian Prudential Regulation Authority’s guidelines and has their plans benchmarked to best practice.
Business Continuity and Disaster Recovery t esting generally occurs ov er the course of several weeks . Table III, below,
displays the preparation, frequency, and involvement within the six profiled institutions.
Table III: Preparation, Frequency and Involvement of Testing
Institution A
Institution B
Institution C
Institution D
Institution E
Institution F
Preparation
3-4 days
spread over 4-
5 weeks
Rolling Basis
NA
Rolling Basis –
meet monthly
to discuss
testing
4-5 months
4-6 weeks part time
Frequency
NA
Once per year
Rolling Basis
NA
Once per
year
NA
Involvement
Business
coordinators
from each
department
IT Service
Continuity and
Corporate
Business
Continuity
Group
IT and
Business
Involved
IT and
Business
Involved
NA
Recovery
Coordinator and
Disaster Recovery
team leaders and
test participants
Figure III: RTO vs. RPO for Critical Resources
I
SSUE
O
VERVIEW
S
UMMARY OF
F
INDINGS
T
ESTING AND
R
ECOVERY
P
ROCESS
S
TRATEGIES
,
S
TAFF AND
T
RAINING
V
ENDORS
R
ESEARCH
M
ETHODOLOGY
R
ECOMMENDED
R
EADINGS
Source: Board Research, 2004.
B
USINESS
C
ONTINUITY AND
D
ISASTER
R
ECOVERY
P
AGE
5
D
ECEMBER
2004
S
TRATEGIES
,
S
TAFF AND
T
RAINING
Human Resources and Business Continuity and Disaster Recovery Staff and Training
Institution A has one full time BCM manager and one part time coordinator in each department.
Institution B has a full time BCM manager who reports to Fraud and Security Risk. Three additional
BCM coordinators work for each regional area of the company and report upwards to the BCM manager.
Two of these coordinators are part time staff, while the third operates full time. There is one contracted BCM
consultant who reports independently to the Risk Manager. This consultant has full responsibility for
Information Technology/Disaster Recovery Plans (ITDRP).
Institution C’s Business Continuity Program consists of one full time BCM coordinator, one full time
technical continuity coordinator and five directors of Information Security and Continuity Services.
Institution D has two full time staff on the IT/DR side who report through the CIO and up to the CFO. An
additional two full time staff are on the Business Recovery side, reporting up to the CFO. All four employees
meet monthly to coordinate their Information Security issues.
Institution F’s BCDR program consists of three full time Information Security/Information Technology
Recovery Coordinators. These coordinators report to the Security Officer, who then reports to the Director of
Enterprise Operations Applications. There are nine part time Business Recovery Coordinators who are all
director - level employees who report to Vice Presidents across the company.
Institution F also has a Corporate Business Continuity Team that is led by an IT/DR specialist and staffed
by Recovery Coordinators. About 200 Disaster Recovery Team Leaders, composed of department
managers, supervisors, or IT/IS technicians, are responsible for applications and support.
Institution E does not currently have a support staff for their BCDR program.
The number of people supporting the business continuity programs at each institution widely varies. However, all
institutions but Institution E have a mixture of full and part time employees who report to the overall BCM manager.
R
ECOVERY
S
TRATEGIES
Institution B has recovery strategies at the operational level to cover three scenarios: loss of a physical site, loss of
computers, and loss of telephones at a physical site. Each scenario has its own Action Plan which covers loss of systems
nationwide.
Institution F has implemented a mainframe hot site and work area for alliance distributed environments through SunGard.
Additionally, they have an in-house work area for home office and failover for critical distributed, quick ship for less critical
distributed, and acquire ATOD for least critical distributed.
B
USINESS
C
ONTINUITY AND
D
ISASTER
R
ECOVERY
T
RAINING
§
Institution A offers ongoing training for specific individuals in each area of the institution.
§
Institution B administers a Training and Awareness Pack to all employees at the business level. An additional intranet
site on business continuity exists to assist employees.
§
Institution C is revamping their strategy and developing a BCP Awareness program for all employees.
§
Institution D exercises teams annually on BCDR strategies; however, at this time there is no formal instruction available
to employees. A website defines expectations, policies and procedures for employees to follow in the case of a disaster.
§
To implement BCDR training across levels of the Institution F provides walk-through scenarios of business plans, mock
disaster sessions, hot site and question/answer testing
§
Institution E provides cursory training on BCDR, though not everyone is accountable for it.
I
SSUE
O
VERVIEW
S
UMMARY OF
F
INDINGS
T
ESTING AND
R
ECOVERY
P
ROCESS
S
TRATEGIES
,
S
TAFF AND
T
RAINING
V
ENDORS
R
ESEARCH
M
ETHODOLOGY
R
ECOMMENDED
R
EADINGS
B
USINESS
C
ONTINUITY AND
D
ISASTER
R
ECOVERY
P
AGE
6
D
ECEMBER
2004
V
ENDORS
Business Continuity and Disaster Recovery Vendors
Institutions A, B, C and F blend in-house processes and outsource processes to maintain a hybrid solution to disaster
recovery.
Table IV , below, depicts the vendors and software employed by the institutions in order to maintain BCDR plans.
Table IV: BCDR Vendors and Software
Outsource, Insource or
Hybrid Solutions
Software Used to Maintain BCDR
Plans
Institution A
Hybrid Solution: in-
house cold site,
outsource contracts to
Synstar
Utilizes C- Plan, an internally
developed software solution, to
maintain the BCDR plans. C- Plan
needs redevelopment as there is a
low level of satisfaction for the plan.
Institution B
Hybrid Solution: shared
warm site facility that is
run by IBM/GSA and
staffed by Institution B’s
employees
Utilizes Microsoft Office products to
maintain BCDR plans. The products
“work well and are cost effective.”
Institution C
Hybrid Solution: many
systems, such as email
and network based
services, are replicated
between locations and
the rest is outsourced to
IBM
Utilizes LDRPS to ma intain their
BCDR plans. Likely to discontinue
because the product “adds little value
to the process.”
Institution D
In-house Solutions
NA
Institution E
Outsource to SunGard
NA - currently evaluating several
plans
Institution F
Hybrid Solution: in-
hous e solutions for
critical information that
cannot withstand a 72
hour RTO, SunGard is
used as a hot site
vendor
Utilizes ComPAS s oftware and is
currently evaluating other products.
Living Disaster Recovery
Planning System (LDRPS)
is a software product that
assists in organizing and
carrying out BCDR plans.
LDRPS offers plan
templates, language
modules, work station
recovery, call lists and
databases to support data
recovery.
ComPAS is a software
program developed by
SunGard. Its plans are
action oriented task lists
organized by function,
department and location.
ComPAS’ guides
companies through a step-
by-step process of
developing, testing and
implementing a
comprehensive continuity
plan.
K
EY
P
OINT
Institution D decided that it is less expensive to establish a hot site run by their employees. The company set up a data center
12 miles from their downtown office. While the center is in close proximity to the downtown offices, the worst case natural
disaster that could occur is a tornado. However, Institution D remarked that “a tornado is only likely to hit any given point
every 999 years and to hit two points 12 miles apart is highly unlikely.”
K
EY
O
BSERVATION
Each of the profiled institutions, except for Institution B, that apply software to maintain their BCDR plans are unsatisfied
with the results. While numerous software vendors exist, companies have trouble effectively integrating the software into
their institutions. Despite the immense loss of data for some companies after September 11, 2001, no standard practice has
been implemented across the insurance industry for BCDR plans, leaving many institutions questioning what steps to take to
protect their company from a disaster.
I
SSUE
O
VERVIEW
S
UMMARY OF
F
INDINGS
T
ESTING AND
R
ECOVERY
P
ROCESS
S
TRATEGIES
,
S
TAFF AND
T
RAINING
V
ENDORS
R
ESEARCH
M
ETHODOLOGY
R
ECOMMENDED
R
EADINGS
B
USINESS
C
ONTINUITY AND
D
ISASTER
R
ECOVERY
P
AGE
7
D
ECEMBER
2004
R
ESEARCH
M
ETHODOLOGY
Board staff interviewed business continuity and disaster recovery professionals at six
insurance companies. These individuals discussed the recovery process, IT vendors and
human resources within each institution. This report represents findings from primary and
secondary sources.
Ke y Questions:
1. What is the business-defined recovery time objective (RTO) for the Company’s critical resources?
2. What is the business-defined recovery point objective (RPO) for the Company’s critical information?
3. Does the Company outsource (e.g., SunGard) or in source disaster recovery or do you have a hybrid solution?
a.
If insourced solution, does the Company run a hot site, worksite or cold site.
b.
If outsourced, with what firm do you contract?
c.
If hybrid solution, explain the solution.
4. How often does the Company run a full continuity/recovery test?
a.
How often does the Company do point solution recovery tests?
b.
What is the scope of testing? (e.g., mainframe only, application testing, etc.)
c.
Does the testing include third party business partners, vendors, etc.?
5. Does the Company prepare for the test or is it spontaneous?
a.
If the Company prepares, how much preparation / planning is involved?
b.
Who is involved in this planning?
6. Are the Company’s business resumptions plans certified? If so, by which organization?
7. Is the Company’s DR plan(s) certified? If so, by which organization?
8. How many people support the business continuity program? Clarify full time or part time, roles and responsibilities
and where they report.
9. How often does the Company perform a business impact analysis?
a.
Is this done using existing staff or external resources?
10. Has the Company developed high availability solutions for critical business processes? Describe.
11. How many recovery strategies have been implemented? Describe.
12. Does the Company perform business continuity training? If so, to what extent?
a. Is it company-wide training or for specific individuals?
13. Does the Company utilize a software solution for maintaining business resumption plans, disaster recovery plans,
etc.? If so, please name the product and the satisfaction level.
a. If not, have you evaluated certain products?
b.
What is your opinion?
R
ESEARCH
M
ETHODOLOGY
I
SSUE
O
VERVIEW
S
UMMARY OF
F
INDINGS
T
ESTING AND
R
ECOVERY
P
ROCESS
S
TRATEGIES
,
S
TAFF AND
T
RAINING
V
ENDORS
R
ESEARCH
M
ETHODOLOGY
R
ECOMMENDED
R
EADINGS
B
USINESS
C
ONTINUITY AND
D
ISASTER
R
ECOVERY
P
AGE
8
D
ECEMBER
2004
R
ECOMMENDED
R
EADINGS
1.
“Trends in Information Security and Business Continuity Planning: From Infrastructure Protection to Business
Planning.” Working Council for Chief Information Officers, 2003.
2.
“Disaster Recovery Technologies.” Infrastructure Executive Council, April 2003.
3.
“Business Continuity Management Structures.” Operations Council, April 2004.
4.
“Crisis Management Strategies.” Corporate Leadership Council, May 2003.
5.
“Business Continuity Planning.” Working Council for Chief Executive Officers, September 2002.
6.
“Disaster Recovery Planning at Banks.” Operations Council, April 2002.
7.
“Disaster Recovery / Business Continuity Plans and Key Insights from September 11
th
.” Working Council for Chief
Information Officers, September 2002.
8.
“Disaster Recovery and Business Continuity Vendors.” Working Council for Chief Information Officers , July 2002.
9.
“HR and Business Process Recovery and Contingency Planning.” Corporate Leadership Council, February 2002.
3
1
“Business Continuity and Crisis Management.” Management Quarterly, January 2003.
2
“Leaving it to Chance.” CA Charter, 1 February 2004.
3
“Management Update: Best Practices in Business Continuity and Disaster Recovery.” Gartner, 17 March 2004.
4
“Business Continuity Management Structures.” Operations Council, April 2004.
5
ibid
6
Marlin, Steven and Martin J Garvey. “Disaster-Recovery Spending on the Rise.” Information Week, 9 August 2004.
7
Swann, James. “Be Prepared: Disaster Recovery Strategies.” Community Banker, February 2004.
8
“Moving IT Forward.” Best’s Review, May 2004.
9
Garvey, Martin J. “Mirrored Sites Keep Systems Up.” Information Week, August 2003.
10
“Disaster Recovery and Business Continuity Vendors.” Working Council for Chief Information Officers, July 2002.
11
“IT Challenge.” Wall Street & Technology Online, 9 October 2001.
12
“Contingency Planning & Management.” http://www.contingencyplanning.com
13
“Crisis Management Strategies.” Corporate Leadership Council, May 2003.
14
“Executive Succession Plans.” Change Management Group, November 2001.
15
“HR and Business Recovery and Contingency Planning.” Corporate Leadership Council, February 2002.
16
ibid
The Insurance Advisory Board has worked to ensure the accuracy of the information it provides to its
members. This project relies upon data obtained from many sources, however, and the Insurance Advisory
Board cannot guarantee the accuracy of the information or its analysis in all cases. Further, the Insurance
Advisory Board is not engaged in rendering legal, accounting or other professional services. Its projects
should not be construed as professional advi ce on any particular set of facts or circumstances. Members
requiring such services are advised to consult an appropriate professional. Neither Corporate Executive Board
nor its programs is responsible for any claims or losses that may arise from any errors or omissions in their
reports, whether caused by Corporate Executive Board or its sources.
Professional Services Note
I
SSUE
O
VERVIEW
S
UMMARY OF
F
INDINGS
T
ESTING AND
R
ECOVERY
P
ROCESS
S
TRATEGIES
,
S
TAFF AND
T
RAINING
V
ENDORS
R
ESEARCH
M
ETHODOLOGY
R
ECOMMENDED
R
EADINGS