Wednesday, June 25, 2008

Tape is not dead!

For several years I have been chagrined by the various proclamations from industry experts that “Tape is dead”. These reports (many of which are published by vendors that do not offer a tape product) extol the virtues of using disk as a replacement for tape, but ignore the operational benefits and cost savings that tape provides in the enterprise IT environment.

These benefits become more apparent as the quantity of data and the size of the enterprise grows. At some point, it becomes evident that keeping all copies of all data on spinning disk storage is simply unsupportable – both operationally and financially.

Even so, the many myths regarding the demise of tape have been perpetuated over the years. Therefore I was very pleased to see an article debunking some of these myths appear in the June/July 2008 issue of Z/Journal.

This article, “Mainframe Tape Technology: Myths and Realities”
(By Stephen Kochishan and John Hill) discuss several of these popular myths and then describe both the reality and best practice that aptly describe the issue. This article should be an interesting read for the enterprise storage professional.

Saturday, June 7, 2008

ATT Study confirms DRJ and Forrester results

The following study announcements were found clogging my inbox on Friday...

Survey finds most businesses prepared for disasters Bizjournals.com - Charlotte, NC, USA The survey found that 77 percent of Seattle and Portland executives indicate their companies have a business continuity plan.

AT&T 2008 Business Continuity Study Continuity Central (press release) - Huddersfield, UK AT&T has published the results of its latest annual survey of business continuity practices in US organizations. The 2008 survey is the seventh such survey...

AT&T Study: One in Five US Businesses Does Not Have a Business ...Converge Network Digest - USA For the seventh consecutive year, AT&T's Business Continuity Study surveyed IT executives from companies throughout the United States that have at least $25...

Reading just these excerpts, it took me a moment or two before I realized that they were all describing the same AT&T study.

These particular reports are a little light as far as presenting the specific results from the AT&T survey, but still very timely considering the topic of my last post. The consensus of these interpretations of this AT&T study seems to be that 80% of companies have a Business Continuity plan. 59% of the respondents have updated their plan within the last 12 months, but fewer (46 percent) have had the plans fully tested during the same time period.

Using these numbers in place of the percentages from the DRJ/Forrester study referenced in the previous post, we end up with a set of calculations that look something like this:

80% of the companies have a Business Continuity Plan.
59% of the companies update their plan at least once a year. This means that (.80 * .59 = 47%) 47% of the companies have a plan that is updated at least once a year.
46% of the companies actually test their recovery plans at least once a year. This indicates that (.46 * .47 = 22%) 22% of the companies have a plan, update and test it at least once a year.

This result isn't too bad, I guess, but it doesn't incorporate the result from the Gartner study that only 28% of the planned tests actually are successful and meet all of their objectives. If we apply this calculation to the results of the AT&T study, we find that:

(.28 * .22 = 6%) Only six percent of the surveyed companies can be expected to have successful Business Continuity exercises that meet all of their business requirements.

This is discouraging news indeed.

Sunday, June 1, 2008

Are we prepared?

I attended a Disaster Recovery seminar last week aimed at building a better disaster recovery plan. Some of the statistics presented sparked my interest but rather than taking the presentation at face value, I thought that some additional discussion and analysis about some of these findings would be useful.

It is fortunate that I did take a closer look as some of the statistics that were presented as fact did not bear up to close scrutiny. In fact, upon verifying the presenters’ source information it became apparent that one of the statistics I had chosen as the starting point for my analysis had been misinterpreted and used out of context of the original source document.

However, I was still interested in where this information might have led with the proper analysis, so I discarded the seminar materials and went looking for similar – but more accurate and verifiable data that that could stand up to analysis.

These are the results.

The current state of DR/BC


79%The percentage of enterprises that report having a formal and documented recovery plan in place. [Source: DRJ/Forrester article] This is a very strong showing and represents the significant progress that the industry as a whole has achieved. While no one can argue that the percentage should be anything less than 100%, 19% of the respondents indicated that they expect to have a plan in place within the next 6-12 months, leaving only 2% of the respondents with no plan whatsoever

81% Of those with a DR plan, 81% responded that their plans are updated at least once a year. 26% of the respondents indicated that their plans are updated in an ongoing fashion as part of the change and configuration management processes. Kudos to these folks! 14% update their plans quarterly, 18% update their plans twice a year, and 23% update their plans once a year. [Source: DRJ/Forrester article]

82%
82% of those that responded perform a full exercise of their disaster recovery plans at least once a year. 50% test once a year, 22% test twice a year while 10% test more than twice a year. [Source: DRJ/Forrester article]

At this point, the numbers look really encouraging. As a DR/BC professional myself I can look at these numbers and say “Wow! 80% is really good. We’re doing a great job!” On a personal level, this causes warm and fuzzy feelings as my GQ (Goodness Quotient) is set to 80.

However…

Simple numbers, such as these, can be deceiving. In fact, much better business decisions can be made if additional understanding and analysis of the overall results can be achieved.

First of all, even though each of the numbers presented so far are very close to 80%, it is important to realize that each additional statistic represents but a portion of the prior sample: There is a subsequent reduction in the effective end-product success at each iteration.

In other words, the numbers should be understood in this context:

  • 79 out of 100 enterprises have a disaster recovery plan. (GQ=79)
  • 81% of the enterprises with a plan update them at least once a year: 100*.79 = 79 * .81 = 64 (GQ = 64)
  • 82% actually test their recovery plans at least once a year: 100*.79 = 79 * .81 = 64 * .82= 52% (GQ = 52)


  • Hmmm. So this means that only about 52% of all enterprises actually have a DR plan, update it and test it at least once a year. While that doesn’t make me as warm and fuzzy as the 80% number did, it’s still pretty good, right?

    Well, maybe not. While this still appears to be a relatively positive indication, it doesn’t yet include any indication of how many of these DR plans are successful and actually meet or exceed the client requirements.

    In order to proceed with this next analytical step, it is necessary to reference the results of an additional study, this one – a recent Gartner study that found: “Twenty-eight percent of organizations reported that their last disaster recovery exercise went well and met all their service targets. However, 61 percent of survey participants reported that they had problems with the exercise.” So,

    28%“Twenty-eight percent of organizations reported that their last disaster recovery exercise went well and met all their service targets. So, a 28% “success” rate. [Source: Gartner article]

    However, we must remember that this percentage only applies to those that actually have a plan and update and test their plan. So in reality, this is a 28% success rate of only 52% of the total. (52% * .28 = 15%)

    The complete breakdown of this analysis is shown graphically here:

    Spreadsheet showing the number of succesfull DR tests as a percentage of the wholeThis indicates that only 15% of the total companies will actually recovery from a disaster as they have planned. In other words, 85% of all companies will either fail following a disaster or will experience difficulties that will cause them to exceed either their RTO or RPO or both.

    It would make a great - although completely irresponsible - headline if we were to state that "85% of all organizations will fail following a disaster". Tempting to some perhaps, but no.

    To do so would totally ignore the 61% (as reported by Gartner) who reported that they had "some problems with their last exercise". Since we do not have the detailed information regarding this statistic, we cannot
    ascertain the severity or number of the problems they encountered. It is, however safe to assume that many of these problems have been corrected and that a portion of these enterprises will enjoy greater success the next time they exercise their DR validation program.

    The areas of Disaster Recovery and Business Continuity are ones that capitalize on the benefits of a Continual Service Improvement methodology. Maintaining the plan and keeping it current and validating the plan via frequent test executions are two of the cornerstones necessary for a compliant and Resilient enterprise.

    Source Information
    Website: Disaster Recovery Journal, "The State Of DR Preparedness", http://www.drj.com/index.php?option=com_content&task=view&id=794&Itemid=159&ed=10, with references from the Forrester/Disaster Recovery Journal October 2007 Global disaster Preparedness Online Survey, verified 06/01/2008

    Website: Gartner, "Gartner Says Most Organizations Are Not Prepared For a Business Outage Lasting Longer Than Seven Days", http://www.gartner.com/it/page.jsp?id=579708, verified 06/01/2008