|
By Fadi Albatal
In my last post on disaster recovery (DR)
automation, I explored how to address the complexity and challenges of DR implementation and execution through the
deployment and adoption of new technologies. In many cases, however, the
question is not whether organizations have the ability to implement a complex
DR solution – it is a question of being able to afford one.
A
recently published survey by Enterprise Strategy Group ESG found that 74
percent of respondents could tolerate up to three hours of downtime before they
start suffering revenue loss, and 54 percent of those surveyed could not
tolerate even an hour of downtime. This is an indicator of two things: first,
our business processes are heavily dependent on IT, which is a good thing; and
second, a DR implementation for systems that drive and support business
processes is crucial to the survival of an enterprise.
With
today’s stricter criteria for downtime tolerance, we can already exclude and
dismiss certain legacy DR strategies as nonviable. One such strategy that has
been the backbone of any DR process for a long time is recovery from tape. Tape
recovery times can be hours and even days in many cases, especially when it
comes to partial or entire data center recovery. Although virtual tape
libraries (VTL) and other disk-based de-duplication solutions have accelerated
the backup and recovery process, data restoration performed with traditional
backup software is an extremely complex and lengthy process that far exceeds
the tolerance level of the 74 percent of respondents to the ESG survey.
To
make the right decision on where to invest, let’s first categorize the recovery
scenarios. In most cases, data loss is the result of human error or a software
or hardware malfunction. This represents almost 84 percent of failure and
data-loss scenarios. In all these instances, a local recovery may be the best
way to go. Remote recovery, however, remains the best response to natural
disasters where the impact on primary data center operations is significant.
The second tier of categorization is the services that you’re providing to your
users. Since all data is not created equal, it’s important to identify what
constitutes critical tier-one versus tier-two services and the different
service level agreements (SLA) associated with these applications.
This
categorization does not mean that you should neglect some services over others.
There is often a correlation between the desired up-time of a certain
application and the cost of the infrastructure supporting the service. Tiering
your services based on SLA will help you distribute your investment accordingly
and ensure optimal up-time for all your business-critical applications.
Here
are some best practices to achieve the highest level of availability while
keeping your costs under control:
1. Start
with the local recovery infrastructure and reduce the level of dependency on tape
by leveraging newer technologies such as snapshots and continuous data
protection (CDP). This has three distinct benefits: dramatically improved
recovery time, elimination of the backup window, and reduced tape production
from daily to monthly backups. It’s important, though, to maintain the
secondary copy of the data separate from your primary infrastructure for
optimal availability and to assign the right resources to the right service as
per your service-level tiering. The result is a massive reduction of the tape
infrastructure and a much lower operating cost for both backup and recovery
operations.
2. When
it comes to your remote copy, you should consider your needs and the frequency
of utilization of your remote infrastructure. Your needs in the case of a major
disaster are not the same. In many cases, an infrastructure that ensures
essential services is sufficient for DR (this is where that tiering model is
important, too). Also, the performance-level requirements may be reduced in a
disaster scenario, which should allow you to invest in a lower-cost
infrastructure at the remote site. And, as with tape-based DR systems, a big
portion of the cost of data replication is associated with data transport. A
WAN-optimized replication solution is essential to keep your DR costs down and
to allow you to invest your money strategically in the right areas.
Ultimately,
any local or remote DR solution needs to solve for one question: how can I
restore my data and get my services back online as soon as possible? Because
the highest cost of any DR scenario is the downtime – and loss of business associated with it.
About the Author:
Fadi Albatal is the
vice president of marketing at FalconStor Software. With more
than 12 years of senior level management in the IT market, Albatal has
substantial experience with large-scale storage systems.
Only registered users can write comments. Please login or register. |