| Reducing the Cost of Disaster Recovery |
|
By Fadi Albatal
In my last post on disaster recovery (DR) automation, I explored how to address the complexity and challenges of DR implementation and execution through the deployment and adoption of new technologies. In many cases, however, the question is not whether organizations have the ability to implement a complex DR solution – it is a question of being able to afford one.
A recently published survey by Enterprise Strategy Group ESG found that 74 percent of respondents could tolerate up to three hours of downtime before they start suffering revenue loss, and 54 percent of those surveyed could not tolerate even an hour of downtime. This is an indicator of two things: first, our business processes are heavily dependent on IT, which is a good thing; and second, a DR implementation for systems that drive and support business processes is crucial to the survival of an enterprise.
With today’s stricter criteria for downtime tolerance, we can already exclude and dismiss certain legacy DR strategies as nonviable. One such strategy that has been the backbone of any DR process for a long time is recovery from tape. Tape recovery times can be hours and even days in many cases, especially when it comes to partial or entire data center recovery. Although virtual tape libraries (VTL) and other disk-based de-duplication solutions have accelerated the backup and recovery process, data restoration performed with traditional backup software is an extremely complex and lengthy process that far exceeds the tolerance level of the 74 percent of respondents to the ESG survey.
To make the right decision on where to invest, let’s first categorize the recovery scenarios. In most cases, data loss is the result of human error or a software or hardware malfunction. This represents almost 84 percent of failure and data-loss scenarios. In all these instances, a local recovery may be the best way to go. Remote recovery, however, remains the best response to natural disasters where the impact on primary data center operations is significant. The second tier of categorization is the services that you’re providing to your users. Since all data is not created equal, it’s important to identify what constitutes critical tier-one versus tier-two services and the different service level agreements (SLA) associated with these applications.
This categorization does not mean that you should neglect some services over others. There is often a correlation between the desired up-time of a certain application and the cost of the infrastructure supporting the service. Tiering your services based on SLA will help you distribute your investment accordingly and ensure optimal up-time for all your business-critical applications.
Here are some best practices to achieve the highest level of availability while keeping your costs under control:
1. Start with the local recovery infrastructure and reduce the level of dependency on tape by leveraging newer technologies such as snapshots and continuous data protection (CDP). This has three distinct benefits: dramatically improved recovery time, elimination of the backup window, and reduced tape production from daily to monthly backups. It’s important, though, to maintain the secondary copy of the data separate from your primary infrastructure for optimal availability and to assign the right resources to the right service as per your service-level tiering. The result is a massive reduction of the tape infrastructure and a much lower operating cost for both backup and recovery operations.
2. When it comes to your remote copy, you should consider your needs and the frequency of utilization of your remote infrastructure. Your needs in the case of a major disaster are not the same. In many cases, an infrastructure that ensures essential services is sufficient for DR (this is where that tiering model is important, too). Also, the performance-level requirements may be reduced in a disaster scenario, which should allow you to invest in a lower-cost infrastructure at the remote site. And, as with tape-based DR systems, a big portion of the cost of data replication is associated with data transport. A WAN-optimized replication solution is essential to keep your DR costs down and to allow you to invest your money strategically in the right areas.
Ultimately, any local or remote DR solution needs to solve for one question: how can I restore my data and get my services back online as soon as possible? Because the highest cost of any DR scenario is the downtime – and loss of business associated with it.
About the Author: Fadi Albatal is the vice president of marketing at FalconStor Software. With more than 12 years of senior level management in the IT market, Albatal has substantial experience with large-scale storage systems.
Only registered users can write comments. |
||||