|Read the Digest in
You need the free Adobe
The digest of current topics on Continuous Processing Architectures. More than Business Continuity Planning.
BCP tells you how to recover from the effects of downtime.
CPA tells you how to avoid the effects of downtime.
Thanks to This Month's Availability Digest Sponsor
In this issue:
Browse through our Useful Links.
Check our article archive for complete articles.
Sign up for your free subscription.
Join us on our Continuous Availability Forum.
Database Backup and Archiving
In my last editor’s message in the October issue of the Availability Digest, I asked for opinions on whether data replication eliminated the need for database backups. This lead to a very active discussion on our LinkedIn Continuous Availability Forum thread. Thanks to all who contributed.
The answer was a resounding “You’d better back up!” In this issue, I summarize the various points made by our respondents and illustrate the reasons for backing up with several horror stories from our Never Again series. We report on one more pertinent Never Again experience in this issue – a major bank’s three-day outage of its online banking services because corruption of its primary database was replicated to its standby SAN.
We also note the difference between backing up near-term data on disk for fast recovery and archiving that data for long-term storage on disk or tape. We review an independent analyst’s cost study comparing archiving to disk versus archiving to tape following the retention period for virtual tape backups stored on disk. The study shows the impact of deduplication technologies.
If you have further views on this issue, please post them on our thread Does Active/Active mean that you no longer have to do backups in the LinkedIn Continuous Availability Forum. This is still an active and important topic.
Dr. Bill Highleyman, Managing Editor
JPMorgan Chase is a big bank. If its operations are compromised, it is felt by millions of customers. In September, 2010, exactly this happened when for three days the bank lost its online banking services used by over 16 million online customers.
The problem occurred in a large Oracle database. The database is managed by an Oracle cluster. Mirrored EMC SANs provide the data storage. The database holds authentication data and user profiles for its online customers.
It appears that an Oracle bug corrupted key files in the authentication database. This corruption was dutifully replicated by the EMC SANs so that both the active and mirrored SANs were corrupted. With no authentication, the online applications became inaccessible. The bank had to restore the database from its backups.
This graphically illustrates the need for database backups. If the bank had relied only on its redundant database to provide data protection, it would have lost its entire authentication- and user-profile database.
Data replication has become the standard way to keep the database of a standby system synchronized with its production system. In such an architecture, there are two copies of the database – one at the production system and one at the geographically remote standby system.
Because there are now two independent copies of the database, is there no need to back up the database to magnetic tape or virtual tape, especially since the backup copy is only seconds or minutes old, not hours or days?
The fact is, a company would be foolish not to perform periodic backups. Data replication does not protect data. It protects system operations. It is backup that protects data. If the production database gets corrupted, or if a file or table is lost, data replication provides no protection. If there is a simultaneous failure of both the production and standby databases, data replication provides no protection. Only backup provides this protection.
Therefore, the database must be backed up. Near-term backups should be kept on disk for rapid recovery and reference. Long term archiving of data should be on magnetic tape for economy.
There are more than just the users of a system who are interested in the availability of a system. There are also the system operators and management.
Users will typically be interested in recovery time (MTR). The faster the recovery time, the less impact an outage has on them. System operators will typically be more interested in failure intervals (MTBF) since that defines their workload in terms of component repairs (though fast recovery helps them by minimizing the stress to get a component fixed). Management is perhaps most interested in total system downtime as that represents a cost to the enterprise.
We have talked a lot about recovery times and the probability that a system will be up (its availability). But how do we calculate the MTBF of a system made up of several critical components, each with their own MTBFs? We explore that question in this article.
There are two needs for backing up a database – fast recovery in the event of a partial or full database loss and long-term archiving. Fast recovery can only be achieved with disk-based backup. Disk backup can reduce restoration time for a full database from days to hours, and it can also reduce from days to hours the amount of data lost.
However, the cost of fast recovery afforded by disk may not be justified for long-term archiving of data. The costs of archiving data can be significantly reduced by using a disk-to-disk-to-tape (D2D2T) strategy for backup and archiving. Recent backups of data are stored on disk for fast recovery, but older data is archived to tape for economy.
A study by The Clipper Group shows a 23:1 cost savings by archiving to tape rather than to disk. Furthermore, energy consumption is reduced by a factor of 290.
The use of only disk or the use of only tape may not address all of the goals of an organization. Disk and tape in a tiered D2D2T solution provide complementary values to achieve a company’s goals of recovery, data protection, compliance, energy, and TCO objectives.
Sign up for your free subscription at http://www.availabilitydigest.com/signups.htm
Would You Like to Sign Up for the Free Digest by Fax?
Simply print out the following form, fill it in, and fax it to:
+1 908 459 5543
The Availability Digest is published monthly. it may be distributed freely. Please pass it on to an associate.
Managing Editor - Dr. Bill Highleyman firstname.lastname@example.org.
© 2010 Sombers Associates, Inc., and W. H. Highleyman