Tuesday, January 06, 2009

The importance of proper BACKUP!

Come across a techcrunch.com article today. A blog site called journalspace.com been completely wiped out after few years of operation simply because ex-IT person deliberately overwritten all the data on SQL server. And guess what, they were only using RAID mirror drives as 'Backup'.

http://www.techcrunch.com/2009/01/03/journalspace-drama-all-data-lost-without-backup-company-deadpooled/

This might be one of the extreme cases, but it's certainly a wake up call to many companies that gave up traditional tape backup and rely solely on standby and replication technologies.

In the case of Oracle database, I know a famous financial website didn't backup their RAC production servers. They have setup multiple Data Guard standby servers and replicate data remotely to off site servers. They even setup two days delayed log apply mechanism to counter bad data contamination. But is that enough? Well it seems pretty well covered all potential hardware and system failures. In most events they can bring production servers back relatively quick without painful slow tape restore. Cool huh.

But they over looked one of the most common and a lot of time most deadly form of system failures -- Human errors either accidentally or maliciously
Whatif a developer accidentally introduced an application bug into system, updated some records and wasn't noticed until two days later? Of course you can say let's increased the delay log apply to 7 days. Hmm, whatif you didn't find the bug 8 days later? You can't indefinitely increase the log apply. Besides this particular website has millions of users doing thousands of online transactions every second. It's not hard to imagine the cost of saving all the transaction logs for many days.

Till now, tape backup is still the most cost effective massive long term backup method. A lot of modern technologies like flashback database, Data Guard, RAC, Replication and storage snapshot etc have been introduced in last few years to help ease DBA's burden of database recovery. But so far they can only cover the database failures in the matter of days, they will not completely replace tape backup anytime soon.

1 comment:

Yingkuan Liu said...

I can't stress enough how important proper backup is even in all these modern High Availability technology.

The last victim is subsidiary of Software giant Microsoft. No matter how unbelievable it sounds, the truth is many company not making proper backup to protect their data under misconception that modern HA method can protect everything.

http://www.techcrunch.com/2009/10/10/t-mobile-sidekick-disaster-microsofts-servers-crashed-and-they-dont-have-a-backup/