Disaster Recovery

Preparations Aren’t Enough20 Sep

Considering my recent post on technology failure preparation, it was quite ironic that one of my clients was a victim of a system failure today. This incident shows that even the best laid plans go awry if someone’s not watching carefully.

It all started with a call first thing this morning. One of those “I can’t believe I just did this…can you help us” calls. This organization uses a Sharepoint intranet and there are a couple very proficient super users who have full control access to the site. But mistakes happen, and they accidentally deleted one of the departmental sites.

Windows Sharepoint Services 3.0 (WSS) has a very useful new feature appropriately called the Recycle Bin. Just like the Recycle Bin on your desktop, it exists to save your hide when you delete something prematurely. Documents, list items, images. Most everything you delete from WSS goes into the Recycle Bin. Most everything, of course, except sites.

So we couldn’t just undo the mistake. But no problem, their Managed Service Provider (MSP) had scheduled daily backups of all their servers. Get the tape, grab the SQL database files and do the restore. We’d only lose a few hours of information assuming the backup ran successfully last night.

Oh man. The database files haven’t been backing up because the backup software couldn’t get exclusive rights to the database. Not a single backup over the past 4 months.

We explored restoring from transaction logs but that wasn’t going to work either. Eventually we just realized that the site was going to have to be rebuilt.

So what went wrong? Well, the first thing was the action to delete the site. WSS gives you a confirmation page to make sure you really want to delete the site. But if you’re busy and not reading closely sometimes even that’s not enough. I don’t think you can easily turn off the ability to delete sites. If you can build them, you can delete them.

Next, as the primary WSS support for the client, I should have been more active in the ensuring the backup process was working. I assumed the tape backup would be sufficient. I should have been much more involved in ensuring that the applications I support are fully covered. I’ve got a Sharepoint backup scheduled now.

Finally, the MSP reports monthly on overall network performance. They’ve been reporting that a full backup has not been successful since June. There are just a few files that have not been backing up so it was never a major concern. Now they know that some of those files are very important. I’m sure they won’t be glossing past the errors any longer.

No less than 4 of the 16 tips I posted earlier this week were relevant and implemented in this situation. I only wish those preparations could have prevented this unfortunate incident.

Leave a Reply

Contact Me

Granite Peak Systems, LLC
PO Box 80892
Billings, MT 59108
Tel: 406-672-8292
Email: trupsis@granitepeaksys.com

Kiva

Since 2007, I have funded a Kiva account in recognition of my clients. Whenever I get a new client, or find a microloan that relates to the industries my clients serve, I contribute to the account. You can see my lender profile here:

http://www.kiva.org/lender/gpsclients