With 2010 and New Years resolutions just around the corner, Tiger Woods isn’t the only person that should be reviewing his disaster recovery strategy (see related New York Times article, Woods Is Silent as Spin Takes On Life of Its Own). Recently, I’ve met with several SAP BusinessObjects customers and am concerned about the verbal responses (and corresponding facial gestures) that I get when I ask about disaster recovery and business continuity.
Let’s review the basics. Unlike “classic” BusinessObjects, which only required the backup of a single repository database backup, starting with the XI R1 platform (and continuing up through BI 4.0/4.1) has multiple components that require backup: the system database (also known as the CMS database), an optional audit database, and two file stores managed by the Input File Repository Server (iFRS) and Output File Repository Server (oFRS).
Similar to the “classic” (pre-XI) BusinessObjects repository database, the system database in XI (sometimes referred to as the CMS database) stores metadata about users and groups, folders, reports, and universes. However, unlike “classic” BusinessObjects, reports and universes are no longer stored as BLOBs in a relational database. Instead, the relational database contains pointers to report and universe files that reside on the iFRS and oFRS. The iFRS and oFRS are file system directory structures. A proper backup will atomically (at the same time) perform a system database backup with a full backup of the input and output file repository server directories. All three items should be treated as a single entity, during a period of system inactivity. If the database backup and the file system backup occur at different times, a restored system from these backups may not have all of the required information. If your organization is using the auditing feature, the auditing database should be included in the backup and restore process. However, even though it is important, the audit database it is not critical to system operation.
I frequently hear that organizations do not view their BI systems as “business critical” and therefore not subject to the same scrutiny as other IT systems in the enterprise. But in addition to taking proper backups, it is imperative to test the restoration process. To test the backup, restore the system database and two file stores on an isolated server and confirm that the recovered environment is viable.
There are some additional nuances here that I haven’t included for sake of brevity. But I hope that you’ll take the time to review your business continuity plans while updating your personal career goals for 2010. And be careful when parking your Cadillac Escalade.
Great post Dallas. I know I owe you one about VMWare and Mac’s, it’s still on my list!
I’d definitely agree that a backup and DR strategy is worth putting in place. Organisations tend to underestimate the important of their business intelligence and reporting systems – reports that arrive every week start to become critical over time, and if one day they stop, it can seriously affect business processes.
Cheers, Josh
I am very glad to see that you included a test restore as part of your post. All too often I hear stories that start out with, “… but we had backups” only to find out that the backup strategy (or media or something) was flawed. If you can’t restore your backup, you might as well not have one.
Hi Dallas – I wonder what/who the inspiration for this blog was…
I was extremely bothered by the statement Dallas made about customers lack of focus on backup and disaster recovery. Unfortunately, I have had to put our DR plan to work. You cannot stress enough the timing of the db and file repository backups, as Dallas said, they must be taken at the same time, and be completely in synch, preferably when the environment is shut down.
Two additional points:
– Redundant hardware (identical to production) is required. I have needed to fail over to my QA/Test cluster, pointing at our Production CMS DB and file repositories
– Another very helpful tip is to put a tested plan in place to repoint your Production URL to the backup environment, so that you do not need to pass out a new URL during a disaster.
What types of issues should we look out for if our backups are not performed in synch? We have the FRS on NAS andd BO DB on Oracle, and backups are handled by different jobs at different times. What are our risks?