Baby, you can drive my CAR (Continuity, availability, recovery)
Organisations are now almost entirely dependent on their IT platforms – a mix of applications and data that need to be highly available to allow the organisation to carry on its day-to-day activities. Historically, the focus has been on disaster recovery as a safety net for any failure in the IT platform, using backup and restore as the primary means of recovering data. However, this used to take days, and many vendors in the space built their markets on compressing this time down to less than one working day, and then down to just a few hours.
It has to be remembered that for as long as service is suspended, particularly in areas such as e-commerce, financial trading or in supply chain management, the organisation could be losing significant amounts of money every minute the systems are unavailable. This has forced a move to investigating business continuity – but this has its own set of problems.
The idea with business continuity is that as far as is possible, when a component of the IT platform fails, the business retains some level of capability to continue operating. This involves ensuring that all applications and their data, along with required connectivity are available. For high levels of business continuity, everything has not only to be architected such that every component is backed up by at least one similar component, but this redundancy also has to be managed across multiple geographic zones, so that a major issue in one geography (e.g. floods, earthquake) can be managed by failing over to a different geography.
Full business continuity where the organisation owns all its own infrastructure is therefore out of reach of the majority due to its high expense.
An organisation should therefore look to create a strategy that provides an optimal approach to systems and data availability, balancing the constraints of budget and corporate risk. Through a strategy involving a mix of continuity, availability and restore approaches together along with the use of cloud-based services, a cost-effective business solution can be put in place.
Firstly, an organisation needs to decide where business continuity is mandatory. This will be apparent by defining which business processes are so important that they cannot be allowed to fail. The applications and data that facilitate these processes will need to be fully available, and a highly virtualised internal or external cloud environment with shared resources may give much of what an organisation is looking for in this space, particularly if the business decides that it can carry the risk of an outside chance of a catastrophic problem, such as earthquake or flood.
The next aspect to look at is the availability of the underlying data. In many cases, securing the availability of the application will be easier than doing the same thing for the underlying data – which may for example have been accidently deleted by the user. In this case, granular data availability where single or multiple files can be pulled back from a secondary store by administrators or accessed directly by end users enables rapid recovery from, in this case, the accidental data deletion.
It may be that the problem is caused by data corruption, leaving the application with no valid data to work against. In this first case, being able to elegantly fail over from one dataset to an alternative “live” data set provides a form of business continuity, with downtime measurable in seconds.
However, this may not always be possible, and the use of a full data backup and recovery strategy will be required. Datasets that have been backup up to a reasonably current time can be recovered rapidly, minimising downtime and enabling the organisation to be back up and running rapidly to a known point, known as the recovery point objective (RPO). Even where the application and the data fail together, fully functional images can be recovered that include the application as well as the data.
Using backup and restore is not the same as business continuity, as some downtime will be required to regain full functionality. However, through the use of tools such as data deduplication to minimise data volumes and snapshotting of changes to data stores on a regular basis to minimise incremental backup sizes, combined with wide area network (WAN) acceleration, full restores can now be operated in short timescales of minutes or hours, which can be agreed with the business as the recovery time objective (RTO).
The last aspect is the need for long term storage of information assets, whether this is for business or governance reasons. This will need an archive strategy to back up the continuity, availability and recovery plans, and could involve vaulting of assets to an external for long term management, or an on-going plan for rolling data from one storage medium to another in house. Beware the belief that long term storage of data can be done on one type of storage medium – if the information is required to be stored over just a 10-year period, think back to 2002 and try and remember what was being used at that point: it is essentially out of date already.
Like a great many other decisions for the CIO, this one is always going to be a trade-off. Different processes and activities vary in terms of operational and financial significance – so the optimal solution in terms of protection vs expenditure requires careful analysis – and the CIO will need a clear mandate from the board because all processes are not equal.
A corporate information availability strategy needs a blended approach. The organisation will need some aspects of business continuity, but unless it has exceedingly deep pockets, it will also need to address how information availability and recovery can provide a low risk, cost effective strategy to provide an optimal platform for the business.
Quocirca’s free report on the subject can be downloaded here.
Clive Longbottom, Service Director, Business Process Analysis, Quocirca