Edward Roozenburg: Business continuity is more than just IT

Edward Roozenburg: Business continuity is more than just IT

Risk Management

This column was originally written in Dutch. This is an English translation.

By Edward Roozenburg, Senior Risk Management Consultant, Probability & Partners

When a cyber incident occurs, the first questions directors often ask are: ‘Is everything still working? Can customers log in? Can payments go through? Are the core applications available?’ These are logical questions. But they are not enough.

After all, business continuity is not just about systems. It is about whether an organisation can continue to fulfil its promises to customers, policyholders or participants when under pressure. That question goes beyond IT and is often more difficult to answer.

Take a fictional insurer. Let’s call it BestAssurance.

BestAssurance has its IT in order. Policy administration is running smoothly. The claims platform is available. The customer portal is working. The telephone system is operational. The disaster recovery environment has been tested. Reports state that the key processes are safeguarded. On paper, business continuity appears to be well organised.

Then an incident occurs. No ransomware. No system failure. No encrypted files. An external service provider that handles customer communications turns out to have been the victim of a data breach. The data involved includes names, contact details, policy numbers and some customer attributes. Legally, the incident is manageable. Technically, there is no disruption. All of BestAssurance’s systems continue to run.

But… Within a few hours, the first enquiries start coming in. Customers want to know if they’ve been affected. Some ask exactly what data has been leaked. Others, furious, want to cancel their policies. Customer service manages to respond effectively on the first day. After that, the workload builds up rapidly. Not because staff aren’t doing their jobs, but because routine work is being put on hold whilst incident-related tasks are added to the mix.

New claims are still being registered, but the assessment process takes longer. Not because of a system issue, but due to a lack of capacity. The communications team is working on customer letters and internal instructions. The legal department is reviewing the wording. Compliance is requesting evidence. Risk management wants a factual account. The management team wants updates. The regulator is asking further questions. The external service provider is not yet able to provide all the details.

Meanwhile, the systems are still functioning. The IT dashboard is showing green.

The backlog at customer contact has now doubled. Complaints are taking longer to resolve. Staff are fielding calls from angry customers and are becoming more cautious in their responses. As a result, calls are taking longer. Waiting times are increasing further. Posts are appearing on social media claiming that BestAssurance’s communications are too slow and too vague. A consumer programme is seeking a response.

The problem is not the faltering IT. Nor is the problem that BestAssurance lacks a business continuity plan. The problem is that the plan was primarily written with technical disruptions in mind. It contains scenarios for the failure of the claims platform, the unavailability of the office and problems with the cloud provider. It specifies who the crisis manager is and which applications take priority during recovery.

But the plan provides hardly any answers to other questions. And this happens to organisations in the real world too. In practice, Business Impact Analyses and business continuity plans do a very good job of managing the activities needed to ensure the continuity of systems. But questions such as which processes become critical if customer demand suddenly triples, which activities may be temporarily scaled back, which staff can be redeployed without creating new risks, and when reputational damage becomes a continuity risk are not addressed, or only to a limited extent. It is precisely these issues that go to the heart of what actually happens when an organisation comes under pressure.

An incident does not need to bring a process to a standstill to disrupt the organisation. It may be enough for customer contact, communication, legal assessment and management decision-making to all come under pressure at the same time.

At BestAssurance, therefore, the real threat to business continuity does not originate on the server. It arises in the work queue. In the backlog. In the gap between what customers expect and what the organisation can reliably deliver at that moment. And in reputational damage that arises because the organisation is functioning but is not acting visibly enough.

This calls for a different approach to BIA and business continuity planning. Not just: which application supports which process? But also: what promise do we make to customers? What capacity is needed to fulfil that promise under pressure? Which external dependencies could cause an incident without our own systems failing? What signs tell us that continuity is under pressure, before the board notices it through complaints, media reports or regulatory scrutiny?

Of course, the classic continuity question remains relevant: how quickly can we recover? But this must be accompanied by a second question: how long can we continue to operate responsibly if everything is working technically, but the organisation becomes overburdened in terms of governance, operations and communication?

Financial institutions are not judged solely on the availability of their systems. They are judged on trust, diligence and predictability when the going gets tough. Business continuity therefore does not end with the observation that the systems are available. That is only where it begins.