CrowdStrike shows the systemic risk of being dependent on a single major provider

What might be the longer term impact of Friday's outages?

clock • 4 min read
Shutterstock
Image:

Shutterstock

Alina Timofeeva, currently a Board Member for the British Computer Society, The Chartered Institute for IT, and a strategic advisor in data and technology to the C-suite of major financial services organisations, shares her thoughts on the impact of the outages in areas such as crisis communications and regulation.

The global IT outages that occurred last week will have a lasting and far reaching impact, way beyond the initial chaos that the CrowdStrike update caused.

We saw with the outage that with technologies come risks. And customers may not necessarily be fully aware of what they are exposed to. The global ripple effect of the outage illustrates the interconnectivity across the supply chain and risk concentration in this market.

Alina Timofeeva

Software vendors like CrowdStrike have become so large and so interconnected that their failures can damage the global economic system and tens of millions of customers globally. 

It is key that companies, but also governments and the regulatory ecosystem, are more mindful and perhaps concerned about the systemic risk of being dependent on a single major provider. 

Last week it was CloudStrike and Microsoft. In the future it could be cloud giants like Amazon, Microsoft or Google who fail and this would impact tens of millions of customers. 

From the governments' perspective we need to start monitoring in detail the impact of this interconnected state, and identify the future events that could start small but become much bigger. This would help us build the nation's resilience and ability to respond to similar events.

Digital trust & communication 

Building and restoring digital trust is key. I would define digital trust as a confident relationship with the unknown. Further enhancement around technology, third-party management, operational resilience – this combination of existing and incoming regulations, for example DORA, which is coming into play in 2025 - can help ensure that the future of financial services and products is cost-efficient but also safe and secure. 

I do feel that real disruption happening isn't technological. At its core is empowerment. Empowering us as customers to navigate through change and uncertainty in an agile an safe way. Communication is key here to maintain transparency and trust.

The proportionate response to Friday's outage would include clear, transparent, and timely communication both externally with customers and internally with employees around the impact the outage had on the organisation,  material services impacted, what key steps are being taken to restore the material services, and clear timelines on when these will be restored.

However, I believe that the communication should not stop there. More strategically, what are the steps being taken by the company to ensure that this situation does not impact or harm the employees and customers in future, so that trust is restored longer term?

I would anticipate that after the crisis communication, strategic communication over the next fews days is necessary around what will happen, not only at company level, but at the government and regulatory level to mitigate these risks in future. 

Regulation 

Fridays events should be a wake-up call for employers to invest in tried and tested disaster recovery plans, which in many cases were exposed more as a paper-based exercise rather than a plan that was tried and tested at scale across the key simulation scenarios, including extremely unlikely crisis scenarios - just like this one. 

I believe that there will be a much bigger focus from regulators on operational resilience, holistically across data, technology, people and processes. 

To ensure a proportionate response, material services (e.g. payroll or making payments) would need to be prioritised. Whilst there is already focus on DORA implementation for some industries by 2025 deadline, it doesn't apply to all sectors and there will be questions of whether it is enough. 

I would anticipate a bigger push from regulators to mitigate concentration risk, not only within companies but also at the level of the providers of material services. I anticipate both tighter regulations, and increased scrutiny from the regulators of companies choosing to prioritise cost and efficiency over the safety and security of their operations and the potential harm to customers. 

Whilst I am not anticipating new regulations being developed on the back of this situation in addition to existing ones, I am anticipating greater adherence to existing ones including DORA, FCA and EBA guidelines etc and the regulators being more stringent. Currently compliance varies and I would anticipate in the conversations with regulators, companies would want to demonstrate at least full Level 3 compliance with the key risks and existing or proposed control frameworks, versus this being a checkbox exercise.

Fridays outage wasn't just about a technology update that went terribly wrong. It was about the impact it made both internally on operations and externally on customers. From the regulators perspective at the company level the key questions to answer would be: 

  • In the event of an issue, what do you do to restore material services fast and restore the trust of your customers? 
  • Do you have a comprehensive view of your material services? 
  • Do you have a view of all the data, technology, people and processes that would be key to restoring these services fast? 
  • Do you have tried and tested recovery plans, including crisis communications both internally and externally with customers? 
  • How fast can you access all of these and stand this up?

 

 

 

You may also like
CrowdStrike contracts external security to evaluate Falcon tool

Corporate

The platform was at the heart of last month's massive outage

clock 08 August 2024 • 2 min read
CrowdStrike rebuffs Delta's legal threats

Law

'Strongly rejects any allegation that it was grossly negligent or committed wilful misconduct'

clock 06 August 2024 • 3 min read
Microsoft offers advice on avoiding another CrowdStrike-style outage

Security

Vendors should minimise use of kernel mode, customers should make full use of integrated Windows security features

clock 29 July 2024 • 3 min read
Most read
01
02

UK signs AI agreement with EU and USA

05 September 2024 • 2 min read
03

Transport for London hit by cyber incident

03 September 2024 • 1 min read
05

Why do you need an AI PC strategy?

04 September 2024 • 2 min read

Sign up to our newsletter

The best news, stories, features and photos from the day in one perfectly formed email.

More on Compliance

Nvidia denies antitrust subpoena from Justice Department

Nvidia denies antitrust subpoena from Justice Department

Officials are concerned about firm's AI chip dominance

clock 05 September 2024 • 2 min read
ICO reprimands Labour Party for failing to respond to personal information requests

ICO reprimands Labour Party for failing to respond to personal information requests

An email inbox designated for privacy-related requests was not checked for nearly a year

clock 30 August 2024 • 3 min read
UK tribunal approves billion-pound lawsuit against Google's Ad Tech practices

UK tribunal approves billion-pound lawsuit against Google's Ad Tech practices

Ruling adds to Google's growing legal woes

clock 06 June 2024 • 2 min read