What We Can Learn From the CrowdStrike and Microsoft Outage

What We Can Learn From the CrowdStrike and Microsoft Outage

Friday 19th July saw global chaos as CrowdStrike and Microsoft suffered a huge technological fail. Deemed as the most serious IT outage the world has ever seen, everything from airline companies and banks to GP surgeries and television companies were affected.

This blog will unpack what happened, how it happened and what we can learn from it.  

What happened?

18:00 Thursday 18th July, first reports of leading banks, media outlets and airlines in the US suffering major IT outages. Microsoft acknowledged the issues and said they were taking mitigating actions.  

Friday morning came around and huge organisations like the London Stock Exchange were reporting huge IT issues. By 08:00, there was a global impact with; trains cancelled, Sky News temporarily off air and flights suffered huge delays and cancellations, including US airlines who issued a global ground stop on all their flights. As well as this, some shops were suffering from card machine issues, meaning they were forced to close or operate on a cash-only basis. The state of Alaska reported phone line issues which meant people couldn’t contact 911 and some GPs were unable to access patient records which halted operations.  

Appointments were cancelled, Delhi airport went manual, operations were cancelled, TV broadcasts came off air and more than 5,000 flights were cancelled worldwide. Whilst the issue was ‘fixed’ within 24 hours, the impact was lasting, taking days to recover. 

How did it happen?

Quite early on, people were looking at CrowdStrike, a Texas-based cyber security firm founded in 2011 with the aim of safeguarding the world’s biggest companies and hardware from cyber threats and vulnerabilities. It specialises in endpoint security protection and tries to prevent malicious software from corporate networks.  

Microsoft acknowledged the blame on third-party software, stating it was a software update that was issued and caused damage to Microsoft Windows devices. Many users reported the a blue screen which is a critical error screen displayed on devices that indicates a system crash.

George Kurtz, CEO of CrowdStrike issued a statement on Friday lunchtime in which he began by sincerely apologising to all those affected by the outage and explained that the issue began with a ‘defect found in a single content update for Windows hosts’. He clarified this was not a security incident or cyber attack and that the issue had been identified, isolated and a fix had been deployed.

There are so far no reports of anything malicious or any data compromises, accessed or stolen. It is encouraged that organisations and individuals keep on top of software updates and report anything unusual.  

Whilst the impact was major for all those affected was huge, most of it was resolved before the weekend was over. 

What can we learn from this?

Like all mistakes, there are learnings to take. The first is that it has highlighted the world’s reliance on technology in that many services and sectors ground to a halt following the outage. 

It also proved how IT systems are a complex web of interdependences. An issue in one area can quickly lead to issues in other areas. 

The final big lesson is that sometimes, no matter how rigorous testing can be, real-world external factors can reveal unforeseen circumstances that are beyond an organisation’s immediate control.