“It won’t happen to us,” they said. “We’re too big to fail.” And then? Thousands of cancelled flights, angry customers and a financial impact that will likely total in the tens of millions of dollars.
Welcome to Delta’s nightmare, circa two weeks ago.
At first glance, we could call it a fluke. An unforeseeable event that created a big mess for Delta to clean up.
What went wrong?
It's not the first time this has happened to an airline this year – even this season. Just a few weeks prior, Southwest Airlines experienced what was its second significant outage in less than a year. And these airlines are far from alone. Outages like this bring down companies of all sizes.
Perhaps most surprising about the outages that both Delta and Southwest experienced is that they were issues that could have been avoided with the use of backups – and both airlines reportedly had backups in place. Those backups were clearly not adequate.
Southwest Airlines’ outage was caused by a downed router. According to Chief Operations Officer Mike Van De Ven, they had redundant systems that failed to activate. The airline had to replace the router and reboot some 400 servers. As a result, they cancelled over 2000 flights and financial analysts estimate that the blunder cost the airline between $54 and $82 million. Never mind the egg on their face.
Meanwhile, back at Delta, their worldwide outage boiled down to the failure of a single power control module, which in turn created a surge and loss of power. Unfortunately, 300 of its 7,000 servers were not connected to a backup power source, according to Delta executives. So, the backup power vulnerability went undetected. For Delta, the result was the cancellation of over 1,700 flights and what will likely be in the tens of millions of dollars as well.
The one thing you can do
Clearly, your backup plan is only good if it works. So, how can you avoid sharing a similar fate?
Make sure that you – or your IT provider – tests your backup systems to make sure that they work properly. That means that they: kick in automatically when they’re needed, prevent costly outages, access and restore data quickly and prevent loss of information.
When commenting on the snafu, Delta’s CEO, Ed Bastain, said in no uncertain terms that there was more Delta could have done to preempt the outage, simply by being more aware of their vulnerabilities and proactive in removing them. Southwest Airlines executives could easily say the same. There is no reason why either airline should have been caught in a situation where their backups didn’t protect them or didn’t activate properly.
Backups that are up-to-date, adequately provisioned and tested periodically don’t fail when they are most needed. What sounds on the surface like a technology failure is actually indications of poor technology planning. That is likely the reason that technology experts predict similar outages will occur until the airlines upgrade technology – and their mindset.
Can your IT handle a power failure or server outage?
Get a free assessment and be confident in your IT