British Airways Blames Power Surge for Massive OutageAirline Sees No Cyberattack; Experts See Scant Disaster Recovery Planning
British Airways grounded all of its flights at London's two biggest airports Saturday after an IT failure led to massive delays. By Monday, many information systems had been restored, but delays and disruptions continued.
Alex Cruz, chairman and CEO of British Airways, says there's no evidence the outage was the result of a cyberattack.
"All of our check-in and operational systems have been affected and we have canceled all flights from Heathrow and Gatwick today," Cruz said Saturday in a video message posted to Twitter, apologizing for the disruptions and delays.
BA typically flies hundreds of flights - and about 120,000 passengers - out of both airports each day, and the long, bank-holiday weekend is one of the busiest travel periods of the year in Britain.
The outages began early Saturday morning and, according to media reports, disrupted the airline's ability to generate flight plans - a legal requirement before any flight is allowed to depart. Heathrow and Gatwick quickly filled with angry passengers, and BA staff were left largely unable to help them because none of the computers were functioning.
"We believe the root cause was a power supply issue, and we have no evidence of any cyberattack," Cruz said Saturday.
By Monday, Cruz told the Guardian that the airline's investigation found that the outage was due to a "power surge" at a U.K.-based data center that began Saturday at 9:30 a.m. British Time. That had a "catastrophic effect" on the airline's communication hardware and eventually disrupted "all the messaging across our systems," Cruz said. "We will make sure that nothing like this ever happens in British Airways again," he added.
Disaster Recovery Questions
IT experts and airline industry officials say that a power surge does not explain why backup systems failed to kick in and prevent - or mitigate - the outage.
"What seems remarkable is there was no back-up system kicking in within a few minutes of the system failing," Paul Charles, a former Virgin Airlines spokesman, tells the BBC. "Businesses of this type need systems backing up all the time, and this is what passengers expect."
BA has apologized for the outages, delays, cancellations and overall disruption and promised to expedite refunds for anyone who could not travel - or chose to not travel - over the weekend. "We are extremely sorry for the huge inconvenience this is causing our customers, and we understand how frustrating this must be especially for families hoping to get away on holiday," Cruz said.
By Monday, BA said most long-haul flights were running normally and that it had partially restored short-haul flight services, although it expected further disruptions. BA has also offered to rebook travelers for free or expedite full refunds to anyone who couldn't fly or chose not to fly as a result of the disruptions.
Poor Resiliency Planning
Every industry has its IT bugbears - point-of-sale malware breaches in the restaurant, retailer and hotel industries; ransomware in the healthcare sector; document retention in the political sphere; stolen and leaked attack tools in the intelligence sector. But for the airline industry, poor planning and resiliency failures - leading to outages - remain a repeat challenge.
Last month, an outage of information systems used by Air France and Germany's Lufthansa prevented passengers from boarding. In January, a United Airlines computer outage grounded all domestic flights on a Sunday evening, leading to massive disruptions around the country.
Last September, BA apologized after a glitch in its check-in systems led to delays at Heathrow and Gatwick. That was a repeat of problems it had last July after it first introduced the check-in system. Last August, a power surge near Delta's U.S. headquarters in Atlanta grounded all of its flights worldwide, leading to 2,000 canceled flights over a three-day period (see System Outage Grounds Delta Flights Worldwide).
And Southwest Airlines last July was forced to issue a "ground stop" after a router failed and a backup failed to kick in, leading to a one-hour "system outage" that led to five days of delays and 2,300 flights being canceled.
Computer experts say lack of spending - or replacing outdated systems - is at least partially to blame for airlines' recurring outages, as is a failure to test operational systems to ensure that backup systems are in place and that failovers will be smooth.
Some market analysts say that close scrutiny by investors on airlines' returns has led to inadequate IT spending.
Protected: Fliers in Europe
When such disruptions occur, airline passengers in the United States have relatively few rights.
For anyone whose flight originates or terminates in Europe, however, thanks to a 2004 European Commission regulation, they can receive monetary compensation for flight delays of two hours or more, capping out at €600 ($670) if their flight was delayed for over four hours or canceled outright. Passengers who were stranded by the flight delays can also claim back hotel, transportation and meal expenses, up to a point.
As a result, British Airways could face a bill in excess of £100 million ($128 million) as a result of this past weekend's outage.
"This is not like an ash cloud or traffic controllers' strike that can't be predicted. The computer system breaking down is within its control. BA is going to have to pay out and it looks like its costs will be north of £100 million," James Walker, CEO of Resolver - a free site that passengers can use to claim compensation, under EU consumer protection laws, for delays - tells the Guardian.