Why Facebook and Instagram went down for hours on Monday
When a company can't use the internet's core protocols, it's as if its online domains simply don't exist. That happened to Facebook, creating a cascade of problems.
When Facebook suffered an outage of about six hours on Monday, businesses suffered along with it. The platform and its Instagram and WhatsApp siblings play key roles in commerce, with some companies relying on Facebook's network instead of their own websites.
But on Monday, that network came crashing down. It wasn't a hack, Facebook said, but rather a self-inflicted problem.
An update to Facebook's routers that coordinate network traffic went wrong, sending a wave of disruptions rippling through its systems. As a result, all things Facebook were effectively shut down, worldwide.
Why did the outage last so long?
The problem was made worse — and its solution more elusive — because the outage also whacked Facebook's own internal systems and tools that it relies on for daily operations. Employees also reportedly faced difficulty in physically reaching the space where the routers are housed.
We’re aware that some people are having trouble accessing our apps and products. We’re working to get things back to normal as quickly as possible, and we apologize for any inconvenience.— Facebook (@Facebook) October 4, 2021
"From a technical perspective, they're going to have to review what they do and how they've designed things," cybersecurity expert Barrett Lyon said in an interview with NPR.
The outage cost the company tens of millions of dollars, Marketwatch says, comparing the company's lost hours with its most recent revenue report.
The disruption stands as one of Facebook's worst setbacks since a 2019 incident that took the platform offline for nearly 24 hours — an outage that, like Monday's, was attributed to a change in Facebook's server configuration.
So, what happened?
This week's outage struck around 11:40 a.m. ET. At about 6:30 p.m. ET, the company announced that it had resolved the problem and was bringing services back online.
In an update on the outage, Facebook said, "Configuration changes on the backbone routers that coordinate network traffic between our data centers," blocking their ability to communicate and setting off a cascade of network failures.
That explanation suggests the problem arose between Facebook and the Border Gateway Protocol, a vital tool underlying the Internet.
Border Gateway Protocol is often compared with the GPS system or the Postal Service. Similar to ideas like map coordinates or ZIP codes, the system tells the rest of the world where to route traffic and information.
When a company can't use the gateway protocol, it's as if their online domains simply don't exist. But that didn't stop web pages, searches and messages from looking for Facebook's properties. And that, in turn, led to other problems.
"Many organizations saw network disruptions and slowness thanks to billions of devices constantly asking for the current coordinates of Facebook.com, Instagram.com and WhatsApp.com," tech expert Brian Krebs notes.
The outage came as Facebook faces intense scrutiny over its products and policies — including a whistleblower who is testifying before a Senate subcommittee on Tuesday — prompting some to wonder whether the company had been hacked. But the company said it was simply "a faulty configuration change."
Facebook also stressed that there is "no evidence that user data was compromised as a result of this downtime."
Some businesses lost nearly a day of work
The Facebook outage lasted nearly an entire working day, leaving some businesses rattled and online habits frustrated.
Many people use Facebook, Instagram and WhatsApp to share photos and videos with their family and friends, but many businesses see the platforms as a primary tool, using them to advertise, connect with customers and sell products and services.
Christopher Sumner, the owner of Lowcountry Overstock, a small clothing store based in South Carolina, says that while Monday's outage didn't severely impact sales, his main concern was losing touch with customers.
"We've had longer periods when we've been locked out of Facebook completely, but our main concern was customer relations and not being able to communicate with customers," Sumner told NPR.
Sumner said they regularly make sales on Facebook Marketplace, the website's e-commerce platform. Despite Monday's disruption, Sumner says the recent outage isn't enough to make him take his business completely off of Facebook.
"While yes, there's been a few operational problems from the beginning with Facebook Marketplace, we wouldn't move our entire business or any portion of it, just because the sales are so good," Sumner said.
Editor's note: Facebook is among NPR's financial supporters
Copyright 2021 NPR. To see more, visit https://www.npr.org.