Swathes of the internet including Amazon, Reddit and many news outlets went offline on Tuesday following a glitch affecting a relatively obscure cloud computing company.
Dozens of major websites and hundreds overall were rendered inaccessible when Fastly started experiencing issues, leaving the likes of the Government’s gov.uk domain, streaming services Twitch and Spotify unavailable for most of an hour.
Some feared hackers were responsible for the disruption but it emerged that a problem with Fastly was behind the international wave of outages.
What is Fastly?
Fastly was founded by developer Arthur Bergman in 2011 and has grown into a crucial component of the technology underpinning modern day internet use. It is a content delivery network that sells security and tools to other large companies to help them deliver their content to users more quickly.
Essentially, Fastly’s tools aim to ensure web pages are delivered reliably to users as quickly as possible.
In 2017 it launched an “edge cloud platform” that allows users to access sites not based near where they live. It does so by speeding up loading times through effectively storing some content in servers closer to users.
Fastly’s website says it helped Buzzfeed to load pages 50pc more quickly, while its tools allowed the New York Times to handle the 2m readers reading its website on election night last November.
Mark Hendry at legal business DWF explains:
Fastly provide content delivery network services to companies. The intention of these networks is to route internet traffic and services through “nodes” to balance the load of traffic, prevent bottlenecks and result in high availability and faster content delivery.
Requests for content are directed by an algorithm, for instance the algorithm might direct the traffic so that it routes through the most available or highest performing node, or so that the traffic takes the fastest network route to the requestor. This is the reason that some internet users are reporting no issues with accessing content that is unavailable to others.
What went wrong at Fastly?
Fastly first flagged a problem on its service status page at 10.58am, warning: “We’re currently investigating potential impact to performance with our CDN services.”
Just over 45 minutes later, it added: “The issue has been identified and a fix is being implemented.”
By that point, the parts of the internet that were still working were alight with reports of the problems.
At 12.09pm it wrote on Twitter:
Fastly said a new service configuration sent a wave of disruption across its so-called “POPs” – the servers that store cached copies of web pages to speed up access – leaving users unable to access certain sites.
So it wasn’t a cyber attack?
No – but that doesn’t mean experts are not concerned at the extent of the internet’s reliance on a comparative handful of hosting companies.
Gaz Jones of digital agency Think3 says: “This is what happens when half of the internet relies on Goliaths like Amazon, Google and Fastly for all of its servers and web services. The entire internet has become dangerously geared on just a few players.”
David Warburton at cyber security company F5 Labs adds:
The web as a whole was intended to be decentralised. By not relying on any one central system, it meant that many different components could fail, and internet traffic could still find a way to get where it needed to go.
What we’ve seen over the past decade, however, is the unintentional centralisation of many core services through large cloud solution providers, like infrastructure vendors and content delivery networks.
In a traditional internet app deployment model, an outage of a server or misconfigured application might take out a single website. As we saw today, similar problems with a cloud solution provider can end up taking out all of their customers, resulting in not one website being taken offline, but hundreds or thousands.
What did the outage cost?
It’s hard to put a price on the impact of the outages, but ParcelHero estimates the disruption will have cost close to £1bn for retailers in the UK, Europe and the US.
Its head of consumer research, David Jinks, says: “Amazon alone turns over $950,000 a minute. It was one of the quickest sites to get back online but some organisations were down for around an hour. We believe retail worldwide will have lost around £1bn. Time really is money in the era of e-commerce.”
That begs the question of what kind of compensation these big web companies can demand from Fastly. As Bloomberg reports:
All website system administrators know that network outages and downtime can happen, no matter the size of their hosting platform. As such, a single outage in a year for under 60 minutes isn’t usually enough to warrant moving to a rival provider.
Depending on the service level agreement signed between it and affected customers, however, Fastly may offer refunds or credits equivalent to the number of minutes a website was unavailable, according to its website.