Yesterday, one of the internet’s most trusted service providers powering the backbone of the internet went down for about an hour and fifteen minutes. This affected a large portion of sites worldwide as well as traffic to our data centers.
With any significant event that affects our customers, we conduct an extensive examination to understand the root cause and develop a course of action to improve our systems and procedures. To that end, we wanted to provide a synopsis of the situation that occurred.
During yesterday’s event the Servebolt Cloud was still operational and there was no data loss, however – as a customer you could experience that it was not possible to connect to SFTP and SSH services, or that it was impossible to reach your site through HTTP, e.g. the browser.
Here’s what happened
On Thursday, October the 7th, at 18:05 CEST our monitoring services notified us of a severe outage affecting customers worldwide. Our emergency staff was called in immediately to identify and mitigate. Our investigation quickly pointed us into the direction of a large portion of the internet’s backbone being down.
Telia is one of the main providers of core internet infrastructure for the entire internet, referred to as the internet backbone. The internet backbone is maintained by the Tier 1 providers, and when they make mistakes – it can cause large parts of the internet to become inaccessible.
As a result, parts of the Servebolt Cloud was inaccessible for customers and website visitors, who’s traffic was routed over the Telia Carrier network. The explanation Servebolt received from one of our data center providers was the following:
The internet is put together by a network of networks, which means there are alternative ways over the internet that may still work. Therefore, customers that were running Accelerated Domains were up during the entire time.
During a routine update to an aggregate routing policy within AS1299 an error was committed which impacted our internal support systems and traffic in some regions.Source
At 19:20 CEST the issue was resolved and full access including all Servebolt services was available again.
The outage happened by force majeure and was, as indicated by Telia, completely out of our hands and a result of a misconfiguration at Telia.
Here’s what you can do to minimize impact
Fortunately, there are some solutions available to reduce the impact of events like this. We have a couple of solutions available to make sure you can minimize the effect it has on your site.
Accelerated Domains not only greatly improves the performance and security of your site, it also minimizes the impact of Internet backbone outages. Accelerated Domains is built on top of Cloudflare’s Enterprise backbone and uses all of Coudflare’s priority nodes, and has priority in the Cloudflare network. For the websites using Accelerated Domains yesterday this resulted in little to no downtime.
Always Online in Cloudflare Pro and Cloudflare Business
An alternative is having your site on Cloudflare. Cloudflare Pro and Cloudflare Business offer many benefits and one in particular that touches this particular problem. Cloudflare offers a feature called Always Online. It uses the Internet Archive’s Wayback Machine to keep your website online for visitors when your origin server is unavailable.
Of note here is that Always Online only serves limited copies of web pages to users instead of errors when your server is unreachable.
Get in touch with our Support Team to add Cloudflare to your site or order online here.
Let us help you to be prepared in the best possible way.
We take this seriously and we have redundancies built in to prevent as much as possible, but in this case we’re all victims of a problem outside of our control. Outages as these disrupt your life and your business. We understand and we take our responsibility to you very seriously. We sincerely apologize for the disruption and the inconveniences this likely has caused you.
We launched Accelerated Domains!
Accelerated Domains is a service that greatly improves and optimizes your site on these four areas: Performance, Scalability, Security, Carbon footprint.