Telia Outage and Its Ripple Effect

Yesterday, one of the most trusted internet service providers, Telia Carrier AB, a major backbone carrier in Europe, experienced an interruption for about an hour and fifteen minutes. This affected downstream partners and customers, resulting in many sites worldwide and traffic to our data centers being down.

With any significant event that affects our customers, we conduct an extensive examination to understand the root cause and develop a course of action to improve our systems and procedures. To that end, we wanted to provide a synopsis of the situation that occurred.

During yesterday’s event, the Servebolt Cloud was still operational, and there was no data loss; however – as a customer, you could experience that it was not possible to connect to SFTP and SSH services or that it was impossible to reach your site through HTTP, e.g., the browser. 

Telia Outage Overview

On Thursday, October 7th, at 18:05 CEST, our monitoring services notified us of a severe outage affecting customers worldwide. Our emergency staff was called in immediately to identify and mitigate. Our investigation quickly pointed us in the direction of a large portion of the internet’s backbone being down.

Telia is one of the main providers of core internet infrastructure for the entire internet, referred to as the internet backbone. The internet backbone is maintained by the Tier 1 providers, and when they make mistakes – it can cause large parts of the internet to become inaccessible.

As a result, parts of the Servebolt Cloud were inaccessible to customers and website visitors, whose traffic was routed over the Telia Carrier network. The explanation Servebolt received from one of our data center providers was the following:

“The connection to our data centers went down due to Telia not broadcasting the full set of Border Gateway Protocol (BGP) routes from the AS1299 transit Telia network

The Internet is put together by a network of networks, which means there are alternative ways over the Internet that may still work. Therefore, customers that were running Accelerated Domains were up during the entire time. 

During a routine update to an aggregate routing policy within AS1299 an error was committed which impacted our internal support systems and traffic in some regions. 

Source

At 19:20 CEST, the issue was resolved, and full access, including all Servebolt services, was available again.  

The outage happened by force majeure and was, as indicated by Telia, completely out of our hands and a result of a misconfiguration at Telia.

What Can You Do to Minimize the Impact

Fortunately, there are some solutions available to reduce the impact of events like this. We have a couple of solutions available to make sure you can minimize the effect it has on your site.

Accelerated Domains

Accelerated Domains not only greatly improve the performance and security of your site, it also minimizes the impact of Internet backbone outages. Accelerated Domains is built on top of Cloudflare’s Enterprise backbone and uses all of Coudflare’s priority nodes, and has priority in the Cloudflare network. For the websites using Accelerated Domains yesterday, this resulted in little to no downtime.

Order Accelerated Domains right from your site’s Control Panel or directly from our Help Center

Always Online in Cloudflare Pro and Cloudflare Business

An alternative is having your site on Cloudflare. Cloudflare Pro and Cloudflare Business offer many benefits, and one in particular that touches on this particular problem. Cloudflare offers a feature called Always Online. It uses the Internet Archive’s Wayback Machine to keep your website online for visitors when your origin server is unavailable. 

Of note here is that Always Online only serves limited copies of web pages to users instead of errors when your server is unreachable.

Get in touch with our Support Team to add Cloudflare to your site or order online here. Let us help you to be prepared in the best possible way.

Your Stability is Our Mission

We take this seriously, and we have redundancies built in to prevent as much as possible, but in this case, we’re all victims of a problem outside of our control. Outages like these disrupt your life and your business. We understand, and we take our responsibility to you very seriously. We sincerely apologize for the disruption and the inconvenience this likely has caused you.