If COVID-19 has taught us one thing, it’s how much of our digital lives are built on interdependencies and chains of trust.

For many people, this came as a crash course in home internet. 

Anyone that has seen NBN marketing will be familiar with their end-to-end connection disclaimer: that they directly control only part of internet service delivery, and that other factors – from your in-home setup to time-of-day affect your speeds.

The internet is as unpredictable as it is critical. 

It’s a "best-effort" collection of networks connecting myriad providers. This makes it vulnerable to outages, congestion and exploitation that can affect users’ experiences. With many of us working from home, this became a lived experience.

On the evening of 6 May, some of us received a different sort of reminder as all manner of popular iOS apps started crashing.

The culprit was somewhat unlikely: Facebook. A change made to Facebook itself caused bad data to be sent to a Facebook software development kit (SDK) incorporated into countless apps to enable users to log in using their Facebook credentials. 

“Since this happened during the initialisation of the SDK something that occurs right after launching the app  the apps simply became unusable,” third-party app developer Guilherme Rambo wrote in a post-mortem.

That caused problems even if you weren’t using social login. As ThousandEyes lead internet researcher Arash Molavi Kakhki told ‘The Internet Report’, “For a lot of the cases, even if the user is not explicitly using Facebook to log into Spotify or TikTok or those apps, just the fact the SDK is being used in the app could cause the app to crash. You don’t have to use Facebook to log in to necessarily be affected by this.”

The case again highlights interdependencies in our digital lives: the fact a Facebook bug temporarily killed so many popular apps, because the apps run a tiny piece of Facebook’s code, is a problem.

Breaking business

Facebook was an unintentional bug and easily fixed. 

But in recent years, businesses have been awake to the potential for bugs or malware to be purposely slipped into third-party code or services they call on for their own apps. Known as a ‘supply chain attack’, the risk has been enough for Australian banks and other large organisations to up their preparations.

The more immediate challenge most Australian organisations faced over the past couple of months is making sure employees could work effectively from home. This meant having them access enterprise applications using unsecured and uncontrolled home connections.

Application delivery is already complicated, without adding more interdependencies that impact performance. 

Delivery is already dependent on a large and complex ecosystem of internet-facing services, such as content delivery networks (CDN), domain name servers (DNS), distributed denial-of-service (DDoS) mitigation and public cloud. These services work together to provide exceptional digital experiences to users and even brief disruptions to any piece can have a significant impact.

While outages are inevitable, having visibility into these outages can significantly reduce the time to escalate and resolve these incidents, as well as enable better communication with your employees or customers.

COVID-19 has shown us the importance of that. Weekly numbers collected by ThousandEyes reveal that the number of network outages rose from the middle of February through March before achieving some stability. 

Companies that had no visibility into this situation are likely to have had a much tougher time providing IT performance and stability in a time of heightened need, since they could not accurately identify and pinpoint where their problems lay.

Only by taking a proactive approach to visualising all the dependencies that matter to your organisation and managing outage-related risks can you create effective outage recovery plans and measure resilience when those plans are called into play. 

Mike Hicks, principal solutions architect at ThousandEyes

 

FROM THE WEB