Optus Doesn’t Think Emergency Outage Roaming Would Work

Optus Doesn’t Think Emergency Outage Roaming Would Work

Optus’ responses to Senate questions about its recent outage are kind of brief – but they do say a lot about its views on everything from emergency mobile roaming, financial compensation and just how heavily its networks get attacked on a daily basis.

The 8th of November 2023 can’t have been a happy day to work at the nation’s second biggest telco, Optus.

You’re probably well aware of the outage that saw Optus customers offline for upwards of 13 hours, at first blamed on “a routine software upgrade by a third-party infrastructure provider” that led to the resignation of CEO Kelly Bayer Rosmarin.

It’s also fair to assume that a number of Optus customers have jumped ship to rival networks, though we probably won’t know the scale of that at least until Optus next reveals its financial results.

That’s not where the story ends, however, with a number of inquiries now underway into the incident at a government level.

Former Optus CEO Kelly Bayer Rosmarin faced the Senate on the 17th of November, but there were a number of questions taken on notice that Optus wanted to answer in more detail.

It’s now provided those answers, and digging into them reveals some fascinating insights into what went wrong, how Optus’ network is configured and what Optus is doing – or planning to do – to minimise disruptions in the future.

No doubt, being questions on notice, Optus has carefully considered each answer, and some of them are rather brief. I’m going to quote each one, and then ponder on its implications.

Does Optus have good risk management?

Actually, there’s one area where Optus’ answers aren’t in the least bit brief, and that’s in detailing its crisis management processes, noting at the start of its answers that (deep breath)

Optus has an established, comprehensive risk management framework and supporting processes which aligns to the ISO 31000 industry standard. This framework and supporting processes ensure Optus maintains a sound understanding of the nature and extent of its material risks and maintains effective internal controls to manage these risks.

In doing this, Optus has taken a systematic approach through its risk management processes and systems for continuous identification, quantification, monitoring and control of risks, which is further enabled by robust risk governance, risk capability and risk culture. This includes the following measures:

1. Conducting annual risk profiling exercises to determine the nature and extent of material risks to Optus and ensure appropriate control measures are in place;

2. Performing issue management processes to ensure control gaps are tracked and remediated;

3. Enacting overarching governance and oversight processes through business unit/operational

level risk committees and the executive-level risk committee (ERC).

That’s good – and it’s good to see that it’s working to a standard for these matters, not that it led to an outcome where its network did not, in fact fall over.

Did Optus react right away to the outage?

As per Optus:

The Networks Operation Centre (NOC) is monitored 24 hours, and the first Optus engineer was physically present in the NOC at 04:45am.

“Monitored” is doing some heavy lifting here, given that staff weren’t on site until – as per reports from the time – some 45 minutes after the outages first started to kick in. Still, it’s clear that Optus was indeed working away at the issue at a time well before many (but not all) Australians would have been impacted by the outage.

It gets more interesting when we look at what Optus is now saying about the causes of the outage:

The cause of the outage was that Optus’ Cisco routers hit a fail-safe mechanism which meant that each one of them independently shut down.

That’s an interesting evolution from earlier statements that saw some of the blame shift to routing information coming through from an international peering network, though Optus did note that this was after a software upgrade – so it’s possible that both tie together if an upgrade saw those Cisco routers glitching out. I’ve got to think that Optus won’t have made this kind of statement without being sure that Cisco’s lawyers aren’t likely to start grumbling in expensive ways towards them, however.

On the subject of whether any of this was as the result of some kind of hacking attempt, Optus somewhat repeats itself at first:

The cause of the outage was that Optus’ Cisco routers hit a fail-safe mechanism which meant that each one of them independently shut down. There is no evidence to suggest influence by any foreign actor or sabotage.

That’s also good – while the outage itself was clearly a disaster area for Optus and its customer base, it would be worse if it appeared in any way to be a matter of external forces creating a network outage.  

Not that Optus is a stranger to such concepts as cyber-attacks, noting that:

There is no universally agreed formula for calculating this across industry. To provide a response, we have collated events from relevant cyber security controls which indicate around 17M attacks per day on average.

17 million attacks per day… on average. Yikes. Suddenly the myGov spam SMS I get from time to time shrink into insignificance.

What About Compensation For The Optus Outage?

Optus’ initial offer to consumers was to provide bonus data to postpaid and prepaid customers, as well as speed boosts for NBN customers.

As per its terms and conditions for consumer customers as I read them, that’s actually above its absolute requirements – which just call for cost-of-access refunds at most, which would equate just to a few dollars at most.

Not surprisingly, a lot of Optus customers weren’t entirely taken with bonus data by way of compensation, and it appears that Optus is coming around to offering higher levels of compensation. On the question of compensation, Optus’ response was quite brief.

Optus has paid out both cash and account credits.

No mention of how much it’s paid out at all, but as the ABC notes, when Bayer Rosmarin fronted the senate, she noted that some $430,000 was under discussion and that $36,000 had been “applied”.

The brevity of Optus’ answers doesn’t make it entirely clear who gets that cash, but there’s the implication from other answers that Optus is primarily considering financial compensation towards business customers primarily.

On roaming in emergencies

One key question asked during the day when Optus was offline was whether it would be feasible for Optus customers to temporarily utilise the networks of competitors Telstra and Vodafone.

After all, this is what happens if you’re using any network phone and you’re out of range of a tower and need to call 000 for emergency services. If you see an SOS on your phone, it means it’ll connect to any network to make that call happen – unless you’re totally out of range of anything, in which case your only emergency option might be Emergency SOS via satellite if you’ve got an iPhone 14 or iPhone 15, so it must be feasible, right?

Optus says it’s discussing it with Telstra and Vodafone, but it does not ultimately think it would work:

Optus is working with the other MNOs to assess the viability of temporary disaster roaming. If a roaming solution was in place, it would have likely resulted in other mobile networks being unable to accommodate the extra traffic given the number of users trying to roam. If the capacity issue could have been addressed, roaming would likely not have worked as the Optus core network was down and Optus subscribers would not have been able to be authenticated for roaming.

There’s a couple of details worth unpacking here.

I did a bunch of radio interviews on the day of the outage trying to piece it together, and the question of roaming to Telstra or Vodafone kept coming up, which made me ponder exactly how you’d authenticate a phone for usage on a secondary network if the primary network was itself unable to authenticate it without some form of pre-sharing authentication data on file with every telco.

That itself opens up a whole different can of worms around data security, consumer and business privacy and the prospects of account jacking if any of those 17 million daily hack attempts were to sneak past the security barriers.

But it’s the capacity issue that’s more interesting here, I think – and while most folks won’t like it, I think Optus is probably dead right here for most Australians.

Think about whenever you’ve been to a big concert or sporting event where thousands of people were present. Odds are good that your mobile reception was a bit on the crappy side, right?

While there are solutions around technologies such as 5G designed to mitigate this to an extent – though having a few more mmWave 5G phones for Aussies to buy wouldn’t hurt here either – the practical reality is that the more users you pack onto a single mobile cell, the more it has to share that capacity, and the worse it gets for everyone.

Factor in the reality that most Australians do live in specific clusters – mostly along the eastern seaboard by population numbers – and it’s not hard to see one network collapse triggering a second, or even third if everyone bounced from one to another. That would be an even worse scenario.

It’s perhaps not much better in more regional and rural areas, either. There you might not have the population numbers to deal with, but you’ve equally got more sparse coverage and the fact that every network – including Optus – does a lot of load management work on a dynamic basis to allocate resources. If you’ve got scant coverage in a regional area, adding even a few hundred off-network users could tip the scales towards bad network performance.

The Cheapest NBN 50 Plans

It’s the most popular NBN speed in Australia for a reason. Here are the cheapest plans available.

At Gizmodo, we independently select and write about stuff we love and think you'll like too. We have affiliate and advertising partnerships, which means we may collect a share of sales or other compensation from the links on this page. BTW – prices are accurate and items in stock at the time of posting.