More Cloud Outages You Say? Microsoft Proves Me Right.

Posted on Wednesday, Feb 8, 2023 by Ned Bellavance

Featured in this episode of Chaos Lever

Remember when I said that there would be more cloud outages in 2023, and also that no one would really care? Boy is my back sore from all the patting!

During a planned maintenance update on January 25th, Microsoft made a woopsie-doosie with their routers, which caused a multi-hour outage for Teams, Azure, and M365 services. According to their post-incident report, the command sent to the router caused it to tell all adjacent routers in the WAN to recompute their adjacency and forwarding tables. While that computation was happening, the routers couldn’t properly forward traffic.

Amazingly, the root cause was not actually DNS for once. It was BGP. Which messed up DNS. So it’s still kinda DNS. Microsoft has assured everyone that they’ve taken steps to prevent this issue from occuring in the future, which is what every company always says.

Entropy simply laughs maniacally, because there are always infinitely more ways for a thing to be broken than fixed. It’s why I never made my bed growing up and why the universe will eventually descend into a heat death from which no escape is possible.

Put in that context, I guess a four-hour outage doesn’t seem that bad now does it?