Additional backup capacity to be added for Gmail to ensure delivery during network failures
Many of you may have noticed outages or issues with several Google services yesterday, but none were hit as hard as Gmail. Google acknowledged yesterday on its app status dashboard that Gmail was experiencing complete outages or at least service delays for most of the day, and has taken to its official blog today to explain what happened.
According to its post, an extremely rare "dual network failure" knocked out separate redundant paths, taking down Gmail's capacity starting at about 6am PST yesterday. Although engineers were made aware of the outage right as it happened, it took most of the day to clear up the issue and mail didn't start flowing at a regular pace until about 4pm PST.
Google says that 71 percent of messages delivered during the period were unaffected, and across the other 29 percent that were hit by the outage the average delivery delay was a mere 2.6 seconds. Naturally some of us noticed delays much longer than this, and Google does say that roughly 1.5 percent of mail was delayed over two hours.
Just as we would expect, the Gmail team apologized profusely for the delays and downtime yesterday, saying that they want to ensure "that Gmail users get the experience they expect." The plan is to make Gmail delivery more resilient, even in network failure situations by adding extra backup capacity and reviewing internal policies for dealing with a rare failure such as this.
Just in case you were worried, this small outage didn't drop Gmail below its beloved 99.9 percent uptime.
Source: Official Gmail Blog