Why WordPress Was Down On Thursday

On Thursday, United Tech Guys, along with 10+ million other blogs on WordPress.com, went down.  We weren’t hearing much from the WordPress team (other than a few vague tweets), but they got it fixed quickly, which is great.  They’ve waited until today to release an explanation of the outage, which was published on the official WordPress.com blog.  In it, founder Matt Mullenweg says that it was just a very simple code error, and they had to take the site down before it caused any more damage.  Things like this give us a sense of security.  It’s good to know that the WordPress team really cares about keeping everything running smoothly.  

Here’s Matt’s blog post in its entirety:

As some of you noticed, including a number of major media organizations, WordPress.com had some unexpected downtime on Thursday evening. Whether you’re eating delicious BBQ, as I was, watching a marathon, or about to post your opus, downtime is an annoying interruption and we hate it.

This had nothing to do with our network providers, or data centers, or aliens, it was completely our fault. A single line, nay, a single character out-of-place, slipped by our normal review and testing and started overwriting settings when triggered. The team immediately took the site down to prevent further damage and clean up the mess that had been caused. All hands were called to deck.

First we determined that 11.2 million blogs were unaffected by the bug. So we brought those back up. For the remaining 50,000 or so, including some VIPs, we started restoring the lost settings using backups, audit trails, and logs. This was largely automated and we brought blogs back online as they were fixed, but a few final tricky ones were brought back one-by-one by hand because we wanted to make sure everything was in its right place.

For most folks (99%) your site was only unavailable for an hour, the rest came up a bit after that, and the tricky ones we worked on until Friday morning. Fortunately because of the time of day and the shorter duration, this had a smaller effect on traffic (about 3.9m) versus the last time (5.5m).

As a silver lining to this failing of the cloud, we learned a lot. We’ll be using our newfound experience to keep WP.com a safe, stable, and robust place to hang your hat and have your blog call home.

If you have any questions, notice any remaining wonkiness, or just want to say howdy, we’d be happy to hear from you.

Again, it’s good to know that they are readily prepared for situations like this.  Keep it up, WordPress.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s