Houston H1 Goes Supernova: ThePlanet’s Datacenter Transformer Explosion
Posted by Jay Charles | Filed under Rants and Not-So-Rants
Saturday at about 4pm (EST) my server host, ThePlanet.com experienced a major disaster at their Houston TX H1 datacenter. Apparently, a transformer in one of their electrical equipment rooms exploded, taking out most of the power infrastructure of the datacenter (as well as the walls surrounding the equipment room).
For those not familiar with ThePlanet, they are one of the biggest providers of dedicated servers in the US… with 5 datacenters serving upwards of 20,000 customers (that’s direct customers, not considering the hundreds of thousands of reseller accounts making up a million+ domains on their network). I currently have two servers with ThePlanet, and as luck would have it, they are both in ThePlanet’s H1 datacenter in Houston, TX. As a result, my servers were down all weekend, and just came back on yesterday evening. Power is still a bit shaky, and I’ve read reports that some of the phases of the datacenter are still up and down. It’s scary to think that there are portions of the datacenter currently relying on portable generators (the kind that are in a semi trailer… not the little Honda generators you get at Home Depot, silly).
Here are the good things I have to say about the incident:
- ThePlanet reports that nobody was hurt, and that in itself is a miracle. If you’ve ever seen a large (5+ Mw) transformer blow at an outdoor transfer station, you know what sort of power we’re talking about. Now imagine that happening indoors, and the destruction it would (and did) cause. Had anyone been near it during the event, that person wouldn’t be with us anymore… so thank whatever diety or statue you bow to for that fact that nobody was within harm’s reach that day.
- The Planet is doing everything they can to put the pieces back together as fast as possible… but…
Here are the bad things I have to say:
- ThePlanet really dropped the ball when it comes to handling support tickets and calls surrounding the event. Although I’m sure they are being flooded with support requests right now, it’s their job to see them through immediately, regardless of whether that means calling in every one of their support people to work 12 hour shifts. We’re not talking about some itty-bitty hosting company here, were talking about 5 major datacenters pulling in millions of dollars a month in hosting fees.
- When the tickets do get responded to, I’m getting canned responses that don’t actually answer the question or solve the problem. That, friends, is just unacceptable.
- The canned responses provide comments about what the customer can do to fix problems for himself/herself. For example, the nameservers in the H1 datacenter are now borked, so the customer servers using those nameservers simply cannot resolve domain names. Although changing nameservers not a major undertaking for an experienced server manager, it still pisses me off when someone I pay for a service fails, and then tells me to go fix it myself.
- I understand that nobody can plan for every contingency, but really, to have the potential for a single event to take out an entire datacenter (we’re talking 9000 servers here) is insane. I would think that a facility of that calibre would have enough redundancy in place to get back up within 24 hours, regardless of the magnitude of the event
Things I’ve learned from this event
- Having both of my servers in the same datacenter, is a really bad idea.
- I need to move one of my servers to another datacenter, and run a “worst case failure” backup from my office. That way, I’ll still at least get my email.
- I should feel lucky that my servers are both on the floor of the DC that got power back first. My understanding is that other customers are not as lucky.
- I need to evalulate whether this is something that warrants moving to another hosting provider. ThePlanet has been good to me for the past 5 years, but this whole mess has really shaken my confidence.
- I need to be thankful that my servers are intact. While the datacenter’s electrical infrastructure was seriously damaged, my servers are unharmed (ThePlanet has reported that no customer servers were damaged during the event).
- I need to be thankful that I don’t work for ThePlanet’s support department… life must be really tough for those guys and gals right about now.
June 6th, 2008 at 4:33 am
I have two boxes at ThePlanet and I didn’t even know they had an incident. I’m not sure where my servers are but I think they are not in Houston. Still, I guess they should have told me about a major incident.
It sounds that you got away with a small bruise. I don’t think I’d move hosts because of this, but as you mentioned put some additional measures in place. My next server will probably be a colocated one, somewhere near to where I live. Just to see how that goes.
All in all I think ThePlanet are delivering a lot for what they charge. I don’t expect any service from them really apart from uptime, and that they have so far delivered (flashcomguru.com is on one of the boxes there).
June 6th, 2008 at 2:34 pm
Agreed, Stefan.
All in all, the people at thePlanet have been handling things as well as one could expect.
Today, we had another outage when the generator broke down (seems Phase1 of H1 is running only on generator power with no UPS backup). The outage was short, but when the power came back, a number of servers had dead hard drives (luckily, mine survived it). ThePlanet immediately offered to ship the affected drives to RDS for recovery at their (ThePlanet’s) expense, but still… it seems that things are getting worse before they get better over there in Houston.
All said… the people at ThePlanet are doing everything they can to keep me happy. Today, I ordered two new servers in separate datacenters (neither will be in H1) and this weekend I’ll migrate. ThePlanet gave me the first month free, and waived the setup costs, so really I can’t complain too much. It could have been a whole lot worse.
June 7th, 2008 at 6:56 pm
Hi Jay,
The main thing is that everyone remained unharmed, which is good to know. God knows how a generator blows up just like that.
I guess your quite lucky, considering you was the first to get power back and your HDD remains intact.
Hope all is well, and the little one is settling in fine.
Regards,
Jason
July 10th, 2008 at 4:22 pm
Guess it’s easy enough to say, “Welcome back from the dead, again!”