UPS Overwhelmed
May 10, 2013by Wm. Josiah Erikson (wjens)
I guess the new nodes draw a lot more power when going full-tilt. The latest job that Tom submitted took out one of the UPSes, and therefore a bunch of nodes. I’ve moved some of them over to wall power, and they’re reinstalling now…. should be back up soon, but you’ll have to restart some of your jobs (or your whole run, depending on how you feel about the randomness of that event), Tom. Sorry! I guess we need another couple of UPSes if we want to cover the whole cluster… I have some, but they need batteries – around $500 would put them both back in business….
May 10th, 2013 at 9:30 am
I guess that the great idea that arose in discussions at the workshop we’re attending in Ann Arbor had some unintended consequences 🙂
May 10th, 2013 at 9:33 am
Heh. My fault for plugging too much into the UPS 🙂
May 13th, 2013 at 3:47 pm
It happened again… this time with the bottom two UPSes. Perhaps it’s time to take the MGHPCC’s approach – UPSes for head node only.
May 18th, 2013 at 11:10 am
My goodness. It looks like it happened AGAIN – around 1/3 of the cluster is down. I thought I’d redistributed things so that this wouldn’t occur again. Perhaps I should get some 208v rack-mounted PDU’s and use that 208V 30A outlet that used to power rack2 to power the new rack2. That should take care of this problem.
May 20th, 2013 at 10:29 am
This time it was one of the 20A circuits – P3-10. Redistributed a bit, re-racked rack 2 while I was at it, since the rails came in, and powered everything back up. Will order a nice PDU with power draw monitoring and an L6-30P input this afternoon. It’ll be interesting to find out what these C6100’s actually draw at full tilt. I’m suspecting it’s around 1000 watts.
May 24th, 2013 at 12:15 pm
PDU came in (funny story about getting a Big Jammer instead -ask me sometime), NIMBYing rack2 so I can shut ’em down for the install….
May 24th, 2013 at 1:58 pm
Wow. I thought all modern-day PC power supplies ran on 120 – 240V power these days. Nope. I just plugged the cheapo power supply in compute-1-18 into 208V and it EXPLODED with a noise that was loud enough that my ears still hurt and my heart rate went up significantly. Nothing, however, is on fire. Now we’ll see if I destroyed just the power supply, or more than that….
May 24th, 2013 at 2:48 pm
Oh look, it had a switch on the back. Silly of me. Well, the motherboard and everything else is fine
May 24th, 2013 at 3:12 pm
Bought three new active PFC Bronze 80 Plus Certified SeaSonic 300w ATX 2.3 power supplies for like $30 each. Good stuff.
May 24th, 2013 at 3:13 pm
(Active PFC is more efficient, and autosenses input voltage)