Archive for June, 2014

2-15 and 2-27 having hard drive issues

Monday, June 30th, 2014

I may just have a lot of bad hard drives, or they may have backplane issues… I saw this with compute-2-1 at some point, and I don’t remember the exact resolution… I’ll go read that post and see if I blogged about it. At this point, after replacing the hard drives multiple times and seeing different kinds of errors (can’t find the hard drive at all, then loses communication in the middle of an install and resets multiple times), I’ve got a couple of hard drives in there that are working, but I still think it might be a physical backplane connectivity issue…

compute-4-3 reset

Monday, June 30th, 2014

BIOS says: Warning: A fatal error has caused system reset! Continue? I said yes. If it happens again, I’ll worry about it.

compute-1-12 has hardware errors

Thursday, June 26th, 2014

I’m getting kernel: [Hardware Error]: Combined Unit Error: VB Data/ECC error
Could be CPU, memory, or motherboard. This post has almost the exact same hardware we’ve got: http://ubuntuforums.org/showthread.php?t=2010489 and he didn’t have bad memory… those motherboards have been really lousy. We’ve got two coming back from RMA soon, and I’ve got a couple of spare CPUs sitting around that came back from RMA… guess I could swap one or the other of those out.