Archive for January, 2009

Initial Hardware Status Update

Friday, January 9th, 2009

The hardware status of fly right now is a little confusing because some nodes are down and I’m waiting for some replacement hardware so I can take the older nodes apart to fix the newer ones.

We have basically three types of nodes in fly at the moment… well kinda four:

1. dual 2.4Ghz HT P4’s with 2GB of RAM each. These are 1U nodes from PSSC, and they’re dying like flies (get it, flies? ha ha). These are compute-0-1 through compute-0-13

2. dual 3.0Ghz HT P4’s with 4GB of RAM each. These are also 1U nodes from PSSC, and they’re dying too (mostly power supplies and hard drives). These are compute-0-14 through compute-0-23.

3. Intel Core 2 Duo 2.13Ghz nodes with 4GB of RAM each. Compute-1-6, 1-7, and 1-9.

4. AMD Athlon X2 5000+ nodes with 4GB of RAM each. The rest of rack 1. The nodes in Rack 1 were originally Athlon 1200’s with 768MB of RAM. I gutted them and replaced the internals with their current ones for around $300 a node for the Athlon X2’s and around $500 a node for the Intel Core 2 Duos (I could do it even cheap now, but that’s the way of things). These cases are great: easy to work in, the fans are big and quiet and don’t die and no power supply in this rack has ever died, despite the fact that they’re several years older than Rack 0.

So, Jared Benedict at Dartmouth, an old friend and Hampshire alum, offered to sell me a blade chassis full of Dell PowerEdge 1855’s, 8 of them are dual 3.0Ghz Xeons of the 800Mhz FSB variety and 2MB of L2 cache (newer than the 3Ghz Xeons were already have in compute-0-14 through compute-0-23), and two of them are dual dual core (total 4 processor cores) 2.8Ghz Xeons. All have 4GB of RAM apiece. He wanted $5K for the whole kit and kaboodle, and they’ve got some nice 10KRPM Ultra320 SCSI drives in ‘em too, not that we need them (I really need to set up lustre now!). They’re all 64-bit processors too, which is nice…. someday I can make the whole cluster 64-bit… maybe next year when we replace the rest of Rack 0. So this sounded good, but getting the check from the Business Office has been more difficult than I anticipated. They somehow missed our note to please not send the check to Dartmouth, and Dartmouth can’t find the check, so now we have to put a stop payment on the check and start over. Meanwhile, our nodes are dying and the PowerEdge 1855’s are worth less and less every day….

My plan is to replace compute-0-1 through compute-0-13 with these new nodes and use their spare parts to fix 0-20 and 0-22, which are currently dead (power supplies), as I can’t add them for power/heat reasons (I have to free up a 30A 240V plug on the wall, and our AC can barely handle what we have in there now, let alone adding MORE power draw to the situation). Maybe someday I’ll convince somebody to make an outside air cooling system for the room, or use the server room to heat the annex, or something that makes more sense than air conditioning a room in the winter when there is a plentiful free source of nice cold air just three feet from the backs of the servers. I have been known to open the window….

Since not much is going on with fly right now, I might also upgrade to the latest release of ROCKS at the same time, as we’re a bit behind with 4.2.1

Feel free to post comments, suggestions, or questions.