A shed load of disk!

By | January 23, 2007

Normally I like to see a well planed and implemented storage design, be it nicely balanced and symmetrical or a bit of well thought through isolation.  Seldom do I like to see sloppiness and laziness.  I say “seldom” because there *are* rare occasions where a bit of sloppiness can make life a little easier.  For example, my current project would be a lot simpler if a little less thought had gone into the management of the storage arrays that Im currently looking at.

Im heading a project to remove a lot of disk from several large HDS 9980V subsystems.  These boxes were originally well specked (is that how you spell it?) by the HDS guys allowing for a good initial implementation as well as maintaining those good practices going forward.  However, circumstances have changed, consolidations etc and now its thought that some money can be reclaimed by pulling a lot of disk out and getting some money back for it.  Sooo my job is to identify which disks can be easily and safely pulled and prepare the subsystems so that the engineers can go in and remove the surplus disk – we’ve around 2,000 spindles!

The problem for me is that because the boxes have been, on the whole, quite well managed almost every Array Group (RAID set) has at least a handful of LDEVs residing on them making it necessary for me to Cruise Control these LDEVs onto other Array Groups before we can instruct the engineers which Array Groups to pull out.

Then of course there is the issue of performance as well as trying not to ruin the previous good design – I don’t want to be responsible for being the one who butchered a good design and walked away leaving a smouldering heap behind me.  It’s a standing joke among contractors that as soon as you finish a project and move elsewhere, you will be blamed for everything that ever goes wrong – “Oh right, yes I seem to remember that Nigel the consultant who was in a few months ago changed something and that must be the problem…….”  Im sure you all know the story.  However, Im faced with a situation where such comments may be justified if Im not careful.

Ive used Cruise control a couple of times in the past when hot spots (both times were busy Array Groups) were identified and I was moving LDEVs onto cooler spindles.  The software is pretty intuitive and has worked well for me in the past.  However, this time around Im moving a LOT of LDEVs and Im not moving them for performance reasons.  Im moving them as part of major internal subsystem re-organisations that basically involves removing a ton of capacity from several subsystems.  And how do I make sure that performance doesn’t take a nose dive?

Anyway, the most interesting thing for me is how to ship 2,000 disks off site.  The solution that has been decided is to use 5 empty frames and fill them up, load them onto a lorry and ship them back to base.  Quite a lot better than ordering 2,000 cardboard boxes with the associated polystyrene packing and the immense space that would take!

Also, I’m not sure what the rules are for HDS 9980V storage, but when I worked with HP EVA storage the documentation from HP stipulated waiting 60 seconds between each drive removal in order to maintain cooling and a consistent airflow through the array.  If its similar with HDS 9980V storage then that would require approximately 33 hours just to pull the disks out.  Of course these 2,000 disks (estimated as we don’t know exactly how many we can pull yet) are spread across 6 subsystems so you *could* do it quicker but that would require the subsystems to be right next to each other and they are obviously not.  And then that would assume your engineer(s) were being slap dash and rushing – and our engineers are good!  So I think its safe to say that the simple job of pulling the disks out is a task and a half in itself and our engineers wont need to go to the gym after work that night

Actually, talking about the disk removals on a conference call this morning the conversation with our lead engineer descended so far that we were worried the engineers were going to stack the disks and have a Jenga tournament šŸ˜‰

Its going to be interesting!

3 thoughts on “A shed load of disk!

  1. snig

    Here’s a question for you. How will you wipe all the data from the disk before removal?

    Oh and make sure you have 2000 blanks to replace those drives with or you will definitely have cooling problems. šŸ˜‰

  2. Nigel

    Ah yes, the blanks are arriving in the “empty” frames and the destroying of data is being done at the other end (not my call or recommendation). However, it has been decided to put the disks into the empty frames being used for delivery in a random fashion.

    On the recent USP decommission I did it was discussed whether or not pulling the disks out and putting them back in a random fashion would be good enough. Obviously we would have to be careful to mix things up enough to compensate for RAID etc. However, it was decided that was too risky šŸ˜‰ So we opted to mix them up AND have the engieers recut the disks in a different OPEN-X emulation.

    I know in the past of customers who have said that Array Groups can be reformatted and then changed their minds and said they actually needed the data on the disks and have sent them off to a disk recovery specialist who has come backto them and said that the data is not recoverable.

    In each case recently we did not have enough time to use the Data Shredder feature which writes multiple patterns of data over each sector and is US gvt approved.

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You can add images to your comment by clicking here.