Hi there. Remember me? I’m your friendly (not) neighborhood (probably not) storage blogger (ok, well sometimes)
So the situation keeps coming up, and it’s worthy of a post, so here I am. You’ve all (all six of you) probably gathered that I’m not here to promote anything. I don’t play favorites, and I certainly don’t get any money from the stupid ads I placed on the sidebar… (why are those still there?) Hell I’m well aware that I am sorely lacking as a writer even.
Why are companies locking the storage admins out of the hosts?
Why for the love of pete? I have a customer where the storage-admin’s job stops at the connection to the server, for ‘security reasons’
It’s a useless endeavor, it doesn’t gain you ANYTHING as far as security goes, and in fact *WILL* end up costing you more than you would ever have dreamed of saving.
it also makes your storage admins feel untrusted and unappreciated… (but employers don’t care about that so much these days.)
So, in a nutshell, I will list the three common reasons for locking the storage people out of the server environments, and why doing so is a complete waste:
My computers have sensitive information on them that the storage people shouldn’t have access to.
If you trust someone to manage your storage environment you trust them with your data. I can name two different ways off the top of my head that a storage admin could gain access to data without ever going NEAR the server, either physically or over the network. And one of those would be COMPLETELY UNTRACEABLE.
Long story short, the storage admin has access to the data. Just get used to that fact and stop making up ways to make their lives more difficult.
If you have doubts about the people you’re hiring, look at your hiring practices.
The storage admin could inadvertently crash the host.
Well gee. Anyone with access to the power cord could do that. Again, can think of at LEAST two different ways a storage admin could do that without even trying, and that happen on a daily basis. (Remove device masking, remove zones) Again – you’re fixing the chicken-coop with the fox inside.
Try trusting the people you hire to do what you pay them to do.
The storage admin doesn’t require access.
Well, this is kind of a generalization. Many companies practice a “if your job doesn’t directly relate to the server you aren’t granted access to it. If troubleshooting only extended to the point where the server connected to the SAN the above would be a true statement. But as with most systems, there are inter-relationships that are crucial. Multi-path software, HBA management software, Drivers/Firmware, *ALL* are a part of the storage environment.
And the bottom line is this: Storage touches EVERYTHING.
If, like most sane companies, backup is included in the storage job, that’s 100% everything, otherwise there are SOME occasions where non-SAN attached hosts don’t require storage-admin access.
Troubleshooting in an environment where the storage admin’s access ends at the HBA connection can take HOURS longer than it would normally take, and requires at least twice the manpower.
Storage doesn’t stop at the physical layer. Storage management software counts!
My scenario – Here’s why giving your storage administrator access to the servers *WILL* save you money.
It’s 4:15 on a friday afternoon. The dual-port PCI-e HBA you put into the server (to save money and slots which are tight in 1U servers) has failed. Not the port (which, granted, is infinitely more likely) but the chip itself. The SAN storage for the host is down.
As the storage admin, I got a page when the switch ports went dark. Assuming the storage environment is managed properly, I instantly know what host is experiencing the problem. (it’s also safe to assume that the host owner knows because his disks are MISSING)
Now as the storage admin, I’ve tested the connections, the switch ports and I’ve narrowed it down to an HBA issue. The host needs to be shut down (assuming it’s not Windows and blue-screened at the first sign of trouble)
Now if I have to coordinate the reboot, the installation of the new HBA’s, flash up-to-date firmware, pull WWPN’s, rezone, remask and reboot the host again, we’re talking about time. Maybe not much, maybe the host admin is on the ball, and maybe if you’re clever you can zone/mask before the initial boot, but you still need to flash firmware to stay within supportability and not risk further problems.
I’ve done this. If I’m doing it myself the system is back up by now, and the only thing i need the application owner to do is validate the app is functioning correctly.
If you don’t have access but are sitting in the same room with the person it’s still fairly simple but takes a little longer, though not much.
So let’s hope the failure happens during business hours—If it’s after hours, you’ve got two people driving in instead of one. Hours of downtime, total, that is, if you’re lucky enough to be able to get ahold of the host admin.
Now this came about because I had an outage happen. A VMWare lun disappeared and the owners of the “secure” vmware environment were nowhere to be found. (on what planet is it ok for an IT person to not respond to a page?)
Myself and the owners of the “unsecure” vmware environment sat around for a while twiddling our thumbs before the decision was made that the host owner wasn’t going to get back to us and the management decision was made to leave it for the night.
That’s a whole night this host will be down because the people who were there didn’t have the information needed to finish fixing the problem.
I’ve said it before, I’ll say it again. If you don’t trust the people you hire, maybe who has access to what isn’t your primary problem.