So Dell contacted us a few weeks back and told us they would love to remote into our MD3000i Storage Array (hereby referring to as the “SAN”) and update its firmware.
Sure, we told Dell. Love to have ya!
So some scheduling took place, and emails were sent notifying of downtimes, and backups were made, and they day came and Dell updated our firmware.
So Dell contacted us a few weeks back and told us they would love to remote into our MD3000i Storage Array (hereby referring to as the “SAN”) and update its firmware.
Sure, we told Dell. Love to have ya!
So some scheduling took place, and emails were sent notifying of downtimes, and backups were made, and they day came and Dell updated our firmware.
And ye old SAN box would no longer respond to pings. So our ESX hosts could find their precious VMs, and for those of you who’ve never had a ESX host lose its VMs, that’s BAD.
So the troubleshooting began. The SAN are on their own VLAN. The ESX hosts connect to a LAG on the switch that trunks to said VLAN. ESX hosts can happily ping each other, so the VLAN is set up correct. If I plugged a laptop, configured on the same subnet as the SAN, into the one of the network cables plugged into the SAN, it could ping the ESX hosts. So the network cables weren’t bad.
So we called Dell support, and we got a lively one . I could tell from the beginning of the call he wasn’t going to be our savior that night, so I continued troubleshooting while he placed us on hold to “check with someone else” (read: smoke some more chronic). I changed the ip addresses of the iSCSI hosts to something else and back. I disabled the iSCSI controllers and re-enabled them. I rebooted the array. I turned off the VLAN tagging on the iSCSI ports.
Voila, the answer! I had set the switch ports where the SAN attached to untagged VLAN access mode. Apparently, if you set the SAN to set a VLAN tag on its traffic, it would work before the firmware upgrade, but not after. Afterwards, you either have to turn off the VLAN tagging on the SAN, or set the switch port mode to tagged. Otherwise network no worky.
If anyone out there knows which way is the correct behavior, I’d love to know. We were using a PC Dell 6248 as our switch.
Anyway, we figured it out before our wonderful Dell tech, so we sent him on our way, rescanned our iSCSi HBAs on our ESX hosts, restarted the VMs, and were back in business.
My $.02 Weed