Dell firmware update broke my MD3000i…kind of

So Dell contacted us a few weeks back and told us they would love to remote into our MD3000i Storage Array (hereby referring to as the “SAN”) and update its firmware.

Sure, we told Dell. Love to have ya!

So some scheduling took place, and emails were sent notifying of downtimes, and backups were made, and they day came and Dell updated our firmware.

So Dell contacted us a few weeks back and told us they would love to remote into our MD3000i Storage Array (hereby referring to as the “SAN”) and update its firmware.

Sure, we told Dell. Love to have ya!

So some scheduling took place, and emails were sent notifying of downtimes, and backups were made, and they day came and Dell updated our firmware.

And ye old SAN box would no longer respond to pings. So our ESX hosts could find their precious VMs, and for those of you who’ve never had a ESX host lose its VMs, that’s BAD.

So the troubleshooting began. The SAN are on their own VLAN. The ESX hosts connect to a LAG on the switch that trunks to said VLAN. ESX hosts can happily ping each other, so the VLAN is set up correct. If I plugged a laptop, configured on the same subnet as the SAN, into the one of the network cables plugged into the SAN, it could ping the ESX hosts. So the network cables weren’t bad.

So we called Dell support, and we got a lively one . I could tell from the beginning of the call he wasn’t going to be our savior that night, so I continued troubleshooting while he placed us on hold to “check with someone else” (read: smoke some more chronic). I changed the ip addresses of the iSCSI hosts to something else and back. I disabled the iSCSI controllers and re-enabled them. I rebooted the array. I turned off the VLAN tagging on the iSCSI ports.

Voila, the answer! I had set the switch ports where the SAN attached to untagged VLAN access mode. Apparently, if you set the SAN to set a VLAN tag on its traffic, it would work before the firmware upgrade, but not after. Afterwards, you either have to turn off the VLAN tagging on the SAN, or set the switch port mode to tagged. Otherwise network no worky.

If anyone out there knows which way is the correct behavior, I’d love to know. We were using a PC Dell 6248 as our switch.

Anyway, we figured it out before our wonderful Dell tech, so we sent him on our way, rescanned our iSCSi HBAs on our ESX hosts, restarted the VMs, and were back in business.

My $.02 Weed

4 thoughts on “Dell firmware update broke my MD3000i…kind of”

  1. VLAN – MD3000i

    When VLAN is enabled on the MD3000i the array will reject untagged packets on the specified ports. A firmware upgrade shouldn’t touch this though.

    On the flip if the SAN switch port is set to untagged then VLAN must be disabled on the corresponding iSCSI array port.

    As such what you’re seeing with the 6248 and MD3000i is correct behavior.

    Do you have logs of the settings prior to the upgrade?

    Dave

    1. VLANs

      Dave,

      Before the upgrade, the switch ports were in Untagged mode, the MD3000i was set to tag to VLAN 16, and it worked fine. After the upgrade, I had to make it “compliant” (either set the switch port to tagged mode or turn off the MD3000i VLAN) before it would work.

      Just a side question, if you know. What scenario would exist where you’d need to tag packets at the client level? I could see if you had multiple network cards in a client (aka an ESX box) that needed access to multiple VLANs, but outside of that, is there any situation where I wouldn’t let the switch handle what VLAN the port is in?

      My $.02 Weed

      1. Just like what we have

        Imagine you’re running 802.1q trunking to an ethernet port and the box needs to bootstrap from the network? How do you do it?

        Well, one way is to get the network team to restrict the port to just one of the, say, four vlans currently running on the port. Then you can bootstrap with one IP address bound directly to the interface, and once the system is up you have all four bound via 802.1q to the interface again, configure your client (I’m familiar with Linux, so we’ll assume that’s it) to tag packets outbound on particular virtual interfaces with specific VLAN information, and you’re done.

        Imagine a case like the one I’m trying to work through today where you have FOUR vlans assigned to one port due to running virtual machines on a box, all with different network environment requirements. You have a database, mid-tier, and web server all running as virtuals under a Xen hypervisor fault-tolerant on two separate boxes. You need a separate vlan for each, because you don’t want your Oracle RAC communication to be happening on the same VLAN as the front-end of your web server talking to the Internet, for instance.

        So client VLAN tagging is both necessary and useful. But Linux makes it REQUIRED if you’re running 802.1q, and that’s confusing since eth0 will have no configuration information, but you’ll have interfaces like “eth0.28” and “eth0.92” for vlans 28 and 92, respectively.

        Make sense?

        –Matt B.

  2. VLAN tagging

    VLAN tagging has been the source of all my work-related woes this past week, too. On a totally unrelated note except tangentially related, I learned that Xen virtual hosting platforms (“Dom0”) cannot easily run an interface in bridged mode and carry an IP address on that interface at the same time. There are work-arounds, but they are kind of hackish and result in interfaces that may appear to be working long before they are.

    Lesson learned. Week wasted. Hopefully finally fix the bloody thing on the morrow.

Comments are closed.