Networks HP 1910 switch problems

bestseany · 30 Jan 2014

I'm wondering if anyone has got any idea on this switch problem I have at work....

We have two HP 1910-48G swtiches which we use for a Hyper-V cluster with an iSCSI SAN. The switches are linked together by 6 ports in a link aggregation group, and port 48 on each switch is used as an uplink to our existing Cisco LAN infrastructure.

The Hyper-V cluster servers use network teaming and iSCSI MPIO so that everything is split evenly across both switches for fault tolerance.

The problem we have is that the uplink in port 48 in the first switch seems to go down if you so much as blow on it, which takes down our entire cluster! Its like the brief downtime in the uplink is taking the switches down, even though they carry on running fine. The only event in the syslogs just shows the uplink port going down and up.

I thought that the uplink in the second switch would take over instead through STP, or the Hyper-V hosts should still at least see each other and the SAN even with no uplink working to the rest of the LAN.

This does appear to be an issue with the switches themselves. This was all configured before I started working here, but I've been through the configuration and haven't noticed any issues. Could this be an STP problem? Or a bug maybe?

I'm waiting for a quiet Sunday to take this all down and do some proper testing and maybe a firmware update, but it's a strange problem that I've not come across before. I've been managing Hyper-V clusters for years and never had this issue in the past.

deathtaker27 · 30 Jan 2014

have you run a wire shark and seen what is happening at the time when it goes down?

Chairboy · 30 Jan 2014

[Edit]

Just read read what you've put - so I was no help there sorry!

bestseany · 30 Jan 2014

deathtaker27 said: ↑

have you run a wire shark and seen what is happening at the time when it goes down?
Click to expand...

No, not yet. It will be something I do when the whole thing is taken down for testing.

BigM2006 · 31 Jan 2014

bestseany said: ↑

I'm wondering if anyone has got any idea on this switch problem I have at work....

We have two HP 1910-48G swtiches which we use for a Hyper-V cluster with an iSCSI SAN. The switches are linked together by 6 ports in a link aggregation group, and port 48 on each switch is used as an uplink to our existing Cisco LAN infrastructure.

The Hyper-V cluster servers use network teaming and iSCSI MPIO so that everything is split evenly across both switches for fault tolerance.

The problem we have is that the uplink in port 48 in the first switch seems to go down if you so much as blow on it, which takes down our entire cluster! Its like the brief downtime in the uplink is taking the switches down, even though they carry on running fine. The only event in the syslogs just shows the uplink port going down and up.

I thought that the uplink in the second switch would take over instead through STP, or the Hyper-V hosts should still at least see each other and the SAN even with no uplink working to the rest of the LAN.

This does appear to be an issue with the switches themselves. This was all configured before I started working here, but I've been through the configuration and haven't noticed any issues. Could this be an STP problem? Or a bug maybe?

I'm waiting for a quiet Sunday to take this all down and do some proper testing and maybe a firmware update, but it's a strange problem that I've not come across before. I've been managing Hyper-V clusters for years and never had this issue in the past.
Click to expand...

do you have both HP's connected to the core cisco? And then 6 links in a link aggregation group between the two HP's?

If thats the case, it may be that STP / root bridge is causing the connection to go HP1 -> Cisco -> HP2 (i.e, the link aggregation group is made down to remove the loop). So removing the uplink port on one of them causes everything to go down until STP re-converges, which depending on the config could take 30 - 60 seconds.

I would look at port-fast settings, change to rapid spanning tree, or look at the design to remove the loop, so this isnt an issue in the first place.
I would assume that the cisco can also do link aggregation, so perhaps look at changing to a setup with cisco -> (multi port link aggregation) -> HP1 -> (multi port link aggregation) HP2.

Mike

Log in or Sign up

Networks HP 1910 switch problems

bestseany What's a Dremel?

deathtaker27 Modder

Chairboy I want something good to die for...

bestseany What's a Dremel?

BigM2006 What's a Dremel?

Share This Page

Log in or Sign up

Networks HP 1910 switch problems

bestseany What's a Dremel?

deathtaker27 Modder

Chairboy I want something good to die for...

bestseany What's a Dremel?

BigM2006 What's a Dremel?

Share This Page

Useful Searches