1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Networks HP 1910 switch problems

Discussion in 'Tech Support' started by bestseany, 30 Jan 2014.

  1. bestseany

    bestseany What's a Dremel?

    Joined:
    2 Jul 2009
    Posts:
    448
    Likes Received:
    6
    I'm wondering if anyone has got any idea on this switch problem I have at work....

    We have two HP 1910-48G swtiches which we use for a Hyper-V cluster with an iSCSI SAN. The switches are linked together by 6 ports in a link aggregation group, and port 48 on each switch is used as an uplink to our existing Cisco LAN infrastructure.

    The Hyper-V cluster servers use network teaming and iSCSI MPIO so that everything is split evenly across both switches for fault tolerance.

    The problem we have is that the uplink in port 48 in the first switch seems to go down if you so much as blow on it, which takes down our entire cluster! Its like the brief downtime in the uplink is taking the switches down, even though they carry on running fine. The only event in the syslogs just shows the uplink port going down and up.

    I thought that the uplink in the second switch would take over instead through STP, or the Hyper-V hosts should still at least see each other and the SAN even with no uplink working to the rest of the LAN.

    This does appear to be an issue with the switches themselves. This was all configured before I started working here, but I've been through the configuration and haven't noticed any issues. Could this be an STP problem? Or a bug maybe?

    I'm waiting for a quiet Sunday to take this all down and do some proper testing and maybe a firmware update, but it's a strange problem that I've not come across before. I've been managing Hyper-V clusters for years and never had this issue in the past.
     
  2. deathtaker27

    deathtaker27 Modder

    Joined:
    17 Apr 2010
    Posts:
    2,238
    Likes Received:
    186
    have you run a wire shark and seen what is happening at the time when it goes down?
     
  3. Chairboy

    Chairboy I want something good to die for...

    Joined:
    10 Jun 2004
    Posts:
    1,773
    Likes Received:
    112
    [Edit]

    Just read read what you've put - so I was no help there sorry!
     
  4. bestseany

    bestseany What's a Dremel?

    Joined:
    2 Jul 2009
    Posts:
    448
    Likes Received:
    6
    No, not yet. It will be something I do when the whole thing is taken down for testing.
     
  5. BigM2006

    BigM2006 What's a Dremel?

    Joined:
    8 May 2006
    Posts:
    71
    Likes Received:
    1

    do you have both HP's connected to the core cisco? And then 6 links in a link aggregation group between the two HP's?

    If thats the case, it may be that STP / root bridge is causing the connection to go HP1 -> Cisco -> HP2 (i.e, the link aggregation group is made down to remove the loop). So removing the uplink port on one of them causes everything to go down until STP re-converges, which depending on the config could take 30 - 60 seconds.



    I would look at port-fast settings, change to rapid spanning tree, or look at the design to remove the loop, so this isnt an issue in the first place.
    I would assume that the cisco can also do link aggregation, so perhaps look at changing to a setup with cisco -> (multi port link aggregation) -> HP1 -> (multi port link aggregation) HP2.

    Mike
     

Share This Page