…and around it goes

April 1, 2007

Tech challenge

Filed under: Personal — steve @ 11:42 pm

Something interesting for the techs who might be reading:

Problem:

Every time I shut down the old mail server everything stops moving on the network ten minutes later. No machine can ping another. No traffic reaches any of the other machines.  Nothing in nor out. It does not happen until ten minutes after the old server was shut down. If the old server is powered back on, everything moves again. It happens as soon as I hit the power button.

The machine has two nics, on two separate networks. Each attaches to a switch, each switch through a separate firewall to a different Internet gateway. Both networks are active, but one is failover. Tcpdump on the machine shows only the ssh traffic to and from the term running tcpdump, as well as normal broadcasts.

Answer:

I do know the answer. That old server is now off line. Given the information above it was actually the first place I looked. But I thought it interesting enough of an issue to post it here. It’s not one I run into everyday. Lets see how many techs are reading. Comments are open and I will give the answer in two days if no one gets it before then.

I know, only two of you are reading, it’s too new for more, but humor me :)

3 Comments »

  1. let’s see… can’t be DNS - tcpdump would find it (although cotse does have a 15 minute ttl according to dig). same for ARP and DHCP/BOOTP (both contain broadcast requests but replies in both cases are unicast)… something on the physical level? if a machine can’t ping another machine by IP on the same subnet, what’s happening on the switch? check for goblins :)

    Comment by jtatum — April 3, 2007 @ 4:57 pm

  2. Not bad, I’ll have to find a more difficult one next time. It was the switch, for some reason it crapped out when the nic went dead, but if I unplugged first before powering down it worked fine. Strange thing is that it is only that server, I’ve powered down others without incident and even tested one in that port to see if it was just that port.

    Comment by steve — April 3, 2007 @ 6:05 pm

  3. I work for a large software company and the department I am in has a very messy subnet. The core switches are expensive Cisco jobbers but the company for some reason still buys linksys, netgear, and d-link stuff for hooking up small groups of machines. I’m not sure how many mac addresses are floating around on this subnet (I should count someday), but we overflow nearly any mac table I’ve seen on these little switches. A 32 port netgear in the lab has a very unusual failure condition. When the table is overflowed on most switches, they turn into hubs. This one, when the table gets overflowed, it just gets bizarre. You can’t plug anything new into it - the link light goes on but no data goes in or out of that port. It also seems to “remember” old ports even when the cable is pulled out. Bottom line - if you change anything around on it, you have to reboot the switch :) Fairly new switch btw, it’s gigabit. Just cheap - we really should know better (or make another vlan, honestly!)

    Comment by jtatum — April 9, 2007 @ 4:38 pm

RSS feed for comments on this post. TrackBack URI

Leave a comment

You must be logged in to post a comment.

Powered by WordPress