PC's Suspend-Resume cycle induces session breakup

Issue #7 closed
Markus KARG created an issue

We noticed problems with open chat sessions in case one of the chat partners sets his PC to suspend mode and resumes it (long time) later. While other chat programs like Skype or Pidgin recover from this within seconds, Babble fails more often than not.

Sometimes it just works. Many times the result is that the session still works but is really really slow. And sometimes the session just does not work. You can send messages, but they will not arrive on the other side of the network.

I assume that this actually is a problem of the JVM unable to detect suspend and resume events of the operating system, hence cannot recover the TCP session correctly or something like that.

It would be good if this could be checked to decide how to go on: If it is a Babbler problem, then we should fix it. If it is a JVM problem, then we should report it upstream, i. e. to Oracle as the super-upstream of rather all JREs currently on the market.

Comments (6)

  1. Christian Schudt repo owner

    I assume you mean XMPP sessions in general, i.e. the connection to the server?

    Broken TCP connections can take a while until they are reported/discovered by the operating system. Until a broken connection is discovered by Java's socket, it can take a while, I've read it could even be 2 hours, until dead connections are detected (OS dependant). Java eventually detects a broken connection, when reading from or writing to a socket, but from my experience this can also take a while. The socket doesn't immediately throw an exception. That's why messages can be lost, when writing to socket although the OS has already closed the connection.

    Do you mean the reconnection logic does not properly work? What exactly is slow? Sending messages over the wire after reconnection?

    Are you on Windows? You could add a SessionStatusListener to detect/track disconnects.

  2. Markus KARG reporter

    I think this is related to the trouble I currently have with the hanging reconnection. Let's keept this issue open until I can tell you the exact point in the software where it hangs.

  3. Markus KARG reporter

    Sure, close it. The effect happened frequently ealier this year, and we haven't noticed it since months. So while I cannot say if exactly #19 solved it, at least something you did made it go away. :-)

  4. Log in to comment