reconnectionStrategy doesn't work with WebSocket
Using
XmppSessionConfiguration xmppSessionConfiguration =
XmppSessionConfiguration.builder()
.reconnectionStrategy(ReconnectionStrategy
.truncatedBinaryExponentialBackoffStrategy(60, 5))
works well with a TCP connection but not with a WebSocket one. All what I get when shutting down the XMPP server (after connection and login) is:
juil. 26, 2016 4:21:03 PM org.glassfish.tyrus.container.jdk.client.ClientFilter processError
GRAVE: Connection error has occurred
java.io.IOException: Le nom réseau spécifié n’est plus disponible.
at sun.nio.ch.Iocp.translateErrorToIOException(Iocp.java:309)
at sun.nio.ch.Iocp.access$700(Iocp.java:46)
at sun.nio.ch.Iocp$EventHandlerTask.run(Iocp.java:399)
at java.lang.Thread.run(Thread.java:745)
Whereas with TCP:
juil. 26, 2016 6:11:08 PM rocks.xmpp.core.session.ReconnectionManager scheduleReconnection
PRÉCIS: Disconnect detected. Next reconnection attempt in 5 seconds.
Comments (20)
-
repo owner -
reporter I can reproduce it each time: Windows 7 and JDK 8u91 (32 bits). I will try with the latest JDK (102). I'll test also on a target server (Suse, JDK 8u72 32 bits). If that works well on this one, that's fine. I don't know how I can debug further. The only traces I get are those I included in the issue description...
-
repo owner I've tested on another machine (Windows 10, JDK 8u92) and it works well, too. Connected to a remote machine, then shut down and restarted the server (Openfire) => reconnection started normally and successfully.
Unfortunately, I cannot tell, what's your error.
-
reporter I test with a server which is running on my machine. May be that makes a difference?
-
repo owner I've also tested with a localhost and then with a remote machine. it's the same result for me.
-
reporter I'm cursed then :-). I suggest to close this issue...
-
repo owner Although I still couldn't reproduce it, I've browsed through the code and hopefully fixed it with this change: 6bfce85 (The onError method never got called in my tests).
Are you able to compile and test this?
-
reporter I'm trying right now. May take some time as I'm not used to work with Git... I'll let you know.
-
reporter Just for your information, I've got one test failure: Results :
Failed tests: unmarshalThumbnail(rocks.xmpp.extensions.jingle.apps.filetransfer.JingleFileTransferTest): expected obje ct to not be null
Tests run: 524, Failures: 1, Errors: 0, Skipped: 0
-
reporter Does not work better :-(.
Something is puzzling me anyway. In WebSocketConnection (line 269) you create a "new EndPoint" without overriding the "onClose" method. And, as far as I understand, this is this "onClose" which is called on the failure I get. Does that make sense?
-
reporter Just added a "onClose":
@Override public void onClose(Session session, CloseReason closeReason) { System.out.println("Connection closed: " + closeReason); }
and... it is called (Windows test)!
Connection closed: CloseReason[1006,Closed abnormally.]
Looks like dealing properly with the onClose will fix my issue?
-
reporter But I can't still understand why I have this issue and why you don't. May be it's related to how the server close (or don't) the connection during the shutdown?
-
reporter Last update. It looks like I was not waiting long enough:
IN : <iq xmlns="jabber:client" to="activation@lh6kl662/res" id="86b2343a-26f3-4114-bc80-94aaf2e55c9f" type="result"><query xmlns="jabber:iq:roster" ver=""/></iq> Connection closed: CloseReason[1006,Closed abnormally.] juil. 28, 2016 8:04:24 PM org.glassfish.tyrus.container.jdk.client.ClientFilter processError GRAVE: Connection error has occurred java.io.IOException: Le nom réseau spécifié n’est plus disponible. at sun.nio.ch.Iocp.translateErrorToIOException(Iocp.java:309) at sun.nio.ch.Iocp.access$700(Iocp.java:46) at sun.nio.ch.Iocp$EventHandlerTask.run(Iocp.java:399) at java.lang.Thread.run(Thread.java:745) Connection OK/NOK: rocks.xmpp.core.session.ConnectionEvent[type=DISCONNECTED, nextReconnectionAttempt=PT0S] juil. 28, 2016 8:19:12 PM rocks.xmpp.core.session.ReconnectionManager scheduleReconnection PRÉCIS: Disconnect detected. Next reconnection attempt in 55 seconds. Connection OK/NOK: rocks.xmpp.core.session.ConnectionEvent[type=RECONNECTION_PENDING, nextReconnectionAttempt=PT54.994S]
So reconnection attempts are started but after approx 15 minutes after the failure. Not what I expected...
-
repo owner I tested with Openfire. It sends:
<stream:error xmlns:stream="http://etherx.jabber.org/streams"><system-shutdown xmlns="urn:ietf:params:xml:ns:xmpp-streams"/></stream:error>
Then onClose is called with NORMAL_CLOSURE code. The stream error causes an exception and eventually a reconnection.
It seems that Tigase does not send the <system-shutdown/> error (on the XMPP layer), but kills the connection on the hard way on the WebSocket layer with "Closed abnormally" error. (If so, it's also a bug in Tigase). Will try to deal with it.
-
repo owner I don't have the time to setup Tigase server with WebSocket and test with it, but I've implemented the onClose method based on your comments: 6d96d5a
I am happy to get feedback from you.
Btw.: 15 minutes could be caused by trying to write the next XMPP Ping to the (dead) connection, causing an exception, then causing the reconnection.
-
reporter Tested both on Windows and Suse, your fix works perfectly! Thanks.
-
repo owner Great!
with regards to the failed test: I've never had that and the continous integration on this site (drone.io) runs the test successfully as well.
I've reviewed the code and saw that the namespace was wrong. Nonetheless weird, that my JDK/JAXB never complained about it. Fixed it with 102f679
-
reporter Fine!
-
reporter - changed status to closed
-
repo owner Fixed with version 0.7.1
- Log in to comment
I've tried to reproduce, but I couldn't (JDK 8u60). Reconnection worked as expected after server shutdown and restart with WebSocket.
I've also seen this: https://java.net/jira/browse/TYRUS-399, which describes this problem. Whis operating system do you use? Are you able to debug this and/or maybe get more stacktrace?