Reconnecting socket after internet disconnect

Issue #12 new

Daniel Sun created an issue 2017-06-13

I've noticed that there isn't an exposed method that allows me to set the socket that I am using (tcp in my case) to reconnect, if the internet has disconnected.

I see in the sendLogOverTCP: method, there is a check to see if the socket had be deallocated or not initialized, but when the phone drops the internet connection, the socket is just disconnected.

I don't know if this is by design, and there is a higher level method somewhere that I can call to reconnect the socket, or if there is a bug in a lower level manager that doesn't reconnect the socket once it detects the network connection to have comeback online.

For now, I've added another condition in the sendLogOverTcp: method to check if the socket is disconnected, so that I can call connectTcpSocket.

I don't exactly know what the implications of adding that would be, since the thread is locked in that call, but it hasn't broken anything for me.

If anyone can explain how they've dealt with a disconnected socket, I would be grateful.

Comments (16)

Daniel Sun reporter
- edited description
- 2017-06-13T20:46:09+00:00
Daniel Sun reporter
- edited description
- 2017-06-13T20:46:27+00:00
jlange88
I am having an issue with socket disconnect as well. I am seeing that when connection is dropped, (in this case Wifi on an iPad). Papertraillumberjack never reconnects the socket and remote logging essentially stops until the application is re-initialized. Is there a way to manually re-connect the TCP connection?
- 2017-11-21T18:57:04+00:00
George Malayil
Hi,

Apologies for the extremely late reply. PaperTrailLumberjack uses CocoaAsyncSocket to do the actual network tasks. At the moment it doesn't look like CocoaAsyncSocket automatically reconnects the socket (after the loss of network connection). However, based on this issue, I think it should be fine for you to call disconnect on the logger, followed by a call to connectTcpSocket (once, you determine somehow the network connection has been lost).
- 2017-11-21T19:23:45+00:00
jlange88
Sorry if this is a dumb question, where is the connectTcpSocket method? I'm not seeing that in CocoaAsyncSocket.
- 2017-11-21T19:46:24+00:00
George Malayil
Hi,

The methods should be called on your instance of RMPaperTrailLogger. However, they currently are not declared in RMPaperTrailLogger.h. So, you might have to resort to some kind of performSelector method for now. That said, I have not tested out any of this. Do let me know if you run into any issues with this. Thanks
- 2017-11-21T19:54:52+00:00
jlange88
Awesome! Thank you for the update, I believe I have a working solution. I am using Alamofire networking and it has a NetworkReachabilityManager. From there I was able to determine if connectivity is lost, (breaking the socket), and reconnect. I am using this line
```
paperTrailLogger?.perform(Selector(("connectTcpSocket")))
```
and the connection was restored after a disconnect.
- 2017-11-21T20:41:18+00:00
Watson Ξ
This last solution ^^ helped us workaround an issue where we were losing logs when the app backgrounded and re-foregrounded. Thanks!
- 2018-05-18T22:08:22+00:00
Yves-Eric Martin
We have also been facing a problem with logs stopping, and I believe it is this very issue.

Would it be possible for PaperTrailLumberjack to check the socket status by itself, and reestablish the connection if needed?

Because right now, I am not sure where and when I should apply the workaround proposed by jlange88, and I would have many questions. Is applicationDidBecomeActive the proper place, as suggested in https://stackoverflow.com/a/7969310) ? If I am not using a reachability manager, is it safe to call if the socket is already connected? Or can I introspect the connection status of the socket? Etc...

So, if it is technically difficult for PaperTrailLumberjack to handle the reconnect by itself, can you at least mention this issue in the README to save others the trouble, and add a code sample of the workaround so users can implement it properly right away?

Thank you.
- 2019-02-03T23:47:55+00:00
Yves-Eric Martin
I looked at the code, and unless I am missing something obvious (which is quite likely given my unfamiliarity with the code and my rusty Objective C), it looks like fixing this inside PaperTrailLumberjack should be quite easy.

sendLogOverTcp already assumes the responsibility of creating and connecting the socket. So it only makes sense for it to also take on the responsibility of reconnecting the socket should it be disconnected. Inside that function, we have:
```
@synchronized(self) {
    if (self.tcpSocket == nil) {
        GCDAsyncSocket *tcpSocket = [[GCDAsyncSocket alloc] initWithDelegate:self delegateQueue:dispatch_get_main_queue()];
        self.tcpSocket = tcpSocket;
        [self connectTcpSocket];
    }
}
```
which I changed to:
```
@synchronized(self) {
    if (self.tcpSocket == nil) {
        GCDAsyncSocket *tcpSocket = [[GCDAsyncSocket alloc] initWithDelegate:self delegateQueue:dispatch_get_main_queue()];
        self.tcpSocket = tcpSocket;
    }
    if ([self.tcpSocket isDisconnected]) {
        [self connectTcpSocket];
    }
}
```
Now connectTcpSocket is called not only when the socket is first created, but also on subsequent calls, if it is found in a disconnected state. The change is a oneliner, and I created the following PR, but take it with a large fistful of salt, given, again, my unfamiliarity with the code and my rusty Objective C:

https://bitbucket.org/rmonkey/papertraillumberjack/pull-requests/4/attempt-to-reconnect-a-disconnected-tcp/diff

Could this fix, or something similar, be included?
- 2019-02-04T13:58:22+00:00
George Malayil
Off the top of my head, the socket could get disconnected in the following scenarios
- macOS - Machine goes to sleep.
- iOS - App is backgrounded
- macOS/iOS - Network connection is lost
- Possibly more scenarios :)
Currently, we are leaving it to the clients to determine if they are being woken up/foregrounded or regained connectivity (and, then manually reconnect). The process isn't optimal as the connectTcpSocket isn't public. Plus, there exists the possibility of losing messages that might have been logged in the interim.

I think the proposed fix would work well in the first two scenarios (app backgrounding/machine sleeping). But, in the case of a bad network connection (either on client or remote end), each time a new log message is sent, the logger would try to reconnect the socket. In this case, I'm not sure if continued attempts to automatically reconnect would be appropriate (and, log messages would still be lost). This makes me a little hesistant to go down this route.

Offhand, I am wondering if it might be better to
1. Have the logger subscribe to app foregrounding/machine waking up events, and do the reconnect there. However, this is a bit more work, and we would need to handle macOS/iOS differently.
2. Leaving it to the client to determine reachability in the face of network issues. We could make the connectTcpSocket public (or have an alternative that would report errors as well) to make the call easier.
However, this would still not guarantee messages reaching PaperTrail - that would require some level of caching of failed send on either the papertrailogger or client.

Let me know your thoughts (and thank you for the PR).
- 2019-02-04T19:55:57+00:00
Yves-Eric Martin
Thank you for the explanations, and for sharing your concerns and hesitation. I have a much better understanding of the issue now.

The following is a bit long so let me break it down into 5 parts:

1/5: My candid take on the current situation

Currently, we are leaving it to the clients to run into the issue on their own, possibly after their app is already in production, in order to find out about it (*1). Then, it is up to them to implement, without guidance (*2), a hacky workaround (*3) that still may not work (*4).

"This is fine" comes to mind... ;-)
- (*1): My bad: I did not check all open issues until now, only the README. My bad... or should such a major issue really be mentioned in the README?
- (*2): I was going to put the hack in applicationDidBecomeActive, but with your new explanation, it looks like applicationWillEnterForeground is more appropriate.
- (*3): As you noted: connectTcpSocket is not public.
- (*4): Even with our best efforts, the client-side workaround can leave the socket disconnected and logs will stop. More on this below.
2/5: About the current workarounds

The main issue I see with foreground / wakeup / reachability monitoring is that these and the state of the socket can be unrelated:
- The socket may remain connected through a short network interruption.
- As your "Possibly more scenarios" implies: there are other cases, unrelated to network reachability, where the socket may disconnect: client side (change of IP address? resource constriction?), server-side (server restart?), or anywhere in between (other undetectable network issues).
So using network reachability to decide if the hack should be applied may still leave the socket in a disconnected state. This is not just unreliable, it is simply wrong and misleading: as other comments in this issue have shown, people believe they have fixed the issue, while really it has just been buried deeper, but it is still there, lurking in the shadows...

A more reliable way to implement the workaround in the client would be to check the state of the socket. Unfortunately, RMPaperTrailLogger's tcpSocket is not public either, so client's don't have access to its state (if there is a way to access it now, please let me know).

So the only robust client-side workaround I see now, to ensure the socket is connected, is to simply call connectTcpSocket every single time I log a message since it just fails silently if the socket is already connected. But this sounds much worse than the solution I proposed... However, it may not be: is there a risk in repeatedly calling connectTcpSocket when the socket is already connected?

3/5: On the other directions you are considering
1. Subscribing to foregrounding/wake-up events: like the current workarounds, it will deal only with a subset of disconnections, and logging can still stop entirely if the socket disconnects for another reason. Also, I agree with you: it is quite a lot of extra work and added complexity, and would require different handling for iOS and macOS: probably not worth the effort since it would only be an improvement, not a complete fix.
2. Leaving it to the client: again reachability is not enough, so we would need to make tcpSocket public so the client can check the state of the socket. In that case, to ensure my log messages get delivered to the best of what I can do, I would write a log method wrapper that checks the socket connection, and calls connectTcpSocket if it was disconnected, basically exactly what the proposed PR does internally, but in the client. The difference is that I could do a reachability check before calling connectTcpSocket, but I probably wouldn't bother, seeing that sendLogOverTcp already calls connectTcpSocket regardless of network reachability.
4/5: About the proposed solution (PR)

I hear your valid concerns about continued attempts to automatically reconnect, and messages still being lost, but putting these aside for now (we'll get back to them), I think this auto-reconnect solution is at least good enough, because:
1. The change is minimal: it is a oneliner.
2. It works out of the box, without requiring extra work (and maybe new dependencies) on the client side.
3. The change makes sense: since sendLogOverTcp already assumes the responsibility of connecting the socket, it makes a lot of sense for it to also handle reconnecting.
4. The change is compatible with the existing workarounds: it will not try again to connect the socket if an existing workaround already reconnected it.
5. It is more robust than the workarounds: the current workarounds only work for a subset of cases (after the app was backgrounded, or after a full network connectivity loss), but the proposed PR will work whatever the reason of the disconnect.
6. It is not perfect, but it is a huge improvement: yes, some logs messages may still be dropped, but not more than with the workarounds, and it is way better than for the logging to stop entirely.
5/5: Now to address your concerns
- Some log messages can still be dropped: in the long term, we could think of implementing caching, retries, error handling and reporting... But for now, a note in the README would do. Something along the lines of: Caveat: Even with TCP logging, there is no guarantee log messages will reach their destination. Network issues may cause the connection to drop and all logging to stop. However, since vx.x.x, PaperTrailLumberjack does its best to reopen the connection, and logging will automatically resume after the connection has been reestablished.
- Continued attempts to automatically reconnect: I really don't know about this one. Is it really a concern? sendLogOverTcp already calls connectTcpSocket regardless of network reachability, but I guess the repeated attempts may cause some issue? To alleviate this, I propose:
  1. Research if repeated connection attempts really are an issue, so you can make an informed decision (it may be perfectly fine after all). But I am afraid this is above my head... Maybe the CocoaAsyncSocket mailing list is a good place to ask?
  2. I can amend the PR to introduce an autoReconnect flag (or any name you prefer) and make the behavior opt-in. The PR wouldn't be a one-liner anymore, but still very simple.
  3. If repeated attempts really are an issue, we could do our own network reachability check inside PaperTrailLumberjack, and attempt to (re)connect the socket only if the logging host is reachable (yes, as a bonus, we could pinpoint the check to verify the reachability of the logging host, instead of a generic "network up/down"). Quite a lot of extra work though. So, like caching/buffering, maybe keep this for a later?
Please let me know if you would rather confirm whether or not there really is an issue with repeated connection attempts, or if I should just go ahead and amend the PR to make the behavior opt-in. I will also add the caveat note in the README about the risk of losing log messages.

Thank you.
- 2019-02-05T13:00:18+00:00
George Malayil
Quick thoughts on this
1. Repeatedly calling connectTcpSocket on a connected socket could be detrimental. Essentially, we might be getting rid of the connected socket and re-establishing it? In my mind, with this approach we'd lose much of the benefits of the streaming nature of TCP.
2. With regards to automatic reconnect, I'm not sure if it would be good citizen behavior. Under bad network conditions, it might end up taking away network resources from the main app.
I think a less invasive solution for now might be to
1. Expose connectTcpSocket to the public
2. Expose state of the tcpSocket to the public (esp isDisconnected and isConnected).
3. On disconnection of the socket, inform the user via a delegate method.
This would also leave open the possibility for client authors to wrap the DDLog macros, inside their own log functions, that could reconnect (automatically) the socket as they deem fit.
- 2019-02-05T15:15:06+00:00
Yves-Eric Martin
A bit of background for why this is so important to me:

For regular consumer apps, the user frequently switches apps or puts the device to sleep, so reconnecting the socket on wake/foregrounding mostly works. But my app is an unattended, remotely managed, single app mode (kiosk) application. The Home button is disabled, the app never goes into the background and is never restarted. So when logs stop, they stop for good. Since applicationWillEnterForeground is never called, and connection may drop regardless of network reachability, my only option for now is to call connectTcpSocket for every log message. As you noted, this is a bad idea, but it is all I can do for now (there would also be switching to UDP, but unfortunately UDP does not support encryption).

With regards to automatic reconnect, I'm not sure if it would be good citizen behavior. Under bad network conditions, it might end up taking away network resources from the main app.

which is why I also propose to make this behavior opt-in.

While pursuing the solution that gives more control to the client is also a good idea, it is not mutually exclusive with having an internal reconnection behavior, and can be done separately. And I am still convinced that PaperTrailLumberjack really should take on the responsibility of having at least some form of internal reconnection handling, since it creates and connects the socket internally already.

Please see this as a starting point: the internal reconnection behavior can be further improved (exponential backoff...), maybe to a point where you will be confident enough to make it the default behavior.

For now, I have amended the PR, making the behavior opt-in, and adding a caveat section to the README, clearly stating that the auto-reconnect behavior is experimental. Please take a look and let me know if we can try and work in this direction (again, independently from the other approach you mention):

https://bitbucket.org/rmonkey/papertraillumberjack/pull-requests/4/wip-attempt-to-reconnect-a-disconnected/diff
- 2019-02-06T02:43:45+00:00
George Malayil
Hi,

The behavior as requested, might have other implications regarding apple guidelines on background proceses and the tasks that you can execute in them.
- Item 2.5.4 here
- https://forums.developer.apple.com/thread/97824
As I understand it, only a subset of actions are permitted while in background mode. I don't think establishing a TCP socket for logging is one of them - and, it might present an issue, if the call turns into a long blocking one under bad network conditions.

I think the safer approach is to let the client users decide on when they would like to re-establish the socket connection. To aid in this, we could
- expose the state of the socket
- expose a public method to connect the socket
- possibly let the client user know when the socket has been disconnected, and, let them decide as to when they want to reconnect.
- 2019-02-11T19:28:32+00:00
Yves-Eric Martin
Thank you, very good point, we should warn the users about that in the README. That is easily dealt with though: the auto-reconnect behavior is not only opt-in, it is also dynamic. So the recommended implementation should be:
```
  func applicationWillEnterForeground(_ application: UIApplication) {
    paperTrailLogger.autoReconnect = true
  }

  func applicationDidEnterBackground(_ application: UIApplication) {
    paperTrailLogger.autoReconnect = false
  }
```
About the "safer approach": this is really a stretch for my rusty and limited Objective C, but if it can help you accept this PR, I think I may be able to include exposing the socket state and the reconnect method. Letting the client know when the socket has been disconnected, however, is way above my head.

But I really want to keep the internal auto-reconnect behavior too, with all the disclaimers and warnings that you deem necessary: with it, it is only 2 lines of code in the client to fix this socket issue (and only 5 lines library-side).

Without it, for comparison, I wrote what it would take to wrap the DDLog macros as you suggested, and implement the auto-reconnect behavior client-side, using the exposed socket state and reconnect methods. My first naive approach was:
```
internal func myLogVerbose(_ message) {
  if (paperTrailLogger.isDisconnected) { paperTrailLogger.reconnect() }   // <-------- The auto-reconnect is here
  DDLogVerbose(message)  
}
```
But this does not work: all logs now report the same file and line number: the one of the wrapper! Maybe there is a better way, but to make this wrapping work, here is what I had to do:
```
internal func myLogVerbose(
  _ message: @autoclosure () -> Any = "",
  level: DDLogLevel = CocoaLumberjackSwift.defaultDebugLevel,
  context: Int = 0,
  file: StaticString = #file,
  function: StaticString = #function,
  line: UInt = #line,
  tag: Any? = nil,
  asynchronous async: Bool = true,
  ddlog: DDLog = DDLog.sharedInstance
  ) {
  if (paperTrailLogger.isDisconnected) { paperTrailLogger.reconnect() }   // <-------- The auto-reconnect is here
  CocoaLumberjackSwift.DDLogVerbose(
    "\(message())",
    level: level,
    context: context,
    file: file,
    function: function,
    line: line,
    tag: tag,
    asynchronous: async,
    ddlog: ddlog
  )
}
```
and that is only for DDLogVerbose. I'll spare you the copy-paste here, but I would need 5 copies of this code, to cover also DDLogDebug, DDLogInfo, DDLogWarn and DDLogError.

That's a whopping 120 lines of code... It goes without saying, for the same behavior, I much prefer the internal reconnect and adding only 2 lines of code client-side.
- 2019-02-12T10:29:24+00:00
Log in to comment

Assignee: George Malayil

Type: task

Priority: major

Status: new

Votes: 0

Watchers: 2

Comments (16)

1/5: My candid take on the current situation

2/5: About the current workarounds

3/5: On the other directions you are considering

4/5: About the proposed solution (PR)

5/5: Now to address your concerns