Flatcam beta won't re-open after a crash

Issue #383 resolved
Matthew Goulart created an issue

Flatcam version: r2350.e2e8bde5-1

OS: Manjaro (linux)

When attempting to change theme settings, flatcam crashed. Attempting to re-open flatcam results in this error:

[INFO][MainThread] FlatCAM Starting...
[DEBUG][MainThread] Application path is /opt/flatcam
[DEBUG][MainThread] Started in /home/matthew
[DEBUG][MainThread] FlatCAM defaults loaded from: current_defaults
Traceback (most recent call last):
File "/opt/flatcam/FlatCAMApp.py", line 12439, in my_loop
listener = Listener(*address)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 448, in __init__
self._listener = SocketListener(address, family, backlog)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 591, in __init__
self._socket.bind(address)
OSError: [Errno 98] Address already in use

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/opt/flatcam/FlatCAMApp.py", line 12463, in run
self.my_loop(self.address)
File "/opt/flatcam/FlatCAMApp.py", line 12444, in my_loop
conn = Client(*address)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 502, in Client
c = SocketClient(address)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 629, in SocketClient
s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused
/usr/bin/flatcam: line 3:  2124 Aborted                 (core dumped) python /opt/flatcam/FlatCAM.py

I believe it is attempting to re-use an existing socket. In my experience, using SO_REUSEPORT would fix this but I don’t know if this applies to AF_UNIX sockets…

The only way to fix this is to reboot my machine.

Comments (16)

  1. Marius Stanciu

    Hi,

    This issue was fixed on 19.Jan in my working copy. I am on the final steps before I will make a push with the latest changes. Expect the fix to be available in a few days.

  2. oltupnasn

    Hi, I have the same error, it happens not only after a crash, but also after a normal closing.

  3. Андрей Б.

    Hello.

    This problem occurs when IPC file is not cleaned after crush. On new run this file cannot be recreated. To make new run possible (without reboot) rm /tmp/testipc file and run FlatCAM.

    Sorry for my english. I am write it to exclude reboot on crash and save time.

  4. Marius Stanciu

    Hi,

    Ok. Here is why I use and open this socket.
    When FlatCAM beta is first run it will open this socket. If another instance is launched and the first one is still running, this second instance will detect that this socket is open, will know that there is another FlatCAM instance open and then will pass the arguments he got to the first instance and then close itself.

    The reason for this behavior is that if FlatCAM is running and another instance will be run trying to open let’s say a Gerber by having the Gerber filename as argument, then this new Gerber file will be open not in this second instance of FlatCAM but in the first one.
    This way there will be not more than one instance of FlatCAM running (there is reason for that, crashes due of locked resources and so on).

    Normally, if FlatCAM is closed succesfully then the socket is closed also so when you start the FlatCAM the process will work OK. Of course that in case of crash this goes out of the window 🙂 since there is no normal close and therefore the socket survive.

    Now, what I may try to do is that even if the app crashed and the socket is still open, if the application is run without arguments then it will attempt to first send a ‘close’ signal to the supposedly opened instance of FlatCAM, close the socket, and then it will run as usual. Of course this mean that if the user will try to start the app with an argument after a previous crash we are still in no-go scenario.

    That’s the best I can think of. If any of you have any other suggestions on how to handle this, please make it. Otherwise, if what I have presented above is not acceptable the best I can do is to disable this completely for Linux.

  5. Matthew Goulart reporter

    i might be missing some information, but couldn’t you just use a lockfile? Say, when flatcam opens, it creates a lockfile and writes it’s PID into it. If you try to start another instance of flatcam, it will first try to read the lockfile. If there is no lockfile, it opens normally, if there is one, it reads the PID of the existing instance. If the PID is valid, it sends the arguments to the existing instance of flatcam, if not it starts a new instance with the given arguments.

    I have a good deal of experience with sockets and I think you might find it’s a big headache trying to get them to behave consistently between win/linux especially when it comes to things like SO_REUSEADDRESS and SO_REUSEPORT which I suspect you’ll end up having to play with to get it to work.

  6. Marius Stanciu

    Hi Matthew,

    You are right but let me tell you how things started. It started in Windows (in which I do the programming and which I use as my main OS) where it works very well as it is, as a way to detect double clicking on a file with an extension recognized by FlatCAM which lead to opening FlatCAM if there is no instance or opening in FlatCAM if there is an already running instance .

    I wanted to extend this feature to Linux (for the case presented in my post above) but it seems that Linux does not do what I want, which is:

    • a way to detect that another instance of the app is running without doing much maintenance

    The accent is on: without maintenance.
    In Windows even if the program crashes, I don’t have the problem that we have in Linux.

    Using a lockfile is actually a good idea but it does not solve the problem in case of crash because the file will still be left there. And this will get to do programming just to decide if the current file is junk (leftover from a crash) or a valid one from a running instance of the app.
    And while I am willing to do a bit of ‘if the OS is …’ programming I won’t start to make special methods just for Linux.
    Which means that in the end, if I can’t come up with a solution, I will disable this feature for Linux (that was actually the initial view I had on the problem, initially it was enabled only for Windows and right now for example is not running for MacOS) and therefore the problem will be solved.
    Right now I am running FlatCAM in Linux (Virtual machine XUbuntu 19.10) and I do not see the issue. I guess that only a crash will yield this result since I’ve added in the ‘close’ method a piece of code to actually close the Listener which deletes that file.

    I know that it’s not ideal but since there are crashes I do lean to disable it. Yet, I will still search for a way out of this, for a while.

    Thanks for your thoughts!

    -Marius

  7. Matthew Goulart reporter

    In that case, try setting SO_REUSEPORT. It will allow you to bind to a socket that is already open.

  8. Marius Stanciu

    But that’s the whole idea, if there is a socket open, close the new FlatCAM instance passing the eventual parameters to the old instance of FlatCAM. There is a QThread running in background waiting for such an event.
    So it’s not about reusing the socket, it’s about detecting it’s presence and using it as a conduit to send data between two FlatCAM instances that may exist at one time (not for long). Sorry if I did not go into much details since this is the behavior that it has in Windows.
    I just thought that it might be useful in Linux too, in a more limited way, and apply it with a few lines of code, even if the whole mechanism is not used in Linux because here we don’t get to double click a file and open it in FlatCAM. Yet it seems that it’s more trouble than benefits for Linux.

  9. Marius Stanciu

    Ok guys, I think I may have solved it and it was easy.
    All I had to do is to handle the Exception ConnectionRefusedError:

    except ConnectionRefusedError:
        if sys.platform == 'win32':
            pass
        else:
            os.system('rm /tmp/testipc')
            self.listener = Listener(*address)
            while True:
                conn = self.listener.accept()
                self.serve(conn)
    

    I remove the mentioned file. like Andrei suggested and then started the socket again.
    I think it should work. I will make the push in a moment. Will any of you try the latest version on the repo and see if it’s solved?

  10. Matthew Goulart reporter

    Working on it, Sorry its taking so long. I only gt around to it this morning and installing the deps takes over an hour. python-or-tools is a meg beast…

  11. Matthew Goulart reporter

    Ok so I got it working nicely, looks like you also fixed the dark mode option! This is looking fantastic! Great work.

  12. Log in to comment