Telescope mount disconnected

Issue #688 on hold
John L created an issue

Hi, last night I downloaded nightly #016, and ran it overnight. In the morning, I found in the log files that during the night the telescope mount was “not connected” and was not talking to NINA (or ASCOM) around 23:23pm. However, both the mount and NINA were both still running separately and doing their own thing, as if nothing happened. This continued until I discovered it in the morning. Just so I’m clear: after the disconnect, NINA was sending commands to ASCOM, but ASCOM was not communicating with CPWI. Simultaneously, the mount continued to follow the same target it was commanded to (i.e., the last NINA slew command), and it doesn’t appear the mount/telescope changed from tracking that same target for the rest of the night.

Additionally, and perhaps relevant to the main problem above, I had PHD2 running as well (although I was not planning to use guiding… I should have shut guiding off, but didn’t), and PHD2 (about an hour after the “disconnect”) started sending guide commands to the mount, and the mount was receiving and executing them! Note that I did not program NINA to turn on the guider in the sequence, but PHD2 was running and I did not set it manually to guide. Lastly, FYI, in the ASCOM logs, it appears that NINA still tried to do an autofocus around 00:23 in the morning, but CPWI log doesn’t show a record that it executed the NINA commands.

Currently, I’m not sure how this all happened, and it very well could be the telescope, CPWI, PHD2 or ASCOM. Though I can’t pin point the cause, there is a possibility that NINA had a problem; because of that I thought I would bring this to your attention.

As background, I’ve been running with my Celestron CPC1100 telescope/mount using CPWI, consistently using the CPWI ASCOM driver, and have been running it this way with NINA for several months; to-date I haven’t experienced this particular problem. I’ve never seen the mount do anything like this, except when there’s been a power glitch or a USB hiccup. In those cases, CPWI will typically warn there is a disconnect, and will prompt the user to reconnect to the base (i.e., the mount will just stop, and it won’t continue to track the same target)

FYI, here is the header from last night’s NINA log file:

----------------NINA - Nighttime Imaging 'N' Astronomy----------------
-------------------Running NINA Version 1.11.0.1016-------------------
-------------------------2020-11-07T20:29:48--------------------------
----------------------ASCOM Platform Version 6.5----------------------
---------------------.NET Version 4.0.30319.42000---------------------
---------------------Oparating System Information---------------------
---------------------------Is 64bit OS True---------------------------
------------------------Is 64bit Process True-------------------------
---------------------------Platform Win32NT---------------------------
--------------Version Microsoft Windows NT 10.0.19041.0---------------
---------------------------Major 10 Minor 0---------------------------
----------------------------Service Pack -----------------------------

I really don’t want to include the entire set of log files since they contain my GPS info, so instead I’ve attached a separate file with excerpts from the related log files; hopefully this is enough for you to work with. If not, let me know and I will try to send you more when I have the chance.

I’m planning to revert back to an earlier version of NINA for now. The forecast is for two good nights in a row, I don’t want to miss out on clear sky.

Again, I’m not expecting you to troubleshoot for me, but if you think my problem might be due to a NINA issue or an error caused by my own faulty sequence programming or settings, or other software issues, please let me know.

Thanks,

John

Comments (42)

  1. John L reporter

    Hi, I just wanted to give you an update. Because I ran out of time yesterday afternoon, I decided to run again with Nightly #016 last night. After a quick look this morning, it appears that NINA ran properly last night. I did notice some weird behavior with my mount early during setup, but after that it seemed to run fine all night. In any case, regarding the problem I reported above, my major suspicion is that the telescope mount has some sort of issue.

    However, as I said in my original post above, it still remains that NINA did not seem to recognize or react to the fact that the telescope mount shutdown. Is this the way it should be? It appears that NINA continued to command slews (unsuccessfully), take exposures (successfully, of the side of my house 😉), do centering/plate solves (unsuccessfully), and auto focusing (unsuccessfully).

    Thanks again for all of your hard work to develop NINA, and I will appreciate any comments you may provide.

    John

  2. Stefan B repo owner

    Hi John,

    thank you for your details, I didn’t have the chance to look through it in depth. However there is currently no recovery for a scneario where equipment disconnects mid session and the sequencer will just try to continue.
    Recovery options and/or failure handling will come in future with the sequencer rework that is ongoing in 1.11

  3. John L reporter

    Stefan, thank you for your response; I will look forward to the newer versions. I will plan to continue using the nightly versions, and if I discover anything new, I will let you know.

    Thanks,

    John

  4. robert hasson

    I find that my ascom telescope is disconnected almost systematically after running through a sequence panel. this is on the version 1.10. With other imaging programs i never have the issue. PHD2 or APT or even stellarium keep the connection up through the whole session.

    Also i tried upgrading to 1.11 (any version) but then i get an error when trying to connect to an ascom camera. (get_offset() method not found,) . This happens on 2 different camera drivers. (my own lumix one which works on 1.10 and also the Ascom.Simulator).

    link to 2 relevant log files.https://1drv.ms/u/s!AqAl-gkQ24Fu4H72vodr5pjOqLT7?e=ppH3hC

  5. John L reporter

    Hi, I’m running nightly version #024 tonight. The connection between my telescope (via ASCOM/CPWI) and NINA has disconnected 3 times so far (in about 3 hours of run time so far). The telescope keeps running, and appears to not disconnect from my PC. In the past, if the telescope disconnected from the PC, I had to go into CPWI and reconnect; I haven’t had to do that tonight. Luckily the disconnects have happened during the middle of an imaging sequence and I’m running without PHD2 and am unguided, so the disconnects haven't caused any problem yet. However, if it disconnects again before the next target, likely I’ll miss that and subsequent targets. (something similar happened previously, as I reported in my original post above) When each disconnect happened tonight, I was able to go into the equipment tab in NINA and reconnect the telescope manually.

  6. Dale Ghent

    There’s not much we can do if the driver goes AWOL on its own accord. ASCOM drivers generally keep their own logs of what is going on, or offer the option to turn on such logging through their own options. I installed the current CPWI release from August (version 2.3.5) and tried to see if I could get into its configuration, but it appears you need an actual Celestron mount to get that far. But what I did notice is that it started putting its own logs into Documents\ASCOM\CPWI_Logs. If you see any logs there, they might contain lower level info that is relevant to this issue.

    On that note, I do know that CPWI has generally been problematic in the past and there are people who opt to use the classic Celectron ASCOM driver with their mounts instead of CPWI. I know it has seen some updates recently, so make sure you’re updated on that front. There can also be issues if you’re running ASCOM 6.5 instead of 6.4.

  7. John L reporter

    Hi, I’m on nightly 79. As reported back in November I was having an issue with my mount disconnecting (my scope side setup is a mini pc that runs NINA and CPWI - software interface to my Celestron CPC1100 telescope). In any case, it’s happened again; NINA recognizes that the mount has disconnected, and flashes a warning popup on the screen. Then, if I go into the equipment>telescope and hit connect to telescope, most times the mount will re-connect and keep going. So, I wanted to check back in to see if you have a software update planned to do this function automatically in NINA? And/or, some sort of reporting mechanism to send the disconnect report (and/or other failures, events or conditions) to a cell phone would be fantastic. Thanks.

  8. Dale Ghent

    Blindly reconnecting to a driver that cannot stay connected to its own hardware is not something to be taken lightly. There is no guarantee that a mount is in a sane state upon reconnecting to the driver after a driver or hardware fault caused a disconnect, so there should certainly be a human in the loop. In this case, there is an obvious communications issue between CPWI and the mount, and the root cause of that should definitely be found and addressed. Doing so will obviate the need to any kind of spit-and-bailing-wire approach such as blind auto-reconnects in hopes that life will carry on for just a little while more. Did you take the advice in my above response from 2020-11-23 and look for CPWI’s own logs? You might wish to look them over and and take them to Celestron support so that they can assist you with their driver+hardware.

  9. John L reporter

    Hi, I understand what you’re saying about the danger of re-connecting blindly. Until I find the root cause, I guess I’ll have to check status during the night. I didn’t look at the old logs, because like I said, this disconnect problem hadn’t happened since I last reported it. I thought I was in the clear. It very well could be CPWI, so I’ll look into that. I will try to contact Celestron, but the TeamCelestron software website seems to have gone dark for a while; not sure if/when I’ll get a response, but I’ll give it a shot. I just updated to the latest ASCOM software. So I suppose that could be suspect as well.

    I did look at the latest logs. I’m no expert at all, and not sure I’m even reading them correctly. I can see the time where it disconnects in NINA, CPWI and ASCOM logs, but it’s not apparent to me to indicate why it’s disconnecting. Funny thing is, unless I’m reading things wrong, it looks to me that PHD2 is still running properly at the same time, and it’s still commanding the telescope even after NINA is disconnected from CPWI! In my setup, usage of the CPWI ASCOM driver is the same for both NINA and PHD2. I looked in the PHD2 log and there’s nothing that indicates that PHD2 saw a disconnect. So, evidently CPWI and the mount are not entirely disconnected. And, I would guess because of that I can eliminate cable or usb port issues because the usb connection is the same for both NINA and PHD2; from mini-pc to telescope USB connection.

  10. Dale Ghent

    When it comes to intermittent connection losses like this, always check the cables and make sure none are too loose at their connections and are properly restrained to prevent excessive movement at their points of connection. You might also want to swap out the data cable for your mount. Also make sure that enough voltage is being supplied to the mount, as even a momentary drop (such as during a slew) below a certain level can cause the internal electronics to reset. Lastly, I don’t know about Celestron’s support organization, but do contact them as they are the best people to talk to for issues with their product.

  11. Ruediger

    Hi all,

    i also made this experience with high end mounts which is connected via Ethernet. Even with perfect hardware there may be hick ups e.g. caused electrostatics or cosmically, even when shielded. I think it is basically a good idea to have a kind of auto reconnect or at least an error handling.
    An uncontrolled mount which is moving on till it hits an hard limit, is the last what you want when you operate unattended. At least I would reconnect and issue a park commend or at least a “stop”, Also I could trigger an emergency shutdown via external switch or power cut off. But ignoring a disconnect is the worst way to handle this event.
    So an auto reconnect or an trigger would be fine or at least to fire the safe state could be solution. But I think to give an option for how often and a delay for reconnect could be a good feature. E.g. try 3 times with 15s in between. This two variables should editable in options.

    BTW: For the argument it is not a good idea to simply reconnect since you do not know the state: that’s anyway what I have to do in any case when working remotely. There is no other option at all. And all the better mounts are always in a coherent state regardless their communication, since they have their own controller. Hence this argument is actually not scoring.

    Thanks for considering.
    Rüdiger

  12. Dale Ghent

    All mounts act differently so it cannot be assumed that what is the case for one situation will be the case for another. Some mount controllers have watchdog functions that can stop tracking or park the mount upon loss of connection from the PC. Some mounts depend completely on the ASCOM driver for their management functions and must be manually initialized. The disconnect can also be due to a software fault such as a crash in the driver, and automatically reconnecting and restarting the driver will cause an endless loop of crashes that could cause further harm. All these reasons are why it’s best to involve a human in what can be a unexpected excursion from the norm.

  13. Ruediger

    Hi Dale,

    That’s true for some mounts, but not too many. But all remotely used mounts are fail safe and a reconnect is absolute safe, even mandatory. So I see no reason not to integrate some functionality because some HW cannot handle it.

    You have everywhere hardware wich is capable of something or not. Following this argument you would have to remove two third of all functions because they are HW specific.

    Also my human interaction is always the same: hit „connect“ again. Nothing to decide at all. There is no alternative to reconnect.

    Why not simply make it optional to be on the safe side? Enable reconnect yes | no. So the user can decide and it would be very helpful for many users.

    Cheers

    Rüdiger

  14. Dale Ghent

    Again, we cannot play games or guesses with the many situations that can come from the wide range of hardware and software faults that can occur. Even making it optional does not make sense because that would be asking - or enticing - the user to make a bet that things will always be ok after an unexpected system fault. I have personally seen mounts that lose their state because of a sudden low voltage transient and must be completely reinitialized. I’ve seen drivers crash endlessly because of a bug. The usual case might seem benign, but that isn’t always the case and the worse cases aren’t predictable, nor are they even really detectable. Blind assumptions, even if they are made optionally, are never the correct course of action. In this case, the user needs to intervene, correct the fault, and ensure a consistent state of their equipment.

  15. Stefan B repo owner

    Hi,

    the ASCOM driver is not a direct interface to the hardware, but all hardware specifics need to be handled inside the driver, so that a common unified interface can exist. Just because a cosmic ray hits the ethernet cable, the connection state should not flick to false. In all the years using EQMOD it never disconnected even once randomly. Only when there was a serious failure, like power outage, where a human interaction was necessary. (Pointing model was lost)

    This comment is not for or against a reconnection / error recovery feature, just want to point out that ASCOM drivers also have some responsibilities to make sure that things are working properly.

  16. Ruediger

    Hi Dale,
    ok, here our experiences differ completely 🙂. I have never seen such a behavior at any mount. But on a remote setup there are only two possible reactions:

    1. to reconnect.
    2. to abort the sequence and shutdown everything

    There is no alternatives when running unattended

    Hi Stefan,
    Totally agree. And that’s what e.g. the 10 Micron driver exactly does. It recovers immediately. But that does not help when NINA is disconnected. Actually the complete setup is working flawless after recovery, but NINA is not aware of it.

    To both: I totally accept your arguments though I have a different point of view and the requirements for unattended operation are a bit different, but in any case it would make sense to be able to react to such an event. It makes no sense that NINA continuous to execute the plan if such a critical event like a disconnect happens. At least I must be able to execute a script which shuts down the mount, parks it or switches it off or sending a sms or message. But to let the mount running uncontrolled without the operator being informed is a big risk. At least trigger the “ is safe” state would be a help.

    For me the worst case is a disconnect from mount and everything is moving on until hard limits. This has happened to me two times. My workaround was to setup a watchdog which sends a stop, park or shutdown command via scheduler in the morning.

    The reason for the very short disconnects had been the NIC because Win10 was toggling link state for fraction of second. So no hardware issue, but OS. There are many potential reasons. Also sometimes win10 re-enumarates the USB Tree which also sometimes leads to disconnect for devices, without any real problem.

    Anyway, many thanks for this discussion and sharing your point of view.
    I can implement a solution to cover my use case via script based tools.

    Cheers Rüdiger

  17. Stefan B repo owner

    the one thing i don’t understand is - why does the 10 micron mount driver report to be not connected when it immediately recovers anyways? makes no sense to me from an ascom driver perspective.

  18. Ruediger

    I can only guess, but it looks like it takes too long. NINA is already disconnected and it is probably toggling connection state in between. But that is only guessing.

  19. robert hasson

    I can only confirm Ruediger’s experience in a different setup. here is mine;

    1. Skywatcher Synscan app on mini pc.
    2. HEQ5 pro connected either via serial cable (e.g. EQMOD cable) and more recently via wifi adapter (both SW original or manually created DIY wifi serial repater)
    3. NiNa connects on first attempt via the Synscan mobile ASCOM driver to the Synscan APP connected on the SAME mini PC.
    4. After (typically but not verified systematically) one sequence Nina reports connection with driver has stopped (regardless of the amounts of retries or the timeout of the ASCOM driver i have setup)
    5. at the same time PHD never reports any such problems.
    6. If i fallback to APT, the connections lasts through the nigh for however long my sequence is.
    7. I can always reconnect to the mount from nina on the equipment tab after such a disconnect

    Hence in my experience this happens with or without cable at a much more frequent rate than what could be accounted by cosmic rays. no power outage can explain this as all equipment is running off the same mains and on the same PC.

    Could it be that

    1. the logic for sensing disconnect is to drastic ?
    2. on disconnect a “retry” attempt could be instantiated to make sure that the disconnect is truly effective?

    maybe the root cause is with the Synscan Mobile ASCOM driver however as i mentioned i do not see disconnect from other ASCOM clients like APT, PHD or Sharpcap. Also i have had no disconnects on INDI platform ( on Kstars/EKOS on raspberry)

    All i am asking is for some double checking of the connection routine within NINA which seems to be to only client causing this issue. Until this is fixed i will have to stay with APT i am afraid.

  20. Stefan B repo owner

    The disconnect logic is simple. The driver reports disconnect, nina considers it disconnected unexpectedly. If there is an internal retry mechanic inside the ascom driver, why does it report a disconnect.

    There is no way to see on the client side why a disconnect happened from the ascom interface, so we have to treat those events equally.

    The only way to get around this would be to completely ignore what the driver reports. This might be worth a discussion on the ascom board on what the connection property should reflect on the driver level and what it should not.

  21. Ruediger

    Hi Stefan,
    I think it is a serious design flaw (by ASCOM) to presume 100% stable connection and that any disconnect is related to the actual device. I think there is no doubt about that this should be handled in ASCOM in some way.

    But the question is what to do meanwhile or to compensate this design lack, aside a fundamental ASCOM discussion? As Robert mentioned, other programs try in case of an unexpected disconnect, to reconnect, since the vast majority of disconnect are due to simple and singular communication issues between hardware and ASCOM driver and can be solved by an easy reconnect. Not a big thing.

    I understand yours and Dale’s point of view, that this is an ASCOM issue. But in this case it should be closely evaluated if continuing the sequence is a valid behavior, since dithering, targeting and many more function will fail consequently during execution and even may produce highly dangerous situations (mount limits etc.)

    At minimum, if you do not want to implement a reconnect, there should be a trigger “On unexp. disconnect of device camera | mount | guider” terminate sequence and trigger an emergency script, so I can get the gear in a defined safe state.

    But actually I would prefer an optional soft reconnect, since this could save the observing night and solve 90% of the problems.

    Many thanks for this interesting discussion.

    Cheers
    Rüdiger

  22. Stefan B repo owner

    I have posted a query regarding this ASCOM design into their group today. First I want to get the official stance on how the Connection Property should behave. Once we have this info we can think about further steps on how to handle your specific scenario where there is no hardware failure but just a small driver hickup.

    Creating a trigger as a plugin to do just a simple reconnect is also already possible if you know a bit of c#. (However the plugin system is still in development and they can break during new nightlies)
    The new plugin system is there to cover the cases which don’t make necessarily sense for the broad variety of users.

  23. Ruediger

    Many thanks for your support. Very much appreciated.

    BTW:
    I don't think that only a small group of users is bothered by these disconnect issues. Only most of the “interactive users” just hit “connect” again. That's it and not a big thing. Problem solved.
    Here locally all AP friends are well aware of this disconnect issue. That's the reason why many simply say: I stay awake and observe it running just for such a case. I know some guys from AB who refrain from running unattended exactly for this reason.

    Addendum: I think error handling in any situation is very important feature and crucial for a good sleep 😃

  24. Dale Ghent

    So what if the driver does not set Connected=false while it is trying to reconnect in the background, and Nina (or something like PHD2) sends a command (slew, dither/pulse guide, etc) during this period. Are you assuming that the reconnection attempt in the background is always going to succeed instantly? What if it takes a while, or never at all? During this period the app will continue to operate under the belief that the mount is working fine. What happens to these commands that are being issued? Does the driver queue them? For how long? The app is also waiting for these to complete. How should the driver handle it if multiple dithers stack up while PHD2 is also sending pulseguide commands that are also stacking up. All of this may be happening while one or more exposures might be going on.

    This attempt to be clever or throw caution to the wind with wild assumptions is rather misguided. FIX the underlying problems that cause the connection issues and you won’t ever have to contemplate these maneuvers. This kind of terrible gaming is just shoving the problem off and is lazy.

  25. Ruediger

    Hi Dale,

    your points are valid to a certain extend, but what are the alternative and consequences? A failed reconnect cannot make things worse than they already are. And I actually prefer trying to recover the system than let it run uncontrolled since the sequence is continuing.

    There are always communication interruption possible, which have to be considered in any design. To design any software (here ASCOM) which is assuming 100% reliable connection is ridiculous. If you cannot handle a secure recovery than you have to make sure that in case of an connection loss the system is switched to a well defined safe state. This does any OS (e.g. “system halt”, “kernel panic”, BSOD) , any serious software any machine control. To continue without any reaction to a disconnect is potentially dangerous.

    I agree the target should be to find the root cause for a disconnect, but also hardware gets defective, cable can brake or many other unplanned and uncontrollable situation can cause interruptions. Even here the switches are configured as hot standby failover, but you lose one or two pings of heartbeat until the other takes over.

    Error handling and robustness is a part of any serious SW implementation. Or do you consider “find your underlaying problem proactively” is a substitute for error handling? Sorry, but this not feasible. Unexpected things cannot be proactively excluded, only minimized. A reasonable and safe error handling (in what form ever or in which component ever) is mandatory. Shoving the problem to “get your connection 100% stable” is not the answer of an undefined system state after an error/disconnect.

    Also to make this very clear: This is not the fault of NINA, but NINA has to react in a appropriate way on these design flaws of ASCOM.

    So concluding from your given arguments and the pointed out problems occurring when reconnecting, the only consequence could be to abort the sequence and trigger a fail safe script to terminate everything in a defined state. That's exactly what I also suggested as alternative.

    Rüdiger

  26. Dale Ghent

    A failed reconnect cannot make things worse than they already are. 

    Again, assuming always-good outcomes to a fault situation.

    And I actually prefer trying to recover the system than let it run uncontrolled since the sequence is continuing.

    Recovery from a problem requires knowledge about the situation. NINA, or any ASCOM application, does not have any visibility into why a driver suddenly has set its Connected property to false. Zero. None. Nada. There could be any range of fault, and the driver and/or the mount controller itself can be confused as to what the disposition of the mount is. On top of this, there can be influences on the mount outside of the driver and application that add to and confusion.

    There are always communication interruption possible, which have to be considered in any design. 

    Yes, and communication between the driver and the hardware is a concern between the driver and the hardware. ASCOM applications have no visibility into this area of operation. ASCOM apps do not know that the driver is connecting to the mount via serial, or WiFi, or ethernet, or is using tin cans and string, and ASCOM apps should not concern themselves with the particularities of this and if there are errors. This is all for the driver to handle accordingly.

    It has always been the hard stance of NINA that faults at the lower levels must be addressed directly by the components involved at the level(s) in question. IF there is a bug in an ASCOM driver, the ASCOM driver’s vendor must fix it. IF there is an issue with the mount’s wifi connection, then an alternative connection method should be used or the wifi operating environment be improved, and so on. NINA, or any ASCOM application, cannot guess how best to proceed based on the extremely minimal amount of information that is available to it.

    Error handling and robustness is a part of any serious SW implementation. Or do you consider “find your underlaying problem proactively” is a substitute for error handling?

    Error handling and fault recovery (errors and faults are not the same) are, as I have already said, the responsibility of the layers in which the fault or error has taken place. It’s simply because the information needed to recover is most readily available at those levels and tends to become unavailable or highly abstracted or generalized in higher levels. This has been a universal concept in open systems design for decades, and not just some wild concept that I just now invented. If your ethernet port is killing the TCP connection that your 10micron driver makes to the mount, then you need to address and fix the causes of this either in the driver that is making the TCP connection, the controller firmware or with your ethernet hardware and its driver.

    • The ASCOM driver and/or mount controller needs to be more resilient to UDP or TCP connection faults
    • The ASCOM driver must handle reconnections in a better way without doing something that interrupts the applications that use it
    • The ethernet driver has a bug and must be fixed by its vendor
    • The ethernet port and associated hardware is faulty or failing and must be replaced
    • If there is an ethernet switch involved between the host and the mount, that needs to be investigated
    • The ASCOM driver’s configuration of the UDP or TCP connection may be inadequate in terms of socket options and timeouts
    • And so on

    Outsourcing the problem up the stack where the context of what is happening at the lower levels is generalized at best completely absent at worst (which is the ASCOM case) is never the best way to deal with a problem. I’m sorry if these lower-level problems are harder to deal with, the fix or might take more time, or the vendors are not as accessible as we may be; that is just the situation. The problem’s root cause will continue to exist until it is directly addressed and, in some cases, the problem can grow worse over time if left unaddressed such as in the case of failing hardware.

  27. Ruediger

    Hi Dale,
    I am agreeing on your basic theoretical approach. But let me ask one question: When your windows system crashes once every 2-3 month, do you start debugging or even reinstalling everything or do you simply restart your PC? I agree debugging would make sense but it is not a practical approach. I simply reboot.

    Same for a disconnect: I simply reconnect. Because not doing with certain likelihood of success it will definitely end up up in physical limits of the mount, since it continuous tracking. Other devices via USB are anyways prone to disconnects, since USB is not stable per se.

    Also your explanation is fully supporting my argument: Since ASCOM does not know why it fails and disconnects (no error is provided) the only consequents is to stop everything (shutting off power via remote switch over other communication e.g. serial port). Continuing would be grossly negligent.

    BTW: What does a pilot when the engine fails? He tries to restart it, and not analyzing what went wrong. This might also lead to bigger catastrophe because the state is unknown, but there is a certain chance to get it running again and prevent the crash. A detailed failure analysis you can do after the event, but not during the night. Same with the mount. Doing nothing will test the hard limits and in worst case your money purse.

    But this topic is not worth it to get into a dispute. I have a quite practical approach since we have to live with ASCOM as it is. It is a very old design. I have a workaround by setting up a watchdog scheduler which is firing a “stop” and a “mount shutdown” at the planned end to prevent it to run into hard limits. So in worst case I only lose imaging time.

    Many thanks for sharing your point of view. Much appreciated.

    Rüdiger

  28. John L reporter

    Hi Stefan, just checking back to see if you learned anything useful from the ASCOM group?

    Thanks,

    John

  29. Dale Ghent

    John, if this kind of problem you’re experiencing was normal and if many users chronically encountered it as you are, I might be sympathetic to this ask. But that isn’t the case, and the root causes of these sorts of problems are always with the driver itself or the connection between the system and the mount. To me, the onus is not on the app to paper over the issue for the user by blinding reconnecting in the hope that all remains well.

    The fact of the matter is this: You have a unique systemic problem that exists in the layers and systems external to NINA’s purview, and those problems should be addressed where they are. You have not engaged Celestron support on this issue after being asked to and, if you have, you have not relayed the results of that here. I have a hard time being convinced that the developer hours must be invested to invent a system that does not need to exist if the chronic issues were directly addressed through the proper support channels - support channels which preclude NINA and 100% involves Celestron.

  30. Dale Ghent

    Rüdiger,

    As a pilot in training myself, your analogy is out of bounds. A malfunctioning ASCOM driver is not a life or death situation. Such appeals to extremes are not going to work well.

    If my Windows PC blues screens regularly, I would not simply reboot and carry on. I would suspect that there is likely something wrong with my RAM as memory corruption is a common reason for kernel panics, known as blue screens in the Windows world. There are memory stress testing tools I can employ to directly test my memory for faults and pinpoint the stick. I can also review the contents of the panic message to get an idea if it was a fault in a driver. There are many ways to directly address such a situation. Simply rebooting and hoping it doesn’t happen again is not the only option.

    If your 10micon driver is not properly recovering a communications fault, then it should be made more resilient. If your communications fault is caused by a system issue that you’ve apparently already have been able to identify, then you should correct that fault. Doing so would render this entire concept moot and unnecessary.

  31. Stefan B repo owner

    @John L

    The answers there are pretty much what we already said too. https://ascomtalk.groups.io/g/Developer/topic/connected_property_and/83071409?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,83071409

    When a driver returns connection as false and returns to true without any request to do so by an ASCOM client, but rather due to some internal retry mechanics, that is a driver issue then, as driver specific behavior cannot be handled in the client and should not be.

  32. Ruediger

    Dale,

    reading from your vita I knew you have a pilot a training, as well you are in IT business - like me. I have studied computer sciences and I have been working as a leading software architect for more than 20 years (still active). Also more than 30 years experience with AP.
    But you are mixing up many things. Like Problem Management; which is dealing with removal of the root cause, with incident management to get it working again with a workaround. Also appropriate error handling and safe states.
    I have a completely different point of view and different experiences. Your approach is more theoretical, than a practical approach.

    But I have to be honest: for me this issue is not so important to start a lengthly discussion. I have my own solutions in place. So for me this issue is closed.

    But one thing I want to emphasize:
    I accept your different point of view, since I think it is a good thing to have different opinions. This will bring everyone forward. An open and frank exchange of opinions and experience is a benefit for all.

    Again many thanks for sharing your point of view.
    Rüdiger

  33. John L reporter

    Stefan, thanks for letting me know. I’m in the process of preparing a request with my specific info to send to Celestron so they can look into this problem. I’ll include all of the info provided.

    Thank you, I appreciate it.

  34. John L reporter

    “The fact of the matter” is that at the top of this thread, Stefan said “Recovery options and/or failure handling will come in future with the sequencer rework that is ongoing in 1.11”

    So, originally, I was holding out hope that some new feature would be forthcoming that would help with my problem. You are the one that’s made it abundantly clear that won’t happen. Message received.

    I have never insisted that NINA is the problem. I’m not insisting that you develop anything different. And once you guys explained it, I was fine. Since then I’ve just been trying to gather enough info and learn enough so I can figure out why this is happening with my setup and then ask the right questions. And, I’m working at my own pace. I’m planning on contacting Celestron soon. Didn’t know I was required to do that before asking another quick question here.

    John

  35. Stefan B repo owner

    There are quite a couple of topics mixed here.

    What will not happen: randomly reconnect to stuff that drops connection due to some driver issues or in general incompliant driver workarounds

    What will come in future: More error handling and recovery in advanced sequencer. It just needs time and the sequencer will be enhanced incrementally. A solid baseline will be established with what we have and then enhanced step by step.

  36. John L reporter

    Thanks Stefan, I appreciate your response.

    Believe me, I am not complaining about NINA. I really do like this software a lot. You guys have done a excellent job!

  37. Ruediger

    Thanks Stefan!

    That is exactly the point: Some error handling to react to an unexpected event and enable me to get into a safe state/condition with the equipment.

    Thanks!

    Cheers Rüdiger

  38. Dale Ghent

    It’s not required that you engage your vendor’s support before coming here as problems can be complex and it might take some time and a trained eye or two to pick out what and where the real problem is. But when the problem is clearly an issue with the vendor’s hardware or software, it is certainly highly appreciated that an effort is made to pursue and (hopefully) resolve the issue with them. Your issue, in particular, can range in cause from a faulty electronics board, to controller firmware, to an issue in the CPWI driver and everything in between. We here on the NINA project don’t have the knowledge or capabilities to make these in-depth determinations and it would be asking a lot for community volunteers to stand in as support agents for any vendor’s commercial products. However, we are always happy to assist a vendor in understanding a user’s issue if we have information or observations that are helpful.

    I understand that astrophoto systems are a complex layer cake of technologies that come from a multitude of vendors and touch almost every area of computing, electronics, and mechanical engineering. Apps such as NINA try to lasso all these things together under a single pane of glass, so any errors or issues that exist at lower levels of this cake of madness are often first apparent through the app which is used to interface with it all. This does not mean that the responsibility to address these issues always lays with the app. If there is a true fault at those lower levels, then that’s where the fault fix needs to be found applied. Papering over these faults at the upper layer where NINA resides might seem like the most convenient thing to do or be a quick workaround, but it adds technical debt. Importantly, it doesn’t directly address the underlying root cause, which is an anti-pattern in software development. The problem will still exist, albeit in an obfuscated form, and it could metastasize further depending on its nature, potentially rendering the original workaround less useful.

    We’re all for helping you get rolling here. We really and truly are. Spontaneous disconnects of any equipment aren’t normal and usually indicate a low-level problem that needs to be explored and addressed by the owners of those areas; usually the equipment vendor. We’re happy to help but prescribing solutions before the problem is even properly characterized is putting the cart before the horse.

  39. robert hasson

    couple of (ok maybe more) observations:

    1. these disconnects happen it seems with multiple types of mounts/drivers. Mine is ASCOM Synscan Mobile connecting to the Synscan App. Disconnect happens whether in Wifi or COM/Serial
    2. For me the status on the imaging screen still shows connected when in sequence shooting, however when i go to the equipment tab it has lost the connection!? would that mean that the status is not sync’d across the various tabs?
    3. What i was looking for is what Stefan hinted at with a plugin or an option in sequencer to “connect on sequence” the same way there is a “center” or “guide” command with the sequence editor
    4. there seem to be many more lines of text in this exchange than the lines of code this option would have required all for the virtue of principles and purity. Sometimes workarounds are not defeats or concessions of moral high ground… Just saying…

    Robert

  40. Dale Ghent

    Robert,

    Have you contacted SkyWatcher at all regarding your issues? I feel like I’m repeating myself here.

  41. Stefan B repo owner

    Device reconnects after unknown failures are not trivial even if it sounds like they are on first glance.

    The application has no deeper knowledge about why equipment gets disconnected, especially when using ASCOM drivers, that abstract that layer even further. A blind reconnect can potentially worsen the situation. If you want to brute force this, you can abuse the connector plugin in the adv. sequencer and drag it into your imaging loop to always connect to the equipment - or do nothing if it already is connected.

  42. Log in to comment