ZWO: Repeating camera errors - corrupted images - failing image download

Issue #1248 resolved
Ruediger created an issue

Hello,
I am not sure what is going on here, but I will post here, since that could be an real issue. I have noticed in the last view nightlys continuous image corruption, which never happened before and there was no change to the setup at all. The first few times I ignored it, but now, since it happens more often and stalls complete nights I am worried,

The session started normally and a couple of images were taken as expected, But suddenly the images got corrupted at 22:55. Also the log indicates an issue, which causes then some follow up issues.

What worried me, is the error handling. If one individual file fails, it may happen. But in this situation, it caused a non recoverable situation.

Any suggestion or comments welcome.

Rüdiger

Comments (11)

  1. Stefan B repo owner
    • changed status to new

    Failures like these are outside of N.I.N.A.'s control as the image data is just retrieved from the SDK. The ZWO SDK reports some exposure failures in the logs which indicate some communication problems. Almost always this is either some hardware fault or a USB connection problem.

  2. Ruediger reporter

    Hello Stefan,
    many thanks for the quick reply. Defects may occur at anytime, that's true, but so far I have only experienced that with N.I.N,A. FireCapture and ZWO tool suite do not suffer from that problem. That is a bit suspicious. Also no changes in permanent setup.
    I could try to replace the ZWO SDK DLL with an older one and see what happens.

    But I am thinking in the direction, how such a failure could potentially be handled within N.I.N.A., because the error occurred once during that session, but all the following images were impacted and corrupted. Maybe it is possible, when something like this happens, to fire a reconnect or reset of the connection in order to restore a defined state and avoid compromising the complete session. So to say an “on error reconnect trigger”. This is just an idea. I am aware that this no solution for the root cause, but I am looking for way to save the unattended sessions.

    Rüdiger

  3. Dale Ghent

    The comparison with FireCapture (as well as SharpCap) is not entirely valid as those apps initialize the camera into a completely different mode than N.I.N.A. does. We have observed that the mode (video vs single shot) does usually have different results with overall reliability of the camera when it comes to problems like this. So saying it works in FireCapture but not in N.I.N.A. isn’t the smoking gun one might think it is. Why this difference exists is unknown and unexplained by ZWO. We also see this same effect with QHY cameras.

    The USB communication is managed by the ZWO-provided SDK library for their cameras. I do not think we have changed it between the versions you have used. At any rate, all responsibility for USB communications reliability rests on that and the ZWO camera USB driver itself. N.I.N.A. interacts with the SDK only at a high, abstract level (take image, set cooling, etc.)

    I am inclined to suggest that you start with the common cause for this problem- cables. Loose USB connector or one with corrosion on the pins. A crushed cable or one that has been flexed too much over its life. Since this is ZWO and they have historically provided those terrible flat style cables with their products, I very much advise donating any of those to the trash bin and replacing them with quality usb cables.

  4. Ruediger reporter

    Hi Dale,
    many thanks for the elaborate answer.

    Even if FireCapture uses another mode, a defective cable or a bad plug connection should also result in some glitches. Even the probability should increase due to the high amount of data transmitted.

    I think, I may have seen some commits in the nightly, which may impact it, but I might be wrong.

    You are right: The included flat cables are trash. Therefore I do not use the standard cables. I use high-end silicon double shielded cables in precise length. But of course, they also can break or get a defective plug, though gold plated.

    I think the best way is to swap the cables an check again. My hope was that you guys see more hints in the log than I do.

    But what I might suggest:
    If such a fatal error occurs (as it was detected and logged), it would be great if:

    1. You could send a message (Pushover, Groundsation, mail…) The pop-ups are only useful, if somebody sits in front of the screen. It is not suitable for unattended imaging. e.g. I run my session 80-90% unattended.
    2. You could also react in sequence with a trigger “on fatal error….”

    I think point 1 is in general required, when you run N.I.N.A. remote and unattended. This would be very useful, because currently you miss this fatal errors. You only become aware of it the next day.

    If preferred, I can create an improvement for this. Please let me know. Thanks!

    Rüdiger

  5. Dale Ghent

    As I said, zwo has never offered an explanation for the reliability differences between modes. The usb communication and low-level camera management is managed by their SDK and USB driver. Nina interacts with the camera only at a high level and has no influence over what goes on in detail over the USB connection as well as the camera and its own firmware.

    If your experience with camera programming is sufficient to point out changes we made to Nina that we are not aware of that can cause this problem, I would be happy for you to point these out.

  6. Stefan B repo owner

    In addition to the suggestions - failures from instructions will already be sent when groundstation is set to do so.
    Corrupted images like the ones attached are not detected as the SDK does not indicate that these are corrupt and there is no way for the application to detect that then.

    Instructions also have failure modes to repeat them or skip to the end of sequence in their advanced options. A simple reconnect on any arbitrary error is however not a good approach and can also cause more issues. For example a mount connection problem could result in a pier crash if we just reconnect and the mount potentially has its pointing state ruined.

  7. Ruediger reporter

    Thanks both!
    @Stefan: Please correct me, if I understand it in the wrong way.

    I can get an alarm, when this errors happens?

    2023-08-23T23:00:12.4012|ERROR|ASICamera.cs|DownloadExposure|385|ASI: Camera reported unsuccessful exposure: ASI_EXP_FAILED
    2023-08-23T23:00:12.4017|ERROR|ImagingVM.cs|CaptureImage|225

    In this case, I am happy. Then I will just add the trigger “Failures to Pushover”.

  8. Log in to comment