NINA 174 hard crash with memory access error

Issue #962 closed
Barry King created an issue

Have not tried to reproduce since I do not yet have a use case for synchronized imaging, the possible culprit. This occurred in #174. Since updated to #176 and removed the suspected (Synchronization) plugin.

First, what was observed:

Imaging had run for several hours. Sequencer successfully switched targets. Some time later Windows event log shows NINA crashed with the following:

  • <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  • <System>
    <Provider Name="Application Popup" Guid="{47bfa2b7-bd54-4fac-b70b-29021084ca8f}" />
    <EventID>26</EventID>
    <Version>0</Version>
    <Level>4</Level>
    <Task>0</Task>
    <Opcode>0</Opcode>
    <Keywords>0x8000000000000000</Keywords>
    <TimeCreated SystemTime="2021-10-31T06:24:28.4355662Z" />
    <EventRecordID>6865</EventRecordID>
    <Correlation />
    <Execution ProcessID="2148" ThreadID="5144" />
    <Channel>System</Channel>
    <Computer>GEM45G</Computer>
    <Security UserID="S-1-5-18" />
    </System>
  • <EventData>
    <Data Name="Caption">.NET-BroadcastEventWindow.4.0.0.0.797c81.0: NINA.exe - Application Error</Data>
    <Data Name="Message">The exception unknown software exception (0xc0020001) occurred in the application at location 0x00007FF990E94F99. Click on OK to terminate the program</Data>
    </EventData>
    </Event>

This caused a hard crash of NINA with no dump file, which was odd. I have the PC set to reboot on failure and it booted a few seconds later.

Tail of the last log entries:
2021-10-31T01:24:18.9208|INFO|SequenceItem.cs|Run|213|Finishing Category: Guider, Item: Dither
2021-10-31T01:24:18.9208|INFO|SequenceItem.cs|Run|213|Finishing Category: * Instruction Set *, Item: NINA.Sequencer.Container.SequentialContainer, Strategy: SequentialStrategy, Items: 1, Conditions: Triggers:
2021-10-31T01:24:18.9208|INFO|MeridianFlipTrigger.cs|ShouldTrigger|287|Meridian Flip - (Side of Pier usage is disabled) There is still time remaining. Max remaining time 03:08:47.7070000, next instruction time 300.
2021-10-31T01:24:18.9208|INFO|SequenceItem.cs|Run|195|Starting Category: Camera, Item: TakeExposure, ExposureTime 300, Gain -1, Offset -1, ImageType LIGHT, Binning 1x1
2021-10-31T01:24:18.9318|INFO|CameraVM.cs|Capture|709|Starting Exposure - Exposure Time: 300s; Filter: ; Gain: 101; Offset 10; Binning: 1x1;
2021-10-31T01:24:25.2091|INFO|SynchronizationPlugin.cs|StartHeartbeat|124|Stopping heartbeat
2021-10-31T01:24:25.2091|INFO|SynchronizationPlugin.cs|StartServerHeartbeat|103|Stopping server heartbeat
2021-10-31T01:24:25.2121|INFO|SynchronizationPlugin.cs|Teardown|138|Shutting down server

(Three seconds later, NINA crashes.)

This is below is curious and may be noise, but including anyway:

Earlier in the log:
2021-10-31T01:08:33.7692|INFO|MeridianFlipTrigger.cs|ShouldTrigger|287|Meridian Flip - (Side of Pier usage is disabled) There is still time remaining. Max remaining time 03:24:33.9470000, next instruction time 300.
2021-10-31T01:08:33.7692|INFO|SequenceItem.cs|Run|195|Starting Category: Camera, Item: TakeExposure, ExposureTime 300, Gain -1, Offset -1, ImageType LIGHT, Binning 1x1
2021-10-31T01:08:33.8549|INFO|CameraVM.cs|Capture|709|Starting Exposure - Exposure Time: 300s; Filter: ; Gain: 101; Offset 10; Binning: 1x1;
2021-10-31T01:08:44.7648|INFO|StarDetection.cs|Detect|221|Average HFR: 3.97144701191334, HFR σ: 1.70611121611765, Detected Stars 496
2021-10-31T01:08:45.0440|INFO|BaseImageData.cs|FinalizeSave|136|Saving image at C:\Astro\Capture\Spaghetti Nebula\Redcat51\ZWO ASI094MC Pro\2021-10-30\LIGHT\Altair Quad\2021-10-31_01-03-31_Altair Quad_300.00s_-6.30_G101_O10_0012.fits
2021-10-31T01:08:45.1956|ERROR|SynchronizationPlugin.cs|StartServerHeartbeat|107|An error occurred while pinging the server
Grpc.Core.RpcException: Status(StatusCode="DeadlineExceeded", Detail="")
at GrpcDotNetNamedPipes.Internal.MessageReader1.<MoveNext>d__5.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at GrpcDotNetNamedPipes.Internal.MessageReader1.<>c__DisplayClass9_0.<<ReadNextMessage>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Synchronization.Service.DitherServiceClient.<Ping>d__15.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at Synchronization.SynchronizationPlugin.<<StartServerHeartbeat>b__8_0>d.MoveNext()
2021-10-31T01:08:45.2056|ERROR|SynchronizationPlugin.cs|StartHeartbeat|126|An error occurred while pinging the server
Grpc.Core.RpcException: Status(StatusCode="DeadlineExceeded", Detail="")
at GrpcDotNetNamedPipes.Internal.MessageReader1.<MoveNext>d__5.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at GrpcDotNetNamedPipes.Internal.MessageReader1.<>c__DisplayClass9_0.<<ReadNextMessage>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Synchronization.Service.DitherServiceClient.<Ping>d__15.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at Synchronization.SynchronizationPlugin.<<StartHeartbeat>b__9_0>d.MoveNext()

From past life, 0xc0020001 can occur when mixing memory access between managed and unmanaged code in the Microsoft world. There may be other reasons.

Also, while the Synchronization plugin was installed, it was not part of a sequence and had never been invoked (by me purposefully) other than perhaps by NINA to enumerate the list of available instructions in the sequencer.

Curious why it is trying to connect to the synch service when it has not been instructed to do so by a sequence.

I have uninstalled the Synchronization plugin as a precaution just in case it was the culprit that caused the memory access violation.

Hope that helps. Maybe non-issue in latest code.

Now need to look at how to automate imaging recovery after hard failure of NINA. Lost ~5 hrs of imaging opportunity on one of those super dim targets where every photon seems to be needed.. That’ll teach me to try and get sleep! Good thing is, odds are decent the target will still be there next clear night.

Thanks.

Comments (4)

  1. Stefan B repo owner

    Hi,

    the sync plugin got an update recently where the pinging to the server only starts when it is required during the sequence and not on startup. So if that was one of the problems it should be less often or completely gone.

    For the event viewer, could you check for the .NET stacktrace in the application tab where the full stack trace is available?

  2. Barry King reporter

    Thanks Stefan.

    That change may help, but is difficult to say since the actual crash, quite separate time-wise from the exception thrown earlier, does not give sufficient context.

    I agree it is worth checking the location of the memory exception in the #174 build to see if there’s a clue since if it is a managed/unmanaged code thing, the bug could still be lurking. That's why I looked for a corresponding application event and a stack trace when I reported the System event above, but found nothing remotely corresponding. Odd because crash dumps and exception logging are usually pretty dialed.

    Happy to check other things. There was no crash log in the corresponding NINA folder, or a crash dump of any other kind found so far.

  3. Stefan B repo owner

    Most likely not an issue anymore. Reopen when it still occurs with latest versions of plugin and app

  4. Log in to comment