Image writes stalling on exFAT

Issue #1025 closed
Douglas Triggs created an issue

Image writes seem to stall when writing to an exFAT device.

With the following set up: Eagle 4 computer, external USB 3.0 thumb drive (plugged into any 3.0 USB port) formatted in exFAT, running N.I.N.A. v1.10 HF3. When using this combination, anytime a series runs (first noticed with the flats wizard, but seems to be the same when running a normal sequence) any test images are fine, and the first image is fine, but there’s about a 20 second stall on any subsequent frames. Trace looks like so (no errors, nothing else that looks relevant, just incredibly slow):

[2021-12-25T23:09:57.3237] [TRACE] [MemberName] SaveToDisk
[2021-12-25T23:09:57.3237] [TRACE] [FileName] E:\Projects\nina\NINA\Model\ImageData\ImageData.cs
[2021-12-25T23:09:57.3237] [TRACE] [Message] Start: 26.12.2021 06:09:37.306; Stopped: 26.12.2021 06:09:57.323; Elapsed: 00:00:20.0180396

When testing the drive, performance otherwise seems fine, can copy a 1GB file in a few seconds with a throughput of >30MB/s, or a large set of files with similar performance. And the first file in the series takes about as long as expected (along with any test images from the flats wizard; I assume those are staged on disk, but maybe not. Also since the logs don’t really tell me which file is the exact problem, maybe it’s actually stalling because the second image is waiting for the first?)

As a workaround, reformatting the USB drive as NTFS fixed the problem, but that’s not ideal since I can’t mount that except read-only on the machine I use for image processing (MacOS).

Honestly I expect that this is potentially going to be really difficult to fix, but OTOH maybe having an existing issue that’s google-able may help someone else who runs into this see the above work-around and they might not pull out as much hair as I did trying to figure out what was wrong.

Comments (10)

  1. James Malone

    I am also having this exact same issue. It only happens in NINA when using exFAT. It’s quite noticeable, and annoying, when shooting calibration frames.

    Using the latest 2.0 nightly.

  2. Dale Ghent

    If there’s an issue here, I’m not convinced that it would be even NINA’s problem to fix. The entire file system access layer in any OS is virtualized to the point where the application does not need to know any particulars about the underlying filesystem (outside of certain permissions management realms, which we do not engage in), and certainly not for ordinary tasks such as creating a file and writing to it, or renaming it. NINA uses the ordinary file handling methods that .Net provides.

    I’m not sure what your actual storage device is, but exFAT is designed with solid state storage in mind. To that extent, exFAT lacks some traditional file system design aspects, such as journaling, that were used to speed up write performance on spinning hard drives. If you are using spinning hard drives as your storage medium, it’s plausible that you’re encountering some effect of this.

  3. Dale Ghent

    Aslo to add - thumb drives are - not - performance devices. They often have small buffers that get filled and must be expunged as the data is committed to the flash memory cells. An example of a solid state storage device that would be performant across sustained writes are M.2 devices. These are more appropriate for the task and can be found inside portable USB3 enclosures.

  4. James Malone

    The reason it struck me as weird was the difference in filesystems made such a difference. It seemed like this would be an implementation detail of which NINA should not really care about. Agreed the problem may lie outside of NINA altogether. I’ve seen it with an SDXC card w/ a reader that I use for other applications without an issue, which was another reason it seemed weird.

    I’ll try to repo this with an m.2 over USB 3.1 - it’s a really curious problem to me at this point. 🙂

  5. Dale Ghent

    I just rememberd that I have a 500GB M.2 SSD in an external enclosure that is unused, but also formatted with exFAT. It is below. I have a similar enclosure with a 2TB SSD in it, opened up for demonstration purposes (it’s formatted with APFS). Pen is to show scale.

    I plugged this into my Windows box, set NINA to write images to it, and ran NINA with TRACE level logging to see the file write times. It took around 6 seconds to write a 120MB FITS file. I ran this in a continuous loop. The write speeds were fairly consistent:

    2022-02-14T13:03:10.2987|TRACE|BaseImageData.cs|PrepareSave|43|Start: 14.02.2022 06:03:03.607; Stopped: 14.02.2022 06:03:10.298; Elapsed: 00:00:06.6916955
    2022-02-14T13:03:17.1075|TRACE|BaseImageData.cs|PrepareSave|43|Start: 14.02.2022 06:03:10.300; Stopped: 14.02.2022 06:03:17.107; Elapsed: 00:00:06.8073582
    2022-02-14T13:03:23.9144|TRACE|BaseImageData.cs|PrepareSave|43|Start: 14.02.2022 06:03:17.109; Stopped: 14.02.2022 06:03:23.914; Elapsed: 00:00:06.8048679
    2022-02-14T13:03:37.4102|TRACE|BaseImageData.cs|PrepareSave|43|Start: 14.02.2022 06:03:30.681; Stopped: 14.02.2022 06:03:37.410; Elapsed: 00:00:06.7289566
    2022-02-14T13:03:44.0065|TRACE|BaseImageData.cs|PrepareSave|43|Start: 14.02.2022 06:03:37.411; Stopped: 14.02.2022 06:03:44.006; Elapsed: 00:00:06.5948611
    

    I’m thinking that, if you’re using an actual thumb flash driver (rather than a USB connected SSD) then you are running into limitations of these small drives, which are not designed with sustained write activity in mind of the scale you find with astro images. The journaling feature of NTFS is likely working to hide this from you.

  6. Douglas Triggs reporter

    I mean, yeah, it’s probably unfixable, and not worth a whole lot of effort to try, and I’ve moved on to a solution that works for me, but I gotta wonder what the heck .NET is even doing. I’d be curious if we ever do figure it out. Everything else is managing (sustained) transfer rates of 30MB/s for arbitrarily large amounts of data (hundreds of GB) with the exact same drive/filesystem (and even .NET if you switch the FS), and it’s not even managing 1/10th of that. Even a Raspberry Pi (ASIAir Pro) was fast enough doing the exact same thing to be unnoticeable. But if feels a lot less like being slow than stalling, and the fact that it was almost exactly 20 seconds (but just slightly longer) every time feels like a timeout of some sort. (And <20MB/s is still ludicrously slow for an M.2 drive, though the times there are variable enough that it’s less suspicious. Though that in itself is weird.) I can’t imagine FS mattering that much without something in the implementation (meaning .NET/Window FS driver) being outright broken.

    ¯\_(ツ)_/¯

  7. James Malone

    Because this is curious to me, I created a very basic test that writes a file with an arbitrary size that is about the size of an image (40mb) and had it run in a loop. I ran this test a few times and noticed it completed without the massive delay.

    @echo off
    ECHO Start Measure %Time%
    for /l %%x in (1, 1, 50) do (
       echo %%x
       fsutil file createnew d:\speedtest\%%x.txt 41943040
    )
    ECHO Stop Measure %Time%
    

    Here is an example run:

    D:\speedtest>loop.bat
    Start Measure 16:40:22.49
    1
    File d:\speedtest\1.txt is created
    2
    File d:\speedtest\2.txt is created…
    49
    File d:\speedtest\49.txt is created
    50
    File d:\speedtest\50.txt is created
    Stop Measure 16:41:32.24
    

    Granted, it’s a basic test, but the result seems curious - writing sequential files seems to display none of the slowdown seen via NINA. It sounds like the issue exists outside of NINA, but it’s still pretty weird. 🙂

  8. Stefan B repo owner

    Is this problem still reproducible in the 3.x nightly version? WIth the underlying .NET 7 version this might have seen some improvements.

  9. Log in to comment