slow "FILE" history loading from ZFS

Issue #294 new
dd1 created an issue

in the agmini system, we see very slow run start times - 30 sec or more - where mlogger takes that much time to initialize the history system. this is caused by reading the headers of all FILE-type history files. When these files are in the ZFS cache or when they are on SSD storage, run start is almost instantaneous. But reading the headers from HDD storage takes a long time. In the agmini system there are about 5000 history files.

to fix this, I propose to add a cache file for history file headers.

K.O.

Comments (2)

  1. dd1 reporter

    slightly better understanding of ZFS cache.

    • if data is in zfs cache, everything is very quick
    • if data is on SSD, zfs cache loads very quickly
    • if data is on HDD, zfs cache will take forever to load (seconds, minutes)
    • zfs cache has an MRU section and an MFU section. first data is saved in the “most recently used” cache. this cache expires quickly. if we keep using the data, it igrates to the “most frequently used” cache, where it seems to be kept for long time.

    Normal data access by the FILE history will not cause history data to migrate to the MFU cache. But if we setup a cron job (or a thread) to keep accessing history files, it will migrate to the MFU cache and everything is fast afterwards. not sure how often we need to access the data to prevent it’s expiration.

    Of course mlogger writing bulk data to ZFS storage will case ZFS cache churn and will expire useful data…

    Other solution is to read fewer history files:

    • mlogger only needs to read the latest files for currently active events. (currently reads all files for all events)
    • mhttpd history reader only needs to read the files for the requested history event and only files that go as far back in time as the data reuested. (currently reads only files for the correct event (good), but does read them all the way to the beginning of time (bad))
    • “ls -ltr” on all history files has to run… unfortunately.

    K.O.

  2. dd1 reporter

    commit 20a923226196f3cb7e63caa98d4a746f53293625 implements incremental loading of FILE history schema. I see mlogger start time is much reduced - in agmini we read 100 history files instead of all the existing 9000 files.

    “ls -ltr” on the history directory is still there, nothing I can do about it…

    also disable the watchdog while we go reading history files, it can take an arbitrary amount of time.

    K.O.

  3. Log in to comment