de-sychronized writing of sqlite history for TID_STRUCT bank with mserver

Issue #41 resolved
Thomas Lindner created an issue

Odd little bug, originally reported by SCDMS-SNOLAB group.


Setup: 1) I have an example frontend program that write a TID_STRUCT bank (example frontend attached). The struct being written looks like

typedef struct { double mean1; double mean2; double mean3; double mean4; double mean5; double mean6; } Baselines_bank;

2) For debugging purposes, frontend set all the variables to the same random number (between 0-99).

3) I connect the frontend program through the mserver and set the mlogger to write the history using SQLITE option.


Problem: As noted, I would expect to be able to open the sqlite table and see that the variables are the same for a particular row. But that's not what I see. Instead, I find that the first variable is different from all the other variables. Here's an example of what the sqlite table looks like

sqlite> SELECT * FROM readouthistory14_bsln where _i_time > 1488403886; 2017-03-02 09:12:15.0|1488474735| 77.0 | 86.0 | 86.0 | 86.0| 86.0 | 86.0 2017-03-02 09:12:25.0|1488474745| 15.0 | 77.0 | 77.0 | 77.0| 77.0 | 77.0 2017-03-02 09:12:35.0|1488474755| 93.0 | 15.0 | 15.0 | 15.0| 15.0 | 15.0 2017-03-02 09:12:45.0|1488474765| 35.0 | 93.0 | 93.0 | 93.0| 93.0 | 93.0 2017-03-02 09:12:55.0|1488474775| 86.0 | 35.0 | 35.0 | 35.0| 35.0 | 35.0

The first variable is always different from the other variables. The first variable is the correct current value of the variable in the ODB, whereas the other variables are the old values of the variable in the ODB; the last five variable always match the value of the first variable on the previous sqlite row.

So it seems that what is happening is that the sqlite logging happens as soon as an ODB variable gets touched; so it somehow happens before the frontend has had the chance to write all the rest of the changed variables to the ODB.


A couple further debugging points:

1) This problem only happens if the frontend program is connecting through the mserver. If the frontend runs locally and connects directly to the ODB then all the entries in the sqlite row are the same (though it is possible that they are all set to the old value in that case).

2) The problem seems to happen on a per variable basis (rather than being somehow related to the size of the structure in memory). For instance, if the struct was instead just an array of doubles (as below) then I don't see this problem.

typedef struct { double mean[16]; } Baselines_bank;


I will continue investigating bug (for instance, testing other history writing options), but wanted to post in case explanation is obvious to someone.

Comments (3)

  1. dd1

    Yes, this is a general problem with the midas history. Only things written atomically (i.e. whole arrays written with db_set_data()) go into the history correctly. For items written in parts what you describe is what actually happens. K.O.

  2. Thomas Lindner reporter

    Stefan made changes to mfe.c that fix this problem. If I understand correctly, Stefan changed mfe.c so that when the frontend writes the variables to the ODB, the hot-linking (that trigger the mlogger history writing) is disabled for every ODB write except for the last one.

    I confirmed that this fix (in the head of develop branch) fixes my problem.

    It does introduce a small new problem that any hot-linking to just a single frontend variable will no longer work (unless it happens to be the last variable). To avoid this people should setup hot-links (ie open records) to watch the whole variables directory, rather than a single variable.

  3. Thomas Lindner reporter

    Correction: my previous post was incorrect. The changes Stefan made do not introduce a new bug where hot-linking a single frontend variable doesn't reliably work. There are no known negative side-effects of this bug fix.

  4. Log in to comment