sequencer/mhttpd race condition causes "Stop after current run" button to be temperamental

Issue #181 new
Ben Smith created an issue

The Sequencer webpage has a "Stop after current run" button that is very useful. Unfortunately there is a race condition where mhttpd can update "/Sequencer/State/Stop after run", but it is then reset to the original value by the sequencer. (If mhttpd does the update any time between sequencer calling db_get_record and db_set_record).

This happens very often, and makes the button either useless, annoying, or misleading, depending on the specific outcome.

The immediate solution is simply to edit the end of sequencer() in sequencer.cxx from:

   /* get steering parameters, since they might have been changed in between */
   SEQUENCER seq1;
   size = sizeof(seq1);
   db_get_record(hDB, hKeySeq, &seq1, &size, 0);
   seq.running = seq1.running;
   seq.finished = seq1.finished;
   seq.paused = seq1.paused;

   /* update current line number */
   db_set_record(hDB, hKeySeq, &seq, sizeof(seq), 0);

to include the extra line:

   seq.stop_after_run = seq1.stop_after_run;

I've tested the above fix and it fixes the problem.

However, I am sure the same bug can also happen in reverse, where mhttpd reads the state, the sequencer updates ODB with a new line number (for example), then mhttpd updates the paused/stop_after_run flags based on the old state. I would suggest that mhttpd uses db_set_value instead of db_set_record to avoid these issues, but perhaps a deeper look is required.

Comments (4)

  1. dd1

    the race condition is still there - between db_get_record() and db_set_record() - the race window is smaller than before - which can be bad because now it will happen less often, but when it does happen it will be much harder to reproduce. better solution is to straighten the communications - only one person writes into a data structure. K.O.

  2. Stefan Ritt

    I committed Ben's proposal from above. The race window spans now three lines of code, which gives kind of one in a million chance to happen. That's a quick and dirty fix for until I have more time to rework the logic. One possibility would be to split the sequencer parameter set into two sets, once which is only written by the sequencer (such as line number etc.) and one "steering parameter" set which is only written from outside (mhttpd, odb, ...). This way stop_after_run cannot be overwritten by the sequencer.

  3. dd1

    To close this off, need to move following odb entries from “state” to “command”:

    • paused
    • new file (already in Command)
    • stop after run

    All other commands are already in “Command”:

    • user interface writes to “Command”, msequencer clears “Command”
    • msequencer writes to “State”, nobody else writes to “State”.

    K.O.

  4. Log in to comment