frontend crash - buffer read pointer did not move while waiting for bytes

Issue #96 resolved
dd1 created an issue

feudp crashed from an oldish problem that is not logged in this bug tracker. There is a long standing bug in the event buffer code. Under an unknown condition, the writer to the event buffer would go into an infinite loop. Years ago I replaced the infinite loop with a crash and a following message. Never figured out the root cause of the problem.

(messages are in the reverse order: last message is shown first)
00:16:34.904 2017/11/15 [feudp,INFO] Program feudp on host alphagdaq stopped
00:16:34.904 2017/11/15 [feudp,ERROR] [mfe.c:1597:receive_trigger_event,ERROR] rpc_send_event error 203
00:16:34.903 2017/11/15 [feudp,ERROR] [midas.c:7310:bm_wait_for_free_space,ERROR] BUG: read pointer did not move while waiting for 98764 bytes, bytes available: 4304, buffer size: 500000000
00:16:34.903 2017/11/15 [feudp,INFO] Corrected read pointer for client 'feudp' on buffer 'BUFUDP' from 74511528 to 74515752

K.O.

Comments (8)

  1. dd1 reporter

    looked at the core dumps, replaced failing asserts with cm_msg() and error return. most likely this creates an infinite loop of error messages. will see. K.O.

  2. dd1 reporter

    saw the crash again. one of the asserts replaced by error message is now an infinite loop. the error messages show crazy event data length - looks like either bad events go into the buffer or become corrupted in the buffer. Need to write a consistency checked for event buffer content. K.O.

  3. dd1 reporter

    the error message in bm_push_event() produces an infinite loop as it is called from cm_yield() and the error status about corrupted event buffer is not propagated there. Replaced this infinite loop with a crash, for now. K.O.

  4. Log in to comment