ODB error with link to open record variables

Issue #136 resolved
Thomas Lindner created an issue

Suzannah found a problem with more recent MIDAS; it seems to me that the problem is the result of having an ODB link to a hot-linked (open record) variable.

Description of bug:

I start with a simple MIDAS experiment, just running mhttpd and odbedit. sor shows that mhttpd has open records for a bunch of variables, including "Allowed hosts"

[local:simdaq:R]/>sor /Experiment/Security/RPC hosts/Allowed hosts open 2 times by "ODBEdit" "mhttpd" /Experiment/Security/mhttpd hosts/Allowed hosts open 1 times by "mhttpd" /Sequencer/State open 1 times by "mhttpd" /RCParams/testtest2 open 1 times by a deleted client

Now create a symbolic link to that hot-linked variable: [local:simdaq:R]/>mkdir Params [local:simdaq:R]/>cd Params/ [local:simdaq:R]/Params>ln "/Experiment/Security/mhttpd hosts/Allowed hosts[0]" testlink

Now stop and restart odbedit and we get error message:

bash-4.1$ odbedit odbedit: src/odb.c:970: db_update_open_record: Assertion `xkey->notify_count == pkey->notify_count' failed. Aborted

More debugging info: - If mhttpd is stopped (ie, the hot-linking program stopped) then we stop getting this error. - If the symbolic link is removed we stop getting this error. - If the symbolic link is made to the whole array (like a like to "Allowed Hosts", rather than "Allowed Hosts[0]") then we don't get the error.

It seems that we probably need to add code so that we ignore symbolic links when doing these checks:

if(pkey->type == TID_LINK) return; assert(xkey->notify_count == pkey->notify_count);

but I don't understand this function well enough to be sure.

Comments (4)

  1. dd1

    odbedit "sor" also works incorrectly for symlinks to watched variables - reports both the watched variable (correctly) and the symlink (as "open by deleted client", incorrectly) - [local:javascript1:S]/>sor /Experiment/Security/RPC hosts/Allowed hosts open 4 times by "mhttpd" "ODBEdit" "Logger" "ODBEdit1" /aaa/testlink open 1 times by a deleted client K.O.

  2. dd1

    The wrong db_scan_tree() function was used in odbedit "sor" (db_get_open_records()) and in odb validation (db_validate_open_records()). Instead of db_scan_tree_link() which partially follows symlinks, one should use db_scan_tree() which does not follow symlinks. K.O.

  3. Log in to comment