Use of RPC_FTCP is not thread-safe

There are several places in mfe.c and midas.c that set the FAST TCP RPC option: rpc_set_option(-1, RPC_OTRANSPORT, RPC_FTCP). Unfortunately this is done without having the RPC mutex. This means that another thread can call something like db_get_key() while the option is still RPC_FTCP. This causes rpc_call() to return early, and not bother waiting for a response. However rpc_call() still returns SUCCESS, and so db_get_key() will also return SUCCESS, even though the KEY structure hasn't been populated! Similar issues will affect many other functions.

I assume this behaviour (all multi-threaded access to the ODB is unsafe on remote connections) is unintentional. I see the issue regularly in one of our unlucky frontends for SuperCDMS.

I think the solution is that everywhere that sets the RPC_FTCP option on the server connection needs to grab the RPC mutex first.

E.g. change

         rpc_set_option(-1, RPC_OTRANSPORT, RPC_FTCP);
         db_send_changed_records();
         rpc_set_option(-1, RPC_OTRANSPORT, RPC_TCP);

to

         get_the_rpc_mutex();
         rpc_set_option(-1, RPC_OTRANSPORT, RPC_FTCP);
         db_send_changed_records();
         rpc_set_option(-1, RPC_OTRANSPORT, RPC_TCP);
         release_the_rpc_mutex();

Comments (7)