Method 'removeUnreachableNodes' pruning nodes that are not barren nodes

The method 'removeUnreachableNodes' of the 'ProbNetOperations' class (package 'org.openmarkov.core.model.network'), is in some scenarios pruning nodes that should not be pruned as they are not barren nodes.

Attached to the issue there is a network where, not always, this problem arises. Talking specifically of the network attached to the issue, depending on the order in which the nodes are explored, there are cases where the node 'A' is pruned.

If the exploration of the nodes happens to be:

K H F D B C A E G I D4 D2 J D3 D1 L

Node 'A' is not removed.

But if the exploration happens to be:

D1 B C E G I D4 F H D2 D L K J D3

When the node 'C' is being treated, and the code reaches line number 289, it occurs that neither node 'A' (parent of 'C'), nor node 'E' (children of 'C') are in the list 'nodesToKeep' and therefore the method 'pushInExploreAndAddToKeep' is never called with the node 'A' as the first parameter of the call and hence it is pruned. When it should not be.

The scenarios above described refer to the first call to the method 'getPruned' done in the ArcReversal algorithm, just before the inference is performed. But the method is called several times during the inference, and sometimes the pruning of the node 'A' is during the inference.

Maybe I have not understood completely the logic behind the algorithm of the method 'removeUnreachableNodes' but in the part of the code that is failing, where it searches for not head to head connected in the form of 'X->Y->Z' and 'X<-Y<-Z', the part is coded in a way that assumes that when node 'Y' is being examined, nodes 'X' and 'Z' have being examined. Talking about the network that is giving problems, the sequence of the nodes involved in the issue is: 'A' -> 'C' -> 'E'. So when 'C' is being explored, 'A' will never be found in nodesToKeep (as it is examined later); and when 'E' has not being added to this list earlier, the algorithm fails. 'E' is in this list if node 'D' is explored before 'C'.

If have not been able to explain myself clearly, please let me know and I will try to clarify further the issue.

If it is of any help, in the limidEvaluation repository, it is coded the method 'reachableNodes' that implements the algorithm 3.1 of koller2009.

Comments (3)