HTTPS SSH
SSM version 1.1

The Significant Subgraph Miner (SSM) is an algorithm to find subgraphs in a single graph that are significantly associated with a set of vertices. It can handle a large variety of graph dataset types and vertex selections. Further it efficiently progresses through the search space so that most problems should be computable.

The accompanying implementation is provided for free for research purposes.
Some bugs may be present within the software and no guarantees are given!
We would appreciate any comments, bug descriptions, suggestions or succes stories regarding the tool.

This implementation should work on most regular size biological networks with any number of selected vertices and several node labels as demonstrated in the accompanying publication. The only requirement to run this version is a standard Perl 5+ installation and it has been extensively tested on Linux and Mac OSX systems. The example dataset is included in the zip file for illustrative and testing purposes.

Citation:

P. Meysman, Y. Saeys, E. Sabaghian, W. Bittremieux, Y. Van de Peer, B. Goethals and K. Laukens. 
Discovery of Significantly Enriched Subgraphs Associated with Selected Vertices in a Single Graph.
Proceedings of the 14th International Workshop on Data Mining in Bioinformatics (BioKDD15), Sydney, 2015.

Installation:

- Check if perl is installed (e.g. by running perl -version in your command line), else install it
- Run the script with 'perl subgraph.pl'

Git repository:

https://bitbucket.org/pmeysman/sigsubgraphminer

Help:

Input parameters
-graph   	Tab delimited graph file with two columns.
-vertices	Text file with ids for vertices of interest
-labels  	Label file with two columns: Vertex id - Label
-output  	Outputfile to print signficant motifs
-maxsize 	Maximum number of vertices allowed in the subgraph
-pvalue  	Maximum pvalue allowed (default 0.05)
-bgfile  	Variant to use a list of vertices as the graph background to compare against
-uniquelabel	Variant where each node has exactly one label and this label must exactly match for the motif
-nestedpval	Variant where the significance of the child motif is based on the parent matches
-undirected	Undirected option where A->B = B->A and self-loops aren't allowed

Example command:

perl subgraph.pl -graph example/example_graph.txt -labels example/example_labels.txt -vertices example/example_vertexset.txt -maxsize 2 -uniquelabel

Label restrictions:

The following characters are not allowed in the labels: comma (','), dash ('-'), tidle '~'.
Labels should never start with a numerical value.

Contact:

pieter.meysman@uantwerpen.be