How to cite

Aksoy et al. "PiHelper: an open source framework for drug-target and antibody-target data". Bioinformatics (2013) 29 (16): 2071-2072. PMID:23766416


PiHelper is a drug- and antibody-target information aggregator and provider service. The data aggregated by PiHelper can be accessed either programmatically via a Java API or through a web user interface. The framework works in a human-gene centric manner and model drug-target and antibody-target relationships accordingly. PiHelper collectively provides:

  • a command line user interface for importing gene-centric drug- and antibody-target data from multiple sources and for exporting data in various formats
  • a Java API, for programmers interested in building applications that query, modify or analyze drug- and antibody-target data
  • a RESTful web service API, to allow developers utilize the framework without the need of coding in Java
  • a web-based user interface for easy querying of the data and visualizing it in an interactive manner

PiHelper is structred as a multi-module Maven project of which three main submodules are described below:

  • Core
  • Administrator
  • Web

PiHelper is supported by funding from the U.S. National Institutes of Health, National Human Genome Research Institute, grant number U41 HG006623, National Resource for Network Biology, grant number P41 GM103504, and Cancer Technology Discovery and Development Network, grant number U01 CA168409.

PiHelper Core

  • built with Spring Roo
  • contains core model classes such as Drug, Gene, Antiboy, DrugTarget, and etc.
  • provides Roo-based JSON/HTML formatted web services
  • supports persistence using MySQL and Hibernate technologies
  • provides various finder methods in order to ease accessing/querying data

The following two screenshots show the Spring Roo-powered web service component: PiHelper Core Screenshot 1 PiHelper Core Screenshot 2

PiHelper Administrator

  • depends on the core module
  • provides main exporter/importer facilities via a command-line interface

PiHelper Administrator: Importer

Provides support for importing data from the following resources:

Gene data

Drug data

Antibody data

Custom Data

You can also import tabulated drug and/or antibody data from a tab-delimited file -- either from a local file or an URL. Please see the relevant sections within the import section of the document for more details.

Notice about the usage of these data sets

Please be aware of the different data usage policies of each of the data resources we support as part of the framework if you plan to re-distribute the data aggregated by PiHelper.

PiHelper Administrator: Exporter

Provides support for exporting aggregated data in a tab-delimited plain text format. This exported file can then be used in other tools, such as Cytoscape, for further analyses. For importing drug-target data into Cytoscape 3.x, please use the File > Import > Network > File... menu short-cut and then select drugtargets.tsv file exported by PiHelper. The following is a screenshot of the Cytoscape Import Dialog showing the necessary configuration for a sucessful import:

Cytoscape Import Dialog

PiHelper Web

  • built with Twitter Bootstrap, Backbone.js, CytoscapeWeb
  • provides a web-based interface for exploring drugs and drug-target interactions through interactive network graphs.
  • reqiures a running instance of the core module for queries.

The following screenshots show how users can query the data and visualize it as an interactive network (via Cytoscape Web): PiHelper Web Screenshot 1 PiHelper Web Screenshot 2


Getting the source code

PiHelper is a free software project and the code is currently being hosted on the following BitBucket repository: The source code is provided as a Maven 3 project and it is required to install PiHelper and invoke the admin interface.

In order to get the latest stable code, please use the following command:

hg clone -b stable

Configuring database connection

All database related configuration options can be found in core/src/main/resources/META-INF/spring/ This provided configuration is an example file. Before running install and/or preparing WAR/JAR packages, this file should be edited and copied to the followung destination:

cp -f core/src/main/resources/META-INF/spring/ core/src/main/resources/META-INF/spring/

In, please make sure that the username, password and database name are correct for a succesful database connection.

Configuring the cache folder

PiHelper's importer component takes advantage of a local cache folder where it tries to download the files only once and put them under a folder using m5sum of their full URL as a name. This folder is by default is /tmp/pihelper_cache/, but it can be configured within the admin/src/main/resources/META-INF/spring/ file:


If the data resources update their data sets, since the import, simply removing this folder will reset the PiHelper cache:

rm -rf /tmp/pihelper_cache

(Optional) Allocating more memory for the Java Virtual Machine

Importing some of the data resources may require increasing the maximum memory size that the JVM can use. The importer and exporter scripts mentioned below are run through maven utilities; so if the maximum allocatable memory can be configured via setting MAVEN_OPS environment variable before the importer/exporter utilities are used:

export MAVEN_OPTS="-Xmx5g"

The command above, for example, will set the maximum allowable memory size that maven can use to 5 Gb.

(Optional) Configuring the service URL

Users willing to use the web interface should configure the URL that the core web service will be running at. This value can be modified within web/src/main/webapp/ph/js/pihelper.js file:

var CORE_API_URL = "../pihelper-core/";
var WEB_API_URL = "/pihelper-web/";

The default values assume that both the core and the web modules are deployed to the same directory; any different configuration will require to adjust these variables accordingly.

Running unit tests

Before trying to setup an instance of PiHelper on your local machine, please make sure that all the JUnit tests succeed. This will not only make sure the database setup is correct, but it also prevents any potential errors that might arise due to system setup. In order to run the tests, please use the following command in the main pihelper folder:

mvn clean test


In order to use/deploy the PiHelper files, first compile and install the project with the following maven command:

mvn clean install

This will create the necessary war files both for the core and the web modules:

  • core/target/core-VERSION.war
  • web/target/web-VERSION.war

Users can directly copy these war files for deploying the core and the web modules, e.g. into the Tomcat webapps folder.

cp -f core/target/core-VERSION.war ~/Library/Tomcat/webapps/pihelper-core.war
cp -f web/target/web-VERSION.war ~/Library/Tomcat/webapps/pihelper-web.war

The install command will also compile and prepare the necessary classes for admin operations, including importing and exporting data. In order to invoke the command-line admin interface, please run the following the command in the admin folder:

bash src/main/scripts/ 
usage: [export|import] [...]

Each of these subcommands, import and export, will give different lists of options along with their help text:

bash src/main/scripts/ import


bash src/main/scripts/ export

Usage: Import interface

The importer admin interface readily supports multiple data-sources, listed above categorized by the data type they provide. In order to list these resources, please use the following the command:

bash src/main/scripts/ import -l

Before running any importers, the admin interface can be used to create the database schema:

bash src/main/scripts/ import -CREATE dbname

The command above will reset the database, if it already exists, and create the necessary tables without any data in them.

The importer interface can be used to import either all the data at once or portions of data. The following command will import data from all supported data resources:

bash src/main/scripts/ import -a

or for only gene-related data, for example, the following command can be used:

bash src/main/scripts/ import -r

optionally the importers can be run individually:

bash src/main/scripts/ import -e GeneImporter,DrugBankImporter,KEGGDrugImporter,CancerDrugImporter

All drug and antibody importers should be run after the GeneImporter because this importer provides the minimal background gene data.

Finally, importer provides the option to merge the drug-targets that represent the same drug-gene relationship -- or similarly the same antibody-gene relationship. Normally, each data source will create its own drug-gene relationship and all these will be kept as separate entities in the database. It is sometimes more advantegous to merge these entities into a single one, and in order to this, the -m switch is used:

bash src/main/scripts/ import -m

Custom Data Importers

Additionaly, users can import custom drug and or antibody data from a tabulated text file using the corresponding CustomImporter. For example, the following command will pull a sample drug data sheet from the URL and will import it into the database:

bash src/main/scripts/ import -D ""

The same import can also be done via a local file, e.g.:

wget -O /path/to/drugs.tsv ""
bash src/main/scripts/ import -D "file:///path/to/drugs.tsv"

The above syntax is also valid for the custom antibody data importer:

bash src/main/scripts/ import -T "file:///path/to/antibodies.tsv"

The custom drug data importer utility assumes the following tab-delimited format for the file it is trying to import:

#Drug Name  Synonym(s)  Description (max 1024 chars)    Target Gene Symbols is Cancer Drug? is FDA Approved?    is Nutraceutical?   External References
Example1    S11|S12 Some description 1  AKT*|TP53|EGFR  TRUE    TRUE    FALSE   PubChem:1|NCI Drug:3
Example2        Some description 2  BRAF    FALSE   FALSE   TRUE    KEGG:2

For antibody data, also a similar format is expected:

#Antibody Name  Description Target Gene Symbols External References
Example1    Some description 1  AKT1|AKT2   ProteinAtlas:1
Example2    Some description 2  ACTN    ProteinAtlas:2

For target gene names, wild-card characters are allowed at the end of the gene symbol -- e.g. AKT* will match all gene symbols that start with AKT (AKT1, AKT2, AKT3, ...). All lines starting with the character # are ignored to leave room for comments in the input file.

Additional ID Importer Scripts

Additional Groovy scripts are provided (in scripts/) to retrieve PubChem Compound IDs (CIDs) for Board Cancer Cell Encyclopedia (CCLE) and Sanger Cancer Genome Project (CGP) using the the PubChem API (; PubChem is searched for CIDs where a given name matches through a lowercase/exact match in the synonyms for PubChem compounds.

Usage: Export interface

When the data gets loaded into the database, it is possible to create tab-delimited export files for different categories. The -l switch will list which exporters are available for use, but for simplicity the following command can be utilized to export all aggreagted data at once:

bash src/main/scripts/ export -a -o /path/to/output/folder

The default file names for each data type are as follows:

  • Drug: drugs.tsv
  • Drug-Target: drugtargets.tsv
  • Antibody: antibodies.tsv
  • Antibody-Target: antibodytargets.tsv


This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program. If not, see

PiHelper Logo