Snippets

Dénes Türei How to load non-human interactions directly in `pypath`

Created by Dénes Türei

File pypath_signor_mouse_example.py Added

  • Ignore whitespace
  • Hide word diff
+#!/usr/bin/env python
+
+# Dénes Türei Uniklinik RWTH Aachen & EMBL Heidelberg 2018
+# turei.denes@gmail.com
+
+# how to load non-human interaction data directly in pypath
+# note: for mammalians the recommended method is to
+# translate from human network by orthology
+# as most of the data is available for human
+
+import pypath
+
+# first we need to set the `ncbi_tax_id` parameter
+# of the PyPath instance to 10090
+pa = pypath.PyPath(ncbi_tax_id = 10090)
+
+# all network inputs in pypath are defined by
+# `ReadSettings` objects
+# many predefined input settings reside in various
+# dicts in `data_formats`
+# to load mouse data from Signor we need to set the
+# parameter below; this will result the reader to
+# check column #13 in original Signor data and load
+# only those with values `10090` or `10090;10090` in
+# this column, and also to consider these having a
+# taxon ID `10090` (values in the dict; this determines
+# ID translation later)
+# these settings are specific for individual resources
+# one need to check the output of the corresponding
+# `pypath.dataio` method to find out the values;
+# find the methods under the `inFile` attribute of the
+# `ReadSettings` object, e.g.
+# `pypath.data_formats.pathway['signor'].inFile`
+# ok, this is misleading because it's not a file but
+# a function, but in theory it could be also a file name
+# or URL
+pypath.data_formats.pathway['signor'].ncbiTaxId = {
+    'col': 12,
+    'dict': {
+        '10090': 10090,
+        '10090;10090': 10090
+    }
+}
+
+# these `reference lists` are used to make sure all entities
+# have their type and taxon correctly identified
+# for example this is simply a list of all mouse UniProts
+# if you want only SwissProts add `swissprot = True`
+pa.reflists[('uniprot', 'protein', 10090)] = (
+    pypath.reflists.ReferenceList('uniprot', 'protein', 10090, 'all_uniprots')
+)
+pa.reflists[('uniprot', 'protein', 10090)].load()
+
+# in order to get any mouse interaction from Signor
+# we need to pass `organism = 10090` to
+# `dataio.signor_interactions()`
+# this is redundant and undocumented, sorry about that
+# will improve later
+pypath.data_formats.pathway['signor'].inputArgs['organism'] = 10090
+
+# after setting all these it is possible to load mouse interactions
+# as they are defined in Signor
+pa.init_network({'singor': pypath.data_formats.pathway['signor']})
+# output:
+# > 355 interactions between 373 nodes
+# from 1 resources have been loaded
+#
+# this network is much smaller than human
+# in my opinion is better to have more complete data
+# translated from human
HTTPS SSH

You can clone a snippet to your computer for local editing. Learn more.