Implementation of the a r_environment to easily install R packages in a reproducible way.

#219 Merged at 8de5aa8
Repository
galaxy-central-bgruening
Branch
R_environment
Repository
galaxy-central
Branch
default
Author
  1. Björn Grüning
Reviewers
Description

Hi,

as pointed out by John a real r_environment would be nice, to facilitate the installation of R packages and make that procedure reproducible. Here is my solution and drive the discussion further. With the following patch the following should be possible:

            <action type="setup_r_environment">

                <repository changeset_revision="bae5c9880b71" name="package_r_3_0_1" owner="bgruening" toolshed="http://testtoolshed.g2.bx.psu.edu">
                    <package name="R_3_0_1" version="3.0.1" />
                </repository>

                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/BiocGenerics_0.6.0.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/IRanges_1.18.2.tar.gz</r_package>

                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/GenomicRanges_1.12.4.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/Rcpp_0.10.4.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/RcppArmadillo_0.3.900.0.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/locfit_1.5-9.1.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/Biobase_2.20.1.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/DBI_0.2-7.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/RSQLite_0.11.4.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/AnnotationDbi_1.22.6.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/xtable_1.7-1.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/XML_3.98-1.1.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/annotate_1.38.0.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/genefilter_1.42.0.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/RColorBrewer_1.0-5.tar.gz</r_package>
                <r_package>https://github.com/bgruening/download_store/raw/master/DESeq2-1_0_18/DESeq2_1.0.18.tar.gz</r_package>

            </action>

            <action type="set_environment">
                <environment_variable action="append_to" name="R_LIBS">$INSTALL_DIR</environment_variable>
            </action>

If we can merge that system at some point, I will start writing a best-practise guide for R packaging. @peterjc and other requested that and I think its a good idea. If you review that patch, please take John's ruby-environment pull request also into account:

https://bitbucket.org/galaxy/galaxy-central/pull-request/207/john-chiltons-august-2013-tool-shed/diff

You can test that patch with the deseq package located at:

http://testtoolshed.g2.bx.psu.edu/view/bgruening/package_deseq2_1_0_17

With that integrated we have nice installation routines for R, ruby, python. Would be nice to have that in the next stable release!

As a side note, I have collected John's Galaxy-R scripts, ported to RPy2 under:

https://github.com/bgruening/galaxytools/tree/master/R_mixed

and hope to migrate it at some point to this proposed system here.

As you see we need a mirror for tarballs, maybe it is now the right time to kick that idea also further and create a mirror system for bioinformatic tarballs.

Thanks!

Bjoern

Comments (9)

  1. Peter Cock

    What do you mean for nice installation routines for python? Is there another new action type I've missed?

    The R tarballs requirement does seem part of a larger need, perhaps the Galaxy Tool Shed should cache those? It wouldn't really need revision control... something to discuss on the mailing lists?

    1. Björn Grüning author

      There is a "setup_virtualenv" action type for python, afaik.

      Yes, the caching idea needs to be discussed. I need to write several mails, hopefully today. James raised that idea a few days ago on the mailinglist.

  2. John Chilton

    Fantastic! Keep it up :)

    Can the 'set_environment' occur automatically when using setup_r_environment? Will that expression be different in different circumstances?

    Also, I think I would prefer package to r_package for the name of the inner tag, we are already in an R action environment so the r_ prefix seems redundant. Just a personal preference though, obviously feel free to ignore :).

    Another thing that would be cool is to allow the actual R packages to be placed right in the repository, but maybe tool_dependency_definition's are not going to allow this right? Regardless, its something that could always be added later.

    1. Björn Grüning author

      Changed 'r_package' to 'package'.

      I was not sure about the 'set_environment' thing and wanted to discuss that at first with @greg or @inithello. We definitely need to discuss, where we want to store the tarballs.

      1. Björn Grüning author

        Added the R_LIBS env variable automatically. I also added a ToDo, that we should refactor that part at some point.

        Thanks @jmchilton for pointing that out!

  3. Peter Cock

    I was also wondering about bundling the dependency tar-balls into the Tool Shed repository, but as you say John, it would break the current rule that a package only has the single tool_dependencies.xml file (and it could result in a lot of duplicated tarballs too).

  4. Björn Grüning author

    You guys are so fast ... great! I will try to address all issues and start discussion on the mailing list.