Results of checking and suggestion for further improvement

Issue #114 resolved
Xinqiu Yao created an issue

Hi,

Good news! We've passed R CMD check --as-cran. There are still 2 warnings and 1 critical note, which need your helps to remove:

  • 1st warning
* checking dependencies in R code ... WARNING
Unexported object imported by a ':::' call:stats:::plot.dendrogramSee the note in ?`:::` about the use of this operator.
  Including base/recommended package(s):statsSee the information on DESCRIPTION files in the chapterCreating R
packagesof theWriting R Extensionsmanual.
  • 2nd warning
* checking S3 generic/method consistency ... WARNING
pca:
  function(...)
pca.tor:
  function(data, subset)

pca:
  function(...)
pca.xyz2z:
  function(xyz.coord, pca)

pca:
  function(...)
pca.z2xyz:
  function(z.coord, pca)

See sectionGeneric functions and methodsof theWriting R
Extensionsmanual.
  • NOTE
* checking R code for possible problems ... NOTE
.calcAlnModes: no visible binding for '<<-' assignment toiipb.calcAlnModes: no visible binding for global variableiipbnma.pdbs: no visible binding for '<<-' assignment toiipbnma.pdbs: no visible binding for global variableiipb

In addition, we got a table of time-consuming examples. It would be great if we can improve them, too:

* checking examples ... OK
Examples with CPU or elapsed time > 5s
                    user system elapsed
geostas           36.409  0.080  38.144
nma.pdbs          10.705  0.142  10.924
dccm.enma          7.898  0.099   9.971
pdbsplit           6.283  0.088   8.506
binding.site       5.651  0.224   6.482
plot.cna           5.750  0.019   5.765
cna                5.726  0.011   5.771
network.amendment  5.307  0.010   5.315
community.tree     5.257  0.006   5.262
hmmer              2.948  0.036  25.070
plot.hmmer         1.580  0.009   8.085
plot.blast         0.294  0.019 175.284
pfam               0.088  0.009  11.189

Comments (19)

  1. Lars Skjærven

    can we get into CRAN with these warnings?

    1. I couldn't find any other way of calling plot.dendrogram. any tips ?
    2. NOTE in nma.pdbs(): this is the only way we found to make the progressbar work with multicore.
  2. Xinqiu Yao reporter

    No idea about plot.dendrogram... Maybe we leave it for now and see if CRAN complains.

    About progressbar, I used an alternative way writing a temp external file, and it works fine. For example, in dccm.pca():

    ...
          ## Initialize progress bar
          pb <- txtProgressBar(min=1, max=nmodes, style=3)
    
          if(ncore > 1) { # Parallel
    
             # For progress bar
             fpb <- fifo(tempfile(), open = "w+b", blocking = T)
    
             # spawn a child process for message printing
             child <- mcparallel({
                progress <- 0.0
                while(progress < nmodes && !isIncomplete(fpb)) {
                   msg <- readBin(fpb, "double")
                   progress <- progress + as.numeric(msg)
                   setTxtProgressBar(pb, progress)
                }
             } )
             ###################
             ...
             # In the loop of parallel jobs, write the temp file
             mclapply(1:ncore, function(i) {
                ...
               writeBin(1, fpb)
                ...
             })
             close(fpb)
             mccollect(child) # End the child for message printing
    
         } else {     # serial
            ...
            setTxtProgressBar(pb, i)
        }
        close(pb)
    ...
    

    I may write a function to wrap these steps and so easy to use for many other functions.

    Let me know what you think?

  3. Barry Grant

    I don't know of a good solution for warning 1 (‘stats:::plot.dendrogram’). There is a long thread here but with no real solution for us I think https://stat.ethz.ch/pipermail/r-devel/2013-August/thread.html#67180 There are CRAN check reports with these warning for quite a few packages, e.g. ftp://www.postfix.org/mirror/CRAN/web/checks/check_results_DMwR.html

    For warning 2 we can consider changing the names to xyz2z.pca() and z2xyz.pca(). Although this will break some Rd files (probably revealed by R CMD check), potentially mktraj.pca(), and maybe the PCA vignette? This change may not be worth the extra running around.

    Some of the long running examples would be good to address. I can look into the cna related ones. Others that require web access should really be modified to use local files where possible or else wrapped in the skip_on_cran() mentioned earlier.

  4. Lars Skjærven

    warning 1: suggesting to live with it for now. Alternatively, we can leave out hclustplot() from this release. It's nice to have, but nothing we cant live without either. apropos function names...

    warning 2: should be a relatively quick fix. I can give it a try if you want. however, I guess there are several more functions that could be re-named for consistency reasons. perhaps we should either do the entire re-naming job now, or wait with it.

    Long running examples: it's also a point to provide examples which are easy understandable for new users. e.g. I prefer

    read.pdb("1HEL") 
    

    as compared to

    file <- system.file("examples/....", package="bio3d")
    read.pdb(file)
    

    no ?

    (the geostas example may take long, but it is a nice one).

    Xin-qiu: Not sure whats the best solution here. I'm not convinced that keeping track of the variable through IO to the disk is the way to go. Can't we keep it the way it is for now?

    Let me know if any other tasks before release.

  5. Barry Grant

    Long examples are fine if they are justified - but we should skip them on CRAN.

    I agree that the typical system.file() requirement is confusing for new users (but then again so is R in general). Using it was necessary as not having net access was common previously (folks working at home or on plane with a laptop etc.). This is generally no longer the case so I am fine with using read.pdb("1HEL") like calls in the examples from now on.

    The one big disadvantage is that it can often slow things down considerably. Trajectories are obviously not to be read from online and blast and hmmer jobs/services should not be called on CRAN...

    Regarding release tasks.

    Lars: could you check the package on Windows and look after updating the website download links and tutorials/vignettes please?

    Xinqiu: can you generate the nice, example included, html docs - I know these can be tricky as we strip many of the /dontrun{} blocks to generate these. Also I just noticed that I am may of deleted the README.md from the bio3d package dir on master that we need for these html docs. My attempts to add it back have not worked yet for some reason. Anyway, I guess the online vignettes should be updated first as we hard link to these in the README.md

    Once all this is done we can try posting to CRAN.

  6. Xinqiu Yao reporter

    I have made a couple commits to change functions. See here and here. So, warning 2 has been removed. Let's keep warning1 and the note and see how it comes with CRAN.

    I am going to take care some long examples, too. Will try the "skip_on_cran()" as Barry suggested if applicable.

    I guess read.pdb() should be fine for CRAN but definitely should avoid longer database searching like blast

  7. Xinqiu Yao reporter

    After a couple days struggling, we finally reach a cleaner report for package check. Note that I used devtools::check() function, which is better than R CMD check --as-cran. To reproduce my results:

    git checkout releases
    cd ver_devel/util
    
    # If you have R-devel installed and called it R-devel/Rscript-devel 
    ./go_check.sh -devel -cran 
    
    # Otherwise
    ./go_check.sh -cran
    

    Note that it will pass with the minimal requirement of 'Imports' and 'Suggests' packages installed. It means that it passes even if you don't have MUSCLE or DSSP installed, which is important for actual CRAN check.

    Updating bio3d documentation
    Loading bio3d
    First time using roxygen2 4.0. Upgrading automatically...
    '/opt/R/R-devel/lib64/R/bin/R' --vanilla CMD build  \
      '/u2/xqyao/Project/bio3d/ver_devel/bio3d' --no-manual --no-resave-data 
    
    * checking for file/u2/xqyao/Project/bio3d/ver_devel/bio3d/DESCRIPTION... OK
    * preparingbio3d:
    * checking DESCRIPTION meta-information ... OK
    * installing the package to build vignettes
    * creating vignettes ... OK
    * checking for LF line-endings in source and make files
    * checking for empty or unneeded directories
    * looking to see if adata/datalistfile should be added
    * buildingbio3d_2.1-0.tar.gz'/opt/R/R-devel/lib64/R/bin/R' --vanilla CMD check  \
      '/tmp/RtmpbZWYZu/bio3d_2.1-0.tar.gz' --timings 
    
    * using log directory/tmp/RtmpbZWYZu/bio3d.Rcheck* using R Under development (unstable) (2014-10-05 r66720)
    * using platform: x86_64-unknown-linux-gnu (64-bit)
    * using session charset: UTF-8
    * checking for filebio3d/DESCRIPTION... OK
    * this is packagebio3dversion2.1-0* checking package namespace information ... OK
    * checking package dependencies ... OK
    * checking if this is a source package ... OK
    * checking if there is a namespace ... OK
    * checking for executable files ... OK
    * checking for hidden files and directories ... OK
    * checking for portable file names ... OK
    * checking for sufficient/correct file permissions ... OK
    * checking whether packagebio3dcan be installed ... OK
    * checking installed package size ... NOTE
      installed size is  9.7Mb
      sub-directories of 1Mb or more:
        doc   6.8Mb
    * checking package directory ... OK
    * checkingbuilddirectory ... OK
    * checking DESCRIPTION meta-information ... OK
    * checking top-level files ... OK
    * checking for left-over files ... OK
    * checking index information ... OK
    * checking package subdirectories ... OK
    * checking R files for non-ASCII characters ... OK
    * checking R files for syntax errors ... OK
    * checking whether the package can be loaded ... OK
    * checking whether the package can be loaded with stated dependencies ... OK
    * checking whether the package can be unloaded cleanly ... OK
    * checking whether the namespace can be loaded with stated dependencies ... OK
    * checking whether the namespace can be unloaded cleanly ... OK
    * checking loading without being on the library search path ... OK
    * checking dependencies in R code ... NOTE
    Unexported object imported by a ':::' call:stats:::plot.dendrogramSee the note in ?`:::` about the use of this operator.
    See the information on DESCRIPTION files in the chapterCreating R
    packagesof theWriting R Extensionsmanual.
    * checking S3 generic/method consistency ... OK
    * checking replacement functions ... OK
    * checking foreign function calls ... OK
    * checking R code for possible problems ... NOTE
    .calcAlnModes: no visible binding for '<<-' assignment toiipb.calcAlnModes: no visible binding for global variableiipbaa2index: no visible binding for global variableaa.indexaa2mass: no visible binding for global variableaa.massff.sdenm: no visible binding for global variablesdENMnma.pdbs: no visible binding for '<<-' assignment toiipbnma.pdbs: no visible binding for global variableiipb* checking Rd files ... OK
    * checking Rd metadata ... OK
    * checking Rd line widths ... OK
    * checking Rd cross-references ... OK
    * checking for missing documentation entries ... OK
    * checking for code/documentation mismatches ... OK
    * checking Rd \usage sections ... OK
    * checking Rd contents ... OK
    * checking for unstated dependencies in examples ... OK
    * checking contents ofdatadirectory ... OK
    * checking data for non-ASCII characters ... OK
    * checking data for ASCII and uncompressed saves ... OK
    * checking sizes of PDF files underinst/doc... WARNINGgs+qpdfmade some significant size reductions:
         compactedBio3D_md.pdffrom 987Kb to 456Kb
         compactedBio3D_nma.pdffrom 3.6Mb to 2.5Mb
         compactedBio3D_pca.pdffrom 1221Kb to 632Kb
      consider running tools::compactPDF(gs_quality = "ebook") on these files
    * checking installed files frominst/doc... OK
    * checking files invignettes... OK
    * checking examples ... [128s/219s] OK
    Examples with CPU or elapsed time > 5s
                   user system elapsed
    geostas      33.693  0.068  41.024
    binding.site  5.437  0.220   6.150
    hmmer         2.720  0.038  22.262
    plot.hmmer    1.537  0.014   8.230
    fit.xyz       1.527  0.018   7.222
    plot.blast    0.193  0.002   7.795
    pfam          0.067  0.010  10.975
    * checking for unstated dependencies in tests ... OK
    * checking tests ...
      Runningtest-all.R[38s/69s]
     [38s/69s] OK
    * checking for unstated dependencies in vignettes ... OK
    * checking package vignettes ininst/doc... OK
    * checking running R code from vignettes ...Bio3D_cna-transducin.Rnw... OKBio3D_install.Rnw... OKBio3D_md.Rnw... OKBio3D_nma.Rnw... OKBio3D_pca.Rnw... OK
     OK
    * checking re-building of vignette outputs ... OK
    * checking PDF version of manual ... OK
    * DONE
    
    WARNING: There was 1 warning.
    NOTE: There were 3 notes.
    See/tmp/RtmpbZWYZu/bio3d.Rcheck/00check.logfor details.
    

    To do:

    • Check examples/tests to avoid writting to local disk
  8. Barry Grant

    This is looking much better. The one WARNING is still about our vignette sizes. One option to consider is moving all existing vignettes to online ONLY (i.e. removing from the package) and including only one small 'all-in-one' vignette in the package that demos several key areas very briefly and points to the online versions for more details.

  9. Xinqiu Yao reporter

    That is a good idea to have one small 'all-in-one' vignette and points to others online. Not sure whether this WARNING is actually related to CRAN check. Will explore more later.

    Update: Examples now won't write anything to local work directory. Instead, all outputs are redirected to system's temporary directory. Haven't touched test functions yet, but I guess they should be fine because both R CMD check and devtools::check() work in an enclosed environment, i.e. outputs are automatically redirected.

  10. Barry Grant

    To get rid of the remaining WARNING

    Unexported object imported by a ':::' call:stats:::plot.dendrogram

    Lets just copy the plot.dendrogram function from the stats package into bio3d as suggested here: https://stat.ethz.ch/pipermail/r-devel/2013-August/067189.html https://stat.ethz.ch/pipermail/r-devel/2013-August/thread.html#67180

    Then the only other things are the two NOTES about no visible binding for '<<-' assignment to ‘iipb’ and the large PDFs.

    Do we know how to tackle the first of these?

    For the PDFs I think one, perhaps html, vignette that points to all the others online would be ok. It does not even need to contain any material for this version. I will think about this more over the weekend.

  11. Xinqiu Yao reporter

    Hi,

    We only need the name list of plot.dendrogram() arguments, and so the simplest way is just hard code it:

    > names(formals(stats:::plot.dendrogram))
     [1] "x"          "type"       "center"     "edge.root"  "nodePar"
     [6] "edgePar"    "leaflab"    "dLeaf"      "xlab"       "ylab"
    [11] "xaxt"       "yaxt"       "horiz"      "frame.plot" "xlim"
    [16] "ylim"       "..."
    

    The only problem is we may forget updating it if the function will change in future. Not sure about copying entire function: Will we have any copyright issue?

    About the first "no visible binding" NOTE, I suggest that instead of defining a global variable we pass the variable iipb to the function as an argument. This should do the same thing of sharing the area opened by bigmemory between two functions.

    Another NOTE is kind of weired: I guess it is because data is not in the namespace of a package and thus is not exported. The simplest way to solve it is define the same variable at the beginning of the function, e.g.

    aa.index = bio3d::aa.index
    ...
    

    Finally, it would be great if we can reduce sizes of PDFs. I will start doing revisions of other parts, if you all agree with my suggestions.

  12. Xinqiu Yao reporter

    Thanks, Barry. I've updated with this commit, which seem neater than suggestions above but do remove ALL the notes in the CRAN check.

    I think, after solving the problem of PDF sizes, we can resubmit it to CRAN.

  13. Lars Skjærven

    All-in-one vignette: how about expanding the intro vignette ("getting started with bio3d"), and add further examples to this. e.g. some basic functionality on (1) reading and writing PDBs, and atom selection, (2) easy PDB analysis, like torsions, contact map, nma, etc (3) sequence alignment and databases, (4) introduction to the pdbs object, (6) basic examples for PCA, (7) eNMA, (8) CNA, etc, with links to the "main" vignettes. perhaps even shorter ...?

  14. Barry Grant

    That sounds good. It might still be too large though once various molecular images are inserted.

    The smallest and simplest that will enable CRAN submission would be a html vignette that links to the online material. Perhaps a slightly modified version of the statidocs index.html would be sufficient???

  15. Log in to comment