pdbsplit lacks a warning for missing files

Issue #32 resolved
Barry Grant created an issue

There is a problem here when a particular ids is not found no warning and no file will be returned. This is not ideal... E.g.

## first two chains dont exist but pdbsplit will not warn about this!!
ids <- c("1a70_Z","1a70__","1a70_A", "1czp_A")
raw.files <- get.pdb(ids, path = "raw_pdbs")
files <- pdbsplit(raw.files, ids, path = "raw_pdbs/split_chain/")
files
##[1] "raw_pdbs/split_chain//1a70_A.pdb" "raw_pdbs/split_chain//1czp_A.pdb"

Comments (5)

  1. Barry Grant reporter

    also the function looks like it still processes all chains present even though only a subset may be requested with the ids argument. Might want add this to the ToDo list for enhancing in the future...

  2. Lars Skjærven

    agree. I tried with a solution now. check if it makes sense.

    not sure if using "%in%" in this case is smart. might be better with grep....

  3. Lars Skjærven

    Right.. I did it the other way around. i.e. if there were elements of 'pdb.files" not in use a warning was issued.

    Now it looks like this with your example:

    ids <- c("1a70_Z","1a70__","1a70_A", "1czp_A")
    raw.files <- get.pdb(ids, path = "raw_pdbs")
    files <- pdbsplit(raw.files, ids, path = "raw_pdbs/split_chain/")
    Warning message:
    In pdbsplit(raw.files, ids, path = "raw_pdbs/split_chain/") :
      unmatched ids: 1a70_Z, 1a70__
    
    files
    [1] "raw_pdbs/split_chain//1a70_A.pdb" "raw_pdbs/split_chain//1czp_A.pdb"
    

    And if pdb id '1a70' is not in use:

    ids <- c("1a70_Z", "1czp_A")
    raw.files <- get.pdb(ids, path = "raw_pdbs")
    files <- pdbsplit(raw.files, ids, path = "raw_pdbs/split_chain/")
    Warning messages:
    1: In pdbsplit(raw.files, ids, path = "raw_pdbs/split_chain/") :
      unmatched pdb files: 1a70
    2: In pdbsplit(raw.files, ids, path = "raw_pdbs/split_chain/") :
      unmatched ids: 1a70_Z
    

    Ok - so the two warnings are redundant, but it might be useful when 'ids' and 'raw.files' are obtained differently. we can also delete warning 'unmatched pdb files' perhaps..

    Matching the exact 'ids' with '%in%' (line 26) is oK?

    tmp.names <- paste(substr(basename(pdb.files[i]), 
                               1, 4), "_", chains, sep = "")
    tmp.inds <- tmp.names %in% ids
    
  4. Log in to comment