pdbsplit lacks a warning for missing files
There is a problem here when a particular ids is not found no warning and no file will be returned. This is not ideal... E.g.
## first two chains dont exist but pdbsplit will not warn about this!!
ids <- c("1a70_Z","1a70__","1a70_A", "1czp_A")
raw.files <- get.pdb(ids, path = "raw_pdbs")
files <- pdbsplit(raw.files, ids, path = "raw_pdbs/split_chain/")
files
##[1] "raw_pdbs/split_chain//1a70_A.pdb" "raw_pdbs/split_chain//1czp_A.pdb"
Comments (5)
-
reporter -
agree. I tried with a solution now. check if it makes sense.
not sure if using "%in%" in this case is smart. might be better with grep....
-
reporter hmm... Still does not give any warning with the above example code.
-
Right.. I did it the other way around. i.e. if there were elements of 'pdb.files" not in use a warning was issued.
Now it looks like this with your example:
ids <- c("1a70_Z","1a70__","1a70_A", "1czp_A") raw.files <- get.pdb(ids, path = "raw_pdbs") files <- pdbsplit(raw.files, ids, path = "raw_pdbs/split_chain/") Warning message: In pdbsplit(raw.files, ids, path = "raw_pdbs/split_chain/") : unmatched ids: 1a70_Z, 1a70__ files [1] "raw_pdbs/split_chain//1a70_A.pdb" "raw_pdbs/split_chain//1czp_A.pdb"
And if pdb id '1a70' is not in use:
ids <- c("1a70_Z", "1czp_A") raw.files <- get.pdb(ids, path = "raw_pdbs") files <- pdbsplit(raw.files, ids, path = "raw_pdbs/split_chain/") Warning messages: 1: In pdbsplit(raw.files, ids, path = "raw_pdbs/split_chain/") : unmatched pdb files: 1a70 2: In pdbsplit(raw.files, ids, path = "raw_pdbs/split_chain/") : unmatched ids: 1a70_Z
Ok - so the two warnings are redundant, but it might be useful when 'ids' and 'raw.files' are obtained differently. we can also delete warning 'unmatched pdb files' perhaps..
Matching the exact 'ids' with '%in%' (line 26) is oK?
tmp.names <- paste(substr(basename(pdb.files[i]), 1, 4), "_", chains, sep = "") tmp.inds <- tmp.names %in% ids
-
- changed status to resolved
I think this is ok now
- Log in to comment
also the function looks like it still processes all chains present even though only a subset may be requested with the ids argument. Might want add this to the ToDo list for enhancing in the future...