Issue with core.find()

Lars Skjærven

Hi, Thanks for reporting this. Could you save your pdbs object and send it to me via email? That would make it much easier to debug. You can do

save(pdbs, file="my_pdbs.RData")

and send the my_pdbs.RData file. (email: larsss [at] gmail dot com)

2015-04-20T18:45:36+00:00

Barry Grant

The core.find() function should work with gaps in your aligned structures. Can you tell us please what the dimensions of your 'pdbs$xyz' object is? E.g.

dim(pdbs$xyz)
# Or just type 'pdbs$xyz'

Without providing us with a minimal reproducible example (e.g. a small set of your PDB ids that result in this error) it hard to see what could be wrong here beyond a dimension problem in your pdbs object. Thanks!

2015-04-20T18:48:34+00:00

Lars Skjærven

Adjust the stop.at argument down to below 11. do e.g. core.find(pdbs, stop.at=5).

Looking at your alignment it seems you have only 12 non-gap columns (open the aln.fa file generated by pdbaln() to see (e.g with seaview)). You can also check which non-gap columns you have by using function gap.inspect:

> gaps <- gap.inspect(pdbs$ali)
> gaps$f.inds
 [1] 510 511 512 544 627 645 646 647 648 649 650 654

(12 positions).

You can certainly align all your PDB files on these 12 residues (or even the core identified), but doing a PCA here will omit all gap containing columns. which means that the PCA will be performed on 12 residues.

The cryptic error message you got certainly calls for some internal checking and relevant messaging in the core.find function. (prior to line 130 to be precise). while(length(res.still.in) > stop.at) { Thanks !

2015-04-20T19:16:45+00:00

Barry Grant

Indeed the error message should be improved upon here - thanks for reporting!

On a a more fundamental level for you I think your current alignment is likely not what you want. Note that you have at least 10 very different protein sequence groups indicative of distinct proteins (i.e. different protein families) in there that come form your combining of all the chains of you your chosen PDB files. Examining your alignment with a viewer like seaview can help you here as well as examining their pairwise sequence identity values. E.g.

i <- seqidentity(pdbs)
hc <- hclust(as.dist(1-i))
plot(hc, hang=-1)

You might want to focus on only the chains representative of the family you are most interested in here.

2015-04-20T19:37:01+00:00

Barry Grant

changed status to resolved

Resolved but with a note for future core.find() error/warning message improvements

2015-04-20T19:47:04+00:00

Xinqiu Yao

changed version to v2.2

2016-09-21T15:00:20+00:00

Comments (6)