Validate accession-type pairs

Issue #315 resolved
Reece Hart
created an issue

Depends on: #365

Users are reporting the existence of incorrect pairings of accessions and types generated by other tools. This isn't a problem with hgvs, but it would be nice to provide a mechanism to warn about these.

For example, SnpEff current returns p. variants with NM (i.e., mRNA/cDNA) accessions. A similar problem was also reported for an unspecified LOVD installation for BRCA1.

This issue proposes a new intrinsic validation step that will validate accessions against types.

Three categories:

1) Known good pairs, e.g., NM w/ c. 2) Known bad pairs, e.g., NM w/ p., NP w/ c., etc. 3) Everything else.

Two modes seem likely. In a strict mode, cases 2 and 3 raise exceptions because they are not known to be valid. In a relaxed mode, only case 2 raises an exception and case 3 is presumed okay. In other words, strict is pessimistic about unknown pairs and relaxed is optimistic about them.

Comments (6)

  1. Reece Hart reporter

    Meng- after seeing your proposed change, my only recommendations are:

    1) Always reject NM with p. 2) Add CM\d+.\d+ as a valid g. type. (e.g., CM000663.2 is equivalent to NC_000001.11)

  2. Log in to comment