Cleanup Receptor and format parsing code
Issue #109
resolved
Lots of changes were made to the code in the Receptor and Parsers modules to accommodate the AIRR format. It needs a lot of cleanup after these changes.
Also, we should probably convert the Receptor class to use AIRR naming and indexing rules, which will require us to change and test every other tool that uses the Receptor class.
Comments (3)
-
reporter -
reporter REV_COMP
is now in all parsers.- Keeping
CDR3_IGBLAST_*
for compatibility, except renamingCDR3_IGBLAST_NT
toCDR3_IGBLAST
for consistency. - Starting moving parser specific field list into parser classes.
-
reporter - changed status to resolved
- Leaving *_VDJ fields as is -
_ALIGN
was used by AlignRecords already. - Leaving
FUNCTIONAL
as is for now as well. - Leaving indexing as is for the moment. Will revisit later.
Lots of cleanup on parsing, schemas and Receptor in a31a721. It's as clean as it's going to get for now.
- Log in to comment
Looks like we will stick with changeo field names (except lowercase) for the Receptor class attributes.
Some decisions need to be decided about the following changeo fields:
REV_COMP
- should probably be a core (required) field, which means adding extraction of this info to the IMGT and iHMMune-align parsers, if available.CDR3_IGBLAST_*
- We are now using the IgBLAST CDR3 fields directly to determine JUNCTION, so we can probably drop these entirely.*_VDJ
- might make sense to change these to*_ALIGN
for clarity between the changeo and airr formats.FUNCTIONAL
- Is technically the wrong name for this field. It should bePRODUCTIVE
, but it's one of those fields that's regularly used and breaking backwards compatibility might be unwise.We should probably move the extra alignment field sets (regions, scores, junction, etc) out of
ChangeoSchema
andAIRRSchema
and into the parser classesIMGTReader
,IgBLASTReader
,iHMMuneReader
and make them Receptor attributes instead of output fields. Should be easier to understand that way, as thefields
method of the schema are only relevant to the aligner parsing task.Should probably still switch index fields in Receptor to 0-based, because python.