Allow reprocessing of a dataset without rereading it in

Issue #4 resolved
Jason Wang created an issue

The user should be able to call klip_dataset multiple times and not have to do any bookkeeping.

Currenlty,values such as dataset.center and dataset.wcs get modified in klip_dataset, meaning that if you want to run klip_dataset multiple times, you have to reset dataset.center and dataset.wcs manually. We need someway to specify input dataset centers/wcs and output dataset centers/wcs (possibly by adding new fields).

Comments (5)

  1. Rob De Rosa

    How about changing the variable names so that none of the variables set by GPI.GPIData() are changed by klip_dataset? We could also put in some logic so that if klip_dataset is called again, the align_and_scale step can be skipped which would ~halve the time required to process a sequence.

  2. Jason Wang reporter

    Yeah, that's what I meant by adding new fields. We would always retain an original copy of everything.

    The second issue you address is a different issue as it's more of a memory management problem. If we hold on to the align_and_scale data, that means we'll be potentially holding on to tens of GB of memory after pyklip finishes even if the user doesn't wish to reprocess a dataset. That would be best put in as an optional parameter I would guess.

  3. Jason Wang reporter

    Sounds good. Note that to not break p1640 data and the interface, we should make new fields for output centers/wcs headers and not modify how the input centers/wcs headers are defined.

  4. Log in to comment