- marked as minor
pca tool's -s/--separate_colors should be extracted as a re-usable helper
this helper would be nice to have on other diagnostic plotting tools such as the distribution tool
it would also be nice to have for scripting whenever a counts_superdict
with reps from different conditions has been loaded
for reference, the usage for this on the tool side is to request a parameter
pca_parser.add_argument(
'-s', '--separate_colors',
type=str,
help='''Specify a shell-quoted, comma-separated list of class names
(which must be substrings of the replicate names) to color-code the
output with. For example, 'ES,NPC'.''')
on the API-side, the color-coding relies on two kwargs passed to the plotting function: labels
(equivalent to rep_order
, simply the name for each rep if the reps are passed in an unlabeled data structure) and levels
(a parallel list storing the condition name for each replicate in the same order as labels
)
under this API spec, the levels
can be computed with the following code block:
# determine levels
levels = None
if args.separate_colors is not None:
classes = args.separate_colors.split(',')
levels = []
for rep_name in rep_names:
target_class = None
for c in classes:
if c in rep_name:
print('assigning rep %s to class %s' % (rep_name, c))
target_class = c
break
if target_class is None:
raise ValueError('could not assign replicate %s to any of the '
'color-coding classes %s'
% (rep_name, classes))
else:
levels.append(target_class)
there is an alternate API possible, which is that if the plotting function accepts a labeled data structure such as a counts_superdict
, then labels
can be a dict mapping from keys of the counts_superdict
to short replicate names suitable for plotting as labels (defaulting to the identity map), and levels
can be a dict mapping from the keys of the counts_superdict
to condition names
currently lib5c.plotters.distribution.plot_global_distributions()
and lib5c.plotters.distribution.plot_regional_distributions()
use half of this second API, with the labels
being a dict
probably both approaches should be supported by the extracted helper - the code block above can be easily modified to return a dict instead of a list if needed
Comments (3)
-
reporter -
reporter it’s not likely that the CLI tools will be refactored unless we tackle #66. in the context of that proposal though, this would be a good candidate for a “middleware“ layer
-
reporter - changed status to on hold
- Log in to comment
this seems highly optional since things are working as they stand right now