reduce boilerplate re-use in toolbox

in particular, the following sequence is reused way too often

# expand infiles
expanded_infiles = []
for infile in args.countsfiles:
    expanded_infiles.extend(glob.glob(infile))

# resolve level
primermap, resolved_level, resolved_primerfile = resolve_level(
    expanded_infiles[0], primerfile=args.primerfile, level=args.level)

# load counts
print('loading counts')
if resolved_level == 'fragment':
    counts_superdict = {infile: load_primer_counts(infile, primermap)
                        for infile in expanded_infiles}
elif resolved_level == 'bin':
    counts_superdict = {infile: load_counts(infile)
                        for infile in expanded_infiles}
else:
    raise ValueError('invalid level')

though of course there may be others

in general, a *_tool() function should do the following steps:

imports
calls to helpers (parallelization, loading from disk, discerning labels)
custom code to convert the exposed API (command line flags) to the upstream API (kwargs on the high-level scripting function)
calls to high-level scripting functions (may be nested in logic ladders that depend on the command line flags)
write plots to disk (if this is a plotting tool and it is obeying the new-style plotting API, where high-level plotting functions return axes and do not actually save the figure to disk)

if any other kind of logic is being performed in a *_tool() function, it should either be extracted as a helper, or the high-level scripting function should be refactored to simplify its API

Comments (5)