can distribution overlay plotting be simplified by using sns.FacetGrid?

Issue #29 new
Thomas Gilgenast created an issue

currently in lib5c.plotters.distribution.plot_global_distributions() we use a for loop like this to overlay distributions:

    for rep in reps:
        sns.kdeplot(
            flattened_counts[rep],
            shade=shade,
            color=colors[levels[labels[rep]]],
            label=labels[rep]
        )

this already feels like a hack, but it also makes legend construction much more complicated as well:

    legend_labels = list(set(levels.values())) if hue_order is None \
        else hue_order
    legend_handles = [mpatches.Patch(color=colors[l]) for l in legend_labels]
    plt.legend(legend_handles, legend_labels, scatterpoints=1, loc='upper left',
               bbox_to_anchor=(1, 1.05))

we wonder if it might be possible to use sns.FacetGrid to simplify this, for example:

    df = pd.DataFrame([{'count': flattened_counts[rep][i], 'label': labels[rep]}
                       for rep in reps
                       for i in range(len(flattened_counts[rep]))])
    g = sns.FacetGrid(df, hue='label', hue_order=hue_order,
                      palette={label: colors[levels[label]]
                               for label in labels.values()})
    g.map(sns.kdeplot, 'count', shade=shade)
    plt.legend(loc='upper left', bbox_to_anchor=(1, 1.05))

this calls into question how custom legend creation and hue_order works in lib5c's reps/labels/levels/colors convention: custom legend elements are made per level, not per rep, and in parallel these elements are ordered by hue_order, a list of levels

in practice, this behavior is actually being overridden in clients like comparison-manuscript's draw_distribution.py, which removes the legend and then manually adds one with per-replicate entries

the change proposed in the snippet above only works if hue_order is a list of labels (e.g., hue_order = labels.values() works; colors can still map levels to colors to allow for clustering-style visualizations, only hue_order poses a problem here)

Comments (2)

  1. Thomas Gilgenast reporter

    we considered removing the requirement that clients following the reps/labels/levels/colors convention had to manually construct colors using the following code block:

        if colors is None:
            unique_levels = list(set(levels.values()))
            palette = sns.color_palette('husl', len(unique_levels))
            colors = {unique_levels[i]: palette[i]
                      for i in range(len(unique_levels))}
    

    if colors was not passed, we could construct the sns.FacetGrid with palette={label: colors[levels[label]] for label in labels.values() if colors else None} however, this color inference is still necessary at the tool level when calling these plotting tools with the -s/--separate_colors flag, where we would need to override seaborn's default color palette construction, which would assign different reps in the same level to different colors. this could be handled at the tool level by defining the following function in lib5c.util.plotting:

    def make_default_palette(levels):
        """
        Creates a palette dict (mapping level names to colors) from a list of levels
        by assigning them random evenly spaced colors from the husl color space.
    
        Parameters
        ----------
        levels : list of str
            The list of levels to assign colors for.
    
        Returns
        -------
        dict
            The palette dict, mapping level names to colors.
        """
        unique_levels = list(set(levels))
        palette = sns.color_palette('husl', len(unique_levels))
        return {unique_levels[i]: palette[i] for i in range(len(unique_levels))}
    

    importantly, this could move the responsibility of automatic inference of colors to tool-level clients which would inherit this responsibility as a result of supporting the -s/--separate_colors flag

    in the end, however, it seems to be useful for any client - not just tool-level clients - to have the option to accept only levels and auto-assign colors from there (otherwise we might as well just get rid of levels entirely if the client always has to hand-construct and pass both kwargs)

  2. Log in to comment