KDE Bandwidth and Resolution Configuration

Issue #14 closed
Daniel Marsh-Patrick repo owner created an issue

We need to see if we can calculate optimal KDE bandwidth/resolution, or if not, certainly allow the end-users to tailor its configuration. If we get this wrong then it may not show the distribution of data correctly.

Comments (8)

  1. Daniel Marsh-Patrick reporter

    We might be able to do this with science.js as it has some KDE functions and bandwidth derivation. however we may still wish to allow the end-user to tailor this.

  2. Daniel Marsh-Patrick reporter

    I'll add some basic number inputs to begin with for each and then start experimenting with the ways we could suggest bandwidth/resolution

  3. Daniel Marsh-Patrick reporter

    #14: Refactored kernels so that the bandwidth adjustments are done in the main KDE function; Set default kernel to Epanechnikov; Added triweight kernel.

    Resolves #14

    → <<cset 3b578aa395ee>>

  4. Daniel Marsh-Patrick reporter

    This has been resolved as follows:

    • Resolution has been modified to an enum that offers the following factors:
      • 10 - Low
      • 25 - Standard (Default)
      • 50 - High
      • 100 - Very High
    • The above are fed into the axis ticks function to produce an array of sufficient length to offer reasonable changes to the resolution (based on 2 sets of test data I'm using; we may need to revisit later with more profiling)
    • Have researched a number of articles on KDE bandwidth selection and have implemented an auto-bandwidth selection based on Silverman's rule-of-thumb, with factos for the following kernels (which have been added as additional properties in the pane):
      • Epanechnikov (default)
      • Gaussian
      • Quartic
      • Triweight
    • I've added a Specify Bandwidth setting, which will allow the user to manually override with a number of their choosing
    • Because of this, I've opted to calculate the optimal bandwidth over the entirety of the selected data set, rather than individual groups, as it's difficult to apply this to individual series at this time. We can possibly review later on if necessary
  5. Log in to comment