interval histogramming

Issue #3 resolved
Justin Mason created an issue

Hi there -- this looks great, and thanks for implementing it!

I have one feature request, however. Would it be possible to support HdrHistogram's interval recording mode?

As I describe at http://taint.org/2014/01/16/145944a.html , IMO the default way that Metrics records Histograms and Timers is incorrect when publishing to a time-series store like Graphite or Ganglia, since it incorrectly "smears" recorded data beyond the time point when it actually occurred, and a more correct approach is to restart with an empty histogram at the start of each interval, once the previous interval has been recorded. As far as I can tell, HdrHistogram.getIntervalHistogram() supports this nicely -- and if HdrHistogramReservoir was to omit tracking the runningTotals, it would too. ;)

Perhaps a nice way to do this would be to provide a second Reservoir implementation which implemented this behaviour -- HdrHistogramIntervalReservoir, or something like that. WDYT?

I'd be happy to provide a pull request, if you're interested.

Comments (8)

  1. Marshall Pierce repo owner

    I've been thinking about the relationship between gathering samples and the storage/display/alerting system that uses them for some time, and I'm rather unhappy with the lack of flexibility we have right now in Metrics (along with the annoying inefficiencies forced upon us by the Reservoir interface, ah well). Maybe I'll do something about it for Metrics 4...

    Anyway, to your point: yes, I think it's a great idea to give people the choice to have a reservoir that resets on each snapshot. I do wish that there was a better axis upon which to control that than by simply plugging in a different reservoir implementation, but that's what we have right now.

    See what you think of https://bitbucket.org/marshallpierce/hdrhistogram-metrics-reservoir/branch/feature/reset-on-snapshot#diff. Is this what you're after? Do you have any ideas for better names? And of course I'll write some tests when I have a little more time.

  2. Justin Mason reporter

    Awesome, that looks perfect ;) Thanks! (One minor quibble -- updateRunningTotals() no longer does what the method name says it does, so could do with a rename. also, s/percentage/percentile/ in the javadoc comment.)

    I agree that putting it into the reservoir is slightly imperfect -- it really should be under the control of the Reporter. But it at least works now ;) I guess that could be considered for the Metrics 4 timeframe alright.

    Regarding naming, I can't come up with anything better, really. Maybe "HdrHistogramResetOnReportReservoir", perhaps? For a Metrics user, they interact with this in terms of how frequently the Reporter calls it. The fact that the Reporter calls getSnapshot() under the covers is not visible to the user.

    cheers!

  3. Marshall Pierce repo owner

    Good catch on the confusing name. Fixed.

    I think I prefer to keep the name more descriptive of what it does vs how it could be used so as to not be confusing in other use cases; people probably shouldn't be monkeying in the reservoir -> reporter flow like this unless they know what snapshots are and how they're used. :)

    I've deployed 1.0.2-SNAPSHOT to the OSS snapshots repo if you care to give it a try.

  4. Log in to comment