Commits

Anonymous committed d163f9e

Benchmarks for Ruby vs. Rubinius array manipulation

Comments (0)

Files changed (6)

+This small project provides benchmarking for various approaches to
+implementing the following function:
+
+Array#topn(n) { |a,b| } => ary
+
+where	n	is the maximum number of elements to select
+	ary	is an array containing the top n elements
+
+The block should return:
+-1 if a is ahead of b
+ 0 if they are equal
+ 1 if a is after b
+
+The returned array does not have to be sorted itself, it merely
+must contain the top n items.
+
+I've supplied three versions of this method:
+
+topn0.rb	the obvious sort and slice
+topn1.rb	collect only the top n by insertion sort
+topn2.rb	collect only the top n, without sorting
+
+If also provided the following two benchmark harnesses:
+
+test_ruby.rb	uses the 'profiler' module for Ruby
+test_rbx.rb	uses the -Xprofile option for Rubinius
+
+Invoke either of these at the command line, specifying the
+version of the method to use as the first argument.  An optional
+second argument is the number of top items to collect (default = 15),
+and the optional third argument is the maximum size of the array to
+test (default = 100).
+
+Using Ruby 1.8.7, topn0.rb beats the others hands down in my testing.
+Surprisingly, with Rubinius this approach is the slowest of the three.
+Apparently, Rubinius' quicksort algorithm needs improving.
+#!/usr/local/bin/rbx -Xprofile
+# Script to test various flavors of Array#topn
+
+require "#{ARGV[0]}"
+
+TOP_N = (ARGV[1] || 15).to_i
+MAX_LENGTH = (ARGV[2] || 100).to_i
+srand(1337)	# Always use the same sequence
+arrays = []
+(0..MAX_LENGTH).each do |len|
+  ary = []
+  len.times { ary << rand(MAX_LENGTH) }
+  arrays << ary
+end
+
+arrays.each { |ary| ary.topn(TOP_N) { |a,b| b <=> a } }
+#!/usr/bin/env ruby
+# Script to test various flavors of Array#topn
+
+require 'profiler'
+require "#{ARGV[0]}"
+
+TOP_N = (ARGV[1] || 15).to_i
+MAX_LENGTH = (ARGV[2] || 100).to_i
+srand(1337)	# Always use the same sequence
+arrays = []
+(0..MAX_LENGTH).each do |len|
+  ary = []
+  len.times { ary << rand(MAX_LENGTH) }
+  arrays << ary
+end
+
+Profiler__::start_profile
+arrays.each { |ary| ary.topn(TOP_N) { |a,b| b <=> a } }
+Profiler__::stop_profile
+Profiler__::print_profile($stderr)
+# The obvious 'sort and slice' approach
+class Array
+  def topn(n, &block)
+    self.sort(&block)[0,n]
+  end
+end
+# Collect only the top n so far, by keeping the array of the top n sorted from worst to best.
+class Array
+  def topn(n, &block)
+    self.inject([]) do |ary, elem|
+      ary.shift if (ary.length >= n) && (block.call(elem, ary.first) < 0)
+      if (ary.length < n)
+        max = ary.length - 1
+	if max > 0
+	  place = ary.each_with_index do |item, ndx|
+	    break ndx if block.call(elem, item) < 0
+	    break ndx+1 if ndx == max
+	  end
+	else
+	  place = 0
+	end
+	ary.insert place, elem
+      end
+      ary
+    end
+  end
+end
+# Unsorted approach to keeping only the top n, by keeping track of the worst case
+# and recomputing if needed.
+class Array
+  def topn(n, &block)
+    self.inject([[],nil,nil]) do |results, elem|
+      ary, worst, windex = results
+      if worst
+        if ary.length < n
+	  ary << elem
+	  if block.call(elem,worst) >= 0
+	    worst = elem
+	    windex = ary.length - 1
+	  end
+	else
+	  if block.call(elem,worst) < 0
+	    ary[windex] = worst = elem
+	    ary.each_with_index do |itm,ndx|
+	      if block.call(itm,worst) > 0 
+	        worst = itm
+		windex = ndx
+	      end
+	    end
+	  end
+	end
+      else
+        ary = [worst = elem]
+	windex = 0
+      end
+      [ary, worst, windex]
+    end.first
+  end
+end
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.