A simple C library for finding nearest neighbors of positions in 3D space.
The code on its own is mostly ignorable. Likely, with some additional discussion it provides something of value.
The Creation of Nrnr, Part 1
Several weeks ago I had some ideas on how to find nearest neighbors in a collection of points in 3D space. For many visual effects studios this means using something like a kdtree.
A long, long time ago I was working at a place where we need to combine all points in the same position into the same point. We had written the straightforward approach of comparing all points with each other. This turned out to be slow. At the time we were using a piece of commercial software, Prisms from Side Effects Software, that had the same operation, but ran orders of magnitude faster than our own. The wise +Mark Elendt explained the secret; Prisms would sort the points along the first dimension.
Driving in to work recently, I remembered these interactions and wondered what if you were to do the same thing across all three axis. Hopeful, but cautious, I began creating some experiments that would, at best, end the interest and use of kdtree structures for all time.
The end result is a cleaned up codebase I named "nrnr" for nearest neighbor searching. Unfortunately, this is not appear to revolutionize anything. There are some cool highlights. I learned a lot along the way, which was my main motivation for packaging up the final code and writing a series of posts that descrive the journey.
The Creation of Nrnr, Part 2
There were going to be three main parts to my algorithm. First, sort the points along each axis. Second, find the ranges of possibly valid points for each axis. Finally, find the intersecting points across each axis.
Step one was the sort. This was simple, and my first implementations used the stdlib qsort. Later I moved to a hand written quicksort that switched to an insertion sort once the range of numbers got small enough. The custom sort did not turn out to be a large improvement over the stock qsort.
One big win I did was to play around with OpenMP. Sorting each axis can be done independently. A few quick pragma statements and I was creating my data structure three times faster. The OpenMP thing was just for curiousity, and the final versions of the source run in a boring, single thread.
The Creation of Nrnr, Part 3
Once I had sorted the positions along each dimension I could begin the search for positions inside an area. The first step of the search is to find the ranges of points to use along each dimension.
It turns out the straightforward bisect is all that is needed. Give it 10 million points and bisect gets you the range in "free time". I was expecting to figure out some clever techniques to keep the bisect fast, but it simply is not an issue. At the late stages I added a few optimizations that help it do less searching when bisecting for the last index, after if had found the first index.
In the end, bisecting was easy and did not present room for optimizations. I did have to work with it a bit to ensure it was really finding the right "floor" and "ceiling" indices for the range I wanted.
Later in Nrnr's development the pool of points was split into smaller sections. When there got to be thousands of sections the bisection time did start to show up in measurements. I tried to add several early-outs, but the evenly distributed random dataset I had managed to artfully avoid those.
There are many areas of Nrnr that greatly benefit by splitting the point array into smaller sections. In those cases, the bisect is the biggest bottleneck. If I ever find some inspiration or algorithm I may attack the bisection again some day.
The Creation of Nrnr, Part 4
So now Nrnr has a range of sorted point indices across each dimension. Time to find the intersection of all these ranges. Then we'll know which points are inside the search area.
My first implementation sorted each of those ranges then marched through them looking for matches, incrementing the smallest index. This was several times too slow, so I needed an alternate solution.
What I ended up with was a second set of arrays that allowed me to find the index of each point inside the next axis. So, if at index 5 of the first dimension we had a reference to point 400, I had an array that told me which index contained point 400 on the next dimension. Since each dimension knew the first and last index inside its range, I could step through an index across each dimension, stopping as soon as the index was outside the range.
It turns out when dealing with ten million points, the biggest cost in performance in all this is each memory lookup into the arrays. With a wide search area I was getting several hundred thousands of matches across each dimension. The intersection of these ended up being a few tens of thousands of points. Originally it was hard to beat iterating across the dimension with the fewest points, and then jumping directly to look up the position to see if it fell inside the range of interest.
By incresing the required memory a bit, I could generate the index of any index on any dimension to all the other dimensions. Or only keep the index from the first dimension to all the others. That actually might be something worth trying, but could end up with more work to do if the first dimension happened to have an abnormally large number of the points inside its search area.