use Numpy nan-functions in Aggregate raster layer bands algorithm.

Issue #977 resolved
Andreas Janz created an issue

Requested by @Florian Katerndahl .

We could just always use the NaN functions (e.g. np.nanmean) instead of the standard functions (e.g. numpy.mean), but the numpy.nanpercentile function is implemented poorly and iterates over each pixel in Python, which is very slow.

I would introduce a checkbox “Exclude no data values”, which is checked per default and will use the nan-functions. If it is unchecked, we use the standard functions.

In the long run, we should reimplement the np.nanpercentile function, which is possible using standard numpy functions. I've already done it in the past, but didn’t have an implementation right now.

Comments (7)

  1. Andreas Janz reporter

    In the long run, we should reimplement the np.nanpercentile function, which is possible using standard numpy functions. I've already done it in the past, but didn’t have an implementation right now.

    I found it 🙂

    def nanpercentiles(arr, percentiles, copy=True):
    
        def zvalue_from_index(arr, ind):
            nB, nL, nS = arr.shape
            idx = nS * nL * ind + nS * numpy.arange(nL)[:, None] + numpy.arange(nS)[None, :]
            return numpy.take(arr, idx)
    
        if copy:
            arr = arr.copy()
    
        # valid (non NaN) observations along the first axis
        valid_obs = numpy.sum(numpy.isfinite(arr), axis=0)
        invalid_pixel = valid_obs == 0
    
        arr[numpy.isnan(arr)] = numpy.Inf
    
        # sort - former NaNs will move to the end
        arr = numpy.sort(arr, axis=0)
    
        result = []
        for percentile in percentiles:
    
            # desired position as well as floor and ceiling of it
            k_arr = (valid_obs - 1) * (percentile / 100.0)
            f_arr = numpy.floor(k_arr).astype(numpy.int32)
            c_arr = numpy.ceil(k_arr).astype(numpy.int32)
    
            # linear interpolation (like numpy percentile) takes the fractional part of desired position
            floor_value = zvalue_from_index(arr=arr, ind=f_arr)
            floor_weight = (c_arr - k_arr)
            ceil_value = zvalue_from_index(arr=arr, ind=c_arr)
            ceil_weight = (k_arr - f_arr)
            floor_weight[f_arr == c_arr] = 1.  # if floor == ceiling take floor value
    
            quant_arr = floor_value*floor_weight + ceil_value*ceil_weight
    
            # fill invalid pixels with fill value
            quant_arr[invalid_pixel] = numpy.NaN
    
            result.append(quant_arr[None])
    
        return result
    

  2. Log in to comment