implement "sample size" option in "Create unsupervised dataset (from feature raster)" algorithm

Issue #1062 resolved
Andreas Janz created an issue

Comments (9)

  1. Andreas Janz reporter

    Note that the given sample size is only an approximate upper limit. No data pixel will further decrease sample size. I will update the description to reflect that.

  2. Agustin Lobo

    Yes, good idea to make the remark to the user. It would be nice getting the actual nb of samples really used in the result.

  3. Andreas Janz reporter

    The actual sample size/shape [nsamples, nfeatures] should be printed in the log and is also available here:

  4. Agustin Lobo

    When I do this in R, having set n samples, I generate 1.5*n random points and discard those with NA values. If the new number is <n, then I generate 2*n points and filter out again (this never occurs as I never have so many NA values). As soon as I get the filtered nb of points >=n,

    I take the first n points.

    But this could be unnecessary: it the user sets n and gets a much lower final number, he/she can just re-run with a higher n.

  5. Andreas Janz reporter

    Yeah, I also thought about beeing a little bit more clever to get an exact number of samples. But haven’t done it so far.

    For now, I use the exact same logic as QGIS, when it calculates BandStatistics with a given SampleSize:

  6. Log in to comment