Issue #675 new

SoundData samples starting at 1 (and also ImageData pixels)

created an issue

I'm thinking that SoundData samples should start at 1 instead of 0, and I'm actually kinda thinking that ImageData pixels should too.

First I thought about SoundData. It seemed a bit strange that the first sample is number 0. Wouldn't it be more natural if SourceData:getSample(1) returned the first sample, and SourceData:setSample(2) set sample number 2, the second sample? And it seems to be the Lua way to be 1-indexed. Contrast! :D

samples = {-1, 0, 1}

for i = 1, SoundData:getSampleCount() do
    SoundData:setSample(i, samples[i])
samples = {-1, 0, 1}

for i = 0, SoundData:getSampleCount() - 1 do
    SoundData:setSample(i, samples[i + 1])

So, that seems to make sense to me. But then I thought... what about ImageData?

I remembered issue #248, where this was suggested. The conclusion seemed to be that ImageData arguments are coordinates, not indicies. But I wonder... are they? :D

I should mention that it totally makes sense to me why the coordinate system has an origin of (0, 0). It's so natural, (1.2, 0) is 1.2 units to the right, and (-1.2, 0) is 1.2 units to the left.

However, I'm not sure if ImageData pixels are really a coordinate system.

Coordinates can be:

  • Negative
  • Real numbers and not just integers
  • Transformed (scaled, rotated, etc.)

ImageData pixels aren't like this, they're refering to pixels, and I think they're closer to indicies than coordinates.

As well as being more natural use with Lua (as noted in #248), I think it's also more natural to think about. ImageData:getPixel(3,4) would return pixel number 3 horizontally and pixel number 4 vertically. It's like how a matrix might be represented with Lua tables, grid[3][4].

I can of course see how ImageData pixel positions are like coordinates, and I can also see how SoundData sample positions are like coordinates too; in audio editing/playing software it starts the "coordinate" position at 0. But, I think pixels/samples are different, they're not "points in space", they're more like "entries in a table" or "references to entities". Like, SampleData:getSample(1) would be the sample at "index" 1, and ImageData:getPixel(1, 1) would be the pixel at horizontal "index" 1 and vertical "index" 1. Actually I think I'm misusing the term "index" here (and, y'know, every other term I try to use), because referring to a pixel requires two values, but, yeah, I hope what I'm trying to say is still kind of understandable. :P

Comments (7)

  1. Alex Szpakowski

    I think it might make sense for SoundData to be 1-based in Lua.

    It does not make sense for ImageData. Having no easy way to transform them with built-in methods does not make pixel coordinates any less pixel coordinates, same deal goes for integers versus floats.

  2. hahawoo reporter

    I see! I'm still having a bit of trouble wrapping my head around it though. SoundData could make sense being 1-based in Lua because it's like an array, right? Like, sample N is the Nth sample, like a table. But why doesn't this make sense for ImageData, like X,Y is the pixel at the Xth column and Yth row, like a 2D table? If... if that made any sense? :P

  3. Bart van Strien

    Because for ImageData they are very much coordinates, even if you do not agree with me directly on that, their sole use is being turned into Images where they directly correspond with the coordinates in that, would it not be weird if suddenly everything got translated by (-1,-1)?

  4. hahawoo reporter

    Okay, so, pixel coordinates are coordinates too. :D

    But, would you agree that pixel coordinates and the global coordinate system are conceptually quite distinct?

    Do Image pixel coordinates directly correspond to the coordinate system? Wouldn't everything be translated by (-0.5, -0.5), with the Images pixels placed in the middle of the screen pixels, judging by how lines work? (This diagram by Boolsheet illustrates this: And what if the drawing was transformed? Is there a time when pixel coordinate numbers not lining up with coordinate system numbers causes things to be confusing?

    It seems to be that pixel coordinates are actually array-like, rather than LÖVE-global-coordinate-system-like. And, I'd suggest that the starting point for counting them is arbitrary; the top-leftmost pixel will always be the top-leftmost pixel. The concept of an "origin" doesn't really make sense for an image, right?

    I think Images and Sources are really similar like this. For a sound, there is a dimension of "seconds", and there are the actual samples. If you say Source:seek(100), what sample will that be? Well, it'll depend on the sample rate. But the first sample of the sound is always the first sample of the sound, and whether you reference it by 0 or 1 or "first", it's referring to an actual sample value, not a "position". Y'know what I mean?

  5. vrld

    Two cents on pixel indexing:

    Not every coordinate system must be the standard Cartesian one. It is totally possible to define a coordinate system on positive integers only (e.g. a 2D-vector space over natural numbers). So it makes sense to think of the x,y parameters to ImageData:[sg]etPixel() as coordinates.

    In image processing, pixel data is usually represented as a (multi-channel) matrix. In this context 1-indexing makes sense as it is consistent with mathematical notation. However, if one interprets pixel data as matrix, the pixel at (x,y) is indexed with img[y,x], since rows are indexed first. I am sure this will be the source of even more confusion than the current coordinate based indexing.

    Regarding the actual issue:

    As with image data, it depends on what you think SoundData represents. If you think of it as a list of samples, then yes: 1-based indexing should be preferred. On the other hand, you can think of it as values of continuous signal, where each index i corresponds to the signal at time t = i / samplerate.

    I am leaning towards the second view (i.e. sampling a continuous signal), since this view is more consistent with the usage in Source objects.

  6. hahawoo reporter

    Cheers! I think I have a better understanding now.

    So perhaps the question is, which way of looking at things is more useful in practice?

  7. vrld

    I guess that depends on the use case and personal preference. I prefer the signal/coordinate-system (0-indexing) view, as it makes both synthesis and analysis easier. This is also the quasi-standard accross many different libraries (both sound and images), so being different for the sake of being (seemingly) internally consistent might backfire: Seasoned programmers will be alienated and newbies will have difficulties to adapt techniques from tutorials using other libraries.

  8. Log in to comment