Grid/Array naming
Ch 8: "For a non-reference type T, the type “N-dimensional grid with element type T” is denoted: ndarray<T, N> where N is a positive compile-time int constant."
Currently the normative text of the spec uses the term "grid" almost everywhere, the chapter 8 headings use the term "Array", but the type name is defined to be "ndarray<>".
Is there a strong motivation for this inconsistency in terminology? Why isn't the type name "grid<>"? Why do Chapter 3 and Chapter 8 titles both use the term "Array" and yet discuss two completely different datatypes?
As a first-time "user" of UPC++ this naming inconsistency for core features seems like an invitation to future confusion and ambiguity. I would strongly recommend we choose one term for each datatype and apply it universally (in text, chapter titles and type names).
Comments (7)
-
reporter -
I agree with Dan that "distributed_array" is more appropriate than "shared_array". Originally I used shared_array to mimic UPC shared arrays but I've no problem changing the name, especially before the first version of spec.
Additionally, inspired by Bill Carlson, I think it would be very nice to use a single "shared<T>" template for everything lives in the global address space. But I'm not sure if this is implementable and how complicated the semantics would be. We would need a bit more design and a proof-of-concept prototype before committing to this direction.
-
The distinctions between shared_arrays and the ndarrays described here needs to be clearer. The ndarrays are never distributed, while the shared_arrays are almost always distributed, correct? Having a single shared data type that can handle both cases would be great, but agree with Yili that the semantics might become complicated. Perhaps a section near the beginning highlighting the differences between various provided array/domain/grid types and how they are related would be helpful.
-
The term "grid" is carried over in the spec from Titanium. The typename "ndarray" is inspired by NumPy. I agree that this is inconsistent. Do you prefer one over the other?
As for "shared_array", Yili at one point suggested combining shared arrays and shared variables in a single type, which hopefully would be simpler and more doable than a generic "shared<T>" template. If we can unify them, then maybe we can avoid the "shared_array" terminology.
I think Dan makes a good point about distinguishing between an entity that lives in the global address space and an entity that has a specific distribution across UPC++ ranks. If I remember correctly, that's we we chose the term "global pointer" rather than "shared pointer" in UPC++ for the former. So I agree that we should make it clearer that "grids" (or whatever we decide to call them) are in the global address space but are not distributed across ranks.
-
-
assigned issue to
-
assigned issue to
-
I move that we close this issue, as it is not relevant to v1.0.
-
reporter - changed status to resolved
- edited description
- Log in to comment
On a related note, the name "shared_array" (for the datatype described in Ch 3) seems like a bad choice, since it seems to imply that other array data structures cannot shared (and it invites confusion/ambiguity when discussing other arrays that live in shared space).
Perhaps a better term would be "distributed_array", since parallel distribution seems to be their defining characteristic?