Semantics of shared arrays and variables

Issue #10 resolved
Amir Kamil created an issue

The following are various semantic issues that need to be clarified with respect to shared variables and arrays. These are related to issue #4, but as they affect both shared variables and arrays, I've chosen to create a new issue to collect them.

1. Lifetime of shared objects

Section 3.1 states: "Shared variables should be defined in the C++ global scope as their lifetimes are expected to be the entire program execution." This is an ambiguous statement. What does it mean that the lifetime is "expected" to be the entire program execution? Is it legal to define a shared variable outside of the global scope? What is meant by "global scope?"

I believe the intention here is that a shared object must have static storage duration (see here), which is distinct from the scope of any names that reference it.

If shared objects are not required to have static storage duration, then we need to specify complete semantics for creating and destroying them, including whether or not these operations are collective.

The spec also states that "all static shared variable are stored on rank 0." What about non-static shared variables? Does "static" here refer to the qualifier or the storage duration?

"Objects of shared variable type are not initialized by default." I'm not sure what this means. I think the intention is that the storage that the shared object logically represents is not initialized, not that the object itself is not initialized.

2. Copy semantics for shared objects

What happens when a shared object is copied? In particular, what does the following code do?

shared_var<int> x = 42;
shared_var<int> y = x;

Related to this is whether or not shared objects may be used as parameter and return values.

The fundamental question here is whether an object of type shared_var or shared_array represents data or metadata. In other words, does it act as a value or a reference?

Also related is whether or not a shared object can be passed between ranks, whether directly or indirectly through a global pointer (e.g. global_ptr<shared_var<T>>).

3. Discussion in 5/13/15 meeting

These issues may be obviated by the answer to the first one above, but I would like to document our discussion in the UPC++ meeting on 5/13/15, which involved construction of shared objects, particularly from a non-collective context.

What does construction of a shared object mean in a non-collective context? Is this allowed? Does constructing a shared variable on rank i allocate storage for it on rank i or on rank 0?

Can a shared object be constructed using the new operator? What does this do? Is there a semantic difference between creating a shared object with or without new? What is the lifetime of the resulting shared storage in either case?

Comments (14)

  1. Yili Zheng

    These are very good points for discussion. I will try to give my thoughts on each bullet separately.

    1. Lifetime of shared objects Section 3.1 states: "Shared variables should be defined in the C++ global scope as their lifetimes are expected to be the entire program execution." This is an ambiguous statement. What does it mean that the lifetime is "expected" to be the entire program execution? Is it legal to define a shared variable outside of the global scope? What is meant by "global scope?"

    I will change "global scope" to "file scope", which is in line with the UPC spec and C++ spec.

    I believe the intention here is that a shared object must have static storage duration (see here), which is distinct from the scope of any names that reference it. If shared objects are not required to have static storage duration, then we need to specify complete semantics for creating and destroying them, including whether or not these operations are collective. The spec also states that "all static shared variable are stored on rank 0." What about non-static shared variables? Does "static" here refer to the qualifier or the storage duration?

    At least in the first version of the spec, I intend to limit shared_vars to have static storage duration, which means the storage for these entities shall last for the duration of the program. This is in line with UPC.

    "Objects of shared variable type are not initialized by default." I'm not sure what this means. I think the intention is that the storage that the shared object logically represents is not initialized, not that the object itself is not initialized.

    Yes, objects contained in shared_var were not initialized in the past. But with the auto init feature available, I think we should initialize the object in the shared_var template and delete that sentence.

  2. Yili Zheng

    2. Copy semantics for shared objects What happens when a shared object is copied? In particular, what does the following code do? shared_var<int> x = 42; shared_var<int> y = x;

    The current design of shared_var is copy-by-value.

    *Related to this is whether or not shared objects may be used as parameter and return values. *

    I intend to prohibit using shared_vars as parameters and return values in the first version of spec. But if someone wants to do the design and implementation in the current time frame, I think that would be great and look forward to incorporating it.

    The fundamental question here is whether an object of type shared_var or shared_array represents data or metadata. In other words, does it act as a value or a reference?

    This probably needs some discussion. The current design is that shared_var represents data (value) but shared_array represents metadata (reference). Therefore, it's legal to use shared_array as function arguments and return values.

    Also related is whether or not a shared object can be passed between ranks, whether directly or indirectly through a global pointer (e.g. global_ptr<shared_var<T>>).

    A shared object is not allowed to passed between ranks. But a global pointer to a shared object can. Taking the address of a shared object returns a global_ptr type. In UPC, this type is formally called "pointer-to-shared" but in UPC++ we decided to call it "global pointer".

    shared_var<int> a;
    global_ptr<int> pa = &a;
    // pa can be passed between ranks.
    
  3. Amir Kamil reporter

    I will change "global scope" to "file scope", which is in line with the UPC spec and C++ spec.

    At least in the first version of the spec, I intend to limit shared_vars to have static storage duration, which means the storage for these entities shall last for the duration of the program. This is in line with UPC.

    As far as I can tell, "file scope" is not a concept defined in the C++ spec. The spec defined the following scopes: block, function prototype, function, namespace (including global namespace scope, also called global scope), class, enumeration, and template parameter. I see no reason to prevent variables of shared object type from having block, namespace, or class scope, as long as they have static storage duration.

    It seems to me that the relevant restriction is on a shared object's storage duration, not on the scope of any names that reference it. So I see no reason at all to say anything about scope in the UPC++ spec. Instead, we should just declare that shared objects must have static storage duration.

    Yes, objects contained in shared_var were not initialized in the past. But with the auto init feature available, I think we should initialize the object in the shared_var template and delete that sentence.

    If we do make this change, then we have to specify how we initialize the object. In particular, what constructor do we call? What if no constructor matches the specified requirements?

  4. Amir Kamil reporter

    A shared object is not allowed to passed between ranks. But a global pointer to a shared object can. Taking the address of a shared object returns a global_ptr type.

    This makes sense for shared variables, but I'm not sure it does for shared arrays. Since shared_array represents metadata, it seems reasonable to expect it to work when passed to another thread. And I'm not sure if it makes sense to produce a global_ptr when taking the address of a shared_array.

  5. Yili Zheng

    As far as I can tell, "file scope" is not a concept defined in the C++ spec. The spec defined the following scopes: block, function prototype, function, namespace (including global namespace scope, also called global scope), class, enumeration, and template parameter. I see no reason to prevent variables of shared object type from having block, namespace, or class scope, as long as they have static storage duration. It seems to me that the relevant restriction is on a shared object's storage duration, not on the scope of any names that reference it. So I see no reason at all to say anything about scope in the UPC++ spec. Instead, we should just declare that shared objects must have static storage duration.

    Namespace scope is fine as it's similar to C file scope. Other scopes are trickier because the creation of shared_var is a symmetric/collective operation but we can't guarantee collective entry of a function or symmetric creation of an object in general.

    *In particular, what constructor do we call? What if no constructor matches the specified requirements? *

    Will require the object class to provide a copy constructor. For example,

    shared_var<MyClass> a = b; // calls MyClass(b)
    
  6. Amir Kamil reporter

    Namespace scope is fine as it's similar to C file scope. Other scopes are trickier because the creation of shared_var is a symmetric/collective operation but we can't guarantee collective entry of a function or symmetric creation of an object in general.

    I don't understand why objects declared at class or function scope pose a problem if they have static storage duration. For example:

    class foo {
      static shared_var<int> x;
    };
    
    shared_var<int> foo::x = 3;
    
    void bar() {
      static shared_var<int> y = 4;
    }
    

    Both foo::x and y have static storage duration. They are initialized at program startup, which is a collective context. Neither have namespace scope; foo::x has class scope and y has function scope.

    Will require the object class to provide a copy constructor.

    What is a initialized to in the following code?

    shared_var<MyClass> a;
    
  7. Yili Zheng

    This makes sense for shared variables, but I'm not sure it does for shared arrays. Since shared_array represents metadata, it seems reasonable to expect it to work when passed to another thread. And I'm not sure if it makes sense to produce a global_ptr when taking the address of a shared_array.

    The address of a shared_array element (e.g., &A[i]) is a global pointer. The address of a shared_array (e.g., &A) is just a regular local pointer since shared_array itself is just metadata. A shared_array is a collective object but the local metadata on each rank are different. I think it's possible to make the metadata of a shared array globally sharable but that comes with some additional performance overheads and implementation complexities.

  8. Yili Zheng
    class foo {
      static shared_var<int> x; // declared in the class scope
    };
    
    shared_var<int> foo::x = 3; // defined in the file/namespace scope
    
    void bar() {
      static shared_var<int> y = 4; // this won't get executed if bar() is not called
    }
    
    shared_var<MyClass> a; // calls MyClass()
    
  9. Amir Kamil reporter

    You are correct, block-scope static variables aren't guaranteed to be initialized until the block is entered. So we can't allow them.

    However, as far as I can tell, scope is defined in the C++ scope for names/declarations, not for definitions. So to be precise and match the C++ spec, we should say that shared objects can only be declared at namespace or class scope. We should also specify for clarity that shared objects have static storage duration.

  10. Amir Kamil reporter

    After looking into the terminology more closely, I've changed my mind again and think we should say that shared objects can only be defined at namespace or class scope. While scope isn't specifically defined for definitions, I see no reason to prevent an extern declaration at block scope:

    void foo() {
      extern shared_var<int> x; // declaration (but not definition) at block scope; should be OK
    }
    

    My only concern about using the term defined is that the spec is unclear in what scope a class member definition occurs:

    class bar {
      static shared_var<int> y; // declaration at class scope
    };
    
    shared_var<int> bar::y = 4; // definition; unclear what scope this is
    

    However, I think we can cover our bases if we use the following wording: Shared variables and arrays may only be defined as static class members or at namespace scope.

  11. Yili Zheng

    The new wording sounds good to me. If we can resolve issue #12, then we can loosen the restrictions of shared variable and arrays even further.

  12. Log in to comment