API Consistency

Issue #26 resolved
Former user created an issue

With more authors adding content, I can see inconsistency creeping into our user-facing APIs. This needs to be squashed. I think these should be uncontested as this is how the standard looks:

1. Symbol names, variables, class members, and templated and non-templated types need to be all lowercase. Underscores may or may not separate words depending on what what feels right.

2. #define's are UPPERCASE and prefixed with UPCXX_.

3. Template type arguments (class/typename), and template template arguments are capitalized CamelCase.

4. Template value arguments (int, etc) are as in rule 1.

5. Don't use the class keyword to declare template type arguments. Use typename.

My personal flare follows. I expect opinions will begin to differ. Please try to come armed with objective evidence as to why a particular convention is superior. For instance: fewer keystrokes but still immediately clear to the properly indoctrinated. With all the power and sway that I can muster, I will not tolerate conventions founded on the shortcomings of your favorite editor, though I expect those to come up more in coding-style rather than API naming. In the likely absence of objectivity I propose we give final say to Bryce (@blelbach).

6. Collections of things (lists, vectors, etc) should be named with a plural noun, like std::vector<T> stupid_ranks.

7. Population counts, possibly of the aforementioned collections, should be singular nouns suffixed with "_n", as in stupid_rank_n.

8. Try to order the words in a name such that lexicographic sort will group our API along conceptual boundaries for the important things in big namespaces. Like upcxx::rank_n() and upcxx::rank_me() instead of upcxx::ranks() and upcxx::myrank(). This rule can disrupt natural English flow (ranks_stupid seems worse than stupid_ranks), so use best judgement. Also getters/setters violate this. I won't discourage class members like promise::get_future() since the get_ prefix helps avoid collision with the future type. But let's try and keep get_/set_ out of the bare ucpxx namespace. Also, there is no need to add words just to coerce sorting into grouping things together. allocate and deallocate are fine just as they are.

9. Singular template type arguments should be named T if nothing else fits better.

10. Template arguments for function objects are usually F, I've been leaning towards Lambda to make it really clear we support lambdas (which is the intended style). Want to encourage lambda-centric thinking. Example:

template<typename Lambda>
void rpc_ff(intrank_t, Lambda&&);

Let's keep any discussion to the API style. Coding style like indentation and curly placement do not belong in the spec repo.

Brian's recent commits broke consistency with API I had already laid down in other sections. Let's hash this out now:

  • The typedefs for rank indices should be upcxx::intrank_t and upcxx::uintrank_t. We could add an underscore (upcxx::int_rank_t) to be like std::int8_fast_t and friends, but without the underscore is like std::intptr_t which is shorter hence better. Just rank_t alone does not communicate to the perusing reader that it is an integer, which in so many contexts is a necessity to understand.

  • Let's favor using the signed rank indices over the unsigned even though we probably won't support teams numbering their ranks in the negatives. Compilers produce faster loops with signed numbers than unsigned thanks to the UndefinedBehavior of signed overflow. This applies to all of our "casual" integers really. Leave unsigned for the cases where you need modular arithmetic or the maximal information content per stored bit.

  • Let's go with ucpxx::rank_n() and upcxx::rank_me() instead of upcxx::myrank() upcxx::ranks() as justified by the rules above.

  • The global team should not be called upcxx::global(). Global is not a noun, let's use a noun like upcxx::world().

Comments (16)

  1. Dan Bonachea
    Compilers produce faster loops with signed numbers than unsigned thanks to the UndefinedBehavior of signed overflow. 
    

    Do you have a reference for this claim?

    I've actually heard the opposite, because the common special case of unsigned division by a power of two can always be generated as a (generally much faster) shift instruction, however the same transformation cannot be performed on signed integers unless compiler analysis can first prove the value is non-negative (and value analysis is notoriously conservative).

    Also, all modern architectures natively use 2's complement integer representations that automatically enforce the defined overflow behavior of unsigned, so there are no extra instructions or overhead to provide that behavior.

    Note this is orthogonal to usability issues such as which type meshes best with other types in the language, or programmer taste preferences.

  2. Former user Account Deleted reporter

    I'm googling for a reference now. Bryce believes this too and he's had extensive experience fighting compilers to get loops to unroll and vectorize.

    Nvidia says this (http://docs.nvidia.com/cuda/maxwell-tuning-guide/#smm-instruction-throughput)

    Note: As was already the recommended best practice, signed arithmetic should be preferred over unsigned arithmetic wherever possible for best throughput on SMM. The C language standard places more restrictions on overflow behavior for unsignedmath, limiting compiler optimization opportunities.

    And this guy points out the same in GCC: https://kristerw.blogspot.com/2016/02/how-undefined-signed-overflow-enables.html

    And LLVM says so too: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

  3. Dan Bonachea

    Nvidia says this (http://docs.nvidia.com/cuda/maxwell-tuning-guide/#smm-instruction-throughput)

    This statement is talking about a GPU architecture (which won't be running UPC++ library code), not a general-purpose processor.

    And this guy points out the same in GCC: https://kristerw.blogspot.com/2016/02/how-undefined-signed-overflow-enables.html

    This is a good argument for why signed overflow should remain undefined, but that's not the same as claiming signed is faster than unsigned in general (ie many of the more important axioms he lists for signed with undefined overflow also apply directly to unsigned types).

  4. BrianS

    In my limited experience with wrestling compilers to generate vectorized code it has changed multiple times over the years. The current flavoring is signed integer makes the vectorizer happier, since it can be certain that a[i] is adjacent to a[i+1]. The other demon is alignment.

    the intel compiler is aggressive in generating the special case code so that both types get vectorized. gcc and clang give up for unclear reasons but one of them is adjacency issues.

    but, I don't think this is a big issue for UPC++.

  5. BrianS

    I suggest we put all these things into a Coding Standard. It's what we do with Chombo. we can put this document right next to the spec. make can build both documents. I had made a slack channel for coding standard at the start of the project, but it has sat quiet.

  6. b

    I've only skimmed about this and don't have time to give this the deep thought it needs, but at a glance I concur with everything Jon says.

  7. Scott Baden

    Guys, referring back to John's start of this issue: we need to have a short appendix, or a table somewhere, that lists our coding standards. It will be a sure way to catch ourselves, and also give our readers a heads up. I've put a lot what John wrote into [this document]https://docs.google.com/document/d/1kyRufQcxqiZONW663dGMM9XmZcnUnJJWnOuMyHs6jEY/edit). Please modify as needed and keep in mind that a version of the document will find its way into the standard document. Scott

  8. b

    Not a Google doc, please. I agree we should have this document but it should be in the GitHub repo for the codebase.

    -- Bryce Adelstein Lelbach aka wash Lawrence Berkeley National Laboratory ISO C++ Committee Member CppCon and C++Now Program Chair

    Transmitted from my Android Mobile Command Center

  9. Scott Baden

    I've just interpolated the google doc into Conventions.tex, and made my google doc read only. It needs some massaging.

  10. BrianS

    The coding standard should be it's own document and not in the spec. I already suggested this on slack.

    also, the Conventions.tex check in broke the spec build.

    i will pull it out to it's own document. Conventions is for describing our spec document conventions, not our UPC++ coding standard.

    thanks.

  11. Scott Baden

    It is not broken. I always check before committing.

    Agreed " Conventions is for describing our spec document conventions, not our UPC++ coding standard."

    At yesterday's meeting with STRUMPACK, Sherry comments on the need for a user's guide. It belongs there, but that probably won't get started until after the spec has been release 6/30

  12. Scott Baden

    Where does Dan's PGAS glossary fit? I added it to Conventions. But see the comment about consistency model

  13. Dan Bonachea

    I believe this issue is resolved in spec v1.0 draft 2.

    Please open new, specific issues for any remaining consistency problems.

  14. Log in to comment