Document upcxx-jsrun

Issue #436 resolved
Paul Hargrove created an issue

There is now a upcxx-jsrun installed on Summit.
A description of its use should replace the two inline wrapper scripts currently present in wiki's site-docs.md

Official response

Comments (7)

  1. Paul Hargrove reporter
    • changed status to open

    I now have my initial draft of the requested wiki updates.

    However, unless I am missing something, BitBucket doesn't provide for pull requests for wiki content, even though the wikis are backed by git repos.

    The new documentation is staged in my fork (let me know if you lack access).
    The rendered page is currently here, though that will not outlive this issue. The changes are in this commit.

  2. Paul Hargrove reporter

    @Dan Bonachea updates to site--docs.md for the upcxx-jsrun wrapper were written in mid-January and then forgotten. Your copy-editing and general feedback are requested. See the links in my previous comment.

    While not appearing in our release, I'd prefer to see the necessary updates made to the wiki page prior to our release announcement (in the vain belief that a non-empty set of people actually respond to our announcements by reading our documentation and using our software).

  3. Dan Bonachea

    It appears BitBucket does not allow me to comment on a wiki commit, so that's not a usable mechanism for detailed review.

    The changes were not "forgotten" so much as postponed, pending investigation into the best practice that should be implemented by upcxx-jsrun --high-bandwidth and upcxx-jsrun --low-latency. Those implementations still differ semantically from the recommendations in ibv-conduit/README (which recommends 'mlx5_0+mlx5_3' for all cores to maximize bandwidth, and single nearby HCA to reduce latency).

    We can of course merge the documentation change and tweak the script later, but we might want the document to at least mention the --1-hca option for latency sensitive applications and the default of --2-hca.

    I'd also like to see us preserve some of the useful technical details from the "Network ports on Summit" and "Correctness with multiple I/O buses" sections somewhere (eg the HCA layout table), perhaps in a separate "technical details" document? The replacement section is very high-level and gives the info users need to get started, but having these technical details collected together seems valuable (if nothing else, for our own reference when trying to understand how a given config is operating).

  4. Paul Hargrove reporter

    I agree that BitBucket does not provide the right mechanism(s) for review of a wiki.
    So, when I have time I plan to use a clone of the upcxx wiki as a full-fledged private git repo under my account for the purpose of a real PR. I had forgotten, but I did so a year ago and thus the PHHargrove/upcxx-wiki repo already exists (though it is not remotely up-to-date).

    I can definitely add the 1 and 2 hca options.

    The ibv-conduit/README is also up for revision, but I've not gotten to that yet.
    I also have an AI to update the script to match the benchmarking findings.
    So, all three should match by the time we release.

    I had not considered a distinct document for the tech details, but will first try to restore them to site-docs to see how that works. However, that is not going to happen until after I move this activity to a real PR UI.

  5. Paul Hargrove reporter

    The PR mentioned above has iterated to APPROVED.

    However, it makes statements which are not true for the current release.
    So, I am planning to defer the merge of that PR until we are making the other wiki updates normally associated with a release. That will leave only hours, rather than days, with docs and install being out-of-sync.

  6. Log in to comment