- changed component to Memory Kinds
Explore chipStar for HIP-over-Level0 on Intel GPUs
With the forthcoming hip_device
feature, UPC++ will have native support for AMD ROCm/HIP devices.
We already have a (currently experimental) mode capable of running our the UPC++ and GASNet HIP support on NVIDIA GPU hardware using the HIP-over-CUDA shim library.
There is also a HIP-over-Level0 library called HIPLZ.
This task is to experiment with that shim library to see if it's sufficient to allow our hip_device
support in UPC++ and GASNet to function on Intel GPU hardware.
Comments (10)
-
reporter -
reporter - changed milestone to 2023.3.0 release
-
reporter -
assigned issue to
-
assigned issue to
-
- changed status to open
I was able to make some progress on my first attempt at this.
TL;DR:
- With four source modifications (one in configure) I can compile and link
- I cannot yet run, but for a reason I believe I can resolve when I am next able to work on this
Full version:
My first hurdle was the fact that the HIPLZ installs I have access lack a
hip_runtime_api.h
header (having onlyhip_runtime.h
). My understanding is thathip_runtime_api.h
from a ROCm install is a wrapper which includes eitherhcc_detail/hip_runtime_api.h
ornvcc_detail/hip_runtime_api.h
. There are no*_detail
directories in the HIPLZ install. So, as I've stated verbally already, the project's focus appears to be on applications and not on low-level runtimes such as ours (and I can't fault them for that prioritization).✅ Substituting
hip_runtime.h
for the three instances ofhip_runtime_api.h
in UPC++ (one of those inconfigure
) lets me configure and compile the runtime.Next problem was a link failure on every test because
hipMemcpyPeerAsync()
is missing. I've confirmed that name is not present in the provided headers, and this is probably not implemented yet.✅ Flipping a
#if 1
insrc/hip.cc
to avoid use ofhipMemcpyPeerAsync()
lets me link tests.Now, as I run out of time for this trial, I get a run time failure to locate
libOpenCL.so.1
. That is almost certainly a matter of anLD_LIBRARY_PATH
to resolve a dependency of one of the explicitly linked libs. Note that in my experience the executable'sRPATH
is often (always?) ignored when resolving those. But I will try anRPATH
before I attempt kludging application launch to propagate a (yet to be determined)LD_LIBRARY_PATH
. -
@Dan Bonachea noted that disabling the call to
hipMemcpyPeerAsync()
in favor ofhipMemcpyDtoDAsync()
is not actually equivalent for the case of multiple devices. I am currently limiting my testing to nodes with a single Intel GPU device to avoid the potential issue for now. -
Next installment (final for today)
TL;DR:
- I can pass
make check
for udp and smp conduits, includingtest-memory_kinds
(nodes I am using lack InfiniBand). - This result required two more source modifications (now at a total of six)
- GPU-enabled tests in
make dev-check
are NOT all passing, but I lack time to diagnose the failures just now
Full Version:
Now, as I run out of time for this trial, I get a run time failure to locate libOpenCL.so.1
✅ As was noted in my testing of PR
#473, the oneAPI install on the JSLE systems (where I am testing HipLZ) already required a kludge to propagateLD_LIBRARY_PATH
to get Level Zero working. So, resolving the missinglibOpenCL
(from the oneAPI install) was automatic with resolving the Level Zero issue. So, this appears to be a platform issue, not a HipLZ one.Next issue is that
HipStreamDestroy()
inhip_device::destroy()
leads to an assertion failure within HipLZ. It is worth noting that the default formodule load hiplz
on the system in use adds paths which contain-debug
in the version portion of the directory names.✅ So, I've commented out that call to
HipStreamDestroy()
(source mod#5, for those keeping track).Next up: the intentionally over-large allocation call in
memory_kinds.cpp
is seeing a return ofhipErrorInvalidValue
where we expecthipErrorOutOfMemory
fromhipMalloc()
. My understanding of the Hip documentation is that this is probably a conformance error in HipLZ.✅ Regardless of conformance, I've added a case for
hipErrorInvalidValue
to the error-handlingswitch
statement inmake_segment()
insrc/hip.cpp
. (source mod#6, for those keeping track). This lets me make forward progress, but could mask real errors in the call tohipMalloc()
.With the above, I can pass
make check
forNETWORK=smp
orudp
. The output oftest-memory_kinds
confirms an Intel-branded GPU listed as a HIP device.There is no
hipcc
provided by HipLZ, and I've not attempted to follow their insrtuctions form use of theirclang
as a replacement. So, I am ignoringhip_vecadd
for now.Even ignoring
hip_vecadd
, most GPU-enabled tests inmake dev-check
are not passing, but I lack time to investigate the failures right now. Most (maybe all) of the faiures appear to be timeouts, though non-GPU tests run just fine. - I can pass
-
I had hoped to continue this work on one of the systems at UOregon with Intel GPUs. However, I have been unable to build HipLZ's prerequisites.
Note for whoever may continue this:
The HipLZ and HipCL projects have merged to become CHIP-SPV -
reporter - changed milestone to 2023.9.0 release
Mass roll-over of open issues to next release milestone
-
reporter - removed milestone
Clear past Milestone for open issues
-
reporter - changed title to Explore chipStar for HIP-over-Level0 on Intel GPUs
chipStar has replaced HIPLZ as the recommended way to run HIP over Intel GPUs
- Log in to comment