Wiki

Clone wiki

cppamp-driver-ng / Home

Clamp : An open source C++ compiler for heterogeneous devices

About Clamp

This repository hosts Clamp, a C++ compiler implementation project. The goal is to implement a compiler that takes a program conforming C++AMP 1.2 standard and transforms it into HSAIL, SPIR binary, or OpenCL-C.

Tested targets are:

  • Khronos OpenCL SPIR 1.2 and OpenCL C for
    • AMD Stack/AMD GPU with Khronos SPIR 1.2 and OpenCL C
    • NVIDIA Stack/NVIDIA GPU with OpenCL C
    • Apple Mac OS X 10.9 Stack with OpenCL C
  • HSAIL and BRIG for HSA devices:
    • AMD Kaveri APU

Downloads

In the latest release (0.5.0), OpenCL and HSA are unified into one single release package.

See also Downloads for older versions


Install

Ubuntu binary packages for x86-64

To install, download clamp and libcxxamp DEB files from links above, and:

sudo dpkg -i libcxxamp-<version>-Linux.deb clamp-<version>-Linux.deb

clamp-bolt and boost are optional packages if you want to use automatic replacement of C++ STL calls to AMD Bolt calls.

sudo dpkg -i clamp-bolt-<version>-Linux.deb

boost is only provided as a tarball. Please use the following command to install it:

sudo tar zxvf boost_1_55_0-hsa-milestone3.tar.gz -C /opt/clamp

Default installation directory is /opt/clamp.

Binary tarballs for x86-64

To install, download clamp and libcxxamp tar.gz files from links above, and:

sudo tar zxvf libcxxamp-<version>-Linux.tar.gz
sudo tar zxvf clamp-<version>-Linux.tar.gz

clamp-bolt and boost are optional packages if you want to use automatic replacement of C++ STL calls to AMD Bolt calls.

sudo tar zxvf clamp-bolt-<version>-Linux.tar.gz

boost is only provided as a tarball. Please use the following command to install it:

sudo tar zxvf boost_1_55_0-hsa-milestone3.tar.gz -C /opt/clamp

Default installation directory is /opt/clamp.

Dynamic Libraries

Since 0.4.0 platform-specific libraries and libc++ are built as dynamic libraries. After building, please change /etc/ld.so.conf to let dynamic libraries be locatable at runtime.

If you install deb files or tarballs, please add the following lines to /etc/ld.so.conf :

# C++AMP runtime libraries
# libc++ & C++AMP runtime implementations
/opt/clamp/lib

If you build from source, please add the following lines to /etc/ld.so.conf:

# C++AMP runtime libraries
# libc++
(path_of_your_build_directory)/libc++/libcxx/lib
(path_of_your_build_directory)/libc++/libcxxrt/lib
# C++AMP runtime implementations
(path_of_your_build_directory)/build/Release/lib

Please make sure OpenCL or HSA runtime libraries can be located by ld.so as well. For example, your ld.so.conf might also need to include:

# OpenCL runtime (libOpenCL.so)
/opt/AMDAPP/lib/x86_64

# HSA runtime (libhsaruntime-64.so)
/opt/hsa/lib

Always remember to use: sudo ldconfig -v to reload ld.so cache.

Install on Mac OS X (Experimental)

See InstallOnMacOSX

Install on AMD Kaveri HSA (Experimental)

See HSA Support Status


How to compile a C++ AMP source code

A new clang driver has been merged in the latest release to have a streamlined build process. Here's an example to build (compile + link) in 1-step:

# Assume clamp and libcxxamp are installed and added to PATH
# use --install if you install clamp with ubuntu package
# use --build if you build from source
# if not specified, default would be --install
clang++ `clamp-config --install --cxxflags --ldflags` -o test.out test.cpp

To use HSA-extension:

# Use -Xclang -fhsa-ext to enable HSA extension
clang++ `clamp-config --install --cxxflags --ldflags` -Xclang -fhsa-ext foo.cpp -o foo.out

To emit object files. Please notice GPU codes will be stored in a special section ".kernel".

# Use -c to emit object files.
# GPU kernels will be stored in .kernel section
clang++ `clamp-config --install --cxxflags` -c foo.cpp -o foo.o

To link objects. Clang will extract all ".kernel" sections from each objects and lower to target architecture (SPIR/OpenCL C/HSAIL)

# clang will extract all .kernel sections from each objects and lower to target architecture (SPIR/OpenCL C/HSAIL)
clang++ `clamp-config --install --ldflags` foo.o bar.o -o foo.out

Since 0.5.0, it is also possible to generate CPU-only codes which don't need any GPU platforms such as OpenCL or HSA.

clang++ `clamp-config --install --cxxflags --ldflags` -cpu -o test.out test.cpp

Choose C++AMP runtime

C++AMP programs will automatically detect available GPU platform on the system, with the following precendence:

  • HSA
  • OpenCL 1.2 (to use zero-copy performance optimizations)
  • OpenCL 1.1
  • CPU

In case you want to force C++AMP runtime to use a certain runtime, you can use:

# force set C++AMP runtime to HSA
export CLAMP_RUNTIME=HSA

# force set C++AMP runtime to OpenCL 1.2
export CLAMP_RUNTIME=CL12

# force set C++AMP runtime to OpenCL 1.1
export CLAMP_RUNTIME=CL11

# force set C++AMP runtime to CPU
export CLAMP_RUNTIME=CPU

To turn auto detection back on:

unset CLAMP_RUNTIME

Please notice if C++AMP runtime find the specified runtime couldn't be loaded it would fall back to automatic detection.


SPIR v. OpenCL C

On OpenCL machines with SPIR support, SPIR kernels will be used instead of OpenCL C ones. You can alter the precedence by:

# turn off SPIR, force use OpenCL C
export CLAMP_NOSPIR=1

It will force C++AMP runtime to pick OpenCL C instead of SPIR. To turn it back on:

# turn on SPIR
unset CLAMP_NOSPIR

The environment variable CLAMP_NOSPIR has no effect on devices without SPIR support.


How to use STL calls to AMD Bolt calls rewrite plugin

Since release 0.4.0, a new feature is introduced to preprocess normal C++ programs so they can be accelerated by GPU. It is achieved by:

  • Port AMD Bolt, a parallel C++ template library from Windows to Linux and integrate with this project.
  • Implement a new tool, clamp-preprocess, to transform C++ STL calls to AMD Bolt APIs.

The feature is still in experimental stage so it's not enabled by default. To use it please see the example located under src/tests/Unit/BoltRewrite in the source repository, and follow the steps below:

# remember to install clamp-bolt and boost packages

# transform source code
clamp-preprocess foo.cpp foo_transformed.cpp

# remember to add --bolt
clang++ `clamp-config --bolt --cxxflags --ldflags` -o foo.out foo_transformed.cpp

For mode detailed information about this feature, please refer to Parallelize C++ programs through Bolt.


Sample codes

We have collected a few sample codes. The package is also available for download.

You will need to use the build script buildme.binary to correctly invoke the compiler and build C++AMP codes on Linux. See README.BINARY.TXT for details.


News

02/01/2015

HSA/OpenCL Unified Release (0.5.0 Release Milestone 4)

Changes

  • Support HSA 1.0P
  • Fix one major memory leak within Clang 3.3
  • Various bug fixes
  • Support more generic SVM on HSA. It is now possible to capture host objects by reference in GPU kernels.
  • Preliminary support "auto-auto" feature on HSA. GPU kernels do not necessarily have to carry restrict(amp) specifier.
  • Clang 3.5 support is mostly on par with Clang 3.3

Clang 3.5 upgrade

A new repository of Clang/LLVM 3.5 is available. To use it, please use the following instructions to build from source as of now:

git clone https://bitbucket.org:/multicoreware/cppamp-driver-ng-35.git src
mkdir build
cd build
cmake ../src
make -j4 world && make

11/08/2014

HSA/OpenCL Unified Release (0.4.0 Release Milestone 3)

Changes

  • Unified HSA build and OpenCL build into one release package
  • Simplified cmake procedure.
  • Introduced performance improvements in OpenCL and HSA.
  • Implemented "fat binary" : one C++AMP binary could now contain multiple versions of GPU kernels (HSA / OpenCL)
  • Decoupled C++AMP programs from C++AMP runtimes. It's now possible to use an environment variable to dynamically pick which GPU platform to use.
  • Implemented a preliminary port of AMD Bolt C++AMP version from Windows to Linux.
  • Implemented a preliminary version of transforming C++ STL calls to AMD Bolt calls, in order to make normal C++ programs be accelerated by GPU.

9/29/2014

Update OpenCL/SPIR Relase (0.3.0 Release Milestone 2)

Changes

  • OpenCL/SPIR version based on the codebase for Second HSA Release
  • Improved clang driver interface
  • Implemented restrict(auto) which is an optional feature in C++AMP Chapter 13.

9/27/2014

Second HSA Rlease (0.3.0 Release Milestone 2)

Changes

  • Improved clang driver interface
  • Implemented restrict(auto) which is an optional feature in C++AMP Chapter 13.
  • Relaxed C++ language rules on HSA.
  • Implemented new asynchronous parallel_for_each interface on HSA.
  • See HSA Support Status for more detailed HSA-related information.

8/18/2014

First HSA Release (0.3.0 Release)

Changes

  • First C++AMP for HSA release.
  • See HSA Support Status for more detailed HSA-related information.
  • Please notice C++AMP for HSA package is NOT compatible with OpenCL/SPIR at this moment.
  • Please also notice C++AMP for HSA does NOT depend on HSA Okra runtime anymore, and Okra port would NOT be supported anymore.

7/2/2014

Updated OpenCL/SPIR Release

Changes

  • More bug fixes. Conformance rate is more than 99% on SPIR now. (passed + skipped).

6/1/2014

Milestone 5 (0.2.0 Release)

Changes

  • Default installation directory changed to /opt/clamp .
  • Changed and encapsulated a few compile options. We suggest to use clamp-config to abstract away all compile options.
  • Various bug fixes thanks to MS Conformance Tests. Now we have 97.5% conformance rate on SPIR (passed + skipped).

3/20/2014

Milestone 4 (HSA/Okra port)

  • HSA Foundation's HSAIL using Okra runtime for HSA devices (e.g. AMD Kaveri).
  • Note that the HSA/Okra port requires a different configuration flag to build and is currently not compatible with the OpenCL/SPIR version.

3/18/2014

Milestone 3 Updates (140 patches since milestone 3)

Changes

  • Various bug fixes thanks to MS Conformance Tests

3/15/2014

A preliminary port to HSA Okra runtime is functional on a Kaveri machine. The most current status of HSAIL support can be found at here


Older News

Here


Roadmap

See here


More about this project and build instruction

See the Overview

Updated