* Further documentation is provided in docs/html/index.html.

* To INSTALL clMAGMA, modify the make.inc file to indicate your C/C++ compiler,
  Fortran compiler, and where AMD's clBLAS, CPU BLAS, and LAPACK are installed on your
  system. Examples are given in make.inc.acml, make.inc.atlas, make.inc.macos,
  make.inc.mkl-*, make.inc.openblas showing how to link to ACML, ATLAS, MacOS
  veclib, MKL, and OpenBLAS, respectively.

  All the make.inc files assume $CUDADIR is set in your environment.
  For bash (sh), put in ~/.bashrc (with your system's path):
      export CUDADIR=/usr/loca/cuda
  For csh/tcsh, put in ~/.cshrc:
      setenv CUDADIR /usr/local/cuda
  MAGMA is tested with CUDA >= 5.5. Some functionality requires a newer version.
  The MKL make.inc files assume $MKLROOT is set in your environment.
  For bash (sh), put in ~/.bashrc (with your system's path):
      source /opt/intel/composerxe/mkl/bin/mklvars.sh intel64
  For csh/tcsh, put in ~/.cshrc:
      source /opt/intel/composerxe/mkl/bin/mklvars.csh intel64
  clMAGMA is tested with MKL 11.2.3 (2015), both LP64 and ILP64;
  other versions may work.
  The ACML make.inc file assumes $ACMLDIR is set in your environment.
  For bash (sh), put in ~/.bashrc (with your system's path):
      export ACMLDIR=/opt/acml-5.3.1
  For csh/tcsh, put in ~/.cshrc:
      setenv ACMLDIR  /opt/acml-5.3.1
  clMAGMA is tested with ACML 5.3.1; other versions may work.
  See comments in make.inc.acml regarding ACML 4;
  a couple testers fail to compile with ACML 4.
  The ATLAS make.inc file assumes $ATLASDIR and $LAPACKDIR are set in your environment.
  If not installed, install LAPACK from http://www.netlib.org/lapack/
  For bash (sh), put in ~/.bashrc (with your system's path):
      export ATLASDIR=/opt/atlas
      export LAPACKDIR=/opt/LAPACK
  For csh/tcsh, put in ~/.cshrc:
      setenv ATLASDIR  /opt/atlas
      setenv LAPACKDIR /opt/LAPACK

  The OpenBLAS make.inc file assumes $OPENBLASDIR is set in your environment.
  For bash (sh), put in ~/.bashrc (with your system's path):
      export OPENBLASDIR=/opt/openblas
  For csh/tcsh, put in ~/.cshrc:
      setenv OPENBLASDIR /opt/openblas
* Compiling OpenCL kernels at compile time or at run time

  By default, clMAGMA compiles its kernels at compile time and saves them in
  lib/clmagma_kernels.co. At runtime, it searchs CLMAGMA_PATH or LD_LIBRARY_PATH
  for this file.
  Alternatively, if you do not wish to compile kernels at compile time -- for
  instance, if you don't know what GPU will be used at runtime -- install a copy
  of the clmagmablas directory and include this directory in CLMAGMA_PATH or
  TODO add make targets to do each of these.

* Building without Fortran

  clMAGMA can be built without Fortran by commenting out FORT in the make.inc file.
  However, some testers will not be able to check their results.

* Building a Shared Library

  By default now, all make.inc files add the -fPIC option to CFLAGS, FFLAGS,
  F90FLAGS, and NVCCFLAGS, required for building a shared library. Note in
  NVCCFLAGS that -fPIC is passed via the -Xcompiler option. Running:
      make lib
      make test
  will create a shared library lib/libmagma.so, static library lib/libmagma.a,
  and testing drivers in 'testing'.

  (The current exception is for ATLAS, in make.inc.atlas, which in our
   install is a static library, thus requiring clMAGMA to be a static library.)
* Building a Static Library

  Alternatively, comment out FPIC in the make.inc file to compile only a static
  library. Running:
      make lib
      make test
  will create a static library lib/libmagma.a, and testing drivers in 'testing'.
* Install

  To install libraries and include files in a given prefix, run:
      make install prefix=/usr/local/magma
  The default prefix is /usr/local/clmagma. You can also set prefix in make.inc.
* Environment variables

  For multi-GPU functions, set $MAGMA_NUM_GPUS to set the number of GPUs to use.
  For multi-core BLAS libraries, set $OMP_NUM_THREADS or $MKL_NUM_THREADS or
  $VECLIB_MAXIMUM_THREADS to set the number of CPU threads, depending on your
  BLAS library.

* A short standalone EXAMPLE is provided in directory 'example'. This is
  intended to show the minimum needed to start using clMAGMA, without all the
  extra Makefiles, headers, and libraries used in testing. You must edit
  example/Makefile to reflect your make.inc, or use pkg-config, as described in

* To TEST clMAGMA, go to directory 'testing'. Provided are a number of
  drivers testing different routines. These drivers are also useful
  as examples on how to use clMAGMA, as well as to benchmark the performance.

  The testers print "ok" or "failed" for whether the error passes the tolerance.
  In some cases, the tolerance may be too strict, so a test may "fail" even
  though it is only slightly above the tolerance. Error values around 1e-15 for
  double and double-complex, and 1e-7 for single and single-complex, are
  generally acceptable. Values larger than 1e-4 are very suspicious.

* To TUNE clMAGMA, you can modify the blocking factors for the algorithms of
  interest in file 'control/get_nb_tahiti.cpp'. The default values are tuned for
  AMD Radeon 7970 (Tahiti) GPUs. You can also compare your performance to
  what we get, given in file
  'testing/results_clmagma.txt', as an easy check for your installation.

* To autotune clBLAS, set the CLBLAS_STORAGE_PATH environment variable
  to a working directory and run clBLAS tune. Subsequent clMAGMA runs will
  use the optimized routines found in CLBLAS_STORAGE_PATH.

* To use a GPU on a server, disable X forwarding when you ssh to the server,
  using 'ssh -x hostname'. To see whether OpenCL finds your GPU, use 'clinfo'.

For more INFORMATION, please refer to the MAGMA homepage and user forum:


The MAGMA project supports the package in the sense that reports of
errors or poor performance will gain immediate attention from the
developers. Such reports, descriptions of interesting applications,
and other comments should be posted on the MAGMA user forum.