PLASMA  2.8.0
PLASMA - Parallel Linear Algebra for Scalable Multi-core Architectures
int PLASMA_zcungesv ( PLASMA_enum  trans,
int  N,
int  NRHS,
PLASMA_Complex64_t *  A,
int  LDA,
PLASMA_Complex64_t *  B,
int  LDB,
PLASMA_Complex64_t *  X,
int  LDX,
int *  ITER 
)

PLASMA_zcungesv - Solves overdetermined or underdetermined linear systems involving an M-by-N matrix A using the QR or the LQ factorization of A. It is assumed that A has full rank. The following options are provided:

trans = PlasmaNoTrans and M >= N: find the least squares solution of an overdetermined

system, i.e., solve the least squares problem: minimize || B - A*X ||.

trans = PlasmaNoTrans and M < N: find the minimum norm solution of an underdetermined

system A * X = B.

Several right hand side vectors B and solution vectors X can be handled in a single call; they are stored as the columns of the M-by-NRHS right hand side matrix B and the N-by-NRHS solution matrix X.

PLASMA_zcungesv first attempts to factorize the matrix in COMPLEX and use this factorization within an iterative refinement procedure to produce a solution with COMPLEX*16 normwise backward error quality (see below). If the approach fails the method switches to a COMPLEX*16 factorization and solve.

The iterative refinement is not going to be a winning strategy if the ratio COMPLEX performance over COMPLEX*16 performance is too small. A reasonable strategy should take the number of right-hand sides and the size of the matrix into account. This might be done with a call to ILAENV in the future. Up to now, we always try iterative refinement.

The iterative refinement process is stopped if ITER > ITERMAX or for all the RHS we have: RNRM < N*XNRM*ANRM*EPS*BWDMAX where:

  • ITER is the number of the current iteration in the iterative refinement process
  • RNRM is the infinity-norm of the residual
  • XNRM is the infinity-norm of the solution
  • ANRM is the infinity-operator-norm of the matrix A
  • EPS is the machine epsilon returned by DLAMCH('Epsilon').

Actually, in its current state (PLASMA 2.1.0), the test is slightly relaxed.

The values ITERMAX and BWDMAX are fixed to 30 and 1.0D+00 respectively.

We follow Bjorck's algorithm proposed in "Iterative Refinement of Linear Least Squares solutions I", BIT, 7:257-278, 1967.4

Parameters
[in]transIntended usage: = PlasmaNoTrans: the linear system involves A; = PlasmaConjTrans: the linear system involves A**H. Currently only PlasmaNoTrans is supported.
[in]NThe number of columns of the matrix A. N >= 0.
[in]NRHSThe number of right hand sides, i.e., the number of columns of the matrices B and X. NRHS >= 0.
[in]AThe M-by-N matrix A. This matrix is not modified.
[in]LDAThe leading dimension of the array A. LDA >= max(1,M).
[in]BThe M-by-NRHS matrix B of right hand side vectors, stored columnwise. Not modified.
[in]LDBThe leading dimension of the array B. LDB >= MAX(1,M,N).
[out]XIf return value = 0, the solution vectors, stored columnwise. if M >= N, rows 1 to N of B contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of the modulus of elements N+1 to M in that column; if M < N, rows 1 to N of B contain the minimum norm solution vectors;
[in]LDXThe leading dimension of the array B. LDB >= MAX(1,M,N).
[out]ITERThe number of the current iteration in the iterative refinement process
Returns
Return values
PLASMA_SUCCESSsuccessful exit
<0if -i, the i-th argument had an illegal value
See also
PLASMA_zcungesv_Tile
PLASMA_zcungesv_Tile_Async
PLASMA_dsungesv
PLASMA_zgels