Commits

David McClosky committed 9fe91c8

Add new reranker retrainer (cvlm-lbfgs)
This is a new Apache license-compatible reranker retrainer. It
uses libLBFGS for optimization and COBYLA to tune regularization
coefficients. This doesn't give exactly the same results as previous
rerankers, but the resulting models appear to be empirically close.
libLBFGS supports both L1 and L2 regularization using OWLQN for the
former.

Minor changes:
- updated READMEs to drop remaining mentions of TAO/PETSc and include install
instructions for libLBFGS and Boost
- updated Makefile to make cvlm-lbfgs the default optimizer (also made "final"
the default version since that's presumably what most expect)
- updated wlle Makefile to drop remaining mentions of TAO/PETSc and older
optimizers which used to use it
- minor improvements to the main README (a task never finished...)

  • Participants
  • Parent commits 3073d6c

Comments (0)

Files changed (8)

 glob:second-stage/programs/wlle/avper
 glob:second-stage/programs/wlle/gavper
 glob:second-stage/programs/wlle/cvlm
+glob:second-stage/programs/wlle/cvlm-lbfgs
 glob:second-stage/programs/wlle/oracle
 glob:*.swp
 glob:*.orig
 # section 24 is used as dev and sections 22 and 23 are used as test1
 # and test2 respectively.
 #
-VERSION=nonfinal
-# VERSION=final
+# VERSION=nonfinal
+VERSION=final
 
 # FEATUREEXTRACTOR is the program that used to extract features from
 # the 20-fold n-best parses.  If you change this, please pick a new 
 # of these variable values can be found in the train-eval-reranker.sh
 # script.
 #
-ESTIMATOR=second-stage/programs/wlle/avper
+ESTIMATOR=second-stage/programs/wlle/cvlm-lbfgs
 
 # ESTIMATORFLAGS are flags given to the estimator
 #
-ESTIMATORFLAGS=-n 10 -d 0 -F 1 -N 10
+ESTIMATORFLAGS=-l 1 -c 10 -F 1 -n -1 -p 2
 
 # ESTIMATORNICKNAME is used to name the feature weights file
 #
-ESTIMATORNICKNAME=avper
+ESTIMATORNICKNAME=cvlm-lbfgs
 
 # ESTIMATORSTACKSIZE is the size (in KB) of the per-thread stacks
 # used during estimation
 #       program, sparseval, etc., 
 #  extract-spfeatures, which produces feature-count files used to train 
 #       the reranker, 
-#  cvlm/avper, which estimates the feature weights.
+#  cvlm-lbfgs/avper, which estimates the feature weights.
 #
 .PHONY: reranker
 reranker: top TRAIN
 ~BLLIP/reranking-parser/README
 
-(c) Mark Johnson,Eugene Charniak, 24th November 2005 --- August 2006
+(c) Mark Johnson, Eugene Charniak, 24th November 2005 --- August 2006
 
 We request acknowledgement in any publications that make use of this
 software and any code derived from this software.  Please report the
-release date of the software that you are using (this is part of the
-name of the tar file you have downloaded), as this will enable others
-to compare their results to yours.
+release date of the software that you are using, as this will enable
+others to compare their results to yours.
 
-COMPILING THE PARSER
-====================
+COMPILING AND RUNNING THE PARSER
+================================
 
 To compile the two-stage parser, first define GCCFLAGS appropriately
-for your machine, e.g., with csh or tcsh
+for your machine, e.g., with csh or tcsh:
 
 > setenv GCCFLAGS "-march=pentium4 -mfpmath=sse -msse2 -mmmx"
 
 
 > setenv GCCFLAGS "-march=opteron -m64"
 
-(if unsure, it is safe to leave GCCFLAGS unset -- it just won't be
-as optimized as possible)
+(if unsure, it is safe to leave GCCFLAGS unset -- the defaults are
+generally good these days)
 
 Then execute 
 
-make
+> make
 
 After it has built, the parser can be run with
 
 
 > parse.sh sample-text/sample-data.txt
 
+The input text must be pre-sentence segmented with each sentence in an <s> tag:
+
+    <s> Sentence 1 </s>
+    <s> Sentence 2 </s>
+    ...
+
+Note that there needs to be a space before and after the sentence.
+
 The script parse-eval.sh takes a list of treebank files as arguments
 and extracts the terminal strings from them, runs the two-stage parser
 on those terminal strings and then evaluates the parsing accuracy with
 license section at the bottom of this file).
 
 If you're using cvlm as your estimator (the default), you'll also need
-the Boost C++ and the Petsc/Tao C++ libraries in order to retrain the
+the Boost C++ and the libLBFGS library in order to retrain the
 reranker.  If you're using cvlm-owlqn as your estimator, you can ignore
-these steps.  Install instructions for Petsc/Tao are given later in
-this document.  The environment variables PETSC_DIR and TAO_DIR should
-all point to the installation directories of this software.  I define
-these variables in my .login file as follows on my machine.
+this. libLBFGS is available at http://www.chokkan.org/software/liblbfgs/
+under the MIT license. In Ubuntu, you'll need the liblbfgs-dev package:
 
-setenv PETSC_DIR /usr/local/share/petsc
-setenv TAO_DIR /usr/local/share/tao
-setenv PETSC_ARCH linux
-setenv BOPT O_c++
+> sudo apt-get install liblbfgs-dev
+
+Boost can be obtained from http://www.boost.org/ or with the libboost-dev
+package in Ubuntu:
+
+> sudo apt-get install libboost-dev
 
 While many modern Linux distributions come with the Boost C++
 libraries pre-installed, if the Boost C++ libraries are not included

second-stage/README

 This software uses g++ and gcc 4.0; it may also function using earlier
 versions of g++ and gcc.
 
-It also uses the PETSc and TAO optimization packages available from
-Argonne National Labs.  You should install these packages before attempting
-to install this software.  PETSc (which you should install first) can be 
-obtained from
-
-	http://www-unix.mcs.anl.gov/petsc/petsc-2/
-
-and TAO can be obtained from
-
-	http://www-fp.mcs.anl.gov/tao/
-
-PETSc and TAO are used in the programs in the programs/wlle directory.
-If you set the PETSC_DIR, PETSC_ARCH and TAO_DIR environment variables
-as required for the proper functioning of PETSc and TAO, you should not
-need to change anything in my programs.  But if you have a non-standard
-installation of PETSc or TAO, you may need to edit
-programs/wlle/Makefile accordingly.
-
 A single "make" command in the top-level directory should run the feature
 extractor on the trees, run the training program to estimate the model,
 and runs the evaluation program to evaluate model's predictions.

second-stage/programs/wlle/Makefile

 # License for the specific language governing permissions and limitations
 # under the License.
 
-SOURCES = avper.cc cvlm-owlqn.cc hlm.cc gavper.cc lm.cc lmdata.c oracle.cc wavper.cc wlle.cc OWLQN.cpp TerminationCriterion.cpp # cvlm.cc
-TARGETS = avper gavper oracle # cvlm lm oracle wavper cvlm-owlqn hlm
+SOURCES = avper.cc cvlm-lbfgs.cc hlm.cc gavper.cc lm.cc lmdata.c oracle.cc wavper.cc wlle.cc # cvlm.cc OWLQN.cpp TerminationCriterion.cpp
+TARGETS = avper gavper oracle cvlm-lbfgs # cvlm lm oracle wavper cvlm-owlqn hlm
 OBJECTS = $(patsubst %.cpp,%.o,$(patsubst %.l,%.o,$(patsubst %.c,%.o,$(SOURCES:%.cc=%.o))))
 
 all: $(TARGETS)
 
-lm: liblmdata.a lm.o
-	$(CXX) $(LDFLAGS) lm.o liblmdata.a $(TAO_PETSC_LIBS) -o lm
-
 lm-owlqn: lm-owlqn.o OWLQN.o TerminationCriterion.o liblmdata.a
 	$(CXX) $(LDFLAGS) $^ -o lm-owlqn
 
-cvlm: liblmdata.a cvlm.o
-	$(CXX) $(LDFLAGS) cvlm.o liblmdata.a $(TAO_PETSC_LIBS) -o cvlm
-
-cvlm-nolib: cvlm.o lmdata.o
-	$(CXX) $(LDFLAGS) cvlm.o lmdata.o $(TAO_PETSC_LIBS) -o cvlm-nolib
-
 cvlm-owlqn: cvlm-owlqn.o OWLQN.o TerminationCriterion.o liblmdata.a
 	$(CXX) $(LDFLAGS) $^ -o cvlm-owlqn
 
+cvlm-lbfgs: cvlm-lbfgs.o liblmdata.a cobyla.o
+	$(CXX) $(LDFLAGS) -L/usr/local/lib -llbfgs $^ -o cvlm-lbfgs
+
 hlm: hlm.o OWLQN.o TerminationCriterion.o liblmdata.a
 	$(CXX) $(LDFLAGS) $^ -o $@
 
-lnne: libdata.a lnne.o 
-	$(CXX) $(LDFLAGS) lnne.o libdata.a $(TAO_PETSC_LIBS) -o lnne
-
 avper: avper.o liblmdata.a
 	$(CXX) $(LDFLAGS) $^ -o $@ 
 
 wavper: wavper.o liblmdata.a
 	$(CXX) $(LDFLAGS) $^ -o $@ 
 
-wlle: libdata.a wlle.o
-	$(CXX) $(LDFLAGS) wlle.o libdata.a $(TAO_PETSC_LIBS) -o wlle
-
-cvwlle: libdata.a cvwlle.o
-	$(CXX) $(LDFLAGS) cvwlle.o libdata.a $(TAO_PETSC_LIBS) -o cvwlle
-
 oracle: liblmdata.a oracle.o
 	$(CXX) $(LDFLAGS) oracle.o liblmdata.a -o oracle
 
 liblmdata.a: lmdata.o
 	ar rcv liblmdata.a lmdata.o; ranlib liblmdata.a
 
-# TAO stuff
-
-# PETSC_DIR = /usr/local/share/petsc
-# PETSC_ARCH = linux
-# TAO_DIR = /usr/local/share/tao
-# BOPT = O_c++
-
-TAO_PETSC_INCLUDE = -I${PETSC_DIR} -I${PETSC_DIR}/bmake/$(PETSC_ARCH) -I${PETSC_DIR}/include -I${PETSC_DIR}/include/mpiuni -I${TAO_DIR} -I${TAO_DIR}/include -I${PETSC_DIR}/$(PETSC_ARCH)/include
-
-# TAO_PETSC_LIBS = -L${TAO_DIR}/lib/${PETSC_ARCH} -ltaopetsc -ltao -Wl,-rpath,${TAO_DIR}/lib/${PETSC_ARCH} -Wl,-rpath,${PETSC_DIR}/lib/${PETSC_ARCH} -L${PETSC_DIR}/lib/${PETSC_ARCH} -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc   -Wl,-rpath,${PETSC_DIR}/lib/${PETSC_ARCH} -L${PETSC_DIR}/lib/${PETSC_ARCH} -lmpiuni -L${LLAPACK_DIR} -llapack -lblas -lm -lstdc++ -lgcc_s
-# TAO_PETSC_LIBS = -L${TAO_DIR}/lib/${PETSC_ARCH} -ltaopetsc -ltao -Wl,-rpath,${TAO_DIR}/lib/${PETSC_ARCH} -Wl,-rpath,${PETSC_DIR}/lib/${PETSC_ARCH} -L${PETSC_DIR}/lib/${PETSC_ARCH} -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc   -Wl,-rpath,${PETSC_DIR}/lib/${PETSC_ARCH} -L${PETSC_DIR}/lib/${PETSC_ARCH} -lmpiuni -llapack -lblas -lm -lstdc++ -lgcc_s
-# TAO_PETSC_LIBS = -L${TAO_DIR}/lib/${PETSC_ARCH} -ltaopetsc -ltao -Wl,-rpath,${TAO_DIR}/lib/${PETSC_ARCH} -Wl,-rpath,${PETSC_DIR}/lib/${PETSC_ARCH} -L${PETSC_DIR}/lib/${PETSC_ARCH} -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -Wl,-rpath,${PETSC_DIR}/lib/${PETSC_ARCH} -L${PETSC_DIR}/lib/${PETSC_ARCH} -lmpiuni -lm -lstdc++ -lgcc_s
-TAO_PETSC_LIBS = -L${TAO_DIR}/lib/${PETSC_ARCH} -ltaopetsc -ltao -Wl,-rpath,${TAO_DIR}/lib/${PETSC_ARCH} -Wl,-rpath,${PETSC_DIR}/lib/${PETSC_ARCH} -L${PETSC_DIR}/lib/${PETSC_ARCH} -L${PETSC_DIR}/${PETSC_ARCH}/lib -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -Wl,-rpath,${PETSC_DIR}/lib/${PETSC_ARCH} -Wl,-rpath,${PETSC_DIR}/${PETSC_ARCH}/lib -L${PETSC_DIR}/lib/${PETSC_ARCH} -lmpiuni -lm -lstdc++ -lgcc_s
-
-# end of TAO stuff
-
 # Compilation help: you may need to remove -march=native on older compilers.
 GCCFLAGS=-march=native -mfpmath=sse -msse2 -mmmx
-FOPENMP=-fopenmp
 
 CC=gcc
+
+# fast options
+FOPENMP=-fopenmp
 CFLAGS=-MMD -O6 -ffast-math -fstrict-aliasing -Wall -finline-functions $(GCCFLAGS) $(FOPENMP)
 LDFLAGS=$(FOPENMP)
-CXXFLAGS=${CFLAGS} ${TAO_PETSC_INCLUDE} -Wno-deprecated
+CXXFLAGS=${CFLAGS} -Wno-deprecated
 
+# debugging options
 # FOPENMP=
-# CFLAGS=-MMD -O1 -g $(GCCFLAGS) $(FOPENMP)
+# CFLAGS=-MMD -O0 -g $(GCCFLAGS) $(FOPENMP)
 # LDFLAGS=-g $(FOPENMP)
-# CXXFLAGS=${CFLAGS} ${TAO_PETSC_INCLUDE} -Wno-deprecated
+# CXXFLAGS=${CFLAGS} -Wno-deprecated
 
 .PHONY: real-clean
 real-clean: clean

second-stage/programs/wlle/cobyla.c

+/* cobyla : contrained optimization by linear approximation */
+
+/*
+ * Copyright (c) 1992, Michael J. D. Powell (M.J.D.Powell@damtp.cam.ac.uk)
+ * Copyright (c) 2004, Jean-Sebastien Roy (js@jeannot.org)
+ * 
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ * 
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ * 
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/*
+ * This software is a C version of COBYLA2, a contrained optimization by linear
+ * approximation package developed by Michael J. D. Powell in Fortran.
+ * 
+ * The original source code can be found at :
+ * http://plato.la.asu.edu/topics/problems/nlores.html
+ */
+
+static char const rcsid[] =
+  "@(#) $Jeannot: cobyla.c,v 1.11 2004/04/18 09:51:36 js Exp $";
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <math.h>
+
+#include "cobyla.h"
+
+#define min(a,b) ((a) <= (b) ? (a) : (b))
+#define max(a,b) ((a) >= (b) ? (a) : (b))
+#define abs(x) ((x) >= 0 ? (x) : -(x))
+
+/*
+ * Return code strings
+ */
+char *cobyla_rc_string[6] =
+{
+  "N<0 or M<0",
+  "Memory allocation failed",
+  "Normal return from cobyla",
+  "Maximum number of function evaluations reached",
+  "Rounding errors are becoming damaging",
+  "User requested end of minimization"
+};
+
+static int cobylb(int *n, int *m, int *mpp, double *x, double *rhobeg,
+  double *rhoend, int *iprint, int *maxfun, double *con, double *sim,
+  double *simi, double *datmat, double *a, double *vsig, double *veta,
+  double *sigbar, double *dx, double *w, int *iact, cobyla_function *calcfc,
+  void *state);
+static int trstlp(int *n, int *m, double *a, double *b, double *rho,
+  double *dx, int *ifull, int *iact, double *z__, double *zdota, double *vmultc,
+  double *sdirn, double *dxnew, double *vmultd);
+
+/* ------------------------------------------------------------------------ */
+
+int cobyla(int n, int m, double *x, double rhobeg, double rhoend, int iprint,
+  int *maxfun, cobyla_function *calcfc, void *state)
+{
+  int icon, isim, isigb, idatm, iveta, isimi, ivsig, iwork, ia, idx, mpp, rc;
+  int *iact;
+  double *w;
+
+/*
+ * This subroutine minimizes an objective function F(X) subject to M
+ * inequality constraints on X, where X is a vector of variables that has 
+ * N components. The algorithm employs linear approximations to the 
+ * objective and constraint functions, the approximations being formed by 
+ * linear interpolation at N+1 points in the space of the variables. 
+ * We regard these interpolation points as vertices of a simplex. The 
+ * parameter RHO controls the size of the simplex and it is reduced 
+ * automatically from RHOBEG to RHOEND. For each RHO the subroutine tries 
+ * to achieve a good vector of variables for the current size, and then 
+ * RHO is reduced until the value RHOEND is reached. Therefore RHOBEG and 
+ * RHOEND should be set to reasonable initial changes to and the required 
+ * accuracy in the variables respectively, but this accuracy should be 
+ * viewed as a subject for experimentation because it is not guaranteed. 
+ * The subroutine has an advantage over many of its competitors, however, 
+ * which is that it treats each constraint individually when calculating 
+ * a change to the variables, instead of lumping the constraints together 
+ * into a single penalty function. The name of the subroutine is derived 
+ * from the phrase Constrained Optimization BY Linear Approximations. 
+ *
+ * The user must set the values of N, M, RHOBEG and RHOEND, and must 
+ * provide an initial vector of variables in X. Further, the value of 
+ * IPRINT should be set to 0, 1, 2 or 3, which controls the amount of 
+ * printing during the calculation. Specifically, there is no output if 
+ * IPRINT=0 and there is output only at the end of the calculation if 
+ * IPRINT=1. Otherwise each new value of RHO and SIGMA is printed. 
+ * Further, the vector of variables and some function information are 
+ * given either when RHO is reduced or when each new value of F(X) is 
+ * computed in the cases IPRINT=2 or IPRINT=3 respectively. Here SIGMA 
+ * is a penalty parameter, it being assumed that a change to X is an 
+ * improvement if it reduces the merit function 
+ *      F(X)+SIGMA*MAX(0.0,-C1(X),-C2(X),...,-CM(X)), 
+ * where C1,C2,...,CM denote the constraint functions that should become 
+ * nonnegative eventually, at least to the precision of RHOEND. In the 
+ * printed output the displayed term that is multiplied by SIGMA is 
+ * called MAXCV, which stands for 'MAXimum Constraint Violation'. The 
+ * argument MAXFUN is an int variable that must be set by the user to a 
+ * limit on the number of calls of CALCFC, the purpose of this routine being 
+ * given below. The value of MAXFUN will be altered to the number of calls 
+ * of CALCFC that are made. The arguments W and IACT provide real and 
+ * int arrays that are used as working space. Their lengths must be at 
+ * least N*(3*N+2*M+11)+4*M+6 and M+1 respectively. 
+ *
+ * In order to define the objective and constraint functions, we require 
+ * a subroutine that has the name and arguments 
+ *      SUBROUTINE CALCFC (N,M,X,F,CON) 
+ *      DIMENSION X(*),CON(*)  . 
+ * The values of N and M are fixed and have been defined already, while 
+ * X is now the current vector of variables. The subroutine should return 
+ * the objective and constraint functions at X in F and CON(1),CON(2), 
+ * ...,CON(M). Note that we are trying to adjust X so that F(X) is as 
+ * small as possible subject to the constraint functions being nonnegative. 
+ *
+ * Partition the working space array W to provide the storage that is needed 
+ * for the main calculation.
+ */
+
+  if (n == 0)
+  {
+    if (iprint>=1) fprintf(stderr, "cobyla: N==0.\n");
+    *maxfun = 0;
+    return 0;
+  }
+
+  if (n < 0 || m < 0)
+  {
+    if (iprint>=1) fprintf(stderr, "cobyla: N<0 or M<0.\n");
+    *maxfun = 0;
+    return -2;
+  }
+
+  /* workspace allocation */
+  w = malloc((n*(3*n+2*m+11)+4*m+6)*sizeof(*w));
+  if (w == NULL)
+  {
+    if (iprint>=1) fprintf(stderr, "cobyla: memory allocation error.\n");
+    *maxfun = 0;
+    return -1;
+  }
+  iact = malloc((m+1)*sizeof(*iact));
+  if (iact == NULL)
+  {
+    if (iprint>=1) fprintf(stderr, "cobyla: memory allocation error.\n");
+    free(w);
+    *maxfun = 0;
+    return -1;
+  }
+  
+  /* Parameter adjustments */
+  --iact;
+  --w;
+  --x;
+
+  /* Function Body */
+  mpp = m + 2;
+  icon = 1;
+  isim = icon + mpp;
+  isimi = isim + n * n + n;
+  idatm = isimi + n * n;
+  ia = idatm + n * mpp + mpp;
+  ivsig = ia + m * n + n;
+  iveta = ivsig + n;
+  isigb = iveta + n;
+  idx = isigb + n;
+  iwork = idx + n;
+  rc = cobylb(&n, &m, &mpp, &x[1], &rhobeg, &rhoend, &iprint, maxfun,
+      &w[icon], &w[isim], &w[isimi], &w[idatm], &w[ia], &w[ivsig], &w[iveta],
+      &w[isigb], &w[idx], &w[iwork], &iact[1], calcfc, state);
+
+  /* Parameter adjustments (reverse) */
+  ++iact;
+  ++w;
+
+  free(w);
+  free(iact);
+  
+  return rc;
+} /* cobyla */
+
+/* ------------------------------------------------------------------------- */
+int cobylb(int *n, int *m, int *mpp, double 
+    *x, double *rhobeg, double *rhoend, int *iprint, int *
+    maxfun, double *con, double *sim, double *simi, 
+    double *datmat, double *a, double *vsig, double *veta,
+     double *sigbar, double *dx, double *w, int *iact, cobyla_function *calcfc,
+     void *state)
+{
+  /* System generated locals */
+  int sim_dim1, sim_offset, simi_dim1, simi_offset, datmat_dim1, 
+      datmat_offset, a_dim1, a_offset, i__1, i__2, i__3;
+  double d__1, d__2;
+
+  /* Local variables */
+  double alpha, delta, denom, tempa, barmu;
+  double beta, cmin = 0.0, cmax = 0.0;
+  double cvmaxm, dxsign, prerem = 0.0;
+  double edgmax, pareta, prerec = 0.0, phimin, parsig = 0.0;
+  double gamma;
+  double phi, rho, sum = 0.0;
+  double ratio, vmold, parmu, error, vmnew;
+  double resmax, cvmaxp;
+  double resnew, trured;
+  double temp, wsig, f;
+  double weta;
+  int i__, j, k, l;
+  int idxnew;
+  int iflag = 0;
+  int iptemp;
+  int isdirn, nfvals, izdota;
+  int ivmc;
+  int ivmd;
+  int mp, np, iz, ibrnch;
+  int nbest, ifull, iptem, jdrop;
+  int rc = 0;
+
+/* Set the initial values of some parameters. The last column of SIM holds */
+/* the optimal vertex of the current simplex, and the preceding N columns */
+/* hold the displacements from the optimal vertex to the other vertices. */
+/* Further, SIMI holds the inverse of the matrix that is contained in the */
+/* first N columns of SIM. */
+
+  /* Parameter adjustments */
+  a_dim1 = *n;
+  a_offset = 1 + a_dim1 * 1;
+  a -= a_offset;
+  simi_dim1 = *n;
+  simi_offset = 1 + simi_dim1 * 1;
+  simi -= simi_offset;
+  sim_dim1 = *n;
+  sim_offset = 1 + sim_dim1 * 1;
+  sim -= sim_offset;
+  datmat_dim1 = *mpp;
+  datmat_offset = 1 + datmat_dim1 * 1;
+  datmat -= datmat_offset;
+  --x;
+  --con;
+  --vsig;
+  --veta;
+  --sigbar;
+  --dx;
+  --w;
+  --iact;
+
+  /* Function Body */
+  iptem = min(*n,4);
+  iptemp = iptem + 1;
+  np = *n + 1;
+  mp = *m + 1;
+  alpha = .25;
+  beta = 2.1;
+  gamma = .5;
+  delta = 1.1;
+  rho = *rhobeg;
+  parmu = 0.;
+  if (*iprint >= 2) {
+    fprintf(stderr,
+      "cobyla: the initial value of RHO is %12.6E and PARMU is set to zero.\n",
+      rho);
+  }
+  nfvals = 0;
+  temp = 1. / rho;
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    sim[i__ + np * sim_dim1] = x[i__];
+    i__2 = *n;
+    for (j = 1; j <= i__2; ++j) {
+      sim[i__ + j * sim_dim1] = 0.;
+      simi[i__ + j * simi_dim1] = 0.;
+    }
+    sim[i__ + i__ * sim_dim1] = rho;
+    simi[i__ + i__ * simi_dim1] = temp;
+  }
+  jdrop = np;
+  ibrnch = 0;
+
+/* Make the next call of the user-supplied subroutine CALCFC. These */
+/* instructions are also used for calling CALCFC during the iterations of */
+/* the algorithm. */
+
+L40:
+  if (nfvals >= *maxfun && nfvals > 0) {
+    if (*iprint >= 1) {
+      fprintf(stderr,
+        "cobyla: maximum number of function evaluations reach.\n");
+    }
+    rc = 1;
+    goto L600;
+  }
+  ++nfvals;
+  if (calcfc(*n, *m, &x[1], &f, &con[1], state))
+  {
+    if (*iprint >= 1) {
+      fprintf(stderr, "cobyla: user requested end of minimization.\n");
+    }
+    rc = 3;
+    goto L600;
+  }
+  resmax = 0.;
+  if (*m > 0) {
+    i__1 = *m;
+    for (k = 1; k <= i__1; ++k) {
+      d__1 = resmax, d__2 = -con[k];
+      resmax = max(d__1,d__2);
+    }
+  }
+  if (nfvals == *iprint - 1 || *iprint == 3) {
+    fprintf(stderr, "cobyla: NFVALS = %4d, F =%13.6E, MAXCV =%13.6E\n",
+      nfvals, f, resmax);
+    i__1 = iptem;
+    fprintf(stderr, "cobyla: X =");
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      if (i__>1) fprintf(stderr, "  ");
+      fprintf(stderr, "%13.6E", x[i__]);
+    }
+    if (iptem < *n) {
+      i__1 = *n;
+      for (i__ = iptemp; i__ <= i__1; ++i__) {
+        if (!((i__-1) % 4)) fprintf(stderr, "\ncobyla:  ");
+        fprintf(stderr, "%15.6E", x[i__]);
+      }
+    }
+    fprintf(stderr, "\n");
+  }
+  con[mp] = f;
+  con[*mpp] = resmax;
+  if (ibrnch == 1) {
+    goto L440;
+  }
+
+/* Set the recently calculated function values in a column of DATMAT. This */
+/* array has a column for each vertex of the current simplex, the entries of */
+/* each column being the values of the constraint functions (if any) */
+/* followed by the objective function and the greatest constraint violation */
+/* at the vertex. */
+
+  i__1 = *mpp;
+  for (k = 1; k <= i__1; ++k) {
+    datmat[k + jdrop * datmat_dim1] = con[k];
+  }
+  if (nfvals > np) {
+    goto L130;
+  }
+
+/* Exchange the new vertex of the initial simplex with the optimal vertex if */
+/* necessary. Then, if the initial simplex is not complete, pick its next */
+/* vertex and calculate the function values there. */
+
+  if (jdrop <= *n) {
+    if (datmat[mp + np * datmat_dim1] <= f) {
+      x[jdrop] = sim[jdrop + np * sim_dim1];
+    } else {
+      sim[jdrop + np * sim_dim1] = x[jdrop];
+      i__1 = *mpp;
+      for (k = 1; k <= i__1; ++k) {
+        datmat[k + jdrop * datmat_dim1] = datmat[k + np * datmat_dim1]
+            ;
+        datmat[k + np * datmat_dim1] = con[k];
+      }
+      i__1 = jdrop;
+      for (k = 1; k <= i__1; ++k) {
+        sim[jdrop + k * sim_dim1] = -rho;
+        temp = 0.f;
+        i__2 = jdrop;
+        for (i__ = k; i__ <= i__2; ++i__) {
+          temp -= simi[i__ + k * simi_dim1];
+        }
+        simi[jdrop + k * simi_dim1] = temp;
+      }
+    }
+  }
+  if (nfvals <= *n) {
+    jdrop = nfvals;
+    x[jdrop] += rho;
+    goto L40;
+  }
+L130:
+  ibrnch = 1;
+
+/* Identify the optimal vertex of the current simplex. */
+
+L140:
+  phimin = datmat[mp + np * datmat_dim1] + parmu * datmat[*mpp + np * 
+      datmat_dim1];
+  nbest = np;
+  i__1 = *n;
+  for (j = 1; j <= i__1; ++j) {
+    temp = datmat[mp + j * datmat_dim1] + parmu * datmat[*mpp + j * 
+        datmat_dim1];
+    if (temp < phimin) {
+      nbest = j;
+      phimin = temp;
+    } else if (temp == phimin && parmu == 0.) {
+      if (datmat[*mpp + j * datmat_dim1] < datmat[*mpp + nbest * 
+          datmat_dim1]) {
+        nbest = j;
+      }
+    }
+  }
+
+/* Switch the best vertex into pole position if it is not there already, */
+/* and also update SIM, SIMI and DATMAT. */
+
+  if (nbest <= *n) {
+    i__1 = *mpp;
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      temp = datmat[i__ + np * datmat_dim1];
+      datmat[i__ + np * datmat_dim1] = datmat[i__ + nbest * datmat_dim1]
+          ;
+      datmat[i__ + nbest * datmat_dim1] = temp;
+    }
+    i__1 = *n;
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      temp = sim[i__ + nbest * sim_dim1];
+      sim[i__ + nbest * sim_dim1] = 0.;
+      sim[i__ + np * sim_dim1] += temp;
+      tempa = 0.;
+      i__2 = *n;
+      for (k = 1; k <= i__2; ++k) {
+        sim[i__ + k * sim_dim1] -= temp;
+        tempa -= simi[k + i__ * simi_dim1];
+      }
+      simi[nbest + i__ * simi_dim1] = tempa;
+    }
+  }
+
+/* Make an error return if SIGI is a poor approximation to the inverse of */
+/* the leading N by N submatrix of SIG. */
+
+  error = 0.;
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    i__2 = *n;
+    for (j = 1; j <= i__2; ++j) {
+      temp = 0.;
+      if (i__ == j) {
+        temp += -1.;
+      }
+      i__3 = *n;
+      for (k = 1; k <= i__3; ++k) {
+        temp += simi[i__ + k * simi_dim1] * sim[k + j * sim_dim1];
+      }
+      d__1 = error, d__2 = abs(temp);
+      error = max(d__1,d__2);
+    }
+  }
+  if (error > .1) {
+    if (*iprint >= 1) {
+      fprintf(stderr, "cobyla: rounding errors are becoming damaging.\n");
+    }
+    rc = 2;
+    goto L600;
+  }
+
+/* Calculate the coefficients of the linear approximations to the objective */
+/* and constraint functions, placing minus the objective function gradient */
+/* after the constraint gradients in the array A. The vector W is used for */
+/* working space. */
+
+  i__2 = mp;
+  for (k = 1; k <= i__2; ++k) {
+    con[k] = -datmat[k + np * datmat_dim1];
+    i__1 = *n;
+    for (j = 1; j <= i__1; ++j) {
+      w[j] = datmat[k + j * datmat_dim1] + con[k];
+    }
+    i__1 = *n;
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      temp = 0.;
+      i__3 = *n;
+      for (j = 1; j <= i__3; ++j) {
+        temp += w[j] * simi[j + i__ * simi_dim1];
+      }
+      if (k == mp) {
+        temp = -temp;
+      }
+      a[i__ + k * a_dim1] = temp;
+    }
+  }
+
+/* Calculate the values of sigma and eta, and set IFLAG=0 if the current */
+/* simplex is not acceptable. */
+
+  iflag = 1;
+  parsig = alpha * rho;
+  pareta = beta * rho;
+  i__1 = *n;
+  for (j = 1; j <= i__1; ++j) {
+    wsig = 0.;
+    weta = 0.;
+    i__2 = *n;
+    for (i__ = 1; i__ <= i__2; ++i__) {
+      d__1 = simi[j + i__ * simi_dim1];
+      wsig += d__1 * d__1;
+      d__1 = sim[i__ + j * sim_dim1];
+      weta += d__1 * d__1;
+    }
+    vsig[j] = 1. / sqrt(wsig);
+    veta[j] = sqrt(weta);
+    if (vsig[j] < parsig || veta[j] > pareta) {
+      iflag = 0;
+    }
+  }
+
+/* If a new vertex is needed to improve acceptability, then decide which */
+/* vertex to drop from the simplex. */
+
+  if (ibrnch == 1 || iflag == 1) {
+    goto L370;
+  }
+  jdrop = 0;
+  temp = pareta;
+  i__1 = *n;
+  for (j = 1; j <= i__1; ++j) {
+    if (veta[j] > temp) {
+      jdrop = j;
+      temp = veta[j];
+    }
+  }
+  if (jdrop == 0) {
+    i__1 = *n;
+    for (j = 1; j <= i__1; ++j) {
+      if (vsig[j] < temp) {
+        jdrop = j;
+        temp = vsig[j];
+      }
+    }
+  }
+
+/* Calculate the step to the new vertex and its sign. */
+
+  temp = gamma * rho * vsig[jdrop];
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    dx[i__] = temp * simi[jdrop + i__ * simi_dim1];
+  }
+  cvmaxp = 0.;
+  cvmaxm = 0.;
+  i__1 = mp;
+  for (k = 1; k <= i__1; ++k) {
+    sum = 0.;
+    i__2 = *n;
+    for (i__ = 1; i__ <= i__2; ++i__) {
+      sum += a[i__ + k * a_dim1] * dx[i__];
+    }
+    if (k < mp) {
+      temp = datmat[k + np * datmat_dim1];
+      d__1 = cvmaxp, d__2 = -sum - temp;
+      cvmaxp = max(d__1,d__2);
+      d__1 = cvmaxm, d__2 = sum - temp;
+      cvmaxm = max(d__1,d__2);
+    }
+  }
+  dxsign = 1.;
+  if (parmu * (cvmaxp - cvmaxm) > sum + sum) {
+    dxsign = -1.;
+  }
+
+/* Update the elements of SIM and SIMI, and set the next X. */
+
+  temp = 0.;
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    dx[i__] = dxsign * dx[i__];
+    sim[i__ + jdrop * sim_dim1] = dx[i__];
+    temp += simi[jdrop + i__ * simi_dim1] * dx[i__];
+  }
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    simi[jdrop + i__ * simi_dim1] /= temp;
+  }
+  i__1 = *n;
+  for (j = 1; j <= i__1; ++j) {
+    if (j != jdrop) {
+      temp = 0.;
+      i__2 = *n;
+      for (i__ = 1; i__ <= i__2; ++i__) {
+        temp += simi[j + i__ * simi_dim1] * dx[i__];
+      }
+      i__2 = *n;
+      for (i__ = 1; i__ <= i__2; ++i__) {
+        simi[j + i__ * simi_dim1] -= temp * simi[jdrop + i__ * 
+            simi_dim1];
+      }
+    }
+    x[j] = sim[j + np * sim_dim1] + dx[j];
+  }
+  goto L40;
+
+/* Calculate DX=x(*)-x(0). Branch if the length of DX is less than 0.5*RHO. */
+
+L370:
+  iz = 1;
+  izdota = iz + *n * *n;
+  ivmc = izdota + *n;
+  isdirn = ivmc + mp;
+  idxnew = isdirn + *n;
+  ivmd = idxnew + *n;
+  trstlp(n, m, &a[a_offset], &con[1], &rho, &dx[1], &ifull, &iact[1], &w[
+      iz], &w[izdota], &w[ivmc], &w[isdirn], &w[idxnew], &w[ivmd]);
+  if (ifull == 0) {
+    temp = 0.;
+    i__1 = *n;
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      d__1 = dx[i__];
+      temp += d__1 * d__1;
+    }
+    if (temp < rho * .25 * rho) {
+      ibrnch = 1;
+      goto L550;
+    }
+  }
+
+/* Predict the change to F and the new maximum constraint violation if the */
+/* variables are altered from x(0) to x(0)+DX. */
+
+  resnew = 0.;
+  con[mp] = 0.;
+  i__1 = mp;
+  for (k = 1; k <= i__1; ++k) {
+    sum = con[k];
+    i__2 = *n;
+    for (i__ = 1; i__ <= i__2; ++i__) {
+      sum -= a[i__ + k * a_dim1] * dx[i__];
+    }
+    if (k < mp) {
+      resnew = max(resnew,sum);
+    }
+  }
+
+/* Increase PARMU if necessary and branch back if this change alters the */
+/* optimal vertex. Otherwise PREREM and PREREC will be set to the predicted */
+/* reductions in the merit function and the maximum constraint violation */
+/* respectively. */
+
+  barmu = 0.;
+  prerec = datmat[*mpp + np * datmat_dim1] - resnew;
+  if (prerec > 0.) {
+    barmu = sum / prerec;
+  }
+  if (parmu < barmu * 1.5) {
+    parmu = barmu * 2.;
+    if (*iprint >= 2) {
+      fprintf(stderr, "cobyla: increase in PARMU to %12.6E\n", parmu);
+    }
+    phi = datmat[mp + np * datmat_dim1] + parmu * datmat[*mpp + np * 
+        datmat_dim1];
+    i__1 = *n;
+    for (j = 1; j <= i__1; ++j) {
+      temp = datmat[mp + j * datmat_dim1] + parmu * datmat[*mpp + j * 
+          datmat_dim1];
+      if (temp < phi) {
+        goto L140;
+      }
+      if (temp == phi && parmu == 0.f) {
+        if (datmat[*mpp + j * datmat_dim1] < datmat[*mpp + np * 
+            datmat_dim1]) {
+          goto L140;
+        }
+      }
+    }
+  }
+  prerem = parmu * prerec - sum;
+
+/* Calculate the constraint and objective functions at x(*). Then find the */
+/* actual reduction in the merit function. */
+
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    x[i__] = sim[i__ + np * sim_dim1] + dx[i__];
+  }
+  ibrnch = 1;
+  goto L40;
+L440:
+  vmold = datmat[mp + np * datmat_dim1] + parmu * datmat[*mpp + np * 
+      datmat_dim1];
+  vmnew = f + parmu * resmax;
+  trured = vmold - vmnew;
+  if (parmu == 0. && f == datmat[mp + np * datmat_dim1]) {
+    prerem = prerec;
+    trured = datmat[*mpp + np * datmat_dim1] - resmax;
+  }
+
+/* Begin the operations that decide whether x(*) should replace one of the */
+/* vertices of the current simplex, the change being mandatory if TRURED is */
+/* positive. Firstly, JDROP is set to the index of the vertex that is to be */
+/* replaced. */
+
+  ratio = 0.;
+  if (trured <= 0.f) {
+    ratio = 1.f;
+  }
+  jdrop = 0;
+  i__1 = *n;
+  for (j = 1; j <= i__1; ++j) {
+    temp = 0.;
+    i__2 = *n;
+    for (i__ = 1; i__ <= i__2; ++i__) {
+      temp += simi[j + i__ * simi_dim1] * dx[i__];
+    }
+    temp = abs(temp);
+    if (temp > ratio) {
+      jdrop = j;
+      ratio = temp;
+    }
+    sigbar[j] = temp * vsig[j];
+  }
+
+/* Calculate the value of ell. */
+
+  edgmax = delta * rho;
+  l = 0;
+  i__1 = *n;
+  for (j = 1; j <= i__1; ++j) {
+    if (sigbar[j] >= parsig || sigbar[j] >= vsig[j]) {
+      temp = veta[j];
+      if (trured > 0.) {
+        temp = 0.;
+        i__2 = *n;
+        for (i__ = 1; i__ <= i__2; ++i__) {
+          d__1 = dx[i__] - sim[i__ + j * sim_dim1];
+          temp += d__1 * d__1;
+        }
+        temp = sqrt(temp);
+      }
+      if (temp > edgmax) {
+        l = j;
+        edgmax = temp;
+      }
+    }
+  }
+  if (l > 0) {
+    jdrop = l;
+  }
+  if (jdrop == 0) {
+    goto L550;
+  }
+
+/* Revise the simplex by updating the elements of SIM, SIMI and DATMAT. */
+
+  temp = 0.;
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    sim[i__ + jdrop * sim_dim1] = dx[i__];
+    temp += simi[jdrop + i__ * simi_dim1] * dx[i__];
+  }
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    simi[jdrop + i__ * simi_dim1] /= temp;
+  }
+  i__1 = *n;
+  for (j = 1; j <= i__1; ++j) {
+    if (j != jdrop) {
+      temp = 0.;
+      i__2 = *n;
+      for (i__ = 1; i__ <= i__2; ++i__) {
+        temp += simi[j + i__ * simi_dim1] * dx[i__];
+      }
+      i__2 = *n;
+      for (i__ = 1; i__ <= i__2; ++i__) {
+        simi[j + i__ * simi_dim1] -= temp * simi[jdrop + i__ * 
+            simi_dim1];
+      }
+    }
+  }
+  i__1 = *mpp;
+  for (k = 1; k <= i__1; ++k) {
+    datmat[k + jdrop * datmat_dim1] = con[k];
+  }
+
+/* Branch back for further iterations with the current RHO. */
+
+  if (trured > 0. && trured >= prerem * .1) {
+    goto L140;
+  }
+L550:
+  if (iflag == 0) {
+    ibrnch = 0;
+    goto L140;
+  }
+
+/* Otherwise reduce RHO if it is not at its least value and reset PARMU. */
+
+  if (rho > *rhoend) {
+    rho *= .5;
+    if (rho <= *rhoend * 1.5) {
+      rho = *rhoend;
+    }
+    if (parmu > 0.) {
+      denom = 0.;
+      i__1 = mp;
+      for (k = 1; k <= i__1; ++k) {
+        cmin = datmat[k + np * datmat_dim1];
+        cmax = cmin;
+        i__2 = *n;
+        for (i__ = 1; i__ <= i__2; ++i__) {
+          d__1 = cmin, d__2 = datmat[k + i__ * datmat_dim1];
+          cmin = min(d__1,d__2);
+          d__1 = cmax, d__2 = datmat[k + i__ * datmat_dim1];
+          cmax = max(d__1,d__2);
+        }
+        if (k <= *m && cmin < cmax * .5) {
+          temp = max(cmax,0.) - cmin;
+          if (denom <= 0.) {
+            denom = temp;
+          } else {
+            denom = min(denom,temp);
+          }
+        }
+      }
+      if (denom == 0.) {
+        parmu = 0.;
+      } else if (cmax - cmin < parmu * denom) {
+        parmu = (cmax - cmin) / denom;
+      }
+    }
+    if (*iprint >= 2) {
+      fprintf(stderr, "cobyla: reduction in RHO to %12.6E and PARMU =%13.6E\n",
+        rho, parmu);
+    }
+    if (*iprint == 2) {
+      fprintf(stderr, "cobyla: NFVALS = %4d, F =%13.6E, MAXCV =%13.6E\n",
+        nfvals, datmat[mp + np * datmat_dim1], datmat[*mpp + np * datmat_dim1]);
+
+      fprintf(stderr, "cobyla: X =");
+      i__1 = iptem;
+      for (i__ = 1; i__ <= i__1; ++i__) {
+        if (i__>1) fprintf(stderr, "  ");
+        fprintf(stderr, "%13.6E", sim[i__ + np * sim_dim1]);
+      }
+      if (iptem < *n) {
+        i__1 = *n;
+        for (i__ = iptemp; i__ <= i__1; ++i__) {
+          if (!((i__-1) % 4)) fprintf(stderr, "\ncobyla:  ");
+          fprintf(stderr, "%15.6E", x[i__]);
+        }
+      }
+      fprintf(stderr, "\n");
+    }
+    goto L140;
+  }
+
+/* Return the best calculated values of the variables. */
+
+  if (*iprint >= 1) {
+    fprintf(stderr, "cobyla: normal return.\n");
+  }
+  if (ifull == 1) {
+    goto L620;
+  }
+L600:
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    x[i__] = sim[i__ + np * sim_dim1];
+  }
+  f = datmat[mp + np * datmat_dim1];
+  resmax = datmat[*mpp + np * datmat_dim1];
+L620:
+  if (*iprint >= 1) {
+    fprintf(stderr, "cobyla: NFVALS = %4d, F =%13.6E, MAXCV =%13.6E\n",
+      nfvals, f, resmax);
+    i__1 = iptem;
+    fprintf(stderr, "cobyla: X =");
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      if (i__>1) fprintf(stderr, "  ");
+      fprintf(stderr, "%13.6E", x[i__]);
+    }
+    if (iptem < *n) {
+      i__1 = *n;
+      for (i__ = iptemp; i__ <= i__1; ++i__) {
+        if (!((i__-1) % 4)) fprintf(stderr, "\ncobyla:  ");
+        fprintf(stderr, "%15.6E", x[i__]);
+      }
+    }
+    fprintf(stderr, "\n");
+  }
+  *maxfun = nfvals;
+  return rc;
+} /* cobylb */
+
+/* ------------------------------------------------------------------------- */
+int trstlp(int *n, int *m, double *a, 
+    double *b, double *rho, double *dx, int *ifull, 
+    int *iact, double *z__, double *zdota, double *vmultc,
+     double *sdirn, double *dxnew, double *vmultd)
+{
+  /* System generated locals */
+  int a_dim1, a_offset, z_dim1, z_offset, i__1, i__2;
+  double d__1, d__2;
+
+  /* Local variables */
+  double alpha, tempa;
+  double beta;
+  double optnew, stpful, sum, tot, acca, accb;
+  double ratio, vsave, zdotv, zdotw, dd;
+  double sd;
+  double sp, ss, resold = 0.0, zdvabs, zdwabs, sumabs, resmax, optold;
+  double spabs;
+  double temp, step;
+  int icount;
+  int iout, i__, j, k;
+  int isave;
+  int kk;
+  int kl, kp, kw;
+  int nact, icon = 0, mcon;
+  int nactx = 0;
+
+
+/* This subroutine calculates an N-component vector DX by applying the */
+/* following two stages. In the first stage, DX is set to the shortest */
+/* vector that minimizes the greatest violation of the constraints */
+/*   A(1,K)*DX(1)+A(2,K)*DX(2)+...+A(N,K)*DX(N) .GE. B(K), K=2,3,...,M, */
+/* subject to the Euclidean length of DX being at most RHO. If its length is */
+/* strictly less than RHO, then we use the resultant freedom in DX to */
+/* minimize the objective function */
+/*      -A(1,M+1)*DX(1)-A(2,M+1)*DX(2)-...-A(N,M+1)*DX(N) */
+/* subject to no increase in any greatest constraint violation. This */
+/* notation allows the gradient of the objective function to be regarded as */
+/* the gradient of a constraint. Therefore the two stages are distinguished */
+/* by MCON .EQ. M and MCON .GT. M respectively. It is possible that a */
+/* degeneracy may prevent DX from attaining the target length RHO. Then the */
+/* value IFULL=0 would be set, but usually IFULL=1 on return. */
+
+/* In general NACT is the number of constraints in the active set and */
+/* IACT(1),...,IACT(NACT) are their indices, while the remainder of IACT */
+/* contains a permutation of the remaining constraint indices. Further, Z is */
+/* an orthogonal matrix whose first NACT columns can be regarded as the */
+/* result of Gram-Schmidt applied to the active constraint gradients. For */
+/* J=1,2,...,NACT, the number ZDOTA(J) is the scalar product of the J-th */
+/* column of Z with the gradient of the J-th active constraint. DX is the */
+/* current vector of variables and here the residuals of the active */
+/* constraints should be zero. Further, the active constraints have */
+/* nonnegative Lagrange multipliers that are held at the beginning of */
+/* VMULTC. The remainder of this vector holds the residuals of the inactive */
+/* constraints at DX, the ordering of the components of VMULTC being in */
+/* agreement with the permutation of the indices of the constraints that is */
+/* in IACT. All these residuals are nonnegative, which is achieved by the */
+/* shift RESMAX that makes the least residual zero. */
+
+/* Initialize Z and some other variables. The value of RESMAX will be */
+/* appropriate to DX=0, while ICON will be the index of a most violated */
+/* constraint if RESMAX is positive. Usually during the first stage the */
+/* vector SDIRN gives a search direction that reduces all the active */
+/* constraint violations by one simultaneously. */
+
+  /* Parameter adjustments */
+  z_dim1 = *n;
+  z_offset = 1 + z_dim1 * 1;
+  z__ -= z_offset;
+  a_dim1 = *n;
+  a_offset = 1 + a_dim1 * 1;
+  a -= a_offset;
+  --b;
+  --dx;
+  --iact;
+  --zdota;
+  --vmultc;
+  --sdirn;
+  --dxnew;
+  --vmultd;
+
+  /* Function Body */
+  *ifull = 1;
+  mcon = *m;
+  nact = 0;
+  resmax = 0.;
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    i__2 = *n;
+    for (j = 1; j <= i__2; ++j) {
+      z__[i__ + j * z_dim1] = 0.;
+    }
+    z__[i__ + i__ * z_dim1] = 1.;
+    dx[i__] = 0.;
+  }
+  if (*m >= 1) {
+    i__1 = *m;
+    for (k = 1; k <= i__1; ++k) {
+      if (b[k] > resmax) {
+        resmax = b[k];
+        icon = k;
+      }
+    }
+    i__1 = *m;
+    for (k = 1; k <= i__1; ++k) {
+      iact[k] = k;
+      vmultc[k] = resmax - b[k];
+    }
+  }
+  if (resmax == 0.) {
+    goto L480;
+  }
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    sdirn[i__] = 0.;
+  }
+
+/* End the current stage of the calculation if 3 consecutive iterations */
+/* have either failed to reduce the best calculated value of the objective */
+/* function or to increase the number of active constraints since the best */
+/* value was calculated. This strategy prevents cycling, but there is a */
+/* remote possibility that it will cause premature termination. */
+
+L60:
+  optold = 0.;
+  icount = 0;
+L70:
+  if (mcon == *m) {
+    optnew = resmax;
+  } else {
+    optnew = 0.;
+    i__1 = *n;
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      optnew -= dx[i__] * a[i__ + mcon * a_dim1];
+    }
+  }
+  if (icount == 0 || optnew < optold) {
+    optold = optnew;
+    nactx = nact;
+    icount = 3;
+  } else if (nact > nactx) {
+    nactx = nact;
+    icount = 3;
+  } else {
+    --icount;
+    if (icount == 0) {
+      goto L490;
+    }
+  }
+
+/* If ICON exceeds NACT, then we add the constraint with index IACT(ICON) to */
+/* the active set. Apply Givens rotations so that the last N-NACT-1 columns */
+/* of Z are orthogonal to the gradient of the new constraint, a scalar */
+/* product being set to zero if its nonzero value could be due to computer */
+/* rounding errors. The array DXNEW is used for working space. */
+
+  if (icon <= nact) {
+    goto L260;
+  }
+  kk = iact[icon];
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    dxnew[i__] = a[i__ + kk * a_dim1];
+  }
+  tot = 0.;
+  k = *n;
+L100:
+  if (k > nact) {
+    sp = 0.;
+    spabs = 0.;
+    i__1 = *n;
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      temp = z__[i__ + k * z_dim1] * dxnew[i__];
+      sp += temp;
+      spabs += abs(temp);
+    }
+    acca = spabs + abs(sp) * .1;
+    accb = spabs + abs(sp) * .2;
+    if (spabs >= acca || acca >= accb) {
+      sp = 0.;
+    }
+    if (tot == 0.) {
+      tot = sp;
+    } else {
+      kp = k + 1;
+      temp = sqrt(sp * sp + tot * tot);
+      alpha = sp / temp;
+      beta = tot / temp;
+      tot = temp;
+      i__1 = *n;
+      for (i__ = 1; i__ <= i__1; ++i__) {
+        temp = alpha * z__[i__ + k * z_dim1] + beta * z__[i__ + kp * 
+            z_dim1];
+        z__[i__ + kp * z_dim1] = alpha * z__[i__ + kp * z_dim1] - 
+            beta * z__[i__ + k * z_dim1];
+        z__[i__ + k * z_dim1] = temp;
+      }
+    }
+    --k;
+    goto L100;
+  }
+
+/* Add the new constraint if this can be done without a deletion from the */
+/* active set. */
+
+  if (tot != 0.) {
+    ++nact;
+    zdota[nact] = tot;
+    vmultc[icon] = vmultc[nact];
+    vmultc[nact] = 0.;
+    goto L210;
+  }
+
+/* The next instruction is reached if a deletion has to be made from the */
+/* active set in order to make room for the new active constraint, because */
+/* the new constraint gradient is a linear combination of the gradients of */
+/* the old active constraints. Set the elements of VMULTD to the multipliers */
+/* of the linear combination. Further, set IOUT to the index of the */
+/* constraint to be deleted, but branch if no suitable index can be found. */
+
+  ratio = -1.;
+  k = nact;
+L130:
+  zdotv = 0.;
+  zdvabs = 0.;
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    temp = z__[i__ + k * z_dim1] * dxnew[i__];
+    zdotv += temp;
+    zdvabs += abs(temp);
+  }
+  acca = zdvabs + abs(zdotv) * .1;
+  accb = zdvabs + abs(zdotv) * .2;
+  if (zdvabs < acca && acca < accb) {
+    temp = zdotv / zdota[k];
+    if (temp > 0. && iact[k] <= *m) {
+      tempa = vmultc[k] / temp;
+      if (ratio < 0. || tempa < ratio) {
+        ratio = tempa;
+        iout = k;
+      }
+    }
+    if (k >= 2) {
+      kw = iact[k];
+      i__1 = *n;
+      for (i__ = 1; i__ <= i__1; ++i__) {
+        dxnew[i__] -= temp * a[i__ + kw * a_dim1];
+      }
+    }
+    vmultd[k] = temp;
+  } else {
+    vmultd[k] = 0.;
+  }
+  --k;
+  if (k > 0) {
+    goto L130;
+  }
+  if (ratio < 0.) {
+    goto L490;
+  }
+
+/* Revise the Lagrange multipliers and reorder the active constraints so */
+/* that the one to be replaced is at the end of the list. Also calculate the */
+/* new value of ZDOTA(NACT) and branch if it is not acceptable. */
+
+  i__1 = nact;
+  for (k = 1; k <= i__1; ++k) {
+    d__1 = 0., d__2 = vmultc[k] - ratio * vmultd[k];
+    vmultc[k] = max(d__1,d__2);
+  }
+  if (icon < nact) {
+    isave = iact[icon];
+    vsave = vmultc[icon];
+    k = icon;
+L170:
+    kp = k + 1;
+    kw = iact[kp];
+    sp = 0.;
+    i__1 = *n;
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      sp += z__[i__ + k * z_dim1] * a[i__ + kw * a_dim1];
+    }
+    d__1 = zdota[kp];
+    temp = sqrt(sp * sp + d__1 * d__1);
+    alpha = zdota[kp] / temp;
+    beta = sp / temp;
+    zdota[kp] = alpha * zdota[k];
+    zdota[k] = temp;
+    i__1 = *n;
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      temp = alpha * z__[i__ + kp * z_dim1] + beta * z__[i__ + k * 
+          z_dim1];
+      z__[i__ + kp * z_dim1] = alpha * z__[i__ + k * z_dim1] - beta * 
+          z__[i__ + kp * z_dim1];
+      z__[i__ + k * z_dim1] = temp;
+    }
+    iact[k] = kw;
+    vmultc[k] = vmultc[kp];
+    k = kp;
+    if (k < nact) {
+      goto L170;
+    }
+    iact[k] = isave;
+    vmultc[k] = vsave;
+  }
+  temp = 0.;
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    temp += z__[i__ + nact * z_dim1] * a[i__ + kk * a_dim1];
+  }
+  if (temp == 0.) {
+    goto L490;
+  }
+  zdota[nact] = temp;
+  vmultc[icon] = 0.;
+  vmultc[nact] = ratio;
+
+/* Update IACT and ensure that the objective function continues to be */
+/* treated as the last active constraint when MCON>M. */
+
+L210:
+  iact[icon] = iact[nact];
+  iact[nact] = kk;
+  if (mcon > *m && kk != mcon) {
+    k = nact - 1;
+    sp = 0.;
+    i__1 = *n;
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      sp += z__[i__ + k * z_dim1] * a[i__ + kk * a_dim1];
+    }
+    d__1 = zdota[nact];
+    temp = sqrt(sp * sp + d__1 * d__1);
+    alpha = zdota[nact] / temp;
+    beta = sp / temp;
+    zdota[nact] = alpha * zdota[k];
+    zdota[k] = temp;
+    i__1 = *n;
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      temp = alpha * z__[i__ + nact * z_dim1] + beta * z__[i__ + k * 
+          z_dim1];
+      z__[i__ + nact * z_dim1] = alpha * z__[i__ + k * z_dim1] - beta * 
+          z__[i__ + nact * z_dim1];
+      z__[i__ + k * z_dim1] = temp;
+    }
+    iact[nact] = iact[k];
+    iact[k] = kk;
+    temp = vmultc[k];
+    vmultc[k] = vmultc[nact];
+    vmultc[nact] = temp;
+  }
+
+/* If stage one is in progress, then set SDIRN to the direction of the next */
+/* change to the current vector of variables. */
+
+  if (mcon > *m) {
+    goto L320;
+  }
+  kk = iact[nact];
+  temp = 0.;
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    temp += sdirn[i__] * a[i__ + kk * a_dim1];
+  }
+  temp += -1.;
+  temp /= zdota[nact];
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    sdirn[i__] -= temp * z__[i__ + nact * z_dim1];
+  }
+  goto L340;
+
+/* Delete the constraint that has the index IACT(ICON) from the active set. */
+
+L260:
+  if (icon < nact) {
+    isave = iact[icon];
+    vsave = vmultc[icon];
+    k = icon;
+L270:
+    kp = k + 1;
+    kk = iact[kp];
+    sp = 0.;
+    i__1 = *n;
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      sp += z__[i__ + k * z_dim1] * a[i__ + kk * a_dim1];
+    }
+    d__1 = zdota[kp];
+    temp = sqrt(sp * sp + d__1 * d__1);
+    alpha = zdota[kp] / temp;
+    beta = sp / temp;
+    zdota[kp] = alpha * zdota[k];
+    zdota[k] = temp;
+    i__1 = *n;
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      temp = alpha * z__[i__ + kp * z_dim1] + beta * z__[i__ + k * 
+          z_dim1];
+      z__[i__ + kp * z_dim1] = alpha * z__[i__ + k * z_dim1] - beta * 
+          z__[i__ + kp * z_dim1];
+      z__[i__ + k * z_dim1] = temp;
+    }
+    iact[k] = kk;
+    vmultc[k] = vmultc[kp];
+    k = kp;
+    if (k < nact) {
+      goto L270;
+    }
+    iact[k] = isave;
+    vmultc[k] = vsave;
+  }
+  --nact;
+
+/* If stage one is in progress, then set SDIRN to the direction of the next */
+/* change to the current vector of variables. */
+
+  if (mcon > *m) {
+    goto L320;
+  }
+  temp = 0.;
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    temp += sdirn[i__] * z__[i__ + (nact + 1) * z_dim1];
+  }
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    sdirn[i__] -= temp * z__[i__ + (nact + 1) * z_dim1];
+  }
+  goto L340;
+
+/* Pick the next search direction of stage two. */
+
+L320:
+  temp = 1. / zdota[nact];
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    sdirn[i__] = temp * z__[i__ + nact * z_dim1];
+  }
+
+/* Calculate the step to the boundary of the trust region or take the step */
+/* that reduces RESMAX to zero. The two statements below that include the */
+/* factor 1.0E-6 prevent some harmless underflows that occurred in a test */
+/* calculation. Further, we skip the step if it could be zero within a */
+/* reasonable tolerance for computer rounding errors. */
+
+L340:
+  dd = *rho * *rho;
+  sd = 0.;
+  ss = 0.;
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    if ((d__1 = dx[i__], abs(d__1)) >= *rho * 1e-6f) {
+      d__2 = dx[i__];
+      dd -= d__2 * d__2;
+    }
+    sd += dx[i__] * sdirn[i__];
+    d__1 = sdirn[i__];
+    ss += d__1 * d__1;
+  }
+  if (dd <= 0.) {
+    goto L490;
+  }
+  temp = sqrt(ss * dd);
+  if (abs(sd) >= temp * 1e-6f) {
+    temp = sqrt(ss * dd + sd * sd);
+  }
+  stpful = dd / (temp + sd);
+  step = stpful;
+  if (mcon == *m) {
+    acca = step + resmax * .1;
+    accb = step + resmax * .2;
+    if (step >= acca || acca >= accb) {
+      goto L480;
+    }
+    step = min(step,resmax);
+  }
+
+/* Set DXNEW to the new variables if STEP is the steplength, and reduce */
+/* RESMAX to the corresponding maximum residual if stage one is being done. */
+/* Because DXNEW will be changed during the calculation of some Lagrange */
+/* multipliers, it will be restored to the following value later. */
+
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    dxnew[i__] = dx[i__] + step * sdirn[i__];
+  }
+  if (mcon == *m) {
+    resold = resmax;
+    resmax = 0.;
+    i__1 = nact;
+    for (k = 1; k <= i__1; ++k) {
+      kk = iact[k];
+      temp = b[kk];
+      i__2 = *n;
+      for (i__ = 1; i__ <= i__2; ++i__) {
+        temp -= a[i__ + kk * a_dim1] * dxnew[i__];
+      }
+      resmax = max(resmax,temp);
+    }
+  }
+
+/* Set VMULTD to the VMULTC vector that would occur if DX became DXNEW. A */
+/* device is included to force VMULTD(K)=0.0 if deviations from this value */
+/* can be attributed to computer rounding errors. First calculate the new */
+/* Lagrange multipliers. */
+
+  k = nact;
+L390:
+  zdotw = 0.;
+  zdwabs = 0.;
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    temp = z__[i__ + k * z_dim1] * dxnew[i__];
+    zdotw += temp;
+    zdwabs += abs(temp);
+  }
+  acca = zdwabs + abs(zdotw) * .1;
+  accb = zdwabs + abs(zdotw) * .2;
+  if (zdwabs >= acca || acca >= accb) {
+    zdotw = 0.;
+  }
+  vmultd[k] = zdotw / zdota[k];
+  if (k >= 2) {
+    kk = iact[k];
+    i__1 = *n;
+    for (i__ = 1; i__ <= i__1; ++i__) {
+      dxnew[i__] -= vmultd[k] * a[i__ + kk * a_dim1];
+    }
+    --k;
+    goto L390;
+  }
+  if (mcon > *m) {
+    d__1 = 0., d__2 = vmultd[nact];
+    vmultd[nact] = max(d__1,d__2);
+  }
+
+/* Complete VMULTC by finding the new constraint residuals. */
+
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    dxnew[i__] = dx[i__] + step * sdirn[i__];
+  }
+  if (mcon > nact) {
+    kl = nact + 1;
+    i__1 = mcon;
+    for (k = kl; k <= i__1; ++k) {
+      kk = iact[k];
+      sum = resmax - b[kk];
+      sumabs = resmax + (d__1 = b[kk], abs(d__1));
+      i__2 = *n;
+      for (i__ = 1; i__ <= i__2; ++i__) {
+        temp = a[i__ + kk * a_dim1] * dxnew[i__];
+        sum += temp;
+        sumabs += abs(temp);
+      }
+      acca = sumabs + abs(sum) * .1f;
+      accb = sumabs + abs(sum) * .2f;
+      if (sumabs >= acca || acca >= accb) {
+        sum = 0.f;
+      }
+      vmultd[k] = sum;
+    }
+  }
+
+/* Calculate the fraction of the step from DX to DXNEW that will be taken. */
+
+  ratio = 1.;
+  icon = 0;
+  i__1 = mcon;
+  for (k = 1; k <= i__1; ++k) {
+    if (vmultd[k] < 0.) {
+      temp = vmultc[k] / (vmultc[k] - vmultd[k]);
+      if (temp < ratio) {
+        ratio = temp;
+        icon = k;
+      }
+    }
+  }
+
+/* Update DX, VMULTC and RESMAX. */
+
+  temp = 1. - ratio;
+  i__1 = *n;
+  for (i__ = 1; i__ <= i__1; ++i__) {
+    dx[i__] = temp * dx[i__] + ratio * dxnew[i__];
+  }
+  i__1 = mcon;
+  for (k = 1; k <= i__1; ++k) {
+    d__1 = 0., d__2 = temp * vmultc[k] + ratio * vmultd[k];
+    vmultc[k] = max(d__1,d__2);
+  }
+  if (mcon == *m) {
+    resmax = resold + ratio * (resmax - resold);
+  }
+
+/* If the full step is not acceptable then begin another iteration. */
+/* Otherwise switch to stage two or end the calculation. */
+
+  if (icon > 0) {
+    goto L70;
+  }
+  if (step == stpful) {
+    goto L500;
+  }
+L480:
+  mcon = *m + 1;
+  icon = mcon;
+  iact[mcon] = mcon;
+  vmultc[mcon] = 0.;
+  goto L60;
+
+/* We employ any freedom that may be available to reduce the objective */
+/* function before returning a DX whose length is less than RHO. */
+
+L490:
+  if (mcon == *m) {
+    goto L480;
+  }
+  *ifull = 0;
+L500:
+  return 0;
+} /* trstlp */

second-stage/programs/wlle/cobyla.h

+/* cobyla : contrained optimization by linear approximation */
+
+/*
+ * Copyright (c) 1992, Michael J. D. Powell (M.J.D.Powell@damtp.cam.ac.uk)
+ * Copyright (c) 2004, Jean-Sebastien Roy (js@jeannot.org)
+ * 
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ * 
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ * 
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/*
+ * This software is a C version of COBYLA2, a contrained optimization by linear
+ * approximation package developed by Michael J. D. Powell in Fortran.
+ * 
+ * The original source code can be found at :
+ * http://plato.la.asu.edu/topics/problems/nlores.html
+ */
+
+/* $Jeannot: cobyla.h,v 1.10 2004/04/18 09:51:37 js Exp $ */
+
+#ifndef _COBYLA_
+#define _COBYLA_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+ * Verbosity level
+ */
+typedef enum {
+  COBYLA_MSG_NONE = 0, /* No messages */
+  COBYLA_MSG_EXIT = 1, /* Exit reasons */
+  COBYLA_MSG_ITER = 2, /* Rho and Sigma changes */
+  COBYLA_MSG_INFO = 3, /* Informational messages */
+} cobyla_message;
+
+/*
+ * Possible return values for cobyla
+ */
+typedef enum
+{
+  COBYLA_MINRC     = -2, /* Constant to add to get the rc_string */
+  COBYLA_EINVAL    = -2, /* N<0 or M<0 */
+  COBYLA_ENOMEM    = -1, /* Memory allocation failed */
+  COBYLA_NORMAL    =  0, /* Normal return from cobyla */
+  COBYLA_MAXFUN    =  1, /* Maximum number of function evaluations reached */
+  COBYLA_ROUNDING  =  2, /* Rounding errors are becoming damaging */
+  COBYLA_USERABORT =  3  /* User requested end of minimization */
+} cobyla_rc;
+
+/*
+ * Return code strings
+ * use cobyla_rc_string[rc - COBYLA_MINRC] to get the message associated with
+ * return code rc.
+ */
+extern char *cobyla_rc_string[6];
+
+/*
+ * A function as required by cobyla
+ * state is a void pointer provided to the function at each call
+ *
+ * n     : the number of variables
+ * m     : the number of constraints
+ * x     : on input, then vector of variables (should not be modified)
+ * f     : on output, the value of the function
+ * con   : on output, the value of the constraints (vector of size m)
+ * state : on input, the value of the state variable as provided to cobyla
+ *
+ * COBYLA will try to make all the values of the constraints positive.
+ * So if you want to input a constraint j such as x[i] <= MAX, set:
+ *   con[j] = MAX - x[i]
+ * The function must returns 0 if no error occurs or 1 to immediately end the
+ * minimization.
+ *
+ */
+typedef int cobyla_function(int n, int m, double *x, double *f, double *con,
+  void *state);
+
+/*
+ * cobyla : minimize a function subject to constraints
+ *
+ * n         : number of variables (>=0)
+ * m         : number of constraints (>=0)
+ * x         : on input, initial estimate ; on output, the solution
+ * rhobeg    : a reasonable initial change to the variables
+ * rhoend    : the required accuracy for the variables
+ * message   : see the cobyla_message enum
+ * maxfun    : on input, the maximum number of function evaluations
+ *             on output, the number of function evaluations done
+ * calcfc    : the function to minimize (see cobyla_function)
+ * state     : used by function (see cobyla_function)
+ *
+ * The cobyla function returns a code defined in the cobyla_rc enum.
+ *
+ */
+extern int cobyla(int n, int m, double *x, double rhobeg, double rhoend,
+  int message, int *maxfun, cobyla_function *calcfc, void *state);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _COBYLA_ */

second-stage/programs/wlle/cvlm-lbfgs.cc

+// cvlm-lbfgs.cc -- A linear model estimator for a variety of user-selectable loss functions.
+
+const char usage[] =
+"cvlm-lbfgs -- A cross-validating linear model estimator for a variety of user-selectable loss functions.\n"
+"\n"
+" Based on earlier versions of cvlm written by Mark Johnson\n"
+"\n"
+" It uses liblbfgs for feature weight optimization and COBYLA to tune regularization coefficents.\n"
+" The regularizer weight(s) are set by cross-validation on development data.\n"
+"\n"
+"Usage: cvlm-lbfgs [-h] [-d debug_level] [-c c0] [-C c00] [-p p] [-r r] [-s s] [-t tol]\n"
+"                  [-l ltype] [-F f] [-G] [-n ns] [-f feat-file]\n"
+"                  [-o weights-file]  [-e eval-file] [-x eval-file2]\n"
+"                  [-i iterations]\n"
+"	           < train-file\n"
+"\n"
+"where:\n"
+"\n"
+" debug_level > 0 controls the amount of output produced\n"
+"\n"
+" -c c0 is the initial value for the regularizer constant.\n"
+"\n"
+" -C c00 multiplies the regularizer constant for the first feature class\n"
+" by c00 (this can be used to allow the first feature class to be regularized less).\n"
+"\n"
+" -i iterations specifies the maximum number of regularization constants to search\n"
+" (defaults to 10 if not binning feature classes and 50 otherwise)\n"
+"\n"
+" -l ltype identifies the type of loss function used:\n"
+"\n"
+"    -l 0 - log loss (c0 ~ 5)\n"
+"    -l 1 - EM-style log loss (c0 ~ 5)\n"
+"    -l 2 - pairwise log loss \n"
+"    -l 3 - exp loss (c0 ~ 25, s ~ 1e-5)\n"
+"    -l 4 - log exp loss (c0 ~ 1e-4)\n"
+"    -l 5 - maximize expected F-score (c ~ ?)\n"
+"\n"
+" -r r specifies that the weights are initialized to random values in\n"
+"   [-r ... +r],\n"
+"\n"
+" -t tol specifies the stopping tolerance for the LBFGS/OWLQN optimizer\n"
+"\n"
+" -F f indicates that a parse should be taken as correct\n"
+"   proportional to f raised to its f-score, and\n"
+"\n"
+" -G indicates that each sentence is weighted by the number of\n"
+"   edges in its gold parse.\n"
+"\n"
+" -n ns is the maximum number of ':' characters in a <featclass>, used to\n"
+" determine how features are binned into feature classes (ns = -1 bins\n"
+" all features into the same class)\n"
+"\n"
+" -f feat-file is a file of <featclass> <featuredetails> lines, used for\n"
+" cross-validating regularizer weights,\n"
+"\n"
+" train-file, eval-file and eval-file2 are files from which training and evaluation\n"
+" data are read (if eval-file ends in the suffix .bz2 then bzcat is used\n"
+" to read it; if no eval-file is specified, then the program tests on the\n"
+" training data),\n"
+"\n"
+" weights-file is a file to which the estimated weights are written,\n"
+"\n"
+"The function that the program minimizes is:\n"
+"\n"
+"   Q(w) = s * (- L(w) + c * sum_j pow(fabs(w[j]), p) ), where:\n"
+"\n"
+"   L(w) is the loss function to be optimized.\n"
+"\n"
+"With debug = 0, the program writes a single line to stdout:\n"
+"\n"
+"c p r s it nzeroweights/nweights neglogP/nsentence ncorrect/nsentences\n"
+"\n"
+"With debug >= 10, the program writes out a histogram of weights as well\n"
+"\n"
+"Data format:\n"
+"-----------\n"
+"\n"
+"<Data>     --> [S=<NS>] <Sentence>*\n"
+"<Sentence> --> [G=<G>] N=<N> <Parse>*\n"
+"<Parse>    --> [P=<P>] [W=<W>] <FC>*,\n"
+"<FC>       --> <F>[=<C>]\n"
+"\n"
+"NS is the number of sentences.\n"
+"\n"
+"Each <Sentence> consists of N <Parse>s.  <G> is the gold standard\n"
+"score.  To get parsing precision and recall results, set <G> to the\n"
+"number of edges in the gold standard parse.  To get accuracy results,\n"
+"set <G> to 1 (the default).\n"
+"\n"
+"A <Parse> consists of <FC> pairs.  <P> is the parse's possible highest\n"
+"score and <W> is the parse's actual score.  To get parsing precision and\n"
+"recall results, set <P> to the number of edges in the parse and <W> to\n"
+"the number of edges in common between the gold and parse trees.\n"
+"\n"
+"A <FC> consists of a feature (a non-negative integer) and an optional\n"
+"count (a real).\n"
+"\n"
+"The default for all numbers except <W> is 1.  The default for <W> is 0.\n";
+
+#include "custom_allocator.h"    // must come first
+#define _GLIBCPP_CONCEPT_CHECKS  // uncomment this for checking
+
+#include <cassert>
+#include <cctype>
+#include <cerrno>
+#include <cmath>
+#include <cstdio>
+#include <cstdlib>
+#include <cstring>
+#include <iostream>
+#include <unistd.h>
+#include <vector>
+
+#include <lbfgs.h>
+
+#include "lmdata.h"
+#include "cobyla.h"
+#include "utility.h"
+
+typedef std::vector<double> doubles;
+typedef std::vector<size_t> size_ts;
+
+int debug_level = 0;
+
+enum loss_type { log_loss, em_log_loss, pairwise_log_loss, exp_loss, log_exp_loss, 
+		 expected_fscore_loss };
+
+lbfgsfloatval_t loss_function_objective_wrapper(
+        void *instance,
+        const lbfgsfloatval_t *x,
+        lbfgsfloatval_t *grad,
+        const int n,
+        const lbfgsfloatval_t step);
+int estimator1_wrapper(int n, int m, double *x, double *f, double *con, void *func_data);
+
+void print_histogram(int nx, const double x[], int nbins=20) {
+  int nx_nonzero = 0;
+  for (int i = 0;