Commits

David McClosky committed e2c6400

README and parseIt cleanups to discourage multithreading
- update all references to multithreading to mention instability
- remove PETSc/TAO compilation instructions since cvlm is currently
busted (and will be replaced by a version that doesn't require PETSc/TAO
soon).
- fix clash between pretty printing and POS tag smoothing option parsing
(both were 'P', smoothing is now 'p')
- minor cleanups in parseIt.C

  • Participants
  • Parent commits f1c669c

Comments (0)

Files changed (4)

 name of the tar file you have downloaded), as this will enable others
 to compare their results to yours.
 
-MULTI-THREADED PARSING
-======================
-
-NEW!!!  The first stage parser, which uses about 95% of the time, is
-now multi-treaded. The default is two threads.  Currently the maximum
-is four.  To change the number (or maximum) see the README file for
-the first-stage parser.  For the time being a non-threaded version is
-available in case there are problems with threads.  Send email to ec
-if you have problems. See below for details.
-
 COMPILING THE PARSER
 ====================
 
 
 > make real-clean
 
+MULTI-THREADED PARSING
+======================
 
-NON-THREADED PARSER
-===================
-To use the non-threaded parser instead change the following line
-in the Makefile
+The first stage parser, which uses about 95% of the time, is multitreaded.
+However, multithreading support does NOT appear to be stable at this
+time and its use is discouraged.  The default is to only use one thread
+which does not cause problems.  If you're willing to try your luck with
+multithreading, currently the maximum is 64 threads.  To change the
+number (or maximum) see the README file for the first-stage parser.
+For the time being a non-threaded version (oparseIt) is available in
+case there are problems with threads.
+
+To use the non-threaded parser instead change the following line in
+the Makefile:
 
 NBESTPARSER=first-stage/PARSE/parseIt
 
-It should now read:
+to
+
 NBESTPARSER=first-stage/PARSE/oparseIt
 
 That is, it is identical except for the "o" in oparseIt
 
 Then run oparse.sh, rather than parse.sh.  
-
-
-INSTALLING PETSC AND TAO
-========================
-
-If you're using cvlm as your estimator, you'll need to have PETSc and
-Tao installed in order to retrain the reranker.  Otherwise, you can
-safely ignore this section.
-
-These installation instructions work for gcc version 4.2.1 (you also
-need g++ and gfortran).
-
-1. Unpack PETSc and TAO somewhere, and make shell variables point
-to those directories (put the shell variable definitions in your
-.bash_profile or equivalent)
-
-export PETSC_DIR=/usr/local/share/petsc
-export TAO_DIR=/usr/local/share/tao
-export PETSC_ARCH="linux"
-export BOPT=O_c++
-
-cd /usr/local/share
-ln -s petsc-2.3.3-p6 petsc
-ln -s tao-1.9 tao
-
-2. Configure and build PETSc
-
-cd petsc
-FLAGS="-march=native -mfpmath=sse -msse2 -mmmx -O3 -ffast-math"
-./config/configure.py --with-cc=gcc --with-fc=gfortran --with-cxx=g++ --download-f-blas-lapack=1 --with-mpi=0 --with-clanguage=C++ --with-shared=1 --with-dynamic=1 --with-debugging=0 --with-x=0 --with-x11=0 COPTFLAGS=$FLAGS FOPTFLAGS=$FLAGS CXXOPTFLAGS=$FLAGS
-make all
-
-3. Configure and build TAO
-
-cd ../tao
-make all

File first-stage/PARSE/Params.C

    if(args.isset('N'))
      {
        Bchart::Nth = atoi(args.value('N').c_str());
-       //cerr << "Set Nth to " << Bchart::Nth << endl;
      }
    if(args.isset('s')) Bchart::smallCorpus = true;
    if(args.isset('S')) Bchart::silent = true;
      extPosIfstream=new ifstream(nm.c_str());
      assert(extPosIfstream);
    }
-   if(args.isset('P'))
+   if(args.isset('p'))
      {
-       float smoothPosAmount = atof(args.value('P').c_str());
+       float smoothPosAmount = atof(args.value('p').c_str());
        assert(smoothPosAmount >= 0);
        assert(smoothPosAmount <= 1);
        Bchart::smoothPosAmount = smoothPosAmount;

File first-stage/PARSE/parseIt.C

 
   cerr << "\nPerformance/Quality:\n";
   cerr << "-s: small training corpus flag [off by default]\n";
-  cerr << "-t: number of threads [2]\n";
+  cerr << "-t: number of threads [1 -- multithreading may be unstable]\n";
   cerr << "-T: over-parsing level [210]\n";
-  cerr << "-P: smooth known part of speech probabilities. Set to a float to enable. [0]\n";
+  cerr << "-p: smooth known part of speech probabilities. Set to a float to enable. [0]\n";
 
   cerr << "\nInput:\n";
   cerr << "-C: case-insensitive flag\n";
   cerr << "-P: pretty-print flag\n";
   cerr << "-S: silent failure flag\n";
 
-  //cerr << "-t: report timings\n"; // this was in comment at start of main but appears deprecated
-
   cerr << "\nSee README file for additional information.\n\n";
 }
 

File first-stage/README

 [Update 2013] Using multiple threads is not currently recommended as
 there appear to be thread safety issues.
 
-parseIt is multi threaded.  It currently assumes two threads (for dual
-processors).  To change this, use the command line argument, -t4 to
-have it use, e,g, 4 threads.  Currently the maximum number of threads
-allowed is 4.  To change this change the following line in Features.h
-and recompile parseIt.
+parseIt is multithreaded.  It currently defaults to using a single thread.
+To change this, use the command line argument, -t4 to have it use, e,g,
+4 threads.  To change the maximum number of threads, change the following
+line in Features.h and recompile parseIt.
 
-#define MAXNUMTHREADS 4
+#define MAXNUMTHREADS [maximum number of threads]
 
 9. evalTree