David McClosky avatar David McClosky committed 8aefb9e

Change how we handle the PU tag in Chinese.
We now only use the word as a tag if the word is a known tag. Otherwise,
we continue to use the PU tag. Better error message for the TRAIN/
version.

Comments (0)

Files changed (2)

first-stage/PARSE/InputTree.C

   /* fixes bugs in Chinese Treebank */
   if(Term::Language == "Ch")
     {
-      if(trm == "PU") trm = wrd;
+      if (trm == "PU" && Term::get(wrd)) {
+        trm = wrd;
+      }
       const Term* ctrm = Term::get(trm);
       if(!ctrm)
 	{

first-stage/TRAIN/InputTree.C

 		}
 	}
     }
-  if(Term::Language == "Ch" && trm == "PU") trm = wrd;
+  if(Term::Language == "Ch" && trm == "PU") {
+    if (Term::get(wrd)) {
+      trm = wrd;
+    }
+  }
   if (!Term::get(trm))
     {
-      cerr<<trm<<endl;
+      cerr << "Couldn't find term: " << trm << endl;
       assert(Term::get(trm));
     }
   if(wrd == "" && subTrs.size() == 0) return NULL;
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.