Commits

Show all
Author Commit Message Labels Comments Date
david_walker
initial progress report
Branches
parse
david_walker
completed draft of first progress report
Branches
parse
david_walker
BibTeX bibliography file
Branches
parse
david_walker
update token.cend for merging tokens in "years old" type expressions
Branches
parse
david_walker
checkpoint
Branches
parse
david_walker
handle quoted parenthesis
Branches
parse
david_walker
implementation progress report, initial (incomplete) version
Branches
parse
david_walker
rename parser and token modules
Branches
parse
david_walker
converting pos from simple string to PosContainer object
Branches
parse
david_walker
Token.pos was a single Penn Treebank token type, such as 'NN'. With this checkin, it becomes a list of PosTag namedtuple objects, each of which has a token type and a probability value. In most cases there will be only a single entry in the list, but there can be three or more. This change is necessary because the parser fails to parse some sentences given only the highest-probability part-of-speech tag for each token, but succeeds if lower-probability alternatives are present.
Branches
parse
david_walker
rename parser.py to myparser.py to avoid conflict with system module
Branches
parse
david_walker
improve handling of punctuation characters
Branches
parse
david_walker
add test for "nn-year-old" type expressions
Branches
parse
david_walker
changes needed to support YearOldRule, which depends on parse trees
Branches
parse
david_walker
launch cheap as xml-rpc server if not already running
Branches
parse
david_walker
code migrated into rules.py
Branches
parse
david_walker
passing unit tests, except improve/expand. that can be re-enabled once it is possible to search for a noun phrase
Branches
parse
david_walker
work in progress: removing transforms and making rules directly change tokens
Branches
parse
david_walker
revamp of rules.py continues
Branches
parse
david_walker
debug tokensearch.py
Branches
parse
david_walker
checkpoint before massive change--about to merge transform code directly into rules, and have rules apply themselves directly rather than in a two-phase rule/transform process.
Branches
parse
david_walker
tokensearch.py: finish get_levenshtein_dist and modify_tokens
Branches
parse
david_walker
in-development code to search and replace tokens
Branches
parse
david_walker
make repr more selective about what it displays
Branches
parse
david_walker
fix right_token.cbegin in ParagraphTransform
Branches
parse
david_walker
first incomplete code to interface to PET HPSG parser binary 'cheap'
Branches
parse
david_walker
finish adding code to update cbegin and cend attributes of tokens
Branches
parse
david_walker
modify get_transforms methods of rules to skip non_printing tokens;
Branches
parse
david_walker
make eof token have string *EOF*
Branches
parse
david_walker
split Token class out of base into mytoken.py (token.py conflicts with standrd module)
Branches
parse
  1. Prev
  2. Next