Commits

Show all
Author Commit Message Labels Comments Date
david_walker
add only single space after sentence final punctuation
david_walker
don't add space before closing quote
david_walker
merge
david_walker
Append an s to currency names unless the next word is "loan"
david_walker
don't split at apostrophes because the simple approach of having a list of words which can contain them ("you'll" etc.) fails to account for names, some of which can contain multiple apostrophes (e.g. "Ng'ang'a").
da...@david-office.Bubka
add ll to list of apostrophe endings
david_walker
prevent pycountry logging complaint by adding null handler
david_walker
put log file in temp directory instead of current dir
da...@david-office
add two spaces after sentence-final period
da...@david-office.Bubka
add a.m. and p.m. as abbreviations
david_walker
one acre fund template cleanup
david_walker
new regexes for One Acre Fund
david_walker
merge the unicode logging fix to transforms.py from the dev branch
david_walker
change debug log file from full path to logfile.txt to just 'kea.log'
david_walker
fix unicode error in debug logging with helper function token_strings()
Branches
parse
david_walker
rename kea2.py to kea.py. from now on, there will only be a single kea.py. the production version lives in the default branch, and development versions have their own branches, whose changes will be merged into the default branch once stable.
david_walker
delete obsolete kea.py
david_walker
preparation for branching
david_walker
prepare for sentence delimiter token
david_walker
make ':' a non-spacing punctuation character
david_walker
add checks for token.is_URL to split rules
david_walker
don't split contractions at apostrophe
david_walker
convert X USD to $X
david_walker
force abbreviations to normal form
david_walker
more indexsplittransform fixes
david_walker
fix bugs in punct and alphanum splits
david_walker
fix IndexSplitTransform bug
david_walker
improve debug output
david_walker
add splitting of alphanumeric tokens
david_walker
proper spacing on output text generation
  1. Prev
  2. Next