Commits

Author Commit Message Labels Comments Date
Stephen Roller
Add a kd-split-method command line option.
Stephen Roller
Lots of whitespace changes for consistency. Use 4 spaces instead of tabs in java. Use 2 spaces in scala.
Stephen Roller
Ben's renaming from article to document messed up some filename references for the twitter data. Undo this reference until he renames the files.
Stephen Roller
Manually merge work that got obliterated during the merge conflict fiasco from a few weeks ago.
Ben Wing
Fix argument document-data-file to document-file
Ben Wing
Last bits of geotag -> geolocate
Ben Wing
More changes of article -> document
Ben Wing
Change remaining variable names referring to articles to refer to documents instead
Ben Wing
Deal with article/document inconsistency by using 'document' consistently; base class Article -> GeoDocument, class GeoArticle -> DistDocument; argument --article-data-file -> --document-data-file
Ben Wing
Ignore .class files
Ben Wing
Add some stuff only in my local repository (e.g. in bin/old), move some Python stuff into subdir 'twitter-process'; add README.wikigrounder and README.wikigrounder.old to 'python/wikigrounder'
Ben Wing
Automatic merge
Ben Wing
Automatic merge
Ben Wing
Add a comment explaining collectionutil
Ben Wing
Move utility packages from opennlp.textgrounder.geolocate to opennlp.textgrounder.util, break up tgutil into more specific files, making general collections of functions into package objects
Ben Wing
Move utility code until textgrounder.util
Ben Wing
Add twokenize.scala, version of 2011-06-13 from https://bitbucket.org/jasonbaldridge/twokenize
Ben Wing
Rearrange article fields to put general fields first, followed by Wikipedia-specific fields, and add program to convert article-data files to new format
Ben Wing
Automatic merge
Ben Wing
Fixes to split_bzip.py, especially to solve a deadlock
Stephen Roller
Modify bucketSize to be a parameter passed in the constructor.
Ben Wing
Add twokenize, emoticon from GeoText preprocess code
Ben Wing
Add file for concatenating and splitting bzip files
Ben Wing
Automatic merge
Ben Wing
Add unescape_entities.py, code from Frederik Lundh, http://effbot.org/zone/re-sub.htm#unescape-html
Ben Wing
Separate GenerateKML app into GenerateKML.scala
Ben Wing
Extract cell-related code from Geolocate.scala to Cell.scala
Ben Wing
Finish converting GeoArticleTable to use task-specific counters for the statistics
Ben Wing
Rename ExperimentApp.scala to Experiment.scala -- now contains various classes for experiments
Ben Wing
Itsy-bitsy comment addition
  1. Prev
  2. Next