Ark-SAGE is a Java library that implements the L1-regularized version of Sparse Additive GenerativE models of Text (SAGE). SAGE is an algorithm for learning sparse representations of text. Details of the algorithm is described in
Eisenstein, Jacob, Amr Ahmed, and Eric P. Xing. "Sparse additive generative models of text." In Proceedings of ICML. pp. 1041-1048. 2011. PDF
The idea behind the L1-regularized implementation of SAGE is briefly described in
Yanchuan Sim, Noah A. Smith, David A. Smith. "Discovering Factions in the Computational Linguistics Community." In Proceedings of the Association for Computational Linguistics (ACL 2012) Special Workshop on Rediscovering 50 Years of Discoveries. pp. 22-32. 2012. PDF
There are several ways you can use this library. The most straightforward way is to use SAGE for learning sparse effects without latent variables using the tool included in the library. You can run the tool using the shell script
See the relevant Javadoc for ark-sage on
for more details.
Fixes to Version 0.1 (4/7/2013)
- Fixed usage information appearing > 1 time.
Fixes to Version 0.1 (3/9/2013)
- Added regularization penalty to log likelihood calculation.
- Fixed wrong commons-math library version in
-XX:ParallelGCThreadsas a default option in
Version 0.1 (3/3/2013)
- Initial release of SAGE along with SupervisedSAGE implementation.