fogbow / README

----------------------------------------------------------
Scalabha

Author: Jason Baldridge (jasonbaldridge@gmail.com)
----------------------------------------------------------


Introduction
============

This is to be a package for helping teach Computational Linguistics
using Scala. No aspirations in particular to be like NLTK, just
something to provide some basic functionality and a build structure
for students.

It's called Scalabha because "bha" is a Proto-Indo-European root that
is connected with language and speech.

Requirements
============

* Version 1.6 of the Java 2 SDK (http://java.sun.com)
* Version 0.20.2 of Hadoop: http://hadoop.apache.org/common/releases.html


Configuring your environment variables
======================================

The easiest thing to do is to set the environment variables JAVA_HOME
and SCALABHA_DIR to the relevant locations on your system. Set JAVA_HOME
to match the top level directory containing the Java installation you
want to use.

For example, on Windows:

C:\> set JAVA_HOME=C:\Program Files\jdk1.5.0_04

or on Unix:

% setenv JAVA_HOME /usr/local/java
  (csh)
> export JAVA_HOME=/usr/java
  (ksh, bash)

On Windows, to get these settings to persist, it's actually easiest to
set your environment variables through the System Properties from the
Control Panel. For example, under WinXP, go to Control Panel, click on
System Properties, choose the Advanced tab, click on Environment
Variables, and add your settings in the User variables area.

Next, likewise set SCALABHA_DIR to be the top level directory where you
unzipped the Scalabha download. In Unix, type 'pwd' in the directory
where this file is and use the path given to you by the shell as
SCALABHA_DIR.  You can set this in the same manner as for JAVA_HOME
above.

Next, add the directory SCALABHA_DIR/bin to your path. For example, you
can set the path in your .bashrc file as follows:

export PATH=$PATH:$SCALABHA_DIR/bin

Once you have taken care of these three things, you should be able to
build and use the Scalabha Library.

Note: Spaces are allowed in JAVA_HOME but not in SCALABHA_DIR.  To set
an environment variable with spaces in it, you need to put quotes around
the value when on Unix, but you must *NOT* do this when under Windows.

It is assumed that you have Hadoop 0.20.2 installed and in your path,
and that you have set HADOOP_HOME to be the location of your Hadoop
0.20.2 installation.


Building the system from source
===============================

Scalabha uses SBT (Simple Build Tool) with a standard directory
structure.  To build Scalabha, type (in the $SCALABHA_DIR directory):

$ scalabha build update compile

This will compile the source files and put them in
./target/classes. If this is your first time running it, you will see
messages about Scala being dowloaded -- this is fine and
expected. Once that is over, the Scalabha code will be compiled.

To try out other build targets, do:

$ scalabha build

This will drop you into the SBT interface. To see the actions that are
possible, hit the TAB key. (In general, you can do auto-completion on
any command prefix in SBT, hurrah!)

Documentation for SBT is here:

https://github.com/harrah/xsbt/wiki

Note: if you have SBT 0.10.1 already installed on your system, you can
also just call it directly with "sbt" in SCALABHA_DIR.


Trying it out
=============

Assuming you have completed all of the above steps, including running
the "compile" action in SBT, you should now be able to try out some
examples, to be added.


Now what?
=============

One purpose of this package is to allow people to easily build a jar
of their own without needing anything other than the command line, a
Hadoop installation, and Java. You should be able to adapt the SBT
build to your own project and start creating your own packages based
on these fairly straightforwardly. You'll want to:

 * Change $SCALABHA_DIR/build.sbt properties and configurations to be
   appropriate for your project. If you need to specify new managed
   dependencies, you can do so easily in that file (see SBT
   documentation for details). If you prefer to add dependencies
   manually, just add them to $SCALABHA_DIR/lib and they'll get picked
   up without any fuss.

 * Change $SCALABHA_DIR/bin to be an executable of your choice, named
   for your project, and adapt as necessary (including changing
   $SCALABHA to your project name, etc).

Good luck!


Questions or suggestions?
=========================

Email Jason Baldridge: jasonbaldrige@gmail.com

Or, create an issue on Bitbucket: 

    https://bitbucket.org/jasonbaldridge/scalabha/issues
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.