Commits

Christos Kannas  committed 9c2dbd3

Add How To.

  • Participants
  • Parent commits f38b133

Comments (0)

Files changed (1)

+*******
+How To:
+*******
+
+Prerequisites
+=============
+Pyparsing (http://pyparsing.wikispaces.com/):
+*********************************************
+- First install setuptools for Python to get easy_install: **sudo apt-get install python-setuptools**
+- Then install pyparsing: **easy_install pyparsing**
+
+Execute
+=======
+- Unzip the source code in a directory of your choice.
+- Make sure that all mapper and reducer scripts located in DocIRHadoop/InvertIndex have permission for execution.
+- Open a terminal.
+- Export the path to the DocIrHadoop directory in PYTHONPATH environment variable: **export PYTHONPATH=/path/to/parent/of/DocIRHadoop:$PYTHONPATH**
+- Go to parent of DocIRHadoop directory.
+- In a second terminal go to Hadoop directory and start Hadoop (**bin/start-all.sh**).
+- In the first terminal type: **python DocIRHadoop/run.py**
+- At the first promt enter the full path of the location of the english-documents directory. Press Enter.
+- Enter the name of the destination directory in HDFS. Press Enter.
+- At this time you will see a lot of information of the execution of DocIRHadoop, especially for the MapReduce jobs.
+- When the jobs for inverted indexing finish you will access the Search section.
+- Here you type your queries, you see the job execution info.
+- And then you get the result of the query.
+- To Exit press Ctrl+D.
+