OS: Unix, MAC, Linux. Create directory HapZipper : mkdir HapZipper. Go to HapZipper directory : cd HapZipper Download HapZipper.tar.bz2 into this directory. Type : tar -xvf HapZipper.tar.bz2 Type : make clean Type : make Type : ls -al If you find problem with compilation, see the section "COMPILATION" below. You will find the following directories : hapmap, dbsnp, encoded and decoded. hapmap : contains a sample hapmap file for chromosome 1, hapmap3_r2_b36_fwd.consensus.qc.poly.chr1_jpt+chb.unr.phased dbsnp : contains the dbsnp file for chromosome 1, chr1.txt encoded : directory to store the compressed hapmap file decoded : directory to store the decompressed hapmap file Change the permissions of the executables: ------------------------------------------ chmod +x Main chmod +x compress.sh chmod +x decompress.sh To learn about the tool type: ./Main --help Example compression ------------------- Run the shell script "compress.sh" as ./compress.sh This will store the compressed files in ./encoded directory. Example decompression --------------------- Run the shell script "decompress.sh" as ./decompress.sh This will store the decompressed files in ./decoded directory. You can compare if this is a lossless compression by : diff -w ./hapmap/hapmap3_r2_b36_fwd.consensus.qc.poly.chr1_jpt+chb.unr.phased ./decoded/JPT.1 Basically the main executable is names "Main". To run a compression, you need to provide the flag "--c 1", the hapmap file to compress "--hapmap ", optionally the dbsnp file "--dbsnp ", output compressed file name "--out ", chromosome number "--chr " and dbsnp version number "--ver ". To compress without dbsnp, simply omit the "--dbsnp " option. An example is presented in the file compress.sh, you have to uncomment it to run. Don't forget to comment all the lines for the compression with dbsnp part prior to running this. To run a decompression you need to provide the flag "--c 0", the file to decompress "--src ", output decompressed file name "--out ", chromosome number "--chr " and dbsnp version number "--ver ". Options for compression:- --c 1 : For compression. --hapmap : Path to hapmap files. --dbsnp : Path to dbsnp files. --out : Path to the compressed file --ver : dbsnp version being used for compression. Options for decompression:- --c 0 : For decompression. --dbsnp : Path to dbsnp files. --src : Path to the compressed file. --out : Path to the decompressed files. --ver : dbsnp version being used for compression. ###################################################### COMPILATION ############################################################## If you need to recompile the code:(for MAC, Unix, Linux) Make sure you have g++ compiler. src.tgz contains the src code, download it into HapZipper directory. The do the following:- tar -xzvf src.tgz cd src make This will create executable Main, copy it to the HapZipper directory and give it execution permission. For Windows, you need to have a C++ compiler e.g. Microsoft Visual Studio or Intel C++. ######################################################Typical problems:########################################################### make: Nothing to be done for 'all' - from the src folder type 'make clean' hit 'enter' and then type 'make'. Change the permission and copy the Main file into HapZipper main folder. -bash: ./Main: cannot execute binary file - Main file needs to be compiled for your current OS, see 'COMPILATION' section.