Clone wiki

sparkseq / Introduction


SparkSeq is a prototype application for RNA/DNA-Seq analyses with nucleotide precision in the cloud. The goal of the project is to create a scalable and extremely fast tool for interactive RNA/DNA studies. This project is powered by Apache Spark, a fast and general engine for large-scale data processing and Hadoop-BAM a library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce.