The Burrows Wheeler Aligner with networking capabilities. The
unmaintainable multi-threading code was removed from upstream BWA and a
new function 'bwa bam2bam' was added.
This function adds a new workflow that reads a BAM file and writes a BAM
file, in a single invocation and with no intermediate files. It uses
0MQ (ZeroMQ) to allow multithreading and network transparent
Why would anyone want this? First off, the simple workflow makes life
easy. Second, if you're aligning lots of data and running on a single
machine becomes impractical, you will be happy about this program, which
distributes work on the fly to a fully dynamic network of cooperating
processes and machines.
To use the improved workflow, run something like
bwa bam2bam input.bam -f output.bam -t 8
This would run locally with 8 worker threads, handling everything (both single ended
and paired end reads) automatically. To make use of the network distribution, run
bwa bam2bam input.bam -f output.bam -t 0 -p PORT
where PORT is a free port number. Then run many instances of this
bwa worker -h HOST -p PORT -t N
where HOST and PORT denote the host running 'bam2bam' and N is the number of desired
worker threads. Bwa doesn't case how these jobs are started, so you can use any load
balancer or grid scheduler you happen to run.