Parallel BZip2
I ran some benchmarks which included PBZip2, a multi-threaded implementation of BZip2 (which is slow yet effective, so my preferred choice of compressor for basically everything).
Running the Burrows–Wheeler transform over the input blocks is a task well suited for being parallelized and the benchmarks show that Jeff Gilchrist did a great job at this:
Compressor | Time | Archive Size |
---|---|---|
None (cat) | 2.3s | 50 MB |
GZip | 4.0s | 34 MB |
BZip2 | 16.3s | 29 MB |
PBZip2 | 3.0s | 29 MB |
LZip | 41.8s | 24 MB |
The timings were produced by running the code below 4 times and taking the average of the last 3 runs (for each compressor).
This was executed on a 2 × 2.8 GHz Quad Core Mac Pro where PBZip2
(correctly) auto-detected 8 cores.
I am running PBZip2 version 1.1.0 from MacPorts (sudo port install pbzip2
).
for Z in cat gzip bzip2 pbzip2 lzip; do
time tar -cf "${Z}.res" --use-compress-prog="${Z}" Avian
done
Update: Added test with LZip (an LZMA based compresser). There is a multi-threaded implementation of this (plzip
) but a quick ./configure && make
did not cut it.