Last summer under Google Summer of Code, I coded the commonly used WAH (Word-Aligned Hybrid) compression algorithm in C under the direction of David Chiu.

This summer, I will be coding a different compression algorithm with significant promise. Unlike WAH which has a constant compression length, VAL (Variable-Aligned Length) compression varies depending on the column being compressed. By allowing different compression lengths, VAL can anticipate noisier columns and can compress them more efficiently than WAH can.

VAL currently only exists coded in Java and can be optimized if extended into C. Then, if run in parallel, VAL would yield even greater benefits. Over the course of the summer, I intend to implement VAL in C and allow for parallelized compression and querying as well to demonstrate the benefits of VAL.




  • David Chiu