Efficient frequency spectrum computation over large sample sizes
- Mentors
- Simon Gravel
- Organization
- Canadian Centre for Computational Genomics
The goal of the summer project is to revise the current implementation so that it can compute large frequency spectra efficiently. As a result, various inference can be carried on huge sample sizes at a reasonable computation cost. The idea is to track a small subset of entries in the full frequency spectrum and interpolate to recover the full AFS to proceed with integration for larger sample sizes. The key problem is how to recover the frequency spectrum accurately, so several experiments would be carried out to tune parameters and implementations. Also, a framework is expected to be developed for balancing the trade off between computational complexity and data recovery accuracy, that is, deciding which kind of data should be computed with approximation and which should be computed directly.