Project 5: Implementation of parallel analysis in MDAnalysis
- Mentors
- Oliver Beckstein, Richard Gowers, RMeli, yuxuanzhuang, Ian Kenney
- Organization
- MDAnalysis
- Technologies
- python, dask
- Topics
- molecular dynamics, Parallel Programming, hpc computing
The proposal introduces parallel backend based on `dask` for MDAnalysis package. Potentially, it will allow researchers to run analysis of the molecular dynamics trajectories much faster, depending on their access to HPC facilities, hence increasing the speed of obtaining valuable insights from large-scale molecular simulations.
Briefly, the proposal aims to re-write protocol for `MDAnalysis.AnalysisBase.run()` method to use a scheduler-based approach. This is supposed to be done with gradually increasing complexity, covering each steps with respective tests to ensure backward compatibility:
- simply re-write the protocol without adding additional complexity such as multiple processes
- add multiple cores for a single node
At this point, a mid-project checkpoint is set.
- introduce cluster support that allows user to run analysis in an HPC environment with a `dask` backend
- (optional) add tests for subclasses of the base class, and mark those that don't support such parallelization
- (optional) add example jupyter notebook showing that a large-scale molecular dynamics trajectory can be analyzed in a parallel fashion much faster than before