Geant4-FastSim - Memory footprint optimization for ML fast shower simulation
- Mentors
- Dalila Salamani, Anna
- Organization
- CERN-HSF
- Technologies
- python, c++, tensorflow, docker, pytorch, Kubeflow
- Topics
- machine learning, optimization, MLOps, Generative Model
Geant4 is a highly accurate and detailed simulation toolkit used for simulating the passage of particles through matter. Due to its strict precision requirements, simulation is slow and has been proven to be a bottleneck for physics analysis. To overcome this bottleneck, popular machine learning techniques like generative modelling have been employed as a fast simulation alternative. This project focused on the Optimization and Inference components of the project. Very limited work is done on the model in terms of Memory Footprint Reduction. This project was aimed at implementing different post training Memory Footprint Reduction techniques by creating a pipeline using KubeFlow, consolidating all the insights from different experiments and use them to extend KubeFlow pipeline built for carrying out the experiments into a more generalized form which can be used on different ML models at CERN to output the best optimized model.
I integrated different Onnx Runtime Execution Providers - MLAS (default CPU), CUDA, TensorRT, oneDNN and OpenVINO into the Geant4 Par04 inference module. I have implemented an end-to-end KubeFlow pipeline which contains CPU and CUDA optimization modules which leverage different quantization and graph optimization workflows and reduce memory footprint of models while maintaining similar performance. Optimization modules for oneDNN and TensorRT are currently in-development at the time when GSoC is ending.