Scalable geometric and statistical software

Technologies
python, c++, r, jupyter, github-actions
Topics
mathematics, data science, computational biology, computational geometry, statistics
Scalable geometric and statistical software
GeomScale is a research and development project that delivers open source code for state-of-the-art algorithms for problems at the intersection of data science, optimization, geometric, and statistical computing. The current focus of GeomScale is on scalable algorithms for sampling from high-dimensional distributions, integration, convex optimization, and their applications. One of our ambitions is to fill the gap between theory and practice by turning state-of-the-art theoretical tools in geometry and optimization to state-of-the-art implementations. Towards this goal, we will deliver various innovative solutions in a variety of application fields, like finance, computational biology, and statistics that will extend the limits of contemporary computational tools. GeomScale aims in serving as a building block for an international, interdisciplinary, and open community in high dimensional geometrical and statistical computing. The main development is currently performed in volesti, a generic open source C++ library, with R and python interfaces (the latter is hosted in package dingo), for high-dimensional sampling, volume approximation, and copula estimation for financial modelling. In particular, the current implementation scales up to hundred or thousand dimensions, depending on the problem. To our knowledge it is the most efficient software package for sampling and volume computation to date. It is, in several cases, orders of magnitude faster compared to packages that solve the same problems. It can be used to compute challenging multivariate integrals and to approximate optimal solutions in optimization problems. It has already found important applications in systems biology by analyzing large metabolic networks (e.g., the latest human network) and in FinTech by detecting shock events and by evaluating portfolios performance in stock markets with thousands of assets. Other application areas include AI and in particular approximate weighted model integration. Recent studies has shown a potential application of volesti methods in trustworthy AI, static analysis of programs and differential privacy.
2024 Program

Successful Projects

Contributor
Ho Thi Minh Ha
Mentor
Apostolos Chalkis, Cyril
Organization
GeomScale
Machine Learning and Optimization for Finance: Index Replication
Indexes are baskets of stocks with specific characteristic. They provide examples for diversification in investing to mitigate the volatility of...
Contributor
Sotirios Touliopoulos
Mentor
Vissarion Fisikopoulos, Haris Zafeiropoulos
Organization
GeomScale
Pre- and post-sampling features to leverage flux sampling at both the strain and the community level
The first genome-scale models of metabolism appeared in 1999 and 2000. In the following years fundamental Microbial Systems concepts were developed...
Contributor
Luca Perju
Mentor
Elias Tsigaridas, Zafeirakis Zafeirakopoulos
Organization
GeomScale
Develop a new rounding method for convex polytopes
The goal of this project is to improve currently existing implementations for rounding convex polytopes from Volesti as well as develop a new method...
Contributor
Vladimir Necula
Mentor
Apostolos Chalkis, Marios Papachristou
Organization
GeomScale
Efficient Volume Computation
The current state-of-the-art algorithms for the volume computation of high dimensional convex bodies, such as the one currently used within the...
Contributor
Akis Schinas
Mentor
Vissarion Fisikopoulos, Evangelos Skotadis
Organization
GeomScale
Refactor Multiphase Monte Carlo Sampling for volesti and dingo
The proposed project aims to refactor the Multiphase Monte Carlo Sampling (MMCS) algorithm, currently implemented in Python with Cython bindings, to...
Contributor
Ke Shih
Mentor
Vissarion Fisikopoulos, Elias Tsigaridas
Organization
GeomScale
Modernize Linear Program solver interface in dingo
The proposed project aims to modernize the Linear Program (LP) solver interface in dingo, a Python package for analyzing metabolic net- works. By...
Contributor
Atrayee Samanta
Mentor
Elias Tsigaridas, Apostolos Chalkis
Organization
GeomScale
Improving sampling routines for correlation matrices and R interface
The goal of this project is to (1) improve sampling routines for correlation matrices and (2) further enhance the R interface with sampling...