Open-source development of scalable algorithms for geometric statistics

Technologies
python, c++, r, jupyter, github-actions
Topics
mathematics, data science, computational biology, statistics, computation geometry
Open-source development of scalable algorithms for geometric statistics

GeomScale is a research and development project that delivers open source code for state-of-the-art algorithms at the intersection of data science, optimization, geometric, and statistical computing. The current focus of GeomScale is scalable algorithms for sampling from high-dimensional distributions, integration, convex optimization, and their applications. One of our ambitions is to fill the gap between theory and practice by turning state-of-the-art theoretical tools in geometry and optimization to state-of-the-art implementations. We believe that towards this goal, we will deliver various innovative solutions in a variety of application fields, like finance, computational biology, and statistics that will extend the limits of contemporary computational tools. GeomScale aims in serving as a building block for an international, interdisciplinary, and open community in high dimensional geometrical and statistical computing. The main development is currently performed in volesti, a generic open source C++ library, with R and (limited) Python interfaces, for high-dimensional sampling, volume approximation, and copula estimation for financial modelling.

In particular, the current implementation scales up to hundred or thousand dimensions, depending on the problem. It is the most efficient software package for sampling and volume computation to date with orders of magnitude performance in several cases compared to packages that solves the same problems. It can be used to compute challenging multivariate integrals and to approximate optimal solutions in optimization problems. It has already found important applications in systems biology by analyzing large metabolic networks (e.g. the latest human network) and in FinTech by detecting shock events and by evaluating portfolios performance in stock markets with thousands of assets. Other application areas include AI and in particular approximate weighted model integration, and data-driven power systems in control.

2021 Program

Successful Projects

Contributor
Alexandros Manochis
Mentor
Apostolos Chalkis, Vissarion Fisikopoulos, Elias
Organization
GeomScale
High dimensional geometric computations with least matrix inequalities
Package volesti supports volume estimation for polytopes, providing several randomized approximation methods. The most efficient implementation...
Contributor
Haris Zafeiropoulos
Mentor
Apostolos Chalkis, Vissarion Fisikopoulos
Organization
GeomScale
From DNA sequences to metabolic interactions: building a pipeline to extract key metabolic processes
Metabolic modeling has been interwoven with constraing-based methods. The value of randomized sampling in the framework of metabolic modeling has...
Contributor
Konstantinos Pallikaris
Mentor
Apostolos Chalkis, Vissarion Fisikopoulos, Marios Papachristou, Elias
Organization
GeomScale
Parallel Geometric Random Walks with Sparse Numerical Optimizations
Package volesti provides several geometric random walks for high dimensional sampling from convex polytopes. The current implementations can be used...
Contributor
Suraj Choubey
Mentor
Apostolos Chalkis, Vissarion Fisikopoulos, Marios Papachristou
Organization
GeomScale
GeomScale: Monte Carlo Integration
Integration is a fundamental problem in mathematics, physics and computer science with many applications that span the whole spectrum of sciences and...
Contributor
Vaibhav Thakkar
Mentor
Apostolos Chalkis, Vissarion Fisikopoulos, Elias
Organization
GeomScale
Counting linear extensions
The problem of counting the linear extensions of a given partial order consists of finding (counting) all the possible ways that we can extend the...