Apache Gora is an opensource framework which aims to give users an easy-to-use in-memory data model and persistence for big data frameworks with data store specific mappings. The overall goal for Apache Gora is to become the standard data representation and persistence framework for big data by providing easy to use Java API for accessing data agnostic of where the data is stored. It uses Apache Avro for data serialisation and depends on mapping files specific to each datastore.

In this project, we will develop a Benchmark module that will help to identify and understand the various performance characteristics of Apache Gora. It will also help to identify the overhead incurred by Gora compared to the use of native NoSQL systems. This will help in fixing bug and aid performance improvement. The performance characteristics may range from execution time to resource utilisation. The proposed module could be used to benchmark and compare native implementation vs Apache Gora implementation.


Sheriffo Ceesay


  • Kevin Ratnasekera
  • Renato Marroquin
  • Lewis McGibbney