Extend SkyhookDM programmable object storage with statistics, sort/aggregate or data compaction functions.
- Mentors
- Jeff LeFevre, Aaron Chu, ivotron, Noah Watkins
- Organization
- CERN-HSF
SkyhookDM supports dynamic data management in the cloud by enabling data management tasks to be executed directly within the storage. It uses customised C++ object classes to offer support for offloading database operations directly to the object storage layer. The project aims to improvise and extend SkyhookDM’s capabilities by incorporating the following functionalities:
- GROUPBY and ORDERBY database operations: Extend current aggregation method to include GROUPBY and sort (ORDERBY) for an object’s formatted data partitions
- Statistics collection: Implement a custom method for data statistics collection of an object’s formatted data partitions in form of histograms
- Data compaction: Implement compaction of multiple formatted sub-partitions within an object into a single partition.