Contributor
Kush Kothari

[IRIS-HEP] Uproot + Dask


Mentors
Jim Pivarski, David Lange
Organization
CERN-HSF
Technologies
python, dask, Uproot, Scientific Python Libs
Topics
distributed computing, data analysis, File I/O
This project aims to create an API for users that can provide the data in ROOT files directly in a “delayed” form that is supported by Dask. It will reimplement the uproot.lazy function which will now be called uproot.dask. This function will support all of the Dask backends, leveraging the dask.array, dask.dataframe and dask-awkward delayed types in Dask. Uproot is expanding to include Dask for smooth integration with other common data processing libraries. This project is a major revamp of the structure and codebase of Uproot and the changes will also include updating Uproot to use Awkward Array v2. This will result in a new major version of Uproot i.e Uproot v5.