[IRIS-HEP] Uproot + Dask
- Mentors
- Jim Pivarski, David Lange
- Organization
- CERN-HSF
- Technologies
- python, dask, Uproot, Scientific Python Libs
- Topics
- distributed computing, data analysis, File I/O
This project aims to create an API for users that can provide the data in ROOT files
directly in a “delayed” form that is supported by Dask. It will reimplement the
uproot.lazy function which will now be called uproot.dask. This function will support all
of the Dask backends, leveraging the dask.array, dask.dataframe and dask-awkward
delayed types in Dask.
Uproot is expanding to include Dask for smooth integration with other common
data processing libraries. This project is a major revamp of the structure and
codebase of Uproot and the changes will also include updating Uproot to use
Awkward Array v2. This will result in a new major version of Uproot i.e Uproot v5.