The development of GPUs for general purpose computing has revolutionized the field of deep learning. They are critical for training large deep neural networks (DNN) and improve the performance of inference.

The OpenCV’s DNN module has a blazing fast inference capability on CPUs when compared to other popular libraries such as TensorFlow or PyTorch. It supports inference on GPUs using OpenCL but not CUDA. NVIDIA’s GPUs support OpenCL, but their capabilities are limited by OpenCL. A separate CUDA backend is required to reap maximum performance out of NVIDIA's GPUs.

This project aims at adding a complete CUDA backend for OpenCV’s DNN module. By the end of the project, the DNN module should be capable of performing inference on CUDA enabled GPUs nearly as fast as or faster than existing deep learning frameworks such as TensorFlow or PyTorch.



Yashas Samaga B L


  • Davis King