Contributor
Likhit Talasila

OpenCX: Extending OpenCV with NeRFs and LLMs


Mentors
Gary Bradski, Douglas B Lee
Organization
OpenCV
Technologies
python, opencv, flutter, dart, OpenAI, HuggingFace, FastAPI, Langchain, Nerfstudio
Topics
machine learning, computer vision, natural language processing, LLMs, neural radiance fields
Investigative work on extending OpenCV. Researched state of the art and emerging 3D multimodal scene representations and use cases. For eg, queryable NeRF methods, semantic embeddings etc. This led to spearheading and designing 2 new initiatives: NerfNet & CognitiveStudio. NerfNet is an open platform for gathering large scale data a la ImageNet for 3D scene representations. It goes end-to-end from images and metadata to 3D scene representations with support for various configs, NeRF models etc. CognitiveStudio is a NerfStudio fork focused on modularity and multimodal integrations Also developed PoCs for LLM agents for converting natural language to NerfStudio API calls and for gathering metadata for NerfNet.