OpenCX: Extending OpenCV with NeRFs and LLMs
- Mentors
- Gary Bradski, Douglas B Lee
- Organization
- OpenCV
- Technologies
- python, opencv, flutter, dart, OpenAI, HuggingFace, FastAPI, Langchain, Nerfstudio
- Topics
- machine learning, computer vision, natural language processing, LLMs, neural radiance fields
Investigative work on extending OpenCV. Researched state of the art and emerging 3D multimodal scene representations and use cases. For eg, queryable NeRF methods, semantic embeddings etc. This led to spearheading and designing 2 new initiatives: NerfNet & CognitiveStudio. NerfNet is an open platform for gathering large scale data a la ImageNet for 3D scene representations. It goes end-to-end from images and metadata to 3D scene representations with support for various configs, NeRF models etc. CognitiveStudio is a NerfStudio fork focused on modularity and multimodal integrations
Also developed PoCs for LLM agents for converting natural language to NerfStudio API calls and for gathering metadata for NerfNet.