Supporting Image-to-Text (Image Captioning) in Tika for Image MIME Types
- Mentors
- Thamme Gowda, Chris Mattmann
- Organization
- Apache Software Foundation
Image captions are a small piece of text, usually of one line, added to the metadata of images to provide a brief summary of the scenery in the image. It helps text based Information Retrieval(IR) systems to "understand" the scenery in images. It is a very useful feature, yet a challenging and interesting problem in the domain of computer vision.
The objective of this project is providing Apache Tika, image captioning capabilities and a scalable architecture to deal with deep learning models in the future.