Contributor
Thejan Wijesinghe

Supporting Image-to-Text (Image Captioning) in Tika for Image MIME Types


Mentors
Thamme Gowda, Chris Mattmann
Organization
Apache Software Foundation

Image captions are a small piece of text, usually of one line, added to the metadata of images to provide a brief summary of the scenery in the image. It helps text based Information Retrieval(IR) systems to "understand" the scenery in images. It is a very useful feature, yet a challenging and interesting problem in the domain of computer vision.

The objective of this project is providing Apache Tika, image captioning capabilities and a scalable architecture to deal with deep learning models in the future.