Research on Multimodal Communication

Red Hen Lab is a distributed consortium of researchers in multimodal communication, with participants all over the world. We are senior professors at major research universities, senior developers in technology corporations, and also junior professors, postdoctoral students, graduate students, undergraduate students, and even a few advanced high school students. Red Hen develops code in Natural Language Processing, audio parsing, computer vision, and joint multimodal analysis.

Red Hen's multimodal communication research involves locating, identifying, and characterizing auditory and visual elements in videos and pictures. We may provide annotated clips or images and present the challenge of developing the machine learning tools to find additional instances in a much larger dataset. Some examples are gestures, eye movements, and tone of voice. We favor projects that combine more than one modality, but have a clear communicative function -- an example would be floor-holding techniques. Once a feature has been successfully identified in our full dataset of several hundred thousand hours of news videos, cognitive linguists, communication scholars, and political scientists can use this information to study higher-level phenomena in language, culture, and politics and develop a better understanding of the full spectrum of human communication. Our dataset is recorded in a large number of languages, giving Red Hen a global perspective.

For GSoC 2018, we invite proposals from students for components for a unified multimodal processing pipeline, whose aim is to extract information from text, audio, and video, and to develop integrative cross-modal feature detection tasks. Red Hen Lab is directed jointly by Francis Steen (UCLA) and Mark Turner (Case Western Reserve University).

lightbulb_outline View ideas list


  • python
  • scikit-learn
  • tensorflow
  • singularity
  • syntaxnet


  • Science and Medicine
  • multimedia
  • audio processing
  • video processing
  • artificial intelligence
  • machine learning
comment IRC Channel
email Mailing list
mail_outline Contact email

Red Hen Lab 2018 Projects

  • Ahmed Ismail
    Arabic Speech Recognition and Dialect Identification
    The project proposed aims to implement an Arabic speech recognition model using training data from the MGB-3 Arabic datasets to perform speech...
  • Shuwei Xu
    Automatic Speech Recognition for Speech-to-Text on Chinese
    In this project, a Speech-to-Text conversion engine on Chinese is established, resulting in a working application. There are two leading candidates...
  • Xu Tony
    Chinese Pipeline
    This project is roughly divided into three parts: OCR Recognition, which uses existing tools to extract captions from videos to text; Speech...
  • Devendra Yadav
    Emotion detection and characterization in video using CNN-RNN
    This project aims to develop a pipeline for emotion detection using video frames. Specifically, we detect and analyze faces present in the video...
  • Gyanesh Malhotra
    Multi modal Egocentric Perception (with video and eye tracking data)
    This project aims to tackle the problem of egocentric activity recognition based on the information available from two modalities which are video and...
  • Vikrant Goyal
    Multilingual Neural Machine Translation System
    The aim of this project is to build a single Machine Translation system using Neural Networks (RNNs-LSTMs, GRUs,Bi-LSTMs) to translate between...
  • Sumit Vohra
    Multimodal Egocentric Perception (with video, audio, eyetracking data)
    Hey, I have been in constant touch with Mehul regarding my project on Multi-modal Egocentric Perception. I have already had a skype meet with him...
  • Awani Mishra
    Multimodal Television Show Segmentation
    University and libraries of social science and literature department have a large collection of digitized legacy video recordings but are...
  • Vaibhav Gupta-1
    Rapid Annotator
    With Red Hen Lab’s Rapid Annotator we try to enable researchers worldwide to annotate large chunks of data in a very short period of time with least...
  • Burhan Ul Tayyab
    Russian Ticker Tape OCR
    We are proposing an OCR framework for recognizing ticker text in Russian Videos. We do this by solving two main problems, improving the OCR by...