Contributor
Saurabh Shrivastava

CCAligner - Word by Word Subtitle Synchronization.


Mentors
Carlos Fernandez Sanz, Alex Bratosin
Organization
CCExtractor Development

CCAligner

Word by Word Subtitle Synchronization Tool

The usual subtitle files (such as SubRips) have line by line synchronization in them i.e. the subtitles containing the dialogue appear when the person starts talking and disappears when the dialogue finishes. This continues for the whole video. For example :

1274
01:55:48,484 --> 01:55:50,860
The Force is strong with this one

In the above example, the dialogue #1274 - The Force is strong with this one appears at 1:55:48 remains in the screen for two seconds and disappears at 1:55:50.

The aim of the project is to tag the word as it is spoken, similar to that in karaoke systems.

E.g.

The           [6948484:6948500]
Force         [6948501:6948633]
is            [6948634:6948710]
strong        [6948711:6949999]
with          [6949100:6949313]

In the above example each word from subtitle is tagged with beginning and ending timestamps based on audio.

Project for CCExtractor Development, GSoC 2017 by Saurabh Shrivastava.
saurabh.shrivastava54@gmail.com
saurabhshri.github.io