ITSP project (Project Reference : ITS/226/13) with 1.4M HKD funding.
07/2014 - 12/2015

Project demostration at InnoCarnival 2015 during 10/31-11/08, 2015.

Project summary

Subtitles provide valuable aids for understanding conversational speeches in movies and videos. However, the traditional way of placing subtitles at the bottom of the video screen causes eyestrain for ordinary viewers because of the need to constantly move the eyeballs back and forth to follow the speaker expression and the subtitle. The traditional subtitle is also inadequate for people with hearing impairment to understand conversional dialogues in videos.

A new technology will be developed in this project that improves upon the traditional subtitle presentation in movies and videos. In this improved presentation, subtitles associated with different speakers in a video will be placed right next to the associated speakers, so viewers can better understand what is spoken without having to move the eyeballs too much between the speaker and the subtitle, thus greatly reducing eyestrain. Furthermore, with the aid of the improved subtitle, viewers with hearing impairment can clearly associate the speaker with the spoken contents, therefore enhancing their video viewing experience.

We shall apply advanced computer vision techniques to face detection and lip motion detection to identify speakers in a video scene. We shall also study the optimal placement of subtitles around an identified speaker. Finally, a prototype software system will be developed that takes a subtitled video in a standard format and produces a new video with the improved subtitle presentation. As indicated by our initial user study, the outcome of this project holds the promise of benefitting all the people viewing a subtitled video and in particular those with hearing difficulty.

Project deliverables

A subtitle placement software for movies and TV programmes will be developed.

  1. The software detects people in a video and analyzes their facial motions to identify speakers. It then places the input subtitles next to the speaker without blocking salient visual information on screen.
  2. The software also provides a comprehensive and friendly user interface for manual adjustment of auto-placed subtitles and other control functions.
This software will be compatible with most common movie and subtitle file formats used in the video broadcast and production industry. It will support subtitle placement of most languages, including right-to-left languages such as Arabic and Hebrew.


Other info

This project is based on our previous work on Speaker-following Video Subtitles, which is published on ACM TOMM 2014.