Yongtao Hu1 Jan Kautz2 Yizhou Yu1 Wenping Wang1

1 The University of Hong Kong, Hong Kong      2 University College London, United Kingdom

ACM Transactions on Multimedia Computing, Communications and Applications (ACM TOMM 2014)

Sample result for TV "Friends.S10E15".
(French audio, English subtitle)


We propose a new method for improving the presentation of subtitles in video (e.g. TV and movies). With conventional subtitles, the viewer has to constantly look away from the main viewing area to read the subtitles at the bottom of the screen, which disrupts the viewing experience and causes unnecessary eyestrain. Our method places on-screen subtitles next to the respective speakers to allow the viewer to follow the visual content while simultaneously reading the subtitles. We use novel identification algorithms to detect the speakers based on audio and visual information. Then the placement of the subtitles is determined using global optimization. A comprehensive usability study indicated that our subtitle placement method outperformed both conventional fixed-position subtitling and another previous dynamic subtitling method in terms of enhancing the overall viewing experience and reducing eyestrain.



  title={{Speaker-Following Video Subtitles}},
  author={Hu, Yongtao and Kautz, Jan and Yu, Yizhou and Wang, Wenping},
  journal={ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)},