Subtitle Quality

Measuring and improving subtitle quality

Published: 1 January 2012

Around 10% of television viewers in the UK use subtitles on a daily basis. Our work reviews the problems of subtitling and how they can be overcome on all our platforms.

Project from 2012 - 2016

What we're doing

Around 10% of television viewers in the UK use subtitles on a daily basis and many more now use them when watching clips and programmes online. They help many people enjoy television where they cannot have sound turned on, as well as being an access service for people with hearing difficulties or language issues.

For the past four years we have been looking into the ways in which the quality and quantity of subtitling can be improved for our audiences. Our work benefits from our being part of a public service broadcaster. We have access to the resources of the BBC including its audience research, programme archives and with help from production teams we can also create bespoke content material for our tests.

Audience surveys have helped us understand how people use subtitles. 90% of people who watch television with subtitles do so with the sound turned on. They use subtitles in combination with sound and lip reading to follow the programme. The surveys also help us design our User Research to model how people use subtitles at home. We carry out research in a purpose-built lab that replicates a living room environment. We recruit representative groups of subtitle users to take part in our research and using opinion scores and structured interviews we can build up a detailed understanding of the experience of using subtitles.

Subtitle availability

The most important issue for our audience is the availability of subtitles, and our focus now is on ways in which we can provide subtitles for the many thousands of video clips on the BBC's web pages. We have been developing ways of matching up video clips on our web pages to the original broadcast programme in our archive and then locating the matching subtitles for the web video. Our first prototype focused on the News web pages where it is able to find matches for around 40% of the video clips. We have applied for a patent for our technique and it was written up as a paper at the NAB 2015 conference, "Automatic retrieval of closed captions for web clips from broadcast TV content”. The disadvantage of this approach was that it required human intervention to verify, and if necessary, edit the subtitles it produced because the subtitles being recycled had been produced live.

We have now developed an approach that is entirely automatic, capable of finding and verifying subtitles for video clips that were sourced from pre-recorded programmes. The work has produced a sucessful a proof of concept implementation which has been tested against a set of over 7,000 video clips from the BBC Bitesize web site. The system returned matching subtitles for around 47% of the clips. Further tests with other BBC brands indicate that around half of the video clips on the BBC web site could be subtitled in this way. The key issue here is to enable the production of subtitles for web clips without requiring human intervention, enabling tens, if not hundreds of thousands of video clips to be subtitled at a marginal cost. This work has now been written up as a paper "Automatic recovery and verification of subtitles for large collections of video clips" which was presented at IBC2016 in the Paper Session: Novel Technologies for Assisting Sensory-Impaired Viewers.

Subtitle quality

At last year's IBC2015 conference we published two papers on subtitle quality.

The first was called "The Impact of Subtitle Display Rate on Enjoyment Under Normal Television Viewing Conditions" and was based on a set of user research we carried out in March and April 2015. The tests used specially shot news stories, which were read at a series of different word rates along with a series of off-air clips. The results showed that subtitle users want the subtitles to match the word rate of the speech even when the rate of the speech far exceeds current subtitling guidelines. Indeed, the rating of the speed of the news clips by subtitle users was closely matched by the rating of hearing viewers when watching without subtitles. This paper was amongst the top eight in the conference and was published in the journal 'The Best of IET & IBC 2015-16' and James wrote a blog post to go with the paper called How fast should subtitles be? This work has overturned previous claims from some academics and supports the working practices of much of the industry which has been developed in line with the views of the audience.

The second paper, "Understanding the Diverse Needs of Subtitle Users in a Rapidly Evolving Media Landscape" gave an overview of our subtitles research covering the past two years, bringing together work which had been published at academic conferences with developments in our understanding of the viewers' experience of subtitles.

Previous work

At IBC 2015 we also demonstrated our work on Responsive Subtitles, which showed the potential for subtitles to be formatted into blocks in response to the device capabilities and user input. This work was first presented at the Web for All conference in May 2015, in our paper "Responsive Design for Personalised Subtitles".

Subtitled firefighters on various screens

In December 2014 we carried out a programme of user research in collaboration with Mike Crabb who was on placement with BBC R&D from The University of Dundee at the time. This research looked at how subtitles could be presented with a video clip on a web page, adjustment of subtitle size and follow-up work on dynamic subtitles. This work is in the process of being written up as a series of papers. The first of these papers, "Dynamic Subtitles: the User Experience" was presented at TVX2015 in June 2015 and the second paper, "Online News Videos: The UX of Subtitle Position" was presented at ASSETS’15 in October 2015, along with a short paper, "The Development of a Framework for Understanding the UX of Subtitles", which was part of the poster session.

When we first started this work we began by looking into ways in which we might use language models for individual programme topics to improve the performance of speech to text engines and to detect errors in existing subtitles. We had some early success modelling weather forecast subtitles, which suggests there may be some value in this approach, but it would appear that other topics would be less successful. See White Paper WHP 256: "Candidate Techniques for Improving Live Subtitle Quality" for more details.

Then, at the request of our Technology, Distribution and Archives, Solution Design team we carried out a ground-breaking study into the relative impact of subtitle delay and subtitle accuracy. This work required the development of new test methodologies based on industry standards for measuring audio quality. A user study was carried out in December 2012 with a broad sample of people who regularly use subtitles when watching television. The results were presented at IBC2013 in September and are available as White Paper WHP 259: "The Development of a Methodology to Evaluate the Perceived Quality of Live TV Subtitles". Following on from this work the BBC and its subtitling partners have been making significant improvements to live subtitles available on news bulletins by using the presenter’s scripts to create the subtitles. This can result in news bulletins which have word-for-word subtitles presented without delay and without errors. Further work by Trevor Ware and industry partners has lead to the development of a way of retiming live subtitles by making use of the video coding delay. This work has been published as a white paper, "Live subtitles re-timing proof of concept".

BBC Audiences have conducted surveys for us to provide background data on the level of use of subtitles and how people are using them and what issues they have. More recently we have started to examine the iPlayer statistics on subtitle use as they have the potential to give us insight into the use of subtitles on a programme-by-programme basis. This data is in the process of being verified and should be made public in the coming months.

We have also built  a prototype subtitle monitoring tool to allow us to track long term trends with issues that we can measure, such as position and reading rate, as originally outlined in White Paper WHP 255: "Measurement of Subtitle Quality: an R&D Perspective".

Project Team

  • James Sandford (BSc)

    James Sandford (BSc)

    Research Technologist
  • Michael Armstrong (BSc Eng)

    Michael Armstrong (BSc Eng)

    Senior R&D Engineer

Rebuild Page

The page will automatically reload. You may need to reload again if the build takes longer than expected.

Useful links

Theme toggler

Select a theme and theme mode and click "Load theme" to load in your theme combination.

Theme:
Theme Mode: