Automatic Subtitling in Limecraft Flow

Maarten Verwaest
January 29, 2017

Ever more producers and content agencies are looking to automate video captioning. Automatic Speech Recognition (ASR) and automated queueing of subtitles is the key to control the cost and the delay otherwise incurred by manual subtitling processes. Because expensive subtitles (or the lack thereof) may hamper the retrievability and the accessibility of valuable content, we took the challenge to radically automate the process.

Why automatic Subtitling

For producers of audiovisual content, including Film, Television and Corporate Video, subtitling has become an essential part of the production process. People with an aural disability depend on it to fully appreciate the content. Also it is best practice to publish non-broadcast content with complementary subtitles, because people not necessarily want to switch on their audio. Moreover, subtitles delivered alongside the video will improve the Search Engine Optimisation (SEO) for video.

However, the cost of subtitling is significant. Depending on the type of content, it may take 10 to 20 hours hours of manual work to manually edit subtitles of 50′ documentary or drama. So we asked ourselves the question what it would take to radically automate the subtitling process and we cracked the code.

Limecraft approach to Subtitling

To automate the subtitling process, a number of different challenges faced our development teams. Turning audio into test is the easy part, as we are offering transcription services as part of Flow. Turning the script or the transcription into properly looking subtitles is the more challenging part. First of all we rely on Natural Language Processing (NLP) to remove the non-meaningful words. Subsequently we must cut the modified transcription into separate lines of text (captions) according to a specific set of rules. The styling rules are different from producer to producer, so we had to make them configurable.
From a user’s point of view, we strive to hide all underlying complexity in a nice and easy to use interface. You may instruct us to produce subtitles with a simple click.

As soon as Limecraft Flow finishes transcription, it throws back automatically generated subtitles as well. These are displayed side by side with the transcript, to enable quality control and manual correction if necessary. You can play the sequence with the subtitles. When finished, you can export the subtitles in any format of choice (SRT, STL, EBU-TT-D).

Key Benefits

While for some cases the automatically generated captions are good enough, you’ll appreciate the ability to review and to correct what the machine generated in the first place. In either case we anticipate a cost saving of 60% to 80%, depending on the quality of the audio used for transcription.

More importantly, as we took great care of the usability, journalists and reporters can now produce subtitles themselves and make their story immediately available for distribution. From the feedback we received up to date, we understood that the shorter cycle time is at least as important as the reduced cost.

The Limecraft subtitling services are available free of extra charges when using the Limecraft platform.

💡 If you would like to give it a try, please activate your account, or drop us a line on info@limecraft.com in case you need more information.