Using AI in AVT

On November 10th 2022 at the Translating Europe Forum, our CEO Maarten Verwaest shared valuable insights on the dos and don’ts of using AI for professional transcription, subtitling and Audiovisual Translation (AVT). 400 attendants attended the conference in person, and another 2000 attended online using the website of the European Commission. The keynote is now available as an online video.

Maarten Verwaest discusses the latest advances in AI for media. Many use cases of AI in media and entertainment like AI transcription and subtitling are very promising, and especially language professionals have discovered several time-saving applications. The key questions are how to deal with inaccuracy, what user interface you need to make it work, how to deal with the people’s concerns of how it will affect the quality of their jobs, or the more existential question of how to measure “successful” in the first place.

Whether or not an application is adopted by professionals as part of their day-to-day work, depends on many factors. In this presentation, illustrated by real cases of transcription and subtitling, we explored what works, what doesn’t quite work yet, and what areas are unlikely to be solved by technology on their own.

Key take-aways:

Also AI is prone to errors, never publish the results without professional review (“Never leave AI unattended”)
Time savings are hugely dependent on the type of content and audio quality
Repeated error correction is detrimental for efficiency and acceptance
Fine-tuning is a step by step process
Usability is the key to adoption, a good user interface for post-editing is at least as important as accuracy of the result

1. The Problem: manual work is slow and expensive

Producers and post-production facilities worldwide face similar challenges. The volumes of material that need to be processed are increasing, the number of output formats are going, while at the same time turn-around times are reduced. Very similar effects apply to subtitling and audiovisual translation, aggravated by the exploding demand.

Post-production companies and Language Service Providers need to reinvent themselves. Adding more man-power is not the right solution, adding more point solutions only adds to the complexity. They need automation, and recent advances in Artificial Intelligence (AI) seem to be very promising. However, AI needs to be handled with great care, as it not always deliver the perfect result. An expert-in-the-loop workflow is strongly advised, effectively establishing co-creation between artificial and human intelligence.

2. The Solution: a single Workspace that seamlessly integrates AI services

As suggested, AI services like AI transcription, automated subtitling, computer assisted shot listing can be very helpful, assuming 1/ they are properly trained and 2/ they are nicely integrated as part of the workflow. This is why Limecraft invented the Workspace. A consistent user experience, avoiding having to switch between apps or to use copy and paste, that connects to local or cloud-based storage and AI services. Together providing solutions for ingest, computer assisted logging or spotting, transfer back and forth post-production, and, for the matter, subtitling and localisation.

3. How it was and how it is now

Limecraft was one of the first Asset Management solutions to properly implement Automatic Speech Recognition (ASR) or AI Transcription technology at the core of its product. While the original purpose was to index material for search purposes, supporting the work of archive producers and journalists, we started working on AI subtitling for the BBC and NPO in 2016. It was focused on same-language subtitling, and we delivered an industry first implementation of broadcast-grade spotting of subtitling using configurable spotting rules.

💡 What are subtitling spotting rules?

Next, we added Neural Machine Translation (NMT) to support intra-lingual subtitling or localisation, and we also started working on visual transcription to allow AI shot listing. The bulk of the engineering work was conducted in the MeMAD project, a large-scale R&D action funded by the European Commission.

One of the key improvements was further work on the spotting rules, which are now close to perfection. If the transcript is post-edited, cutting the transcript into high-quality subtitles is a matter of seconds and the result is amazing. Have a look at it yourself in the following clip, where we mixed the subtitles from 2018 in the middle area of the screen, while displaying state-of-the-art AI subtitles at the bottom 👇

💡 Interested to know more or to give it a try? You are welcome to join us for a demo.

Help center

Using AI in AVT – What works and what doesn’t

Using AI in AVT

1. The Problem: manual work is slow and expensive

2. The Solution: a single Workspace that seamlessly integrates AI services

3. How it was and how it is now