Facebook has announced the availability of automatic captions for the IGTV platform, starting with captions for videos on demand in up to 16 languages worldwide.
This release comes after the launch of the automatic captions feature for Facebook Live and Workplace Live, which arrived in March in six languages: English, Spanish, Portuguese, Italian, German and French.
Facebook says its expanded captions feature is based on alternative text updates it made a few years ago to support people with limited visibility.
“As more people use captions, artificial intelligence learns, and we expect quality to continue to improve, and this is a small step, and we look forward to expanding into more places, languages and countries,” she added.
She explained the social media platform at Post They developed a technology to train machine learning models that support automatic speech recognition to directly predict the letter shapes of words, which simplifies model training.
Engineers trained models to adapt to new words and predict where they will happen in videos using public Facebook posts.
Facebook also says it has been able to deploy these models with a number of infrastructure improvements, enabling it to serve additional video traffic from the loads associated with the pandemic.
According to the platform, the number of live Facebook broadcasts of pages doubled in June 2020 compared to the same time last year.
Facebook launched its first automatic caption product in February 2016 for video ads, and in October of the same year, it launched a free video label tool across all Facebook pages in English.
Although the tools have improved over the years, the evidence indicates that it has a long way to go.
As noted by Forbes magazine article Modern, caption errors disproportionately affect the video-viewing experience for those with hearing impairment.
Given its clear awareness of the shortcomings of its systems, Facebook says it is looking at ways to improve captions in the future.
And in a study last month, Facebook was able to reduce the rate of word errors – a common measure of speech recognition performance – by more than 20 percent using new way.