Tatarstan Academy of Sciences to fund Russian-Tatar neural translator

Tatarstan Academy of Sciences to fund Russian-Tatar neural translator
Photo: realnoevremya.ru/Dinar Fatykhov

TV, radio and theatre workers to be invited to dub the translator

As Realnoe Vremya found out, the republican Academy of Sciences is going to create a neural network for voice machine translation. 6.8 million rubles are planned to be spent on the project. According to the documentation, the translator’s developers have to collect, record, time code audio data and prepare speech data set of the Tatar language. Also, the authors have to train acoustic and linguistic models of recognition and synthesis of the Tatar speech based on collected data set.

The created web service will have to be able to recognise and synthesise the Tatar speech. The translator should be available both as a website and a mobile app where it needs to be specified that a speech can be interpreted from Russian into Tatar and vice versa. The platform itself is going to be engineered for devices even with the minimum requirements. The app is designed to be free and available in all operating systems.

realnoevremya.ru/Dinar Fatykhov

Audio data in the Tatar language will be created on the basis of speech styles: informal — less than 10%, journalist — no less than 60%, official, literary and scientific — no less than 30%. Also, professional presenters will participate in recording audio data for the Tatar speech synthesis. Men and women working in television, radio or theatre will dub the voices. As few as 2.000 unique voices of different ages, sex and Tatar dialects will be uploaded into the translator.

Tatar audio literature portal

As Realnoe Vremya was explained in the Tatarstan Academy of Sciences, the voice translator will run in the web service of Tatsoft Russian-Tatar machine translator, which in is in the public domain. The result obtained is database for the follwing topical IT services for the Tatar languague:

  • Automation of TV programmes and videos subtitling;
  • automatic performance translation;
  • website and a Telegram bot with speech services for the population and journalists;
  • a modern portal of audio literature in Tatar;
  • Tatar version of ChatGPT.
realnoevremya.ru/Dinar Fatykhov

“Also, the Russian-Tatar machine translator will be used in the SmartCat system by the republic’s public institutions to prepare norms and regulations in Tatarstan’s official languages,” the Academy of Sciences said.

How Liliya assistant was launched in Tatarstan

It should be reminded that Liliya phone robot was the latest voice assistant in Tatarstan. It helped citizens of the republic in self-isolation during the coronavirus pandemic. Liliya could process up to 2.000 phone calls a day.

realnoevremya.ru/Rinat Nazmetdinov

Later, it was added the possibility of sending data on gas, water, heating and electricity consumption. Over a thousand people used this function. It should be added that the bot still exists, however, the service doesn’t demonstrate actual information across the republic. According to it, 179.000 had COVID-19 in the republic. Though, the official statistics say the number topped 200.000. Neither is it possible to send meter readings — Liliya offers to send your phone number and send an SMS code. However, there is no incoming message.

Alexander Zaripov
Tatarstan