How neural network can help preserve the Tatar language

Generation of ideas, improvement of texts and much more. Part 1

How neural network can help preserve the Tatar language
Photo: предоставлено Райнуром Хасановым

As it became known, more than 20 languages ​​of the peoples of Russia will be added to Translator service by Yandex, and neural network technology for speech recognition and synthesis will also be built into some of them. Tatar, the most popular language in the aggregator search after Russian, will be the first language the system will begin to work with in this format. Raynur Khasanov, chairman of the World Forum of Tatar Youth and an IT specialist who daily uses artificial intelligence services to improve, simplify and speed up work on various projects, talked about his experience of working with the platform.

What do Neurotatarlar do?

How will this news change the life of the Tatar-speaking population? After all, this means that users will be able to make requests in the search for gadgets, maps and instant messengers in the Tatar language, and Alisa virtual assistant will learn to read fairy tales in Tatar poet Tukay’s language. The project is being implemented jointly with the Federal Agency for Nationalities’ Affairs and regional language institutes.

Note that it is more difficult for neural networks to work with small languages, since there are few translated texts from which to learn. On the other hand, it is helpful Tatar is a Turkic language, which means it is possible to create a unified model for related languages.

“Unfortunately, many artificial intelligence services do not yet support the Tatar language,” says Khasanov. “But activists of Neurotatarlar community and I have an understanding of how to change this. To do this, we are collecting the largest monocorpus of the Tatar language, which will allow us to train open-source language models. The monocorpus will also be publicly available so that global corporations can use it to train their models.”

Raynur Khasanov. Photo courtesy of the press service of the Tatarstan Ministry of Culture. предоставлено пресс-службой министерства культуры РТ

Let’s start with ChatGPT

“I think there are no more people left who have not heard about this neural network, but still a small part does not fully understand what this AI is and what real possibilities are hidden behind its functions,” says Khasanov. “ChatGPT is a smart programme designed to communicate with people. It can answer questions, help with text writing, give you advice and keep the conversation going. Imagine a robot that can talk, but it uses text instead of voice. You write him questions, and it answers trying to be helpful and understandable.”

So it is a service that can answer any questions, write texts, translate texts, analyse data, generate ideas, make plans and much more. Like all normal SaaS solutions, it has free and paid versions. The free one only works with the GPT-3.5 model, and the service itself is only available via VPN. But how can all this be applied to popularise the Tatar language? Let's start with the fact that GPT-4o, unlike its earlier versions, understands and writes better in the Tatar language. Therefore, it can be used to generate texts in the Tatar language.

“Here’s an example: I asked it to find information on the Internet about our Hay Bazaar summer Tatar urban culture festival,” says Khasanov and sends a screenshot.

It’s not perfect, there are some factual errors, but overall it’s very good.

Ideas of the neural network for the festival’s design. Photo courtesy of Raynur Khasanov. предоставлено Райнуром Хасановым

Text generation

The service can create a variety of texts, including articles, stories, dialogues and scripts, which is especially useful for writers working on books and comics in the Tatar language. Let’s say you want to make a Tatar comic, the resource will help generate its idea. These same scripts can be used to create an animated film by requesting that it be divided into scenes. The tool is also suitable for generating realistic dialogues, which is useful for cartoon and comic book writers.

And although the service works better with English, it can be used to translate texts into Tatar. Together with translators such as Yandex Translator and Tatsoft, this can significantly speed up the process of content adaptation.

A neural network can write a script. Photo courtesy of Raynur Khasanov. предоставлено Райнуром Хасановым

Brainstorming and idea generation

“ChatGPT can offer new ideas and directions for your projects,” Khasanov points out. “This is useful for authors who are looking for inspiration or new approaches to their work. Let’s say you want to create a project to develop the Tatar language, but you don’t know where to start. You can ask ChatGPT the question: ‘What is missing on the Internet to popularise the Tatar language among young people?’ And it will offer you some options. By reflecting on its answers, you can to find an interesting area where your project can be implemented. And write a road map as well.”

A neural network is still far from our correctors. Photo courtesy of Raynur Khasanov. предоставлено Райнуром Хасановым

Text processing and improvement

The tool can help with editing and improving existing texts, making them more coherent and readable. “As I said above, ChatGPT-4o already writes well in the Tatar language and can also find errors and correct them. In particular, by the end of this summer, the folk and I are preparing our analogue of Grammarly, which will help with working with Tatar texts,” says Khasanov.

In the next part, we’ll talk about specific service plugins as well as networks such as Leonardo Ai, HeyGen, Suno.

Radif Kashapov

Подписывайтесь на телеграм-канал, группу «ВКонтакте» и страницу в «Одноклассниках» «Реального времени». Ежедневные видео на Rutube, «Дзене» и Youtube.