Voiced by AI: How robots are taking over book streaming
Artificial intelligence is closing in on narrators: how algorithms are changing the audiobook market and who will win the race for listeners

Artificial intelligence is already narrating novels, choosing voices based on the genre, and even mimicking the intonations of beloved actors. It’s convenient, fast, and cost-effective in production, but what will happen to live narrators once the market is dominated by algorithms? At the International Fair of Intellectual Literature non/fictiovesna, experts discussed digital voices, robot narrators, and the fine line between comfort and imitation. Will we soon stop hearing human voices in books — and will we even notice the difference?
Mechanical narration
Currently, book streaming platforms offer two types of audio content: professional, complex narration by actors and books narrated by neural networks. “Users complain about AI narration because it lacks personalisation, the ability to choose voices, and AI narrators sound mechanical," said Yaroslav Tarnopolskiy, the head of the product division at the book service Stroki. He also added that Stroki plans to balance between these two types of content.
Tarnopolsky noted that AI narration is not suitable for all types of books. “It’s easier, faster, and cheaper to use artificial intelligence for narrating non-fiction, psychology, and self-help books because producing audiobooks with narrators is quite an expensive endeavor. The sheer volume of content we receive every week cannot be narrated by humans. It would be extremely costly,” added Tarnopolsky.

The audiobook publishing company VIMBO is involved in audiobook production and collaborates with several book streaming services. According to the CEO and founder of the audio publishing house, Vadim Bukh, book services actively promote books that have been narrated by professional actors. “The better an audiobook is made, the more the platforms promote it. In other words, performances and series by talented actors receive much greater exposure and a larger audience," said Bukh.
Meanwhile, Diana Smirnova, the head of RUGRAM platform and producer at the Everbook service, believes that AI-narrated content will soon dominate book streaming platforms. “Yes, of course, there are still problems with AI narration today. But given the pace of its development, within a five-year horizon — not even ten — we’ll reach very high-quality narration,” said Smirnova. This forecast was supported by Tarnopolsky, who noted that in the future it will be possible not only to choose a narrator's voice from available options, but even to listen to fairy tales recorded in the voices of loved ones.
Will there be no humans left, only AI?
All participants in the discussion agreed that there is no point in ignoring artificial intelligence or trying to deny its impact on the market. Smirnova suggested that in the near future, around 80% of all audio content will be generated using AI narration. This is neither good nor bad — it’s simply a fact. Vadim Bukh added that in some cases, AI narration can even be better than human narration.
“AI will first replace bad narrators who can’t even place the stress correctly. Then it will replace those who simply can’t read the text properly. A large number of books will be narrated by so-called robots named Ivan and Maria, and there will be an audience for that kind of audio content.”

Experts are confident that artificial intelligence will certainly push unqualified professionals out of the market. However, it is still difficult to imagine how AI will place emphasis within a text, convey the author’s intent through intonation, and, as Bukh put it, “claim ownership” of the narrative.
Mikhail Litvakov, general producer at the audiobook publishing house VIMBO, said that working with AI today is similar to working with an actor. The AI is also given direction — where to sound more tense, where to be more cheerful, where to add fear to the voice. Afterwards, a sound engineer compiles all these fragments into a single audio file. But there’s a key difference: “What sets this apart from working with an actor? Only that an actor can surprise you at any given moment,” added Litvakov.
But this ability to “surprise” isn’t always what listeners, viewers, or readers are looking for. At the same non/fictioNvesna fair, there was an experimental stand showcasing fairy tales — those written and illustrated by humans on one side, and those created by artificial intelligence on the other. Often, people expressed a preference for the works generated by AI. The explanation is simple: neural networks synthesise content based on what has already been created — content that is familiar and stereotypical. In other words, it’s content the audience is used to. The human creator, on the other hand, strives for originality, which the broader public does not always understand.
It all comes down to money
Diana Smirnova suggested that streaming services raise prices for audio content narrated by humans, thereby increasing not only its monetary but also its perceived value. Vadim Bukh added that the more expensive the production of an audiobook, the better it sells in the end. “Right now, we’re not aiming to save on content. Our most expensive projects are also our best-selling ones,” said Bukh.

Mikhail Litvakov sees two business models for monetising audio content. The first is essentially what VIMBO is already doing — creating high-budget projects that are currently selling well. The second is to invest in lower-quality AI narration but produce a large volume of audiobooks. According to Litvakov, both models are viable. However, he also noted that it is currently very difficult to determine the true value of the product.
“This year, LitRes has estimated the audiobook market at 6.5 billion rubles. But how was that calculated? That’s based on the rights holder’s prices, not the consumer’s,” said Mikhail Litvakov. “For example, I subscribe to a book streaming service. How much of that money goes toward audio? We don’t know. Yes, the rights holder gets their share. That’s the price of a compromise between two parties. Then the authors receive their portion based on the number of listening hours. But right now, there is no set price for one hour of listening.”
Not so long ago, when the Russian market still had the Storytel service, that price did exist. At the time, the cost of one hour of listening ranged from 12 to 18 rubles. “At one point, Storytel shook up the market. It turned out that subscription revenue brought in no less than pay-per-download per listened unit. But now we have no idea how much money audiobooks are generating. We only know how many hours our audiobooks were listened to and how much money we received from the platform. How much the individual consumer actually paid — we don’t know,” added Litvakov.

Smirnova made an unflattering conclusion from this issue: “The more the share of AI-narrated content increases, the smaller the share of high-quality content consumption will be. Therefore, the price for that content might rise for rights holders, compensating for our income from pure listening. Everything is heading in that direction.”
Ekaterina Petrova — a literary critic for Realnoe Vremya online newspaper and the author of the Telegram channel Bulochki s Makom (Poppy Seed Buns).
Подписывайтесь на телеграм-канал, группу «ВКонтакте» и страницу в «Одноклассниках» «Реального времени». Ежедневные видео на Rutube, «Дзене» и Youtube.