Categories: NewsTechnology

Deep Learning AI Mimics Human Voices In 30 Minutes or Less

Advancements made in artificial intelligence are seemingly announced every single week. Earlier this year, Deep Voice 1 was released by Baidu Silicon Valley AI Lab. That system was received with considerable enthusiasm and hesitation alike, as it generates synthetic human voices through deep neural networks. (As it happens, the end result is quite creepy.) Now that Deep Voice 2 has been announced, things have taken a dramatic new turn.

Deep Voice 2 Contains Hundreds of Voices

Transforming text to speech in a human-like fashion has proven to be difficult. Even when projects succeed in doing so, the end results are incredibly synthetic and far less professional than we would like. Thanks to solutions such as Deep Voice 1 and Deep Voice 2, that situation is slowly changing for the better. There are some definite benefits to these solutions, even though they may not necessarily become accessible to everyday consumers anytime soon.

The Deep Voice 1 platform generated synthetic human voice through deep neural networks. It did so in real-time, allowing the solution to synthesize audio as fast as it needed to be played. This was a major development for inactive projects like video games, conversational interfaces, and the media industry. Baidu Silicon Valley AI Lab took the time to train deep neural networks to learn from vast amounts of data. As a result, it created one of the most comprehensive voice synthesis solutions the world had seen to date.

However, the company has not rested on its laurels. Instead, it has been working hard on improving its Deep Voice solution over the past few months. Over the course of roughly 90 days, the team managed to make their next Deep Voice system far more advanced than their first. Deep Voice 1 provided 20 hours of speech and only one voice from which to choose. Although this was not necessarily a major limitation, companies have to keep moving forward at all times, improving upon existing solutions and scaling the technology.

Related Post

That is exactly what the company did. Deep Voice 2 now contains hundreds of hours of speech and provides hundreds of voices to choose from as well. It is also more than capable of learning from hundreds of voices and imitating them perfectly. Indeed, it’s pretty creepy to hear a computer utter your voice saying something you’d never said. Deep Voice 2 is capable of learning a new voice in the span of just 30 minutes.

Based on samples provided on the official website, one can hardly tell which voice is human and which is a copy, as the difference between the two is incredibly small. This goes to show that AI technology has come a long way and is advancing at an exceptional pace. Although this is by far one of the creepier developments we have seen (or heard) in recent months, it also represents a major breakthrough for any sector reliant on voice synthesis.

Baidu will not release this project’s source code free of charge, nor does it have to. It would be nice to know if this platform will ever become a consumer-grade product. Technology such as this can have major consequences and powerful use cases, assuming people have a chance to play around with it at some point. Any company working with chat assistants, robotics, or other tools involving some degree of communication would certainly benefit from embracing this technology in the future.

JP Buntinx

JP Buntinx is a FinTech and Bitcoin enthusiast living in Belgium. His passion for finance and technology made him one of the world's leading freelance Bitcoin writers, and he aims to achieve the same level of respect in the FinTech sector.

Share
Published by
JP Buntinx

Recent Posts

Hyperliquid Proposes 37M HYPE Burn as Validators Prepare to Vote

Hyperliquid is facing one of its most consequential governance moments yet. A proposal now before…

5 hours ago

Tether Introduces PearPass, a Local-First Password Manager With No Servers

Tether has entered a new category. The company announced the launch of PearPass, a peer-to-peer…

5 hours ago

Solana Takes First Steps Toward Quantum-Resistant Security

The Solana ecosystem is beginning to prepare for a future many in crypto still treat…

1 day ago

Institutional Bitcoin Accumulation Enters a New Phase

Bitcoin accumulation is no longer subtle. It’s visible. It’s deliberate. And it’s accelerating. In the…

1 day ago

Ripple Expands RLUSD to Layer 2 Networks, Laying the Groundwork for Institutional DeFi

Ripple is pushing its stablecoin strategy beyond a single chain. The company has announced it…

1 day ago

Standard Chartered and Coinbase Deepen Institutional Crypto Partnership

Standard Chartered and Coinbase are pushing their institutional crypto relationship into a new phase. On…

2 days ago