Adobe demonstrated its VoCo software in , which could generate speech from text after 20 minutes of listening to a voice. Montreal-based AI startup Lyrebird claims it can do text-to-speech using just one minute of audio. These technologies represent the kind of leaps in the advancement of AI that researchers and theorists raised concerns around when deepfakes democratized machine learning-generated videos.
Sign In Create Account. Trying out Lyrebird AI is free, but if you want to use it longer, a monthly fee will apply. Most people associate deepfake technology with videos and images, since those are the media contents that are often featured when people paint dark pictures of our future with deepfake altered-media content. However, there is another emerging branch of deepfakes that is popularized just as quickly, mainly because of the wide range of potential applications it can have in so many different industries.
Obviously, we are talking about deepfake audio, in which Artificial Intelligence AI and machine learning algorithms are used to make anyone say anything, at any time. That sounds as cool and frightening as it is.
Because it is exactly that. With a simple application like Lyrebird AI, it is now easier than ever to mimic your voice as precisely as possible. Record a bit of audio, and the software will extrapolate all sounds an intonations it needs to make that person say pretty much anything you type into the app. A future where voice cloning will be as normal as any other app on your smartphone.
The concept of voice cloning is not new — speech synthesis has been an important technology for many decades. The artificial production of human speech is often done through the use of Text-To-Speech TTS systems, which have been drastically improving in quality over time. The latest improvement in the world of speech synthesis utilizes AI and machine learning to power the code which produces the near-realistic but computer-generated voices.
Voice cloning essentially takes an audio file of any individual voice and uses it as source material for creating eerily similar AI-generated audio recordings of that same voice.
With just several hours of source material audio recordings of an individual voice , deepfake application or software like Lyrebird AI is capable of cloning the voice, allowing it to be used for the creation of other deepfake audio outputs. In essence, all voice cloning does is take audio file A and extract the voice intonations, emotions, and other subtle nuances from that voice.
It then uses a set of algorithmic rules to form a completely new set of words or sentences — never spoken by the person in the source material. For example, the audio file tells a story about why rabbits are cute. And the AI-generated voice clone will be able to transform that into any other type of story. It could talk about what it thinks about the latest political news, the weather or any other story that is used as TTS-input for that matter. Voice cloning is a method to make anyone say anything, at any time.
With a minimal need for source audio and a few lines of code. It allows anyone to mimic any voice, using a very simple software TTS tool and a short voice recording or already existing audio file. Developed by a small Montreal-based start-up, Lyrebird AI has quickly climbed the ranks within the world of deepfake audio applications. The firs tech demo came out in early , and since then the program has only gotten better and more accurate in mimicking human voices.
To quote Futurama : "Amy, technology isn't intrinsically good or evil. It's how it's used. This website uses cookies to improve user experience. By continuing to use our website you consent to all cookies in accordance with our cookie policy. Share on Facebook. Share on Twitter. By Rosie McCall 27 Feb , This website uses cookies This website uses cookies to improve user experience.
0コメント