Microsoft creates AI that replicates 'exact voice' of humans - but it's too dangerous to release Microsoft has developed a groundbreaking artificial intelligence (AI) speech generator, VALL-E 2, that's so realistic it won't be released to the public. This text-to-speech (TTS) marvel can mimic a human voice with just a few seconds of audio. According to a paper published June 17 on the pre-print server arXiv, "VALL-E 2 was capable of generating 'accurate, natural speech in the exact voice of the original speaker, comparable to human performance'"/ "VALL-E 2 is the latest advancement in neural codec language models that marks a milestone in zero-shot text-to-speech synthesis (TTS), achieving human parity for the first time," the researchers wrote in the paper.
READ MORE: First AI-generated beauty pageant winner officially announced in historic contest Image: Getty Images) Getty Images) Human parity in this context means that speech generated by VALL-E 2 matched or exceeded the quality of human...
(Image: Getty Images).
