Unboxing the Voice Comparison Saga: Scarlett Johansson versus OpenAI’s Sky

Scarlett Johansson vs. Sky: A Voice Analysis Journey

In a fascinating intersection of artificial intelligence and voice recognition, researchers at Arizona State University have delved deep into the auditory realms to compare the voice of Scarlett Johansson with Sky, the now-discontinued voice of OpenAI’s GPT-4o. NPR’s commission of this voice analysis adds an extra layer of intrigue to this tale of vocal doppelgangers.

A Close Encounter with Scarlett Johansson’s Voice

Researchers at Arizona State University embarked on an ambitious project to understand how closely Scarlett Johansson’s voice resembles Sky, the AI-generated voice from OpenAI’s GPT-4o. Their findings were startling: Johansson’s voice was found to be more similar to Sky than 98% of the 600 other actresses who were compared. This study wasn’t a casual hearing check but involved sophisticated AI models designed specifically to analyze vocal similarities.

Despite the impressive resemblance, the models also pinpointed that the voices of actresses Anne Hathaway and Keri Russell bore even closer similarities to Sky than Johansson’s did. One intriguing discovery was that if Sky’s voice had a physical vocal tract—the combination of throat, mouth, and nasal passages—it would be as long as Johansson’s. While Johansson’s voice is noted for being slightly more breathy, Sky’s voice stands out for its higher pitch and expressiveness.

A waveform visualizing sound waves

The Enigma of AI Voice Models

Professor Visar Berisha, who led this analysis, pointed out the challenges presented by the black-box nature of AI models. Understanding the exact vocal similarities and differences identified by the model remains shrouded in complexity. The question of why an AI might deem one voice more similar than another can often yield more puzzles than answers.

Berisha’s other notable work includes the creation of OriginStory, a microphone that watermarks recordings to verify them as human-created. This technology won an FTC challenge, showcasing its significance in ensuring the authenticity of audio recordings.

AI model analyzing human voices

The Drama Behind the Voice

The narrative thickens with OpenAI’s internal actions and external response. Both CEO Sam Altman and CTO Mira Murati have denied designing Sky’s voice to resemble Johansson’s. Yet, after a GPT-4o demo where Altman cryptically posted “her,” Johansson revealed that Altman had indeed approached her to lend her voice to the model—a request she declined.

Johansson hasn’t yet taken legal action against OpenAI, although she has retained legal counsel. Experts suggest that if she were to sue, she might not even need to prove intentional creation of similarities to pose a serious challenge to the company.

A courtroom scene with AI elements

The Future of AI Voices

This entire saga underscores the ethical and legal complexities intertwined with AI-generated voices. As AI continues to evolve, determining boundaries, safeguarding personal likenesses, and ensuring ethical use will become increasingly critical.

From an investor and tech enthusiast’s perspective, this scenario raises intriguing questions about the future direction of AI-generated content. We may soon need frameworks and regulations to navigate this rapidly shifting landscape.

A futuristic scene with AI human-like figures speaking

Conclusion

The similarity between Scarlett Johansson’s voice and OpenAI’s Sky is a remarkable instance of how intricately advanced machine learning models can replicate real-world data. Yet, it sheds light on the nuanced ethical dilemma of creators, rights, and proprietary AI voices. The advancements are impressive, but so is the caution that must accompany these leaps forward in technology.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top