One of many extra sudden merchandise to launch out of the Microsoft Ignite 2023 occasion is a instrument that may create a photorealistic avatar of an individual and animate that avatar saying issues that the individual didn’t essentially say.
Referred to as Azure AI Speech textual content to speech avatar, the brand new characteristic, accessible in public preview as of right now, lets customers generate movies of an avatar talking by importing photos of an individual they need the avatar to resemble and writing a script. Microsoft’s instrument trains a mannequin to drive the animation, whereas a separate text-to-speech mannequin — both prebuilt or educated on the individual’s voice — “reads” the script aloud.
“With textual content to speech avatar, customers can extra effectively create video … to construct coaching movies, product introductions, buyer testimonials [and so on] merely with textual content enter,” writes Microsoft in a weblog publish. “You should use the avatar to construct conversational brokers, digital assistants, chatbots and extra.”
Avatars can communicate in a number of languages. And, for chatbot eventualities, they will faucet AI fashions like OpenAI’s GPT-3.5 to reply to off-script questions from prospects.
Now, there are numerous methods such a instrument may very well be abused — which Microsoft to its credit score realizes. (Related avatar-generating tech from AI startup Synthesia has been misused to supply propaganda in Venezuela and false information studies promoted by pro-China social media accounts.) Most Azure subscribers will solely have the ability to entry prebuilt — not customized — avatars at launch; customized avatars are at the moment a “restricted entry” functionality accessible by registration solely and “just for sure use instances,” Microsoft says.
However the characteristic raises a bunch of uncomfortable moral questions.
One of many main sticking factors within the current SAG-AFTRA strike was using AI to create digital likenesses. Studios finally agreed to pay actors for his or her AI-generated likenesses. However what about Microsoft and its prospects?
I requested Microsoft its place on corporations utilizing actors’ likenesses with out, within the actors’ views, correct compensation and even notification. The corporate didn’t reply — nor did it say whether or not it might require that corporations label avatars as AI-generated, like YouTube and a rising quantity of different platforms.
Microsoft seems to have extra guardrails round a associated generative AI instrument, private voice, that’s additionally launching at Ignite.
Private voice, a brand new functionality inside Microsoft’s customized neural voice service, can replicate a person’s voice in a couple of seconds supplied a one-minute speech pattern as an audio immediate. Microsoft pitches it as a option to create customized voice assistants, dub content material into completely different languages and generate bespoke narrations for tales, audio books and podcasts.
To keep off potential authorized complications, Microsoft’s requiring that customers give “specific consent” within the type of a recorded assertion earlier than a buyer can use private voice to synthesize their voices. Entry to the characteristic is gated behind a registration kind in the meanwhile, and prospects should agree to make use of private voice solely in purposes “the place the voice doesn’t learn user-generated or open-ended content material.”
“Voice mannequin utilization should stay inside an software and output should not be publishable or shareable from the appliance,” Microsoft writes in a weblog publish. “[C]ustomers who meet restricted entry eligibility standards preserve sole management over the creation of, entry to and use of the voice fashions and their output [where it concerns] dubbing for movies, TV, video and audio for leisure eventualities solely.”
Microsoft didn’t reply TechCrunch’s questions on how actors could be compensated for his or her private voice contributions — or whether or not it plans to implement any form of watermarking tech in order that AI-generated voices could be extra simply recognized.
For extra Microsoft Ignite 2023 protection:
This story was initially printed at 8am PT on Nov. 15 and up to date at 3:30pm PT.