The EMO AI model by Alibaba turns any image, audio into an animated talking and singing video

Views: 4

Alibaba just released EMO, a model that turns any image, audio into an animated talking and singing video. The method operates in two main steps. First, it examines the photo and identifies facial movements (Frames Encoding). Then, it processes the audio to determine appropriate facial movements for the video (Diffusion Process). This ensures that the facial expressions align with the voice and the mood of the audio (expression mapping). It can produce videos that keep up with fast songs and can even animate old portraits or drawings, making them appear as though they are speaking or singing. , ai, artificialintelligence, ainews, Technology, Business, AIhistory, AIarchive, history, alibaba, EMO, audreyhepburn Digital sovereignty for You and Your AI Twin.