CN118648026A - 方法、装置和计算机程序 - Google Patents

方法、装置和计算机程序 Download PDF

Info

Publication number
CN118648026A
CN118648026A CN202380019541.6A CN202380019541A CN118648026A CN 118648026 A CN118648026 A CN 118648026A CN 202380019541 A CN202380019541 A CN 202380019541A CN 118648026 A CN118648026 A CN 118648026A
Authority
CN
China
Prior art keywords
user
machine learning
learning model
avatar
audio data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202380019541.6A
Other languages
English (en)
Chinese (zh)
Inventor
P·J·卡梅伦
C·P·B·莫里森
M·P·格雷森
D·马斯赛蒂
M·A·约翰逊
E·S·L·林特尔
R·法亚·马奎斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Technology Licensing LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing LLC filed Critical Microsoft Technology Licensing LLC
Publication of CN118648026A publication Critical patent/CN118648026A/zh
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/20Three-dimensional [3D] animation
    • G06T13/205Three-dimensional [3D] animation driven by audio data
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/20Three-dimensional [3D] animation
    • G06T13/40Three-dimensional [3D] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transforming into visible information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transforming into visible information
    • G10L2021/105Synthesis of the lips movements from speech, e.g. for talking heads

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Processing Or Creating Images (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
CN202380019541.6A 2022-01-31 2023-01-06 方法、装置和计算机程序 Pending CN118648026A (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
EP22154373.9A EP4220565A1 (en) 2022-01-31 2022-01-31 Method, apparatus and computer program
EP22154373.9 2022-01-31
PCT/US2023/010261 WO2023146741A1 (en) 2022-01-31 2023-01-06 Method, apparatus and computer program

Publications (1)

Publication Number Publication Date
CN118648026A true CN118648026A (zh) 2024-09-13

Family

ID=80119033

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202380019541.6A Pending CN118648026A (zh) 2022-01-31 2023-01-06 方法、装置和计算机程序

Country Status (6)

Country Link
US (1) US20250069308A1 (https=)
EP (2) EP4220565A1 (https=)
JP (1) JP2025505340A (https=)
KR (1) KR20240142448A (https=)
CN (1) CN118648026A (https=)
WO (1) WO2023146741A1 (https=)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12518482B2 (en) * 2023-08-10 2026-01-06 Qualcomm Incorporated Virtual representative conditioning system
US12367425B1 (en) 2024-01-12 2025-07-22 THIA ST Co. Copilot customization with data producer(s)
US12242503B1 (en) 2024-01-12 2025-03-04 THIA ST Co. Copilot architecture: network of microservices including specialized machine learning tools
US12536045B2 (en) 2024-01-12 2026-01-27 THIA ST Co. Distribution of tasks among microservices in a copilot
US12367426B1 (en) * 2024-01-12 2025-07-22 THIA ST Co. Customization of machine learning tools with occupation training
US20250267239A1 (en) * 2024-02-15 2025-08-21 Microsoft Technology Licensing, Llc Generative communication session event effects

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160134840A1 (en) * 2014-07-28 2016-05-12 Alexa Margaret McCulloch Avatar-Mediated Telepresence Systems with Enhanced Filtering
US10559111B2 (en) * 2016-06-23 2020-02-11 LoomAi, Inc. Systems and methods for generating computer ready animation models of a human head from captured data images
EP3797404A4 (en) * 2018-05-22 2022-02-16 Magic Leap, Inc. SKELETAL SYSTEMS FOR ANIMATION OF VIRTUAL AVATARS
US10755463B1 (en) * 2018-07-20 2020-08-25 Facebook Technologies, Llc Audio-based face tracking and lip syncing for natural facial animation and lip movement
US11568645B2 (en) * 2019-03-21 2023-01-31 Samsung Electronics Co., Ltd. Electronic device and controlling method thereof
US10949715B1 (en) * 2019-08-19 2021-03-16 Neon Evolution Inc. Methods and systems for image and voice processing
EP4081985A1 (en) * 2020-01-29 2022-11-02 Google LLC Photorealistic talking faces from audio
US11127225B1 (en) 2020-06-01 2021-09-21 Microsoft Technology Licensing, Llc Fitting 3D models of composite objects
WO2022103877A1 (en) * 2020-11-13 2022-05-19 Innopeak Technology, Inc. Realistic audio driven 3d avatar generation
US11734888B2 (en) * 2021-04-23 2023-08-22 Meta Platforms Technologies, Llc Real-time 3D facial animation from binocular video

Also Published As

Publication number Publication date
US20250069308A1 (en) 2025-02-27
KR20240142448A (ko) 2024-09-30
EP4220565A1 (en) 2023-08-02
EP4473490A1 (en) 2024-12-11
WO2023146741A1 (en) 2023-08-03
JP2025505340A (ja) 2025-02-26

Similar Documents

Publication Publication Date Title
US12277640B2 (en) Photorealistic real-time portrait animation
US20250069308A1 (en) Method, apparatus and computer program
CN113287118B (zh) 用于面部再现的系统和方法
US11410364B2 (en) Systems and methods for realistic head turns and face animation synthesis on mobile device
US11114086B2 (en) Text and audio-based real-time face reenactment
KR102863164B1 (ko) 모바일 디바이스에서 사실적인 머리 회전들 및 얼굴 애니메이션 합성을 위한 방법들 및 시스템들
CN113228163B (zh) 基于文本和音频的实时面部再现
CN118842975A (zh) 一种数字人视频生成方法、装置、设备及介质
WO2008087621A1 (en) An apparatus and method for animating emotionally driven virtual objects
KR102809162B1 (ko) 아이덴티티 이미지, 텍스트 및 오디오를 이용한 감정 페이스 토킹 비디오 생성 시스템 및 그 방법
CN120976376A (zh) 人物动画生成方法、装置、设备、存储介质及程序产品
CN120956978A (zh) 肖像视频生成方法、装置及电子设备
CN119359872A (zh) 虚拟人表情合成方法、装置、电子设备和存储介质

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination