CN118648026A - 方法、装置和计算机程序 - Google Patents
方法、装置和计算机程序 Download PDFInfo
- Publication number
- CN118648026A CN118648026A CN202380019541.6A CN202380019541A CN118648026A CN 118648026 A CN118648026 A CN 118648026A CN 202380019541 A CN202380019541 A CN 202380019541A CN 118648026 A CN118648026 A CN 118648026A
- Authority
- CN
- China
- Prior art keywords
- user
- machine learning
- learning model
- avatar
- audio data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—Three-dimensional [3D] animation
- G06T13/205—Three-dimensional [3D] animation driven by audio data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—Three-dimensional [3D] animation
- G06T13/40—Three-dimensional [3D] animation of characters, e.g. humans, animals or virtual beings
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transforming into visible information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transforming into visible information
- G10L2021/105—Synthesis of the lips movements from speech, e.g. for talking heads
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Quality & Reliability (AREA)
- Processing Or Creating Images (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP22154373.9A EP4220565A1 (en) | 2022-01-31 | 2022-01-31 | Method, apparatus and computer program |
| EP22154373.9 | 2022-01-31 | ||
| PCT/US2023/010261 WO2023146741A1 (en) | 2022-01-31 | 2023-01-06 | Method, apparatus and computer program |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN118648026A true CN118648026A (zh) | 2024-09-13 |
Family
ID=80119033
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202380019541.6A Pending CN118648026A (zh) | 2022-01-31 | 2023-01-06 | 方法、装置和计算机程序 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US20250069308A1 (https=) |
| EP (2) | EP4220565A1 (https=) |
| JP (1) | JP2025505340A (https=) |
| KR (1) | KR20240142448A (https=) |
| CN (1) | CN118648026A (https=) |
| WO (1) | WO2023146741A1 (https=) |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12518482B2 (en) * | 2023-08-10 | 2026-01-06 | Qualcomm Incorporated | Virtual representative conditioning system |
| US12367425B1 (en) | 2024-01-12 | 2025-07-22 | THIA ST Co. | Copilot customization with data producer(s) |
| US12242503B1 (en) | 2024-01-12 | 2025-03-04 | THIA ST Co. | Copilot architecture: network of microservices including specialized machine learning tools |
| US12536045B2 (en) | 2024-01-12 | 2026-01-27 | THIA ST Co. | Distribution of tasks among microservices in a copilot |
| US12367426B1 (en) * | 2024-01-12 | 2025-07-22 | THIA ST Co. | Customization of machine learning tools with occupation training |
| US20250267239A1 (en) * | 2024-02-15 | 2025-08-21 | Microsoft Technology Licensing, Llc | Generative communication session event effects |
Family Cites Families (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20160134840A1 (en) * | 2014-07-28 | 2016-05-12 | Alexa Margaret McCulloch | Avatar-Mediated Telepresence Systems with Enhanced Filtering |
| US10559111B2 (en) * | 2016-06-23 | 2020-02-11 | LoomAi, Inc. | Systems and methods for generating computer ready animation models of a human head from captured data images |
| EP3797404A4 (en) * | 2018-05-22 | 2022-02-16 | Magic Leap, Inc. | SKELETAL SYSTEMS FOR ANIMATION OF VIRTUAL AVATARS |
| US10755463B1 (en) * | 2018-07-20 | 2020-08-25 | Facebook Technologies, Llc | Audio-based face tracking and lip syncing for natural facial animation and lip movement |
| US11568645B2 (en) * | 2019-03-21 | 2023-01-31 | Samsung Electronics Co., Ltd. | Electronic device and controlling method thereof |
| US10949715B1 (en) * | 2019-08-19 | 2021-03-16 | Neon Evolution Inc. | Methods and systems for image and voice processing |
| EP4081985A1 (en) * | 2020-01-29 | 2022-11-02 | Google LLC | Photorealistic talking faces from audio |
| US11127225B1 (en) | 2020-06-01 | 2021-09-21 | Microsoft Technology Licensing, Llc | Fitting 3D models of composite objects |
| WO2022103877A1 (en) * | 2020-11-13 | 2022-05-19 | Innopeak Technology, Inc. | Realistic audio driven 3d avatar generation |
| US11734888B2 (en) * | 2021-04-23 | 2023-08-22 | Meta Platforms Technologies, Llc | Real-time 3D facial animation from binocular video |
-
2022
- 2022-01-31 EP EP22154373.9A patent/EP4220565A1/en not_active Withdrawn
-
2023
- 2023-01-06 KR KR1020247025950A patent/KR20240142448A/ko active Pending
- 2023-01-06 EP EP23704845.9A patent/EP4473490A1/en active Pending
- 2023-01-06 CN CN202380019541.6A patent/CN118648026A/zh active Pending
- 2023-01-06 JP JP2024536188A patent/JP2025505340A/ja active Pending
- 2023-01-06 US US18/726,789 patent/US20250069308A1/en active Pending
- 2023-01-06 WO PCT/US2023/010261 patent/WO2023146741A1/en not_active Ceased
Also Published As
| Publication number | Publication date |
|---|---|
| US20250069308A1 (en) | 2025-02-27 |
| KR20240142448A (ko) | 2024-09-30 |
| EP4220565A1 (en) | 2023-08-02 |
| EP4473490A1 (en) | 2024-12-11 |
| WO2023146741A1 (en) | 2023-08-03 |
| JP2025505340A (ja) | 2025-02-26 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12277640B2 (en) | Photorealistic real-time portrait animation | |
| US20250069308A1 (en) | Method, apparatus and computer program | |
| CN113287118B (zh) | 用于面部再现的系统和方法 | |
| US11410364B2 (en) | Systems and methods for realistic head turns and face animation synthesis on mobile device | |
| US11114086B2 (en) | Text and audio-based real-time face reenactment | |
| KR102863164B1 (ko) | 모바일 디바이스에서 사실적인 머리 회전들 및 얼굴 애니메이션 합성을 위한 방법들 및 시스템들 | |
| CN113228163B (zh) | 基于文本和音频的实时面部再现 | |
| CN118842975A (zh) | 一种数字人视频生成方法、装置、设备及介质 | |
| WO2008087621A1 (en) | An apparatus and method for animating emotionally driven virtual objects | |
| KR102809162B1 (ko) | 아이덴티티 이미지, 텍스트 및 오디오를 이용한 감정 페이스 토킹 비디오 생성 시스템 및 그 방법 | |
| CN120976376A (zh) | 人物动画生成方法、装置、设备、存储介质及程序产品 | |
| CN120956978A (zh) | 肖像视频生成方法、装置及电子设备 | |
| CN119359872A (zh) | 虚拟人表情合成方法、装置、电子设备和存储介质 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |