WO2008087621A1 - Appareil et procédé d'animation d'objets virtuels à répondant émotionnel - Google Patents
Appareil et procédé d'animation d'objets virtuels à répondant émotionnel Download PDFInfo
- Publication number
- WO2008087621A1 WO2008087621A1 PCT/IL2007/001289 IL2007001289W WO2008087621A1 WO 2008087621 A1 WO2008087621 A1 WO 2008087621A1 IL 2007001289 W IL2007001289 W IL 2007001289W WO 2008087621 A1 WO2008087621 A1 WO 2008087621A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- scene
- animation
- animating
- module
- optionally
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 47
- 230000014509 gene expression Effects 0.000 claims abstract description 104
- 238000004891 communication Methods 0.000 claims abstract description 101
- 230000002996 emotional effect Effects 0.000 claims abstract description 92
- 238000010295 mobile communication Methods 0.000 claims abstract description 56
- 230000033001 locomotion Effects 0.000 claims description 27
- 238000002156 mixing Methods 0.000 claims description 14
- 238000001514 detection method Methods 0.000 claims description 10
- 238000005286 illumination Methods 0.000 claims description 9
- 230000000007 visual effect Effects 0.000 claims description 7
- 230000008909 emotion recognition Effects 0.000 description 22
- 230000008451 emotion Effects 0.000 description 14
- 230000001413 cellular effect Effects 0.000 description 9
- 230000002596 correlated effect Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 241001465754 Metazoa Species 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000008921 facial expression Effects 0.000 description 2
- 210000002414 leg Anatomy 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 238000013179 statistical model Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000002730 additional effect Effects 0.000 description 1
- 238000005452 bending Methods 0.000 description 1
- 210000001217 buttock Anatomy 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 210000001694 thigh bone Anatomy 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/205—3D [Three Dimensional] animation driven by audio data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/02—Details
- H04L12/16—Arrangements for providing special services to substations
- H04L12/18—Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
- H04L12/1813—Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
- H04L12/1827—Network arrangements for conference optimisation or adaptation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72403—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
- H04M1/72427—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality for supporting games or graphical animations
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transforming into visible information
- G10L2021/105—Synthesis of the lips movements from speech, e.g. for talking heads
Definitions
- the scene-animating module comprises a directing sub-module configured for automatically adjusting a point of view (POV) of the animated scene according to the at least one identified emotional expression, tracking the movement of the virtual object in the scene.
- the at least one virtual object comprises a plurality of body parts, each the body part is separately animated according to the at least one identified expression.
- the method further comprises receiving a plurality of communication streams from a plurality of participants and identifying respective at least one emotional expression from each the communication stream, wherein the managing comprises managing a scene having a plurality of virtual objects and the animating comprises animating each the virtual object according to the respective at least one emotional expression.
- Fig. 1 is a schematic illustration of a mobile communication terminal, such as a cellular phone, for animating a scene having one or more virtual objects, such as avatars, during a call, according to a preferred embodiment of the present invention
- Fig. 6 is a schematic illustration of a list of possible animations of the mouth of the virtual object, according to one embodiment of the present invention.
- the mobile communication terminal 1 comprises an emotion recognition module 2, which may be referred to as an emotion recognition module 2, a scene- animating module 3, and an output module 6 that is optionally connected to an integrated display of the mobile communication terminal 1.
- the emotion recognition module 2 receives the communication stream, optionally during an audio call, a video call, and/or an audio video call and identifies and/or estimates one or more emotional expression therein.
- the emotion recognition module 2 generates a vector, optionally weighted, that represents the identified emotional expressions and optionally the mouth-movements of the communication session participant 7. This process is performed immediately or substantially immediately after the receiving of the communication stream.
- the emotion recognition module 2 forwards the vector to the scene-animating module 3.
- the movements of the virtual objects in the scene are sequentially animated.
- the scene-animating module 3 uses the decision manager 9 that converts the identified emotional expressions in the weighted vector to a set of instructions that defines how to animate the virtual object in a manner that does not discontinue the animation that is described in the last status.
- the decision manager 9 defines a sound state.
- Fig. 2 and Fig. 4 is a schematic illustration of a state machine defining the sound modes of the decision manager 9, according to one embodiment of the present invention.
- the decision manager 9 is designed to convert a set of identified emotional expressions to a set of animation instructions.
- the decision manager 9 is designed to animate the virtual object in a manner that simulates a character that dances to the sound of music.
- the decision manager 9 is designed to intercept at least a portion of an audio stream from the aforementioned communication stream, an audio stream that is currently played by a music player of the hosting mobile communication device 1, or an audio stream that is locally added to the animation.
- the animation that applies to each one of the body parts is determined according to the sound mode of the decision manager 9.
- the sound mode is determined automatically according to the state machine that is depicted in Fig. 4. For example, if no voice or no music is applied, the decision manager 9 is in an idle mode 300. However, if voice is intercepted, the decision manager 9 switches to voice mode 301. If music is identified or added, the decision manager 9 switches to music and voice mode 302. If the interception of voice stops, the mode switches to music mode 303.
- the sound mode is determined automatically according to the state machine that is depicted in Fig. 4. For example, if no voice or no music is applied, the decision manager 9 is in an idle mode 300. However, if voice is intercepted, the decision manager 9 switches to voice mode 301. If music is identified or added, the decision manager 9 switches to music and voice mode 302. If the interception of voice stops, the mode switches to music mode 303.
- records of the animation lists of one or more of the body parts are divided according to the sound mode of the decision manager 9.
- different instructions to the same identified emotion expression may be chosen when the decision manager is in a different sound mode.
- the scene-animating module 3 may generate animation instructions that animate the virtual object as it dances to the sound of music and/or as it dubs the words of a simultaneously played song.
- Delay belts optionally, one or more body parts may animated in variable rates.
- the mesh is divided to a number of animation rate areas, each define a rate in which the body parts of the animation should be animated.
- the mesh is divided to belts; some of them define areas in which the movement of the body part should be animated with a delay.
- body parts that cannot be animated in the rate that is defined in their current area are filtered out.
- Current POV as described above, the animated scene can be portrayed to be seen as if it is captured from different POVs.
- the decision manager 9 dynamically defines the boundaries of the possible POV according to the last state of the body part and the estimated state of the body part and filters out any animation that exceeds these boundaries.
- the decision manager 9 randomly selects one of the unfiltered potential body part animations.
- each one of the animations in the animation list has a predefined weight.
- the random select is based on the weight of the potential body part animations.
- the body parts of the virtual objects are divided to area clusters.
- the orientation of the body parts of the virtual objects is correlated with the POV.
- the head of the virtual object is directed toward the virtual camera that takes the animated scene.
- the animated scene portrays virtual objects which are focused on the virtual camera.
- the orientation of the body parts is added to the animation cluster as a set of instructions to the graphic engine.
- the aforementioned graphic engine uses a collusion prevention mechanism that verifies that the animation of the visemes and the animation of the mouth may not contradict.
- the visemes tags are selected according to the animation cluster in order to present an animation of all or most of the animated body parts and/or to emphasis a certain body part that is animated according to an emotional expression that has been captured with relatively high intensity.
- the mobile communication terminal 1 further includes a user interface (UI).
- the UI allows the user to control the presentation of the animated scene.
- the UI may be used to allow the user to adjust the POV, the illumination, the background, the music, and/or the animation rate of the animated scene.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Processing Or Creating Images (AREA)
- Telephone Function (AREA)
Abstract
La présente invention concerne un terminal de communication portable qui permet à un utilisateur de participer à une session de communication avec un ou plusieurs participants. Ce terminal se compose d'un module de reconnaissance qui reçoit un flux de communication d'un ou de plusieurs participants et qui identifie les expressions émotionnelles du flux de communication. Le terminal se compose en outre d'un module d'animation de scène pour gérer une scène qui a un ou plusieurs objets virtuels et pour appliquer une animation à un ou plusieurs des objets virtuels selon les expressions émotionnelles identifiées. Ce terminal de communication mobile se compose en outre d'un module de production de la scène virtuelle animée.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US88043107P | 2007-01-16 | 2007-01-16 | |
US60/880,431 | 2007-01-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2008087621A1 true WO2008087621A1 (fr) | 2008-07-24 |
Family
ID=39165813
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IL2007/001289 WO2008087621A1 (fr) | 2007-01-16 | 2007-10-25 | Appareil et procédé d'animation d'objets virtuels à répondant émotionnel |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2008087621A1 (fr) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120146906A1 (en) * | 2009-08-11 | 2012-06-14 | Future Robot Co., Ltd. | Promotable intelligent display device and promoting method thereof |
EP2482532A1 (fr) * | 2011-01-26 | 2012-08-01 | Alcatel Lucent | Enrichissement d'une communication |
EP2838225A1 (fr) * | 2013-08-14 | 2015-02-18 | Samsung Electronics Co., Ltd | Procédé d'exécution de fonction de conversation à base de message de dispositif électronique supportant celui-ci |
US9525845B2 (en) | 2012-09-27 | 2016-12-20 | Dobly Laboratories Licensing Corporation | Near-end indication that the end of speech is received by the far end in an audio or video conference |
WO2021067988A1 (fr) * | 2019-09-30 | 2021-04-08 | Snap Inc. | Animation de danse automatisée |
US11176723B2 (en) | 2019-09-30 | 2021-11-16 | Snap Inc. | Automated dance animation |
US11222455B2 (en) | 2019-09-30 | 2022-01-11 | Snap Inc. | Management of pseudorandom animation system |
CN114020144A (zh) * | 2021-09-29 | 2022-02-08 | 中孚安全技术有限公司 | 一种用于保密管理培训的剧情化教学系统及方法 |
US11282253B2 (en) | 2019-09-30 | 2022-03-22 | Snap Inc. | Matching audio to a state-space model for pseudorandom animation |
US11348297B2 (en) | 2019-09-30 | 2022-05-31 | Snap Inc. | State-space system for pseudorandom animation |
CN115278041A (zh) * | 2021-04-29 | 2022-11-01 | 北京字跳网络技术有限公司 | 图像处理方法、装置、电子设备以及可读存储介质 |
US11816773B2 (en) | 2020-09-30 | 2023-11-14 | Snap Inc. | Music reactive animation of human characters |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1326445A2 (fr) * | 2001-12-20 | 2003-07-09 | Matsushita Electric Industrial Co., Ltd. | Vidéophone Virtuel |
US20050172001A1 (en) * | 2004-01-30 | 2005-08-04 | Microsoft Corporation | Mobile shared group interaction |
WO2005099262A1 (fr) * | 2004-04-07 | 2005-10-20 | Matsushita Electric Industrial Co., Ltd. | Terminal de communication et méthode de communication |
GB2423905A (en) * | 2005-03-03 | 2006-09-06 | Sean Smith | Animated messaging |
-
2007
- 2007-10-25 WO PCT/IL2007/001289 patent/WO2008087621A1/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1326445A2 (fr) * | 2001-12-20 | 2003-07-09 | Matsushita Electric Industrial Co., Ltd. | Vidéophone Virtuel |
US20050172001A1 (en) * | 2004-01-30 | 2005-08-04 | Microsoft Corporation | Mobile shared group interaction |
WO2005099262A1 (fr) * | 2004-04-07 | 2005-10-20 | Matsushita Electric Industrial Co., Ltd. | Terminal de communication et méthode de communication |
US20070139512A1 (en) * | 2004-04-07 | 2007-06-21 | Matsushita Electric Industrial Co., Ltd. | Communication terminal and communication method |
GB2423905A (en) * | 2005-03-03 | 2006-09-06 | Sean Smith | Animated messaging |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120146906A1 (en) * | 2009-08-11 | 2012-06-14 | Future Robot Co., Ltd. | Promotable intelligent display device and promoting method thereof |
EP2482532A1 (fr) * | 2011-01-26 | 2012-08-01 | Alcatel Lucent | Enrichissement d'une communication |
US9525845B2 (en) | 2012-09-27 | 2016-12-20 | Dobly Laboratories Licensing Corporation | Near-end indication that the end of speech is received by the far end in an audio or video conference |
EP2838225A1 (fr) * | 2013-08-14 | 2015-02-18 | Samsung Electronics Co., Ltd | Procédé d'exécution de fonction de conversation à base de message de dispositif électronique supportant celui-ci |
US11282253B2 (en) | 2019-09-30 | 2022-03-22 | Snap Inc. | Matching audio to a state-space model for pseudorandom animation |
US11176723B2 (en) | 2019-09-30 | 2021-11-16 | Snap Inc. | Automated dance animation |
US11222455B2 (en) | 2019-09-30 | 2022-01-11 | Snap Inc. | Management of pseudorandom animation system |
WO2021067988A1 (fr) * | 2019-09-30 | 2021-04-08 | Snap Inc. | Animation de danse automatisée |
US11348297B2 (en) | 2019-09-30 | 2022-05-31 | Snap Inc. | State-space system for pseudorandom animation |
US11670027B2 (en) | 2019-09-30 | 2023-06-06 | Snap Inc. | Automated dance animation |
US11790585B2 (en) | 2019-09-30 | 2023-10-17 | Snap Inc. | State-space system for pseudorandom animation |
US11810236B2 (en) | 2019-09-30 | 2023-11-07 | Snap Inc. | Management of pseudorandom animation system |
US11816773B2 (en) | 2020-09-30 | 2023-11-14 | Snap Inc. | Music reactive animation of human characters |
CN115278041A (zh) * | 2021-04-29 | 2022-11-01 | 北京字跳网络技术有限公司 | 图像处理方法、装置、电子设备以及可读存储介质 |
CN115278041B (zh) * | 2021-04-29 | 2024-02-27 | 北京字跳网络技术有限公司 | 图像处理方法、装置、电子设备以及可读存储介质 |
CN114020144A (zh) * | 2021-09-29 | 2022-02-08 | 中孚安全技术有限公司 | 一种用于保密管理培训的剧情化教学系统及方法 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2008087621A1 (fr) | Appareil et procédé d'animation d'objets virtuels à répondant émotionnel | |
US8830244B2 (en) | Information processing device capable of displaying a character representing a user, and information processing method thereof | |
CN104170318B (zh) | 使用交互化身的通信 | |
US9667574B2 (en) | Animated delivery of electronic messages | |
US6909453B2 (en) | Virtual television phone apparatus | |
US20160134840A1 (en) | Avatar-Mediated Telepresence Systems with Enhanced Filtering | |
KR100826443B1 (ko) | 이미지 처리 방법 및 시스템 | |
US20060079325A1 (en) | Avatar database for mobile video communications | |
US11005796B2 (en) | Animated delivery of electronic messages | |
JP2009533786A (ja) | 自分でできるフォトリアリスティックなトーキングヘッド作成システム及び方法 | |
US20030163315A1 (en) | Method and system for generating caricaturized talking heads | |
CN112652041B (zh) | 虚拟形象的生成方法、装置、存储介质及电子设备 | |
WO2022252866A1 (fr) | Procédé et appareil de traitement d'interaction, terminal et support | |
WO2022079933A1 (fr) | Programme de support de communication, procédé de support de communication, système de support de communication, dispositif terminal et programme d'expression non verbale | |
CN110794964A (zh) | 虚拟机器人的交互方法、装置、电子设备及存储介质 | |
KR20180132364A (ko) | 캐릭터 기반의 영상 표시 방법 및 장치 | |
CN115396390A (zh) | 基于视频聊天的互动方法、系统、装置及电子设备 | |
GB2510438A (en) | Interacting with audio and animation data delivered to a mobile device | |
Ballin et al. | Personal virtual humans—inhabiting the TalkZone and beyond | |
JP2023003008A (ja) | 端末装置、仮想空間提供システム及び仮想空間表示方法 | |
KR101439212B1 (ko) | 단말 장치 및 이를 이용한 토킹 헤드 표시 방법 | |
TWI583198B (zh) | 使用互動化身的通訊技術 | |
TW201924321A (zh) | 使用互動化身的通訊技術(五) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 07827263 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 07827263 Country of ref document: EP Kind code of ref document: A1 |