NO317598B1 - Fremgangsmate og apparat for frembringelse av visuell talesyntese - Google Patents
Fremgangsmate og apparat for frembringelse av visuell talesyntese Download PDFInfo
- Publication number
- NO317598B1 NO317598B1 NO19995673A NO995673A NO317598B1 NO 317598 B1 NO317598 B1 NO 317598B1 NO 19995673 A NO19995673 A NO 19995673A NO 995673 A NO995673 A NO 995673A NO 317598 B1 NO317598 B1 NO 317598B1
- Authority
- NO
- Norway
- Prior art keywords
- acoustic
- mouth
- speaker
- speech
- constituent element
- Prior art date
Links
- 230000000007 visual effect Effects 0.000 title claims description 42
- 238000000034 method Methods 0.000 title claims description 36
- 238000003786 synthesis reaction Methods 0.000 title description 4
- 230000015572 biosynthetic process Effects 0.000 title description 3
- 230000001815 facial effect Effects 0.000 claims description 63
- 239000000470 constituent Substances 0.000 claims description 40
- 230000009466 transformation Effects 0.000 claims description 17
- 238000004458 analytical method Methods 0.000 claims description 13
- 238000005259 measurement Methods 0.000 claims description 11
- 230000001360 synchronised effect Effects 0.000 claims description 10
- 230000001771 impaired effect Effects 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 5
- 238000012886 linear function Methods 0.000 claims description 3
- 238000012417 linear regression Methods 0.000 claims description 3
- 230000004044 response Effects 0.000 claims description 3
- 238000000844 transformation Methods 0.000 claims description 3
- 238000011084 recovery Methods 0.000 claims 1
- 230000001755 vocal effect Effects 0.000 description 16
- 208000016354 hearing loss disease Diseases 0.000 description 4
- 238000001228 spectrum Methods 0.000 description 4
- 210000001260 vocal cord Anatomy 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 3
- 208000032041 Hearing impaired Diseases 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 210000004704 glottis Anatomy 0.000 description 2
- 210000001847 jaw Anatomy 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 206010009269 Cleft palate Diseases 0.000 description 1
- 208000016621 Hearing disease Diseases 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000002902 bimodal effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 230000008602 contraction Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 210000000867 larynx Anatomy 0.000 description 1
- 210000000088 lip Anatomy 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 210000003254 palate Anatomy 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 210000002105 tongue Anatomy 0.000 description 1
- 210000003437 trachea Anatomy 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/24—Speech recognition using non-acoustical features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transforming into visible information
- G10L2021/105—Synthesis of the lips movements from speech, e.g. for talking heads
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/40—Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/56—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
- H04M3/567—Multimedia conference systems
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Theoretical Computer Science (AREA)
- Processing Or Creating Images (AREA)
- Photoreceptors In Electrophotography (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SE9701977A SE511927C2 (sv) | 1997-05-27 | 1997-05-27 | Förbättringar i, eller med avseende på, visuell talsyntes |
PCT/SE1998/000710 WO1998054696A1 (en) | 1997-05-27 | 1998-04-20 | Improvements in, or relating to, visual speech synthesis |
Publications (3)
Publication Number | Publication Date |
---|---|
NO995673D0 NO995673D0 (no) | 1999-11-19 |
NO995673L NO995673L (no) | 2000-01-25 |
NO317598B1 true NO317598B1 (no) | 2004-11-22 |
Family
ID=20407101
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
NO19995673A NO317598B1 (no) | 1997-05-27 | 1999-11-19 | Fremgangsmate og apparat for frembringelse av visuell talesyntese |
Country Status (7)
Country | Link |
---|---|
EP (1) | EP0983575B1 (de) |
DE (1) | DE69816078T2 (de) |
DK (1) | DK0983575T3 (de) |
EE (1) | EE03634B1 (de) |
NO (1) | NO317598B1 (de) |
SE (1) | SE511927C2 (de) |
WO (1) | WO1998054696A1 (de) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101268507A (zh) | 2005-07-11 | 2008-09-17 | 皇家飞利浦电子股份有限公司 | 用于通信的方法以及通信设备 |
CN101304655B (zh) | 2005-11-10 | 2014-12-10 | 巴斯夫欧洲公司 | 杀真菌混合物 |
US9956407B2 (en) | 2014-08-04 | 2018-05-01 | Cochlear Limited | Tonal deafness compensation in an auditory prosthesis system |
US10534955B2 (en) * | 2016-01-22 | 2020-01-14 | Dreamworks Animation L.L.C. | Facial capture analysis and training system |
CN106067989B (zh) * | 2016-04-28 | 2022-05-17 | 江苏大学 | 一种人像语音视频同步校准装置及方法 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5621858A (en) * | 1992-05-26 | 1997-04-15 | Ricoh Corporation | Neural network acoustic and visual speech recognition system training method and apparatus |
US5482048A (en) * | 1993-06-30 | 1996-01-09 | University Of Pittsburgh | System and method for measuring and quantitating facial movements |
US5657426A (en) * | 1994-06-10 | 1997-08-12 | Digital Equipment Corporation | Method and apparatus for producing audio-visual synthetic speech |
KR960018988A (ko) * | 1994-11-07 | 1996-06-17 | 엠, 케이. 영 | 음향 보조 영상 처리 방법 및 장치 |
SE519244C2 (sv) * | 1995-12-06 | 2003-02-04 | Telia Ab | Anordning och metod vid talsyntes |
-
1997
- 1997-05-27 SE SE9701977A patent/SE511927C2/sv unknown
-
1998
- 1998-04-20 EP EP98917918A patent/EP0983575B1/de not_active Expired - Lifetime
- 1998-04-20 EE EEP199900542A patent/EE03634B1/xx not_active IP Right Cessation
- 1998-04-20 DK DK98917918T patent/DK0983575T3/da active
- 1998-04-20 DE DE69816078T patent/DE69816078T2/de not_active Expired - Fee Related
- 1998-04-20 WO PCT/SE1998/000710 patent/WO1998054696A1/en active IP Right Grant
-
1999
- 1999-11-19 NO NO19995673A patent/NO317598B1/no unknown
Also Published As
Publication number | Publication date |
---|---|
DK0983575T3 (da) | 2003-10-27 |
EP0983575B1 (de) | 2003-07-02 |
EP0983575A1 (de) | 2000-03-08 |
DE69816078D1 (de) | 2003-08-07 |
SE9701977L (sv) | 1998-11-28 |
SE511927C2 (sv) | 1999-12-20 |
EE9900542A (et) | 2000-06-15 |
NO995673L (no) | 2000-01-25 |
DE69816078T2 (de) | 2004-05-13 |
EE03634B1 (et) | 2002-02-15 |
SE9701977D0 (sv) | 1997-05-27 |
NO995673D0 (no) | 1999-11-19 |
WO1998054696A1 (en) | 1998-12-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Rosenblum et al. | An audiovisual test of kinematic primitives for visual speech perception. | |
Munhall et al. | The moving face during speech communication | |
Hanson et al. | Towards models of phonation | |
JP3893763B2 (ja) | 音声検出装置 | |
Cvejic et al. | Prosody off the top of the head: Prosodic contrasts can be discriminated by head motion | |
Livingstone et al. | Head movements encode emotions during speech and song. | |
CN107301863A (zh) | 一种聋哑儿童言语障碍康复方法及康复训练系统 | |
Campbell | The lateralization of lip-read sounds: A first look | |
Vatakis et al. | Assessing the effect of physical differences in the articulation of consonants and vowels on audiovisual temporal perception | |
Freitas et al. | An introduction to silent speech interfaces | |
Smith et al. | Infant-directed visual prosody: Mothers’ head movements and speech acoustics | |
NO317598B1 (no) | Fremgangsmate og apparat for frembringelse av visuell talesyntese | |
JP2007018006A (ja) | 音声合成システム、音声合成方法、音声合成プログラム | |
Bicevskis et al. | Effects of mouthing and interlocutor presence on movements of visible vs. non-visible articulators | |
Yip | Phonetic effects on the timing of gestural coordination in Modern Greek consonant clusters | |
Zellou | Similarity and enhancement: Nasality from Moroccan Arabic pharyngeals and nasals | |
Beskow et al. | Visualization of speech and audio for hearing impaired persons | |
Öster | Computer-based speech therapy using visual feedback with focus on children with profound hearing impairments | |
McGarr et al. | Ephphatha1: Opening Inroads to Understanding Articulatory OO Organization in Persons with Jm^ j Hearing Impairment | |
WO2022024355A1 (ja) | 感情解析システム | |
Lavagetto | Multimedia Telephone for Hearing-Impaired People | |
Wu et al. | Development and evaluation of on/off control for electrolaryngeal speech via artificial neural network based on visual information of lips | |
Dahmani et al. | Some consideration on expressive audiovisual speech corpus acquisition using a multimodal platform | |
Chen et al. | Investigating the relationship between glottal area waveform shape and harmonic magnitudes through computational modeling and laryngeal high-speed videoendoscopy. | |
Rothenberg | Rethinking nasalance and nasal emission |