SE511927C2 - Förbättringar i, eller med avseende på, visuell talsyntes - Google Patents

Förbättringar i, eller med avseende på, visuell talsyntes

Info

Publication number
SE511927C2
SE511927C2 SE9701977A SE9701977A SE511927C2 SE 511927 C2 SE511927 C2 SE 511927C2 SE 9701977 A SE9701977 A SE 9701977A SE 9701977 A SE9701977 A SE 9701977A SE 511927 C2 SE511927 C2 SE 511927C2
Authority
SE
Sweden
Prior art keywords
acoustic
mouth
speaker
points
units
Prior art date
Application number
SE9701977A
Other languages
English (en)
Swedish (sv)
Other versions
SE9701977L (sv
SE9701977D0 (sv
Inventor
Mats Ljungqvist
Original Assignee
Telia Ab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telia Ab filed Critical Telia Ab
Priority to SE9701977A priority Critical patent/SE511927C2/sv
Publication of SE9701977D0 publication Critical patent/SE9701977D0/xx
Priority to PCT/SE1998/000710 priority patent/WO1998054696A1/en
Priority to DE69816078T priority patent/DE69816078T2/de
Priority to DK98917918T priority patent/DK0983575T3/da
Priority to EEP199900542A priority patent/EE03634B1/xx
Priority to EP98917918A priority patent/EP0983575B1/de
Publication of SE9701977L publication Critical patent/SE9701977L/xx
Priority to NO19995673A priority patent/NO317598B1/no
Publication of SE511927C2 publication Critical patent/SE511927C2/sv

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transforming into visible information
    • G10L2021/105Synthesis of the lips movements from speech, e.g. for talking heads
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • H04M3/567Multimedia conference systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Theoretical Computer Science (AREA)
  • Processing Or Creating Images (AREA)
  • Photoreceptors In Electrophotography (AREA)
SE9701977A 1997-05-27 1997-05-27 Förbättringar i, eller med avseende på, visuell talsyntes SE511927C2 (sv)

Priority Applications (7)

Application Number Priority Date Filing Date Title
SE9701977A SE511927C2 (sv) 1997-05-27 1997-05-27 Förbättringar i, eller med avseende på, visuell talsyntes
PCT/SE1998/000710 WO1998054696A1 (en) 1997-05-27 1998-04-20 Improvements in, or relating to, visual speech synthesis
DE69816078T DE69816078T2 (de) 1997-05-27 1998-04-20 Verbesserungen im bezug auf visuelle sprachsynthese
DK98917918T DK0983575T3 (da) 1997-05-27 1998-04-20 Forbedringer af eller vedrørende visuel talesyntese
EEP199900542A EE03634B1 (et) 1997-05-27 1998-04-20 Visuaalse kõnesünteesi alased või sellega seotud täiustused
EP98917918A EP0983575B1 (de) 1997-05-27 1998-04-20 Verbesserungen im bezug auf visuelle sprachsynthese
NO19995673A NO317598B1 (no) 1997-05-27 1999-11-19 Fremgangsmate og apparat for frembringelse av visuell talesyntese

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
SE9701977A SE511927C2 (sv) 1997-05-27 1997-05-27 Förbättringar i, eller med avseende på, visuell talsyntes

Publications (3)

Publication Number Publication Date
SE9701977D0 SE9701977D0 (sv) 1997-05-27
SE9701977L SE9701977L (sv) 1998-11-28
SE511927C2 true SE511927C2 (sv) 1999-12-20

Family

ID=20407101

Family Applications (1)

Application Number Title Priority Date Filing Date
SE9701977A SE511927C2 (sv) 1997-05-27 1997-05-27 Förbättringar i, eller med avseende på, visuell talsyntes

Country Status (7)

Country Link
EP (1) EP0983575B1 (de)
DE (1) DE69816078T2 (de)
DK (1) DK0983575T3 (de)
EE (1) EE03634B1 (de)
NO (1) NO317598B1 (de)
SE (1) SE511927C2 (de)
WO (1) WO1998054696A1 (de)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007007228A2 (en) 2005-07-11 2007-01-18 Philips Intellectual Property & Standards Gmbh Method for communication and communication device
CA2632742C (en) 2005-11-10 2013-10-15 Basf Se Fungicidal mixtures comprising a ternary combination of triticonazole, pyraclostrobin and metalaxyl-m
US9956407B2 (en) 2014-08-04 2018-05-01 Cochlear Limited Tonal deafness compensation in an auditory prosthesis system
US10534955B2 (en) * 2016-01-22 2020-01-14 Dreamworks Animation L.L.C. Facial capture analysis and training system
CN106067989B (zh) * 2016-04-28 2022-05-17 江苏大学 一种人像语音视频同步校准装置及方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5621858A (en) * 1992-05-26 1997-04-15 Ricoh Corporation Neural network acoustic and visual speech recognition system training method and apparatus
US5482048A (en) * 1993-06-30 1996-01-09 University Of Pittsburgh System and method for measuring and quantitating facial movements
US5657426A (en) * 1994-06-10 1997-08-12 Digital Equipment Corporation Method and apparatus for producing audio-visual synthetic speech
AU3668095A (en) * 1994-11-07 1996-05-16 At & T Corporation Acoustic-assisted image processing
SE519244C2 (sv) * 1995-12-06 2003-02-04 Telia Ab Anordning och metod vid talsyntes

Also Published As

Publication number Publication date
SE9701977L (sv) 1998-11-28
DE69816078T2 (de) 2004-05-13
EP0983575B1 (de) 2003-07-02
EE9900542A (et) 2000-06-15
EP0983575A1 (de) 2000-03-08
WO1998054696A1 (en) 1998-12-03
DE69816078D1 (de) 2003-08-07
NO995673L (no) 2000-01-25
EE03634B1 (et) 2002-02-15
NO317598B1 (no) 2004-11-22
NO995673D0 (no) 1999-11-19
SE9701977D0 (sv) 1997-05-27
DK0983575T3 (da) 2003-10-27

Similar Documents

Publication Publication Date Title
US7676372B1 (en) Prosthetic hearing device that transforms a detected speech into a speech of a speech form assistive in understanding the semantic meaning in the detected speech
Rosenblum et al. An audiovisual test of kinematic primitives for visual speech perception.
Lavagetto Converting speech into lip movements: A multimedia telephone for hard of hearing people
Jiang et al. On the relationship between face movements, tongue movements, and speech acoustics
Tran et al. Improvement to a NAM-captured whisper-to-speech system
JP3670180B2 (ja) 補聴器
Kim et al. Hearing speech in noise: Seeing a loud talker is better
Barker et al. Evidence of correlation between acoustic and visual features of speech
Salvi et al. SynFace—speech-driven facial animation for virtual speech-reading support
SE511927C2 (sv) Förbättringar i, eller med avseende på, visuell talsyntes
JP4381404B2 (ja) 音声合成システム、音声合成方法、音声合成プログラム
Patel et al. Teachable interfaces for individuals with dysarthric speech and severe physical disabilities
Olives et al. Audio-visual speech synthesis for finnish
Adjoudani et al. A multimedia platform for audio-visual speech processing
Lavagetto Multimedia Telephone for Hearing-Impaired People
Bastanfard et al. A comprehensive audio-visual corpus for teaching sound persian phoneme articulation
Beskow et al. Visualization of speech and audio for hearing impaired persons
Agelfors et al. Synthetic visual speech driven from auditory speech
Beautemps et al. Telma: Telephony for the hearing-impaired people. from models to user tests
Kumar et al. Real time detection and conversion of gestures to text and speech to sign system
KR20150075502A (ko) 발음 학습 지원 시스템 및 그 시스템의 발음 학습 지원 방법
Goecke A stereo vision lip tracking algorithm and subsequent statistical analyses of the audio-video correlation in Australian English
Hatzis et al. Optical logo-therapy (OLT): a computer-based real time visual feedback application for speech training.
Beskow et al. Analysis and synthesis of multimodal verbal and non-verbal interaction for animated interface agents
Engwall et al. Are real tongue movements easier to speech read than synthesized?