CN1556496A - Lip shape identifying sound generator - Google Patents

Lip shape identifying sound generator Download PDF

Info

Publication number
CN1556496A
CN1556496A CNA2003101220227A CN200310122022A CN1556496A CN 1556496 A CN1556496 A CN 1556496A CN A2003101220227 A CNA2003101220227 A CN A2003101220227A CN 200310122022 A CN200310122022 A CN 200310122022A CN 1556496 A CN1556496 A CN 1556496A
Authority
CN
China
Prior art keywords
lip
unit
phonetic synthesis
acoustical generator
identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2003101220227A
Other languages
Chinese (zh)
Inventor
刚 李
李刚
解国明
林凌
任惠茹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CNA2003101220227A priority Critical patent/CN1556496A/en
Publication of CN1556496A publication Critical patent/CN1556496A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Prostheses (AREA)

Abstract

The invention discloses a lip-shaped identifying sounder, and its connecting mode: a micro camera head is connected with an image collecting unit, the output end of the image collecting unit is connected with a lip-shaped image mode identifying unit, the signal of the identifying unit is outputted to a voice synthesizing unit, the synthesizing unit is connected with a voice storage unit, the synthesizing unit extracts voice synthesizing elements from the voice storage unit to synthesize a sound signal and output the signal to a sounding unit. Then a speaker gives out a corresponding sound to the lip shape and its variation sequence. By identifying the lip shape of a speaker, it determines speech contents, makes voice synthesis on the speech contents and real-timely gives out a sound by a speaker. It can help the persons unable to sound because of removal of throat or vocal cords or the deaf-mute able to speak lip languages to sound, convenient for them to exchange with the normal persons.

Description

Lip identification acoustical generator
Technical field
The present invention relates to a kind of acoustical generator, particularly a kind of lip identification acoustical generator.
Background technology
Clinically, many patients have been because larynx or vocal cords pathology have been carried out larynx or vocal cords resection operation, exchange with the normal person thereby postoperative can not sounding have hindered them.The deaf-mute is general to be exchanged with the normal person is to determine the other side's content of speaking by the lip reading of reading the normal person.The deaf-mute but is difficult to allow others understand the meaning of oneself.Utilize lip image recognition and phonetic synthesis sounding instrument, it can help can not sounding the people sound, remove they and normal person's communication disorder.But there are not a kind of instrument and technical scheme can help above-mentioned patient and deaf-mute to sound at present as yet, make things convenient for them to exchange with the normal person.
Summary of the invention
Purpose of the present invention is to provide a kind of sounding instrument can help above-mentioned patient and deaf-mute's sounding, conveniently exchanges with the normal person.The present invention is the lip by the identification speaker, determines its content of speaking by pattern-recognition, sounds by speech synthesis technique then.Most sounds of language all have definite lip when speaking.The present invention can and think that " sound " of sounding is corresponding one by one, adopts speech synthesis technique to sound by loudspeaker speaker's lip.
The present invention is realized by following technical proposals:
1. gather speaker's lip image by camera and image acquisition units.
2. the lip image is carried out Flame Image Process, real-time, Dynamic Extraction lip feature are determined the content of speaking with the lip algorithm for pattern recognition then.
3. according to the pattern-recognition result, the phonetic synthesis unit extracts voice from voice memory unit, the synthetic content and send by phonation unit of speaking.
The present invention is as shown in Figure 1: minisize pick-up head 1 is connected with image acquisition units 2, the output of image acquisition units 2 connects lip type image model recognition unit 3, the signal of lip type image model recognition unit 3 outputs to phonetic synthesis unit 4, phonetic synthesis unit 4 is connected with voice memory unit 5, phonetic synthesis unit 4 extracts phonetic synthesis key element synthetic video signal from voice memory unit 5, output to phonation unit 6, then send and lip type and the corresponding sound of variation order thereof by loudspeaker 7.
Can be with lip Flame Image Process and pattern recognition unit, the phonetic synthesis unit, voice memory unit realizes that with processor 8 processor can be digital signal processor (DSP) or other microprocessors (as ARM) etc.
And minisize pick-up head 1 can be the camera with digital signal output that integrates with image acquisition units, as CCD camera and other image sensors.
Phonation unit 6 can adopt digital/analog converter and amplifier to form, and also can adopt codec.
The present invention is by identification speaker's lip, and the content of determine speaking, the phonetic synthesis content of speaking is sounded by loudspeaker in real time.The present invention can help because the excision of larynx or vocal cords can not sounding the people or deaf-mute that can lip reading sound, made things convenient for them to exchange with the normal person.
Description of drawings
Fig. 1 is that system of the present invention connects block diagram.
Fig. 2 a kind of lip identification acoustical generator of the present invention.
Embodiment
Below in conjunction with accompanying drawing the present invention is elaborated:
Method of attachment as shown in Figure 1, minisize pick-up head 1 is connected with image acquisition units 2, the output of image acquisition units 2 connects lip type image model recognition unit 3, the signal of lip type image model recognition unit 3 outputs to phonetic synthesis unit 4, phonetic synthesis unit 4 is connected with voice memory unit 5, phonetic synthesis unit 4 extracts phonetic synthesis key element synthetic video signal from voice memory unit 5, output to phonation unit 6, then send and lip type and the corresponding sound of variation order thereof by loudspeaker 7.
Adopt minisize pick-up head 1, reduce volume, before minisize pick-up head is put in lip, only absorb the lip image, do not absorb facial other image, its output map interlinking is as collecting unit.Image acquisition units 2 adopts video capture processor, and input connects the output of minisize pick-up head, and output is connected with pattern recognition unit 3 with Flame Image Process.Flame Image Process and pattern recognition unit are the cores of instrument, adopt digital signal processor (DSP) or other microprocessors (as ARM), mainly carry out pre-service, feature extraction and the pattern-recognition of lip image.Phonetic synthesis unit 4 is according to the synthetic speech as a result of lip pattern-recognition.It is also finished by digital signal processor.Voice memory unit 5 is a database, stores all basic phonemes, adopts mass memory stores.Phonation unit 6 is made up of digital to analog converter and amplifier.Digital to analog converter converts digital audio and video signals to simulated audio signal, amplifies rear drive loudspeaker 7 through amplifier.Phonation unit also can adopt codec.Loudspeaker is sent sound.
The minisize pick-up head of present embodiment and image acquisition units can adopt integrated image sensor.
The lip Flame Image Process and the pattern recognition unit of present embodiment, the phonetic synthesis unit, the processor 8 that voice memory unit adopts can be digital signal microprocessor or digital signal microprocessor system, also microprocessor or microprocessor system be can adopt, ARM microprocessor or ARM microprocessor system perhaps adopted.
The phonation unit of present embodiment comprises digital to analog converter and amplifier composition.
Be suitable for for convenient, outward appearance of the present invention is the earphone shape.Minisize pick-up head is put in the position that common headphones is put microphone, and loudspeaker picks out by line, and other functional unit circuit of instrument is placed the ear position.As shown in Figure 2.
This device on user's image-tape earphone one belt transect is left behind minisize pick-up head, aims at the lip of oneself, opens switch, loquiturs.Although the user can not send out sound, as long as the action of lip when normally speaking, this device just can send correct sound.Lip is nonstandard when speaking for some user, needs through certain training.Trained user, this instrument can satisfy daily interchange.

Claims (8)

1. a lip is discerned acoustical generator, and it is made of six parts: minisize pick-up head, image acquisition units, lip Flame Image Process and pattern recognition unit, phonetic synthesis unit, voice memory unit and phonation unit; It is characterized in that minisize pick-up head (1) is connected with image acquisition units (2), the output of image acquisition units (2) connects lip type image model recognition unit (3), the signal of lip type image model recognition unit (3) outputs to phonetic synthesis unit (4), phonetic synthesis unit (4) is connected with voice memory unit (5), phonetic synthesis unit (4) extracts phonetic synthesis key element synthetic video signal from voice memory unit (5), output to phonation unit (6), then send and lip type and the corresponding sound of variation order thereof by loudspeaker (7).
2. by the said lip identification of claim 1 acoustical generator, it is characterized in that: minisize pick-up head and image acquisition units adopt integrated imageing sensor.
3. by the said lip identification of claim 1 acoustical generator, it is characterized in that: lip Flame Image Process and pattern recognition unit, the phonetic synthesis unit, voice memory unit adopts digital signal microprocessor or digital signal microprocessor system.
4. by the said lip identification of claim 1 acoustical generator, it is characterized in that: lip Flame Image Process and pattern recognition unit, the phonetic synthesis unit, voice memory unit adopts microprocessor or microprocessor system.
5. by the said lip identification of claim 1 acoustical generator, it is characterized in that: lip Flame Image Process and pattern recognition unit, the phonetic synthesis unit, voice memory unit adopts ARM microprocessor or ARM microprocessor system.
6. by the said lip identification of claim 1 acoustical generator, it is characterized in that: phonation unit comprises digital to analog converter and amplifier composition.
7. by the said lip identification of claim 1 acoustical generator, it is characterized in that: phonation unit adopts codec.
8. by the said lip identification of claim 1 acoustical generator, it is characterized in that: minisize pick-up head is arranged on lip the place ahead.
CNA2003101220227A 2003-12-31 2003-12-31 Lip shape identifying sound generator Pending CN1556496A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2003101220227A CN1556496A (en) 2003-12-31 2003-12-31 Lip shape identifying sound generator

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2003101220227A CN1556496A (en) 2003-12-31 2003-12-31 Lip shape identifying sound generator

Publications (1)

Publication Number Publication Date
CN1556496A true CN1556496A (en) 2004-12-22

Family

ID=34338600

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2003101220227A Pending CN1556496A (en) 2003-12-31 2003-12-31 Lip shape identifying sound generator

Country Status (1)

Country Link
CN (1) CN1556496A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007134494A1 (en) * 2006-05-16 2007-11-29 Zhongwei Huang A computer auxiliary method suitable for multi-languages pronunciation learning system for deaf-mute
CN102117115A (en) * 2009-12-31 2011-07-06 上海量科电子科技有限公司 System for realizing text entry selection by using lip-language and realization method thereof
CN102193772A (en) * 2010-03-19 2011-09-21 索尼公司 Information processor, information processing method and program
CN101751692B (en) * 2009-12-24 2012-05-30 四川大学 Method for voice-driven lip animation
CN102542280A (en) * 2010-12-26 2012-07-04 上海量明科技发展有限公司 Recognition method and system aiming at different lip-language mouth shapes with same content
CN103092329A (en) * 2011-10-31 2013-05-08 南开大学 Lip reading technology based lip language input method
CN105321519A (en) * 2014-07-28 2016-02-10 刘璟锋 Speech recognition system and unit
CN105632497A (en) * 2016-01-06 2016-06-01 昆山龙腾光电有限公司 Voice output method, voice output system
CN108446641A (en) * 2018-03-22 2018-08-24 深圳市迪比科电子科技有限公司 Mouth shape image recognition system based on machine learning and method for recognizing and sounding through facial texture
CN108510988A (en) * 2018-03-22 2018-09-07 深圳市迪比科电子科技有限公司 Language identification system and method for deaf-mutes
CN108538282A (en) * 2018-03-15 2018-09-14 上海电力学院 A method of voice is directly generated by lip video
CN108831472A (en) * 2018-06-27 2018-11-16 中山大学肿瘤防治中心 A kind of artificial intelligence sonification system and vocal technique based on lip reading identification
CN109559751A (en) * 2019-01-09 2019-04-02 承德石油高等专科学校 A kind of shape of the mouth as one speaks conversion mask
CN109919127A (en) * 2019-03-20 2019-06-21 邱洵 A kind of sign language languages switching system
CN110351631A (en) * 2019-07-11 2019-10-18 京东方科技集团股份有限公司 Deaf-mute's alternating current equipment and its application method
CN111445912A (en) * 2020-04-03 2020-07-24 深圳市阿尔垎智能科技有限公司 Voice processing method and system
CN111913590A (en) * 2019-05-07 2020-11-10 北京搜狗科技发展有限公司 Input method, device and equipment
CN111916054A (en) * 2020-07-08 2020-11-10 标贝(北京)科技有限公司 Lip-based voice generation method, device and system and storage medium

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007134494A1 (en) * 2006-05-16 2007-11-29 Zhongwei Huang A computer auxiliary method suitable for multi-languages pronunciation learning system for deaf-mute
CN101751692B (en) * 2009-12-24 2012-05-30 四川大学 Method for voice-driven lip animation
CN102117115A (en) * 2009-12-31 2011-07-06 上海量科电子科技有限公司 System for realizing text entry selection by using lip-language and realization method thereof
CN102117115B (en) * 2009-12-31 2016-11-23 上海量科电子科技有限公司 A kind of system utilizing lip reading to carry out word input selection and implementation method
CN102193772B (en) * 2010-03-19 2016-08-10 索尼公司 A kind of message handler and information processing method
CN102193772A (en) * 2010-03-19 2011-09-21 索尼公司 Information processor, information processing method and program
CN102542280B (en) * 2010-12-26 2016-09-28 上海量明科技发展有限公司 The recognition methods of the different lip reading shape of the mouth as one speaks and system for same content
CN102542280A (en) * 2010-12-26 2012-07-04 上海量明科技发展有限公司 Recognition method and system aiming at different lip-language mouth shapes with same content
CN103092329A (en) * 2011-10-31 2013-05-08 南开大学 Lip reading technology based lip language input method
CN105321519A (en) * 2014-07-28 2016-02-10 刘璟锋 Speech recognition system and unit
CN105321519B (en) * 2014-07-28 2019-05-14 刘璟锋 Speech recognition system and unit
CN105632497A (en) * 2016-01-06 2016-06-01 昆山龙腾光电有限公司 Voice output method, voice output system
CN108538282B (en) * 2018-03-15 2021-10-08 上海电力学院 Method for directly generating voice from lip video
CN108538282A (en) * 2018-03-15 2018-09-14 上海电力学院 A method of voice is directly generated by lip video
CN108510988A (en) * 2018-03-22 2018-09-07 深圳市迪比科电子科技有限公司 Language identification system and method for deaf-mutes
CN108446641A (en) * 2018-03-22 2018-08-24 深圳市迪比科电子科技有限公司 Mouth shape image recognition system based on machine learning and method for recognizing and sounding through facial texture
CN108831472A (en) * 2018-06-27 2018-11-16 中山大学肿瘤防治中心 A kind of artificial intelligence sonification system and vocal technique based on lip reading identification
CN109559751A (en) * 2019-01-09 2019-04-02 承德石油高等专科学校 A kind of shape of the mouth as one speaks conversion mask
CN109919127A (en) * 2019-03-20 2019-06-21 邱洵 A kind of sign language languages switching system
CN111913590A (en) * 2019-05-07 2020-11-10 北京搜狗科技发展有限公司 Input method, device and equipment
CN110351631A (en) * 2019-07-11 2019-10-18 京东方科技集团股份有限公司 Deaf-mute's alternating current equipment and its application method
CN111445912A (en) * 2020-04-03 2020-07-24 深圳市阿尔垎智能科技有限公司 Voice processing method and system
CN111916054A (en) * 2020-07-08 2020-11-10 标贝(北京)科技有限公司 Lip-based voice generation method, device and system and storage medium
CN111916054B (en) * 2020-07-08 2024-04-26 标贝(青岛)科技有限公司 Lip-based voice generation method, device and system and storage medium

Similar Documents

Publication Publication Date Title
CN1556496A (en) Lip shape identifying sound generator
KR100619215B1 (en) Microphone and communication interface system
US7676372B1 (en) Prosthetic hearing device that transforms a detected speech into a speech of a speech form assistive in understanding the semantic meaning in the detected speech
Nakajima et al. Non-audible murmur (NAM) recognition
EP1345210A3 (en) Speech recognition system, speech recognition method, speech synthesis system, speech synthesis method, and program product
EP0860811A2 (en) Automated speech alignment for image synthesis
CN201532762U (en) Simultaneous interpretation device special for individuals
WO2015090562A2 (en) Computer-implemented method, computer system and computer program product for automatic transformation of myoelectric signals into audible speech
EP1326232A3 (en) Method, apparatus and computer program for preparing an acoustic model
CN106653048B (en) Single channel sound separation method based on voice model
CN110148418B (en) Scene record analysis system, method and device
CN109346057A (en) A kind of speech processing system of intelligence toy for children
JP2000308198A (en) Hearing and
TWI222622B (en) Robotic vision-audition system
CN112232127A (en) Intelligent speech training system and method
CN110516265A (en) A kind of single identification real-time translation system based on intelligent sound
Dupont et al. Combined use of close-talk and throat microphones for improved speech recognition under non-stationary background noise
US20130035940A1 (en) Electrolaryngeal speech reconstruction method and system thereof
KR20170086233A (en) Method for incremental training of acoustic and language model using life speech and image logs
CN109300478A (en) A kind of auxiliary Interface of person hard of hearing
CN110956949B (en) Buccal type silence communication method and system
JP4011844B2 (en) Translation apparatus, translation method and medium
CN113409809B (en) Voice noise reduction method, device and equipment
CN213154723U (en) A interrogation table that role separation double-circuit audio frequency and video for interrogation
US20240267452A1 (en) Mobile communication system with whisper functions

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication