CN1556496A - Lip shape identifying sound generator - Google Patents
Lip shape identifying sound generator Download PDFInfo
- Publication number
- CN1556496A CN1556496A CNA2003101220227A CN200310122022A CN1556496A CN 1556496 A CN1556496 A CN 1556496A CN A2003101220227 A CNA2003101220227 A CN A2003101220227A CN 200310122022 A CN200310122022 A CN 200310122022A CN 1556496 A CN1556496 A CN 1556496A
- Authority
- CN
- China
- Prior art keywords
- lip
- unit
- phonetic synthesis
- acoustical generator
- identification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Prostheses (AREA)
Abstract
The invention discloses a lip-shaped identifying sounder, and its connecting mode: a micro camera head is connected with an image collecting unit, the output end of the image collecting unit is connected with a lip-shaped image mode identifying unit, the signal of the identifying unit is outputted to a voice synthesizing unit, the synthesizing unit is connected with a voice storage unit, the synthesizing unit extracts voice synthesizing elements from the voice storage unit to synthesize a sound signal and output the signal to a sounding unit. Then a speaker gives out a corresponding sound to the lip shape and its variation sequence. By identifying the lip shape of a speaker, it determines speech contents, makes voice synthesis on the speech contents and real-timely gives out a sound by a speaker. It can help the persons unable to sound because of removal of throat or vocal cords or the deaf-mute able to speak lip languages to sound, convenient for them to exchange with the normal persons.
Description
Technical field
The present invention relates to a kind of acoustical generator, particularly a kind of lip identification acoustical generator.
Background technology
Clinically, many patients have been because larynx or vocal cords pathology have been carried out larynx or vocal cords resection operation, exchange with the normal person thereby postoperative can not sounding have hindered them.The deaf-mute is general to be exchanged with the normal person is to determine the other side's content of speaking by the lip reading of reading the normal person.The deaf-mute but is difficult to allow others understand the meaning of oneself.Utilize lip image recognition and phonetic synthesis sounding instrument, it can help can not sounding the people sound, remove they and normal person's communication disorder.But there are not a kind of instrument and technical scheme can help above-mentioned patient and deaf-mute to sound at present as yet, make things convenient for them to exchange with the normal person.
Summary of the invention
Purpose of the present invention is to provide a kind of sounding instrument can help above-mentioned patient and deaf-mute's sounding, conveniently exchanges with the normal person.The present invention is the lip by the identification speaker, determines its content of speaking by pattern-recognition, sounds by speech synthesis technique then.Most sounds of language all have definite lip when speaking.The present invention can and think that " sound " of sounding is corresponding one by one, adopts speech synthesis technique to sound by loudspeaker speaker's lip.
The present invention is realized by following technical proposals:
1. gather speaker's lip image by camera and image acquisition units.
2. the lip image is carried out Flame Image Process, real-time, Dynamic Extraction lip feature are determined the content of speaking with the lip algorithm for pattern recognition then.
3. according to the pattern-recognition result, the phonetic synthesis unit extracts voice from voice memory unit, the synthetic content and send by phonation unit of speaking.
The present invention is as shown in Figure 1: minisize pick-up head 1 is connected with image acquisition units 2, the output of image acquisition units 2 connects lip type image model recognition unit 3, the signal of lip type image model recognition unit 3 outputs to phonetic synthesis unit 4, phonetic synthesis unit 4 is connected with voice memory unit 5, phonetic synthesis unit 4 extracts phonetic synthesis key element synthetic video signal from voice memory unit 5, output to phonation unit 6, then send and lip type and the corresponding sound of variation order thereof by loudspeaker 7.
Can be with lip Flame Image Process and pattern recognition unit, the phonetic synthesis unit, voice memory unit realizes that with processor 8 processor can be digital signal processor (DSP) or other microprocessors (as ARM) etc.
And minisize pick-up head 1 can be the camera with digital signal output that integrates with image acquisition units, as CCD camera and other image sensors.
The present invention is by identification speaker's lip, and the content of determine speaking, the phonetic synthesis content of speaking is sounded by loudspeaker in real time.The present invention can help because the excision of larynx or vocal cords can not sounding the people or deaf-mute that can lip reading sound, made things convenient for them to exchange with the normal person.
Description of drawings
Fig. 1 is that system of the present invention connects block diagram.
Fig. 2 a kind of lip identification acoustical generator of the present invention.
Embodiment
Below in conjunction with accompanying drawing the present invention is elaborated:
Method of attachment as shown in Figure 1, minisize pick-up head 1 is connected with image acquisition units 2, the output of image acquisition units 2 connects lip type image model recognition unit 3, the signal of lip type image model recognition unit 3 outputs to phonetic synthesis unit 4, phonetic synthesis unit 4 is connected with voice memory unit 5, phonetic synthesis unit 4 extracts phonetic synthesis key element synthetic video signal from voice memory unit 5, output to phonation unit 6, then send and lip type and the corresponding sound of variation order thereof by loudspeaker 7.
Adopt minisize pick-up head 1, reduce volume, before minisize pick-up head is put in lip, only absorb the lip image, do not absorb facial other image, its output map interlinking is as collecting unit.Image acquisition units 2 adopts video capture processor, and input connects the output of minisize pick-up head, and output is connected with pattern recognition unit 3 with Flame Image Process.Flame Image Process and pattern recognition unit are the cores of instrument, adopt digital signal processor (DSP) or other microprocessors (as ARM), mainly carry out pre-service, feature extraction and the pattern-recognition of lip image.Phonetic synthesis unit 4 is according to the synthetic speech as a result of lip pattern-recognition.It is also finished by digital signal processor.Voice memory unit 5 is a database, stores all basic phonemes, adopts mass memory stores.Phonation unit 6 is made up of digital to analog converter and amplifier.Digital to analog converter converts digital audio and video signals to simulated audio signal, amplifies rear drive loudspeaker 7 through amplifier.Phonation unit also can adopt codec.Loudspeaker is sent sound.
The minisize pick-up head of present embodiment and image acquisition units can adopt integrated image sensor.
The lip Flame Image Process and the pattern recognition unit of present embodiment, the phonetic synthesis unit, the processor 8 that voice memory unit adopts can be digital signal microprocessor or digital signal microprocessor system, also microprocessor or microprocessor system be can adopt, ARM microprocessor or ARM microprocessor system perhaps adopted.
The phonation unit of present embodiment comprises digital to analog converter and amplifier composition.
Be suitable for for convenient, outward appearance of the present invention is the earphone shape.Minisize pick-up head is put in the position that common headphones is put microphone, and loudspeaker picks out by line, and other functional unit circuit of instrument is placed the ear position.As shown in Figure 2.
This device on user's image-tape earphone one belt transect is left behind minisize pick-up head, aims at the lip of oneself, opens switch, loquiturs.Although the user can not send out sound, as long as the action of lip when normally speaking, this device just can send correct sound.Lip is nonstandard when speaking for some user, needs through certain training.Trained user, this instrument can satisfy daily interchange.
Claims (8)
1. a lip is discerned acoustical generator, and it is made of six parts: minisize pick-up head, image acquisition units, lip Flame Image Process and pattern recognition unit, phonetic synthesis unit, voice memory unit and phonation unit; It is characterized in that minisize pick-up head (1) is connected with image acquisition units (2), the output of image acquisition units (2) connects lip type image model recognition unit (3), the signal of lip type image model recognition unit (3) outputs to phonetic synthesis unit (4), phonetic synthesis unit (4) is connected with voice memory unit (5), phonetic synthesis unit (4) extracts phonetic synthesis key element synthetic video signal from voice memory unit (5), output to phonation unit (6), then send and lip type and the corresponding sound of variation order thereof by loudspeaker (7).
2. by the said lip identification of claim 1 acoustical generator, it is characterized in that: minisize pick-up head and image acquisition units adopt integrated imageing sensor.
3. by the said lip identification of claim 1 acoustical generator, it is characterized in that: lip Flame Image Process and pattern recognition unit, the phonetic synthesis unit, voice memory unit adopts digital signal microprocessor or digital signal microprocessor system.
4. by the said lip identification of claim 1 acoustical generator, it is characterized in that: lip Flame Image Process and pattern recognition unit, the phonetic synthesis unit, voice memory unit adopts microprocessor or microprocessor system.
5. by the said lip identification of claim 1 acoustical generator, it is characterized in that: lip Flame Image Process and pattern recognition unit, the phonetic synthesis unit, voice memory unit adopts ARM microprocessor or ARM microprocessor system.
6. by the said lip identification of claim 1 acoustical generator, it is characterized in that: phonation unit comprises digital to analog converter and amplifier composition.
7. by the said lip identification of claim 1 acoustical generator, it is characterized in that: phonation unit adopts codec.
8. by the said lip identification of claim 1 acoustical generator, it is characterized in that: minisize pick-up head is arranged on lip the place ahead.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA2003101220227A CN1556496A (en) | 2003-12-31 | 2003-12-31 | Lip shape identifying sound generator |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA2003101220227A CN1556496A (en) | 2003-12-31 | 2003-12-31 | Lip shape identifying sound generator |
Publications (1)
Publication Number | Publication Date |
---|---|
CN1556496A true CN1556496A (en) | 2004-12-22 |
Family
ID=34338600
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNA2003101220227A Pending CN1556496A (en) | 2003-12-31 | 2003-12-31 | Lip shape identifying sound generator |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN1556496A (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007134494A1 (en) * | 2006-05-16 | 2007-11-29 | Zhongwei Huang | A computer auxiliary method suitable for multi-languages pronunciation learning system for deaf-mute |
CN102117115A (en) * | 2009-12-31 | 2011-07-06 | 上海量科电子科技有限公司 | System for realizing text entry selection by using lip-language and realization method thereof |
CN102193772A (en) * | 2010-03-19 | 2011-09-21 | 索尼公司 | Information processor, information processing method and program |
CN101751692B (en) * | 2009-12-24 | 2012-05-30 | 四川大学 | Method for voice-driven lip animation |
CN102542280A (en) * | 2010-12-26 | 2012-07-04 | 上海量明科技发展有限公司 | Recognition method and system aiming at different lip-language mouth shapes with same content |
CN103092329A (en) * | 2011-10-31 | 2013-05-08 | 南开大学 | Lip reading technology based lip language input method |
CN105321519A (en) * | 2014-07-28 | 2016-02-10 | 刘璟锋 | Speech recognition system and unit |
CN105632497A (en) * | 2016-01-06 | 2016-06-01 | 昆山龙腾光电有限公司 | Voice output method, voice output system |
CN108446641A (en) * | 2018-03-22 | 2018-08-24 | 深圳市迪比科电子科技有限公司 | Mouth shape image recognition system based on machine learning and method for recognizing and sounding through facial texture |
CN108510988A (en) * | 2018-03-22 | 2018-09-07 | 深圳市迪比科电子科技有限公司 | Language identification system and method for deaf-mutes |
CN108538282A (en) * | 2018-03-15 | 2018-09-14 | 上海电力学院 | A method of voice is directly generated by lip video |
CN108831472A (en) * | 2018-06-27 | 2018-11-16 | 中山大学肿瘤防治中心 | A kind of artificial intelligence sonification system and vocal technique based on lip reading identification |
CN109559751A (en) * | 2019-01-09 | 2019-04-02 | 承德石油高等专科学校 | A kind of shape of the mouth as one speaks conversion mask |
CN109919127A (en) * | 2019-03-20 | 2019-06-21 | 邱洵 | A kind of sign language languages switching system |
CN110351631A (en) * | 2019-07-11 | 2019-10-18 | 京东方科技集团股份有限公司 | Deaf-mute's alternating current equipment and its application method |
CN111445912A (en) * | 2020-04-03 | 2020-07-24 | 深圳市阿尔垎智能科技有限公司 | Voice processing method and system |
CN111913590A (en) * | 2019-05-07 | 2020-11-10 | 北京搜狗科技发展有限公司 | Input method, device and equipment |
CN111916054A (en) * | 2020-07-08 | 2020-11-10 | 标贝(北京)科技有限公司 | Lip-based voice generation method, device and system and storage medium |
-
2003
- 2003-12-31 CN CNA2003101220227A patent/CN1556496A/en active Pending
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007134494A1 (en) * | 2006-05-16 | 2007-11-29 | Zhongwei Huang | A computer auxiliary method suitable for multi-languages pronunciation learning system for deaf-mute |
CN101751692B (en) * | 2009-12-24 | 2012-05-30 | 四川大学 | Method for voice-driven lip animation |
CN102117115A (en) * | 2009-12-31 | 2011-07-06 | 上海量科电子科技有限公司 | System for realizing text entry selection by using lip-language and realization method thereof |
CN102117115B (en) * | 2009-12-31 | 2016-11-23 | 上海量科电子科技有限公司 | A kind of system utilizing lip reading to carry out word input selection and implementation method |
CN102193772B (en) * | 2010-03-19 | 2016-08-10 | 索尼公司 | A kind of message handler and information processing method |
CN102193772A (en) * | 2010-03-19 | 2011-09-21 | 索尼公司 | Information processor, information processing method and program |
CN102542280B (en) * | 2010-12-26 | 2016-09-28 | 上海量明科技发展有限公司 | The recognition methods of the different lip reading shape of the mouth as one speaks and system for same content |
CN102542280A (en) * | 2010-12-26 | 2012-07-04 | 上海量明科技发展有限公司 | Recognition method and system aiming at different lip-language mouth shapes with same content |
CN103092329A (en) * | 2011-10-31 | 2013-05-08 | 南开大学 | Lip reading technology based lip language input method |
CN105321519A (en) * | 2014-07-28 | 2016-02-10 | 刘璟锋 | Speech recognition system and unit |
CN105321519B (en) * | 2014-07-28 | 2019-05-14 | 刘璟锋 | Speech recognition system and unit |
CN105632497A (en) * | 2016-01-06 | 2016-06-01 | 昆山龙腾光电有限公司 | Voice output method, voice output system |
CN108538282B (en) * | 2018-03-15 | 2021-10-08 | 上海电力学院 | Method for directly generating voice from lip video |
CN108538282A (en) * | 2018-03-15 | 2018-09-14 | 上海电力学院 | A method of voice is directly generated by lip video |
CN108510988A (en) * | 2018-03-22 | 2018-09-07 | 深圳市迪比科电子科技有限公司 | Language identification system and method for deaf-mutes |
CN108446641A (en) * | 2018-03-22 | 2018-08-24 | 深圳市迪比科电子科技有限公司 | Mouth shape image recognition system based on machine learning and method for recognizing and sounding through facial texture |
CN108831472A (en) * | 2018-06-27 | 2018-11-16 | 中山大学肿瘤防治中心 | A kind of artificial intelligence sonification system and vocal technique based on lip reading identification |
CN109559751A (en) * | 2019-01-09 | 2019-04-02 | 承德石油高等专科学校 | A kind of shape of the mouth as one speaks conversion mask |
CN109919127A (en) * | 2019-03-20 | 2019-06-21 | 邱洵 | A kind of sign language languages switching system |
CN111913590A (en) * | 2019-05-07 | 2020-11-10 | 北京搜狗科技发展有限公司 | Input method, device and equipment |
CN110351631A (en) * | 2019-07-11 | 2019-10-18 | 京东方科技集团股份有限公司 | Deaf-mute's alternating current equipment and its application method |
CN111445912A (en) * | 2020-04-03 | 2020-07-24 | 深圳市阿尔垎智能科技有限公司 | Voice processing method and system |
CN111916054A (en) * | 2020-07-08 | 2020-11-10 | 标贝(北京)科技有限公司 | Lip-based voice generation method, device and system and storage medium |
CN111916054B (en) * | 2020-07-08 | 2024-04-26 | 标贝(青岛)科技有限公司 | Lip-based voice generation method, device and system and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1556496A (en) | Lip shape identifying sound generator | |
KR100619215B1 (en) | Microphone and communication interface system | |
US7676372B1 (en) | Prosthetic hearing device that transforms a detected speech into a speech of a speech form assistive in understanding the semantic meaning in the detected speech | |
Nakajima et al. | Non-audible murmur (NAM) recognition | |
EP1345210A3 (en) | Speech recognition system, speech recognition method, speech synthesis system, speech synthesis method, and program product | |
EP0860811A2 (en) | Automated speech alignment for image synthesis | |
CN201532762U (en) | Simultaneous interpretation device special for individuals | |
WO2015090562A2 (en) | Computer-implemented method, computer system and computer program product for automatic transformation of myoelectric signals into audible speech | |
EP1326232A3 (en) | Method, apparatus and computer program for preparing an acoustic model | |
CN106653048B (en) | Single channel sound separation method based on voice model | |
CN110148418B (en) | Scene record analysis system, method and device | |
CN109346057A (en) | A kind of speech processing system of intelligence toy for children | |
JP2000308198A (en) | Hearing and | |
TWI222622B (en) | Robotic vision-audition system | |
CN112232127A (en) | Intelligent speech training system and method | |
CN110516265A (en) | A kind of single identification real-time translation system based on intelligent sound | |
Dupont et al. | Combined use of close-talk and throat microphones for improved speech recognition under non-stationary background noise | |
US20130035940A1 (en) | Electrolaryngeal speech reconstruction method and system thereof | |
KR20170086233A (en) | Method for incremental training of acoustic and language model using life speech and image logs | |
CN109300478A (en) | A kind of auxiliary Interface of person hard of hearing | |
CN110956949B (en) | Buccal type silence communication method and system | |
JP4011844B2 (en) | Translation apparatus, translation method and medium | |
CN113409809B (en) | Voice noise reduction method, device and equipment | |
CN213154723U (en) | A interrogation table that role separation double-circuit audio frequency and video for interrogation | |
US20240267452A1 (en) | Mobile communication system with whisper functions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |