CN102324035A - Method and system of applying lip posture assisted speech recognition technique to vehicle navigation - Google Patents

Method and system of applying lip posture assisted speech recognition technique to vehicle navigation Download PDF

Info

Publication number
CN102324035A
CN102324035A CN201110239403A CN201110239403A CN102324035A CN 102324035 A CN102324035 A CN 102324035A CN 201110239403 A CN201110239403 A CN 201110239403A CN 201110239403 A CN201110239403 A CN 201110239403A CN 102324035 A CN102324035 A CN 102324035A
Authority
CN
China
Prior art keywords
voice
mouth
shape
speaks
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201110239403A
Other languages
Chinese (zh)
Inventor
伍栋杨
王冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Coagent Electronics S&T Co Ltd
Original Assignee
Guangdong Coagent Electronics S&T Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Coagent Electronics S&T Co Ltd filed Critical Guangdong Coagent Electronics S&T Co Ltd
Priority to CN201110239403A priority Critical patent/CN102324035A/en
Publication of CN102324035A publication Critical patent/CN102324035A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Navigation (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention relates to a method and a system of applying a lip posture assisted speech recognition technique to vehicle navigation. The technical scheme is that a camera and a microphone are arranged at proper positions to acquire the lip posture image signals and voice signals of a user, the signals are input into an image/speech recognition processing module, a logic judgment sequence of speech recognition judgment first and lip-rounding recognition confirmation later is conducted through a speech recognition and lip-rounding recognition combined method to form uniform judgment results, and the recognized accurate information corresponds to the control commands of vehicle navigation equipment to realize a speech recognition control function. Therefore, the probability of recognition errors caused by noise interference during speech recognition is effectively reduced, the speech recognition rate in a running state and an idling state of a vehicle with car windows being closed is improved to more than 90 percent from the original approximate 80 percent, the recognition rate of the speech recognition technique applied in the field of the vehicle navigation is improved, the speech navigation is enabled to have a higher practical value, a driver can use the navigation equipment more conveniently and the driving safety factor is improved.

Description

The method and system that shape of the mouth as one speaks assistant voice identification art is used in vehicle mounted guidance
Technical field
The present invention relates to the vehicle-mounted voice navigation field, relate in particular to application process and the system of a kind of shape of the mouth as one speaks assistant voice identification art in the vehicle carried video navigation.
Background technology
Speech recognition technology is along with the development of computing machine and relevant software and hardware technology, the increasing every field that is applied in, and its discrimination is also in continuous raising.Under specified conditions such as environment peace and quiet, pronunciation standard, the discrimination that is applied in speech recognition input characters system at present reaches more than 95%.If but onboard or the outside noise interference ratio is big, under the non-type situation of pronunciation, its discrimination will be had a greatly reduced quality, to such an extent as to can't reach practical purpose.If can adopt other method to come auxiliary judgment to improve the accuracy rate of its speech recognition, the practicality of speech recognition will significantly improve so.
Human language acknowledging process is a multichannel perception.In the process of the daily interchange of person to person, come the content of other people speech of perception through sound, pronounce when smudgy the environment of noise and excitement or the other side, also need eyes to observe its shape of the mouth as one speaks, the variation of expression etc., the content that could understand the other side exactly and said.Existing speech recognition system has been ignored this one side of visual characteristic of language perception; Only utilized single auditory properties; Make existing speech recognition system under noise circumstance or loquacity person condition; Its discrimination all significantly descends, and has reduced the practicality of speech recognition, and range of application is also restricted.
Along with the popularization and application of onboard navigation system, the driver carries out controlling of onboard navigation system each item application function when steering vehicle, and it is convenient inadequately only to control still with button and touch, and when controlling owing to divert one's attention to drive, be easy to cause driving accident.Control with the voice RM and to solve this problem, but adopt the navigational system of speech control technology to use on the more serious car of neighbourhood noise at present, its correct recognition rata is low, and influence is accurately controlled, and effect is not ideal enough.
Summary of the invention
The objective of the invention is to: solve the low problem of phonetic recognization rate in the noise circumstance of onboard navigation system when normal vehicle operation or idling.
For addressing the above problem; The scheme that the present invention proposes is: utilize human language hyperchannel apperceive characteristic; Utilize Sensor Analog Relay System " sense of hearing " and " vision ", adopt the identification of Mouth-Shape Recognition technology assistant voice, improve the phonetic recognization rate of onboard navigation system in noise circumstance.Implementation process is: through sensor sound and mouth shape image variation series are obtained " sense of hearing " and " vision " information; After a series of processing such as denoising, A/D conversion; Carry out the speech recognition Mouth-Shape Recognition with the ATL data comparison that is preset in image/voice recognition processing module respectively, compare with Mouth-Shape Recognition result and voice identification result, as if both as a result similarity acquire a certain degree; Promptly can confirm voice identification result; Thereby overcome Effects of Noise, significantly improved phonetic recognization rate, change into dependent instruction to the result again and output to onboard navigation system and navigate or obtain information.
The present invention program's implementation method as shown in Figure 1: system carries out feature extraction after pre-service is carried out in input to phonetic entry and degree of lip-rounding image at first respectively, and " training " " template piece " made identification and matching usefulness.After pre-service is carried out in input to phonetic entry and degree of lip-rounding image respectively again during use; Carry out feature extraction, obtain " test " signal, carry out " measure and estimate " in conjunction with trained template piece; Confirm the effective information of speech recognition; After discerning judgement with " expertise " system that presets again, speech recognition process is accomplished in output " result ".
Specify be training template piece the time, carry out template training through recording and shooting, set up the ATL of voice and Mouth-Shape Recognition, in recording with shape of the mouth as one speaks video image do corresponding one by one judge store.
The method of template matches is adopted in speech recognition of the present invention, and this method is totally four steps: feature extraction, template training, template classification, judgement.
With the voice recognition is example:
The first step is feature extraction, and the various analog signal of voice of gathering are carried out the A/D conversion, processes and stores after converting digital signal to.Be about to this signal digital and carry out digital denoising processing, remove pseudo-data, the keeping characteristics data.The denoising method that adopts is the characteristics according to the environment inside car noise; Analyze the normal mode noise of car when cruising or idling; As close or the engine when opening vehicle window, air-conditioning and driving noise characteristic data; The primary voice data of gathering through related operation, is formed near real voice feature data after removing these noise characteristic data.
Second step was a template training; Control voice command commonly used and relevant information is set up the sound template storehouse according to mobile unit; Like voice such as " beginning ", " navigation ", " destination ", " Shanghai "; Look for the people of all ages and classes, sex, accent to read, and do corresponding processing, set up the automobile-used sound template database of controlling.
The 3rd step was a template classification, was divided into control command class, address information class according to application characteristic, and range of information is type classification by size, to dwindle coupling judgement scope, improved matching efficiency and accuracy rate.The control command class is specifically just like navigation command class, voice control class; The big group of address information is specifically just like provincial place name, city-level place name or littler place name etc.
The 4th step was to judge, utilized matching algorithm to carry out phonetic feature and sound template storehouse Model Matching, with result who judges and Mouth-Shape Recognition comparison, further confirmed the accuracy of voice identification result.
The determination methods that Mouth-Shape Recognition of the present invention adopts lip and lip form and aspect to combine is accurately located the lip position.Be specially and adopt moving feature extraction of a kind of lip and recognition methods based on colourity filtering; Colourity filtering through lip; The lip motion video that is enhanced utilizes variable template to describe shape of the mouth as one speaks profile again and extracts characteristic parameter, and carries out the identification of lip movement sequence image with Hidden Markov (HMM) model.This method does not receive the influence of shape of the mouth as one speaks convergent-divergent, distortion, rotation; Different lip types there is good robustness, illumination is not had special requirement, and non-to the persona certa; Be applicable to the shape of the mouth as one speaks description under the natural conditions, can satisfy variable template has high-resolution to object edge requirement.Thereby realized that the lip position accurately locatees, and adopted suitable lip matching algorithm to discern.Recognition result and voice identification result are compared, form unified recognition result, the accurate information that will discern at last and mobile unit are controlled instruction and are mapped and accomplish the speech recognition manipulation function, and speech recognition is helped out, and improve phonetic recognization rate.
The beneficial effect that the present invention adopts above-mentioned technical solution to reach is: speech recognition and Mouth-Shape Recognition are organically combined through feature extraction, template training, template classification, judging process; Use that first speech recognition is judged, the logic determines sequence of back shape of the mouth as one speaks recognition and verification, effectively reduce because of noise and external sound disturb the probability that produces identification error, experiment proof vehicle go with the idling situation under the phonetic recognization rate of (closing vehicle window) bring up to more than 90% by original about 80%.The raising of discrimination means the weakness that has overcome single Voice Navigation, lets the more convenient use Voice Navigation of user equipment, uses navigator safer during driving.
Description of drawings
Below in conjunction with accompanying drawing and embodiment, the present invention and useful technique effect thereof are further elaborated, wherein:
Fig. 1 is shape of the mouth as one speaks information of the present invention and voice messaging main processing process synoptic diagram.
Fig. 2 is shape of the mouth as one speaks assistant voice recognition system figure of the present invention.
Description of reference numerals: 21, driver's face 22, camera 23, microphone 24, image/voice recognition processing module 25, vehicle mounted guidance audio-video system
Embodiment
The shape of the mouth as one speaks information that the present invention program discloses and voice messaging main processing process be referring to Fig. 1, and system carries out feature extraction after pre-service is carried out in input with degree of lip-rounding image to phonetic entry at first respectively, makes identification and matching usefulness after " training " " template piece " stored.After pre-service is carried out in input to phonetic entry and degree of lip-rounding image respectively again during use; Carry out feature extraction, obtain " test " signal, carry out " measure and estimate " in conjunction with " template piece " through " training "; Confirm the effective information of speech recognition; After discerning judgement with " expertise " system that presets again, speech recognition process is accomplished in output " result ".
Specify be training template piece the time, carry out template training through recording and shooting, set up the ATL of voice and Mouth-Shape Recognition, in recording with shape of the mouth as one speaks video image do corresponding one by one judge store.
Generally, the method that the shape of the mouth as one speaks assistant voice identification art that the present invention discloses is used in vehicle mounted guidance mainly comprises following steps:
A, obtain voice messaging,, handle laggard lang sound identification through feature extraction, template training, template classification, judgement through voice recording equipment;
B, obtain image information,, carry out Mouth-Shape Recognition after the processing through feature extraction, template training, template classification, judgement through shape of the mouth as one speaks picture pick-up device, and mouth shape image information with step a in voice messaging corresponding one by one;
C, voice identification result and Mouth-Shape Recognition result are compared, when both recognition result similarities acquire a certain degree, can confirm that this voice identification result is effective, export this voice identification result;
D, change into command adapted thereto to voice identification result again and output to in-vehicle navigation apparatus and navigate or obtain information.
Further, the method for the template matches of speech recognition employing of the present invention is divided into four steps: feature extraction, template training, template classification, judgement.
With the voice recognition is example:
(a) feature extraction is carried out the A/D conversion with the various analog signal of voice of gathering, and processes and stores after converting digital signal to.Be about to this signal digital and carry out digital denoising processing, remove pseudo-data, the keeping characteristics data.The denoising method that adopts is the characteristics according to the environment inside car noise; Analyze the normal mode noise of car when cruising or idling; As close or the engine when opening vehicle window, air-conditioning and driving noise characteristic data; The primary voice data of gathering through related operation, is formed near real voice feature data after removing these noise characteristic data.
(b) template training; Control voice command commonly used and relevant information is set up the sound template storehouse according to mobile unit; Like voice such as " beginning ", " navigation ", " destination ", " Shanghai "; Look for the people of all ages and classes, sex, accent to read, and do corresponding processing, set up the automobile-used sound template database of controlling.
(c) template classification is divided into control command class, address information class according to application characteristic, and range of information is type classification by size, to dwindle coupling judgement scope, improves matching efficiency and accuracy rate.The control command class is specifically just like navigation command class, voice control class; The big group of address information is specifically just like provincial place name, city-level place name or littler place name etc.
(d) judge, utilize matching algorithm to carry out phonetic feature and sound template storehouse Model Matching,, further confirm the accuracy of voice identification result result who judges and Mouth-Shape Recognition comparison.
Preferably; Speech recognition algorithm adopts Hidden Markov (HMM) method; The present invention designs in the optimization and the practicability of on the basis of this general-purpose algorithm related algorithm having been carried out under the vehicle-mounted voice application particular surroundings; Be specially: ATL is carried out reasonable classification, with series arrangement from small to large, when carrying out beginning successively to big type from group earlier when the phonetic feature coupling is differentiated; Effectively raise matching efficiency like this, and group just comprises the specific command and the warp sound template data commonly used, crucial of those coincidence control mobile units.
For the Mouth-Shape Recognition method, the present invention is preferably based on the moving feature extraction of lip and the recognition methods of colourity filtering, and it is through the colourity filtering of lip; The lip motion video that is enhanced; Utilize variable template again, realize the extraction and the tracking of shape of the mouth as one speaks profile, extract characteristic parameter; And result's (parameter of curve) sent into recognizer, and the HMM model carries out the identification of lip movement sequence image.
Shape of the mouth as one speaks assistant voice recognition system structure of the present invention is as shown in Figure 2; Vehicle mounted guidance audio-video system (25) and the image/voice recognition processing module (24) that upward connects thereof are connected in the image/microphone (23) of voice recognition processing module (24) input end, camera (22).When driver facial 21 pronounces with camera 22 facing to microphone 23; Microphone 23 and camera 22 are gathered and are input to image/voice recognition processing module 24 to voice signal and mouth shape image signal respectively and carry out handled (like processes such as denoising, pre-service, feature extraction, judgement and identifications); And the result after the identification converts the control corresponding instruction to; Be input to vehicle mounted guidance audio-video system 25, realize the voice control operation.
Preferably; Microphone 23 adopts the high-fidelity/highly sensitive electret condenser cartridge with directional audio transfer function; And be installed in panel board upper part, dead ahead, driver position; And the acoustic pickup mouth will be guaranteed to collect best voice signal over against driver's face 21, reduces car internal and external environment The noise as much as possible.
Preferably; Camera 22 adopt the band night vision function, video resolution is 640 * 480,25 frames, the very color CCD video image sensors of 32bit; And be installed in the upper edge end of driver dead ahead windshield; Camera lens is facial 21 over against the driver, guarantees when light is dark, also can obtain lip image information clearly, and system is to more accurately to image analysis processing;
Preferably, image/voice recognition processing module 24 used processor adopting High Performance DSP processors guarantees that system has good real-time performance.
On software processes, control command adopts as " opening navigation ", and " localizing objects ", " programme path ", " making a phone call ", fix command forms such as " answering ", thus greatly reduce the data operation quantity of template matches, also improved recognition efficiency simultaneously.Map address and voice messaging adopt crucial words fuzzy matching recognition methods, thereby have strengthened identification range, also improve the information Recognition rate simultaneously.The correctness that adopts said method that voice command is controlled provides sound assurance.
Preferably; The process of setting up of ATL is: each 20 people of men and women that select 16-70 age last birthday section; Carrying out vehicle mounted guidance voice command, cartographic information voice, speech play voice command and voice programm name voice, device control order voice and corresponding mouth shape image thereof respectively records; Through setting up basic ATL after voice/shape of the mouth as one speaks comparison and the characterization, after the speech recognition ATL is set up, that its classification and storage is subsequent use in the template corresponding class libraries.
In shape of the mouth as one speaks assistant voice identifying, through microphone 23 and camera 22 acquisition characteristics data, in speech processes; In image/voice recognition processing module 24, earlier the original sound of gathering is carried out denoising; Carry out characteristic then and extract, after corresponding shape of the mouth as one speaks characteristic is extracted, carry out a series of matching judgment identifications with the ATL data that preset; Judging characteristic result after the speech recognition compares with corresponding Mouth-Shape Recognition characteristic result again; Preferably, both recognition result similarities reach 70% can confirm voice content when above, converts this voice content to steering order again and sends into the vehicle mounted guidance audio-video system and handle.
Be applied in shape of the mouth as one speaks assistant voice recognition technology in the onboard navigation system; Because of phonetic recognization rate improves; When vehicle '; The vehicle-mounted voice navigator also can Real time identification under the environment of noise, response driver's speech control and navigating, and security incident takes place when avoiding the driver's operation navigator as far as possible.
According to the announcement and the instruction of above-mentioned instructions and specific embodiment, those skilled in the art in the invention can also change and revise above-mentioned embodiment.Therefore, the embodiment that discloses and describe above the present invention is not limited to also should fall in the protection domain of claim of the present invention modifications more of the present invention and change.In addition, although used some specific term and notions in this instructions, these terms and notion be explanation for ease just, the present invention is not constituted any restriction.

Claims (10)

1. a shape of the mouth as one speaks assistant voice is discerned the method that art is used in vehicle mounted guidance, it is characterized in that comprising following steps:
Obtain voice messaging through voice recording equipment,, handle laggard lang sound identification through feature extraction, template training, template classification, judgement;
Obtain image information through shape of the mouth as one speaks picture pick-up device,, carry out Mouth-Shape Recognition after the processing through feature extraction, template training, template classification, judgement, and mouth shape image information with step a in voice messaging corresponding one by one;
Voice identification result and Mouth-Shape Recognition result are compared, when both recognition result similarities acquire a certain degree, can confirm that this voice identification result is effective, export this voice identification result;
Changing into command adapted thereto to voice identification result again outputs to in-vehicle navigation apparatus and navigates or obtain information.
2. the method that shape of the mouth as one speaks assistant voice identification art according to claim 1 is used in vehicle mounted guidance, it is characterized in that: step a concrete steps are following:
(a) feature extraction is carried out the A/D conversion with the various analog signal of voice of gathering, and processes and stores after converting digital signal to; Be about to this signal digital and carry out digital denoising processing, remove pseudo-data, the keeping characteristics data;
(b) template training is controlled voice command commonly used and relevant information is set up the sound template storehouse according to mobile unit, looks for the people of all ages and classes, sex, accent to read, and does corresponding processing, sets up the automobile-used sound template database of controlling;
(c) template classification, according to application characteristic, i.e. control command class, address information class, range of information is type classification by size, to dwindle coupling judgement scope, improves matching efficiency and accuracy rate;
(d) judge, utilize matching algorithm to carry out phonetic feature and sound template storehouse Model Matching, the result that output is judged.
3. the method that shape of the mouth as one speaks assistant voice identification art according to claim 1 is used in vehicle mounted guidance; It is characterized in that: step b practical implementation also comprises following method: adopt moving feature extraction of a kind of lip based on colourity filtering and recognition methods; Through the colourity filtering of lip, the lip motion video that is enhanced; Utilize variable template again, describe shape of the mouth as one speaks profile and extract characteristic parameter, and carry out the identification of lip movement sequence image with HMM.
4. the application process of shape of the mouth as one speaks assistant voice according to claim 1 identification art in vehicle mounted guidance is characterized in that: the said similarity of step c acquires a certain degree and reaches more than 70% for similarity.
5. a shape of the mouth as one speaks assistant voice is discerned the system that art is used in vehicle mounted guidance; It is characterized in that comprising: vehicle mounted guidance audio-video system (25) and the image/voice recognition processing module (24) that upward connects thereof are connected in the image/microphone (23) of voice recognition processing module (24) input end, camera (22); Microphone (23) and camera (22) are gathered voice signal and mouth shape image signal respectively; And be input to image/voice recognition processing module (24) and carry out other handled, identification; And convert the result after the identification to control corresponding instruction, be input to vehicle mounted guidance audio-video system (25) and realize the voice control operation.
6. the system that shape of the mouth as one speaks assistant voice identification art according to claim 5 is used in vehicle mounted guidance is characterized in that: said microphone (23) is for having the high-fidelity/highly sensitive electret condenser cartridge of directional audio transfer function.
7. the system that shape of the mouth as one speaks assistant voice according to claim 5 identification art is used in vehicle mounted guidance is characterized in that: said camera (22) for the band night vision function, video resolution is 640 * 480,25 frames, the very color CCD video image sensors of 32bit.
8. discern the system that arts are used according to claim 5 or 6 described shape of the mouth as one speaks assistant voices in vehicle mounted guidance; It is characterized in that: said microphone (23) installation site is mounted in panel board upper part, dead ahead, driver position, and the acoustic pickup mouth will be over against driver facial (21).
9. discern the system that arts are used according to claim 5 or 7 described shape of the mouth as one speaks assistant voices in vehicle mounted guidance; It is characterized in that: said camera (22) installation site is mounted in the upper edge end of dead ahead, driver position windshield, and camera lens is over against driver facial (21).
10. the system that shape of the mouth as one speaks assistant voice identification art according to claim 5 is used in vehicle mounted guidance, it is characterized in that: image/used processor of voice recognition processing module (24) is the High Performance DSP processor.
CN201110239403A 2011-08-19 2011-08-19 Method and system of applying lip posture assisted speech recognition technique to vehicle navigation Pending CN102324035A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110239403A CN102324035A (en) 2011-08-19 2011-08-19 Method and system of applying lip posture assisted speech recognition technique to vehicle navigation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110239403A CN102324035A (en) 2011-08-19 2011-08-19 Method and system of applying lip posture assisted speech recognition technique to vehicle navigation

Publications (1)

Publication Number Publication Date
CN102324035A true CN102324035A (en) 2012-01-18

Family

ID=45451774

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110239403A Pending CN102324035A (en) 2011-08-19 2011-08-19 Method and system of applying lip posture assisted speech recognition technique to vehicle navigation

Country Status (1)

Country Link
CN (1) CN102324035A (en)

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102800315A (en) * 2012-07-13 2012-11-28 上海博泰悦臻电子设备制造有限公司 Vehicle-mounted voice control method and system
CN103268472A (en) * 2013-04-17 2013-08-28 哈尔滨工业大学深圳研究生院 Dual-color-space-based lip detection method
CN103514875A (en) * 2012-06-29 2014-01-15 联想(北京)有限公司 Voice data matching method and electronic equipment
CN103869962A (en) * 2012-12-18 2014-06-18 联想(北京)有限公司 Data processing method and device and electronic equipment
WO2015018254A1 (en) * 2013-08-05 2015-02-12 Yuan Zhi’Xian Voice/image-controlled vehicle-mounted gps navigation information exchange device
WO2015117403A1 (en) * 2014-07-23 2015-08-13 中兴通讯股份有限公司 Noise suppression method and apparatus, computer program and computer storage medium
WO2015158082A1 (en) * 2014-04-17 2015-10-22 中兴通讯股份有限公司 Lip-reading based terminal operation method and device
CN105389097A (en) * 2014-09-03 2016-03-09 中兴通讯股份有限公司 Man-machine interaction device and method
CN105741839A (en) * 2016-02-17 2016-07-06 陆玉正 Vehicle-mounted electric appliance voice auxiliary control device
WO2016150001A1 (en) * 2015-03-24 2016-09-29 中兴通讯股份有限公司 Speech recognition method, device and computer storage medium
WO2016173132A1 (en) * 2015-04-28 2016-11-03 中兴通讯股份有限公司 Method and device for voice recognition, and user equipment
CN106463116A (en) * 2014-06-11 2017-02-22 霍尼韦尔国际公司 Plant control system using voice as a control mechanism
CN107085734A (en) * 2017-05-24 2017-08-22 南京华设科技股份有限公司 IN service accepts robot
WO2018045703A1 (en) * 2016-09-07 2018-03-15 中兴通讯股份有限公司 Voice processing method, apparatus and terminal device
CN107831684A (en) * 2016-09-16 2018-03-23 天津思博科科技发展有限公司 Using the shape of the mouth as one speaks pronunciation transposition of realizing of Robot Vision
CN107945789A (en) * 2017-12-28 2018-04-20 努比亚技术有限公司 Audio recognition method, device and computer-readable recording medium
CN108227904A (en) * 2016-12-21 2018-06-29 深圳市掌网科技股份有限公司 A kind of virtual reality language interactive system and method
CN108346427A (en) * 2018-02-05 2018-07-31 广东小天才科技有限公司 A kind of audio recognition method, device, equipment and storage medium
CN108389573A (en) * 2018-02-09 2018-08-10 北京易真学思教育科技有限公司 Language Identification and device, training method and device, medium, terminal
WO2018210219A1 (en) * 2017-05-18 2018-11-22 刘国华 Device-facing human-computer interaction method and system
CN109377995A (en) * 2018-11-20 2019-02-22 珠海格力电器股份有限公司 A kind of method and apparatus controlling equipment
CN109448711A (en) * 2018-10-23 2019-03-08 珠海格力电器股份有限公司 A kind of method, apparatus and computer storage medium of speech recognition
US10366691B2 (en) 2017-07-11 2019-07-30 Samsung Electronics Co., Ltd. System and method for voice command context
CN110827823A (en) * 2019-11-13 2020-02-21 联想(北京)有限公司 Voice auxiliary recognition method and device, storage medium and electronic equipment
CN111243585A (en) * 2020-01-07 2020-06-05 百度在线网络技术(北京)有限公司 Control method, device and equipment under multi-person scene and storage medium
CN111554294A (en) * 2020-04-23 2020-08-18 苏州大学 Intelligent garbage classification method based on voice recognition
CN111898108A (en) * 2014-09-03 2020-11-06 创新先进技术有限公司 Identity authentication method and device, terminal and server
CN112927688A (en) * 2021-01-25 2021-06-08 思必驰科技股份有限公司 Voice interaction method and system for vehicle
CN113157080A (en) * 2020-01-07 2021-07-23 宝马股份公司 Instruction input method for vehicle, storage medium, system and vehicle
CN114093354A (en) * 2021-10-26 2022-02-25 惠州市德赛西威智能交通技术研究院有限公司 Method and system for improving recognition accuracy of vehicle-mounted voice assistant
CN115083428A (en) * 2022-05-30 2022-09-20 湖南中周至尚信息技术有限公司 Voice model recognition device for assisting news broadcasting and control method thereof
CN115243104A (en) * 2021-11-30 2022-10-25 广州汽车集团股份有限公司 Method and system for automatically adjusting vehicle-mounted multimedia volume
CN116028603A (en) * 2022-06-07 2023-04-28 成都成电金盘健康数据技术有限公司 Intelligent pre-consultation method, device and system based on big data, and storage medium
CN111898108B (en) * 2014-09-03 2024-06-04 创新先进技术有限公司 Identity authentication method, device, terminal and server

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1811341A (en) * 2005-01-27 2006-08-02 乐金电子(惠州)有限公司 Vehicular navigation apparatus and operating method thereof
CN102023703A (en) * 2009-09-22 2011-04-20 现代自动车株式会社 Combined lip reading and voice recognition multimodal interface system
CN202329640U (en) * 2011-08-19 2012-07-11 广东好帮手电子科技股份有限公司 System for applying auxiliary voice recognition technology by mouth shape in vehicular navigation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1811341A (en) * 2005-01-27 2006-08-02 乐金电子(惠州)有限公司 Vehicular navigation apparatus and operating method thereof
CN102023703A (en) * 2009-09-22 2011-04-20 现代自动车株式会社 Combined lip reading and voice recognition multimodal interface system
CN202329640U (en) * 2011-08-19 2012-07-11 广东好帮手电子科技股份有限公司 System for applying auxiliary voice recognition technology by mouth shape in vehicular navigation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
姚鸿勋等: "《基于色度分析的唇动特征提取与识别》", 《电子学报》 *
段红梅等: "《隐马尔可夫模型在语音识别中的应用》", 《工科数学》 *

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514875A (en) * 2012-06-29 2014-01-15 联想(北京)有限公司 Voice data matching method and electronic equipment
CN102800315A (en) * 2012-07-13 2012-11-28 上海博泰悦臻电子设备制造有限公司 Vehicle-mounted voice control method and system
CN103869962B (en) * 2012-12-18 2016-12-28 联想(北京)有限公司 A kind of data processing method, device and electronic equipment
CN103869962A (en) * 2012-12-18 2014-06-18 联想(北京)有限公司 Data processing method and device and electronic equipment
CN103268472A (en) * 2013-04-17 2013-08-28 哈尔滨工业大学深圳研究生院 Dual-color-space-based lip detection method
CN103268472B (en) * 2013-04-17 2017-07-18 哈尔滨工业大学深圳研究生院 Lip detection method based on double-colored color space
WO2015018254A1 (en) * 2013-08-05 2015-02-12 Yuan Zhi’Xian Voice/image-controlled vehicle-mounted gps navigation information exchange device
WO2015158082A1 (en) * 2014-04-17 2015-10-22 中兴通讯股份有限公司 Lip-reading based terminal operation method and device
CN105022470A (en) * 2014-04-17 2015-11-04 中兴通讯股份有限公司 Method and device of terminal operation based on lip reading
CN106463116A (en) * 2014-06-11 2017-02-22 霍尼韦尔国际公司 Plant control system using voice as a control mechanism
WO2015117403A1 (en) * 2014-07-23 2015-08-13 中兴通讯股份有限公司 Noise suppression method and apparatus, computer program and computer storage medium
CN105389097A (en) * 2014-09-03 2016-03-09 中兴通讯股份有限公司 Man-machine interaction device and method
CN111898108B (en) * 2014-09-03 2024-06-04 创新先进技术有限公司 Identity authentication method, device, terminal and server
CN111898108A (en) * 2014-09-03 2020-11-06 创新先进技术有限公司 Identity authentication method and device, terminal and server
WO2016150001A1 (en) * 2015-03-24 2016-09-29 中兴通讯股份有限公司 Speech recognition method, device and computer storage medium
WO2016173132A1 (en) * 2015-04-28 2016-11-03 中兴通讯股份有限公司 Method and device for voice recognition, and user equipment
CN106157957A (en) * 2015-04-28 2016-11-23 中兴通讯股份有限公司 Audio recognition method, device and subscriber equipment
CN105741839A (en) * 2016-02-17 2016-07-06 陆玉正 Vehicle-mounted electric appliance voice auxiliary control device
WO2018045703A1 (en) * 2016-09-07 2018-03-15 中兴通讯股份有限公司 Voice processing method, apparatus and terminal device
CN107831684A (en) * 2016-09-16 2018-03-23 天津思博科科技发展有限公司 Using the shape of the mouth as one speaks pronunciation transposition of realizing of Robot Vision
CN108227904A (en) * 2016-12-21 2018-06-29 深圳市掌网科技股份有限公司 A kind of virtual reality language interactive system and method
WO2018210219A1 (en) * 2017-05-18 2018-11-22 刘国华 Device-facing human-computer interaction method and system
US11163356B2 (en) 2017-05-18 2021-11-02 Guohua Liu Device-facing human-computer interaction method and system
CN107085734A (en) * 2017-05-24 2017-08-22 南京华设科技股份有限公司 IN service accepts robot
US10366691B2 (en) 2017-07-11 2019-07-30 Samsung Electronics Co., Ltd. System and method for voice command context
CN110785735A (en) * 2017-07-11 2020-02-11 三星电子株式会社 Apparatus and method for voice command scenario
CN107945789A (en) * 2017-12-28 2018-04-20 努比亚技术有限公司 Audio recognition method, device and computer-readable recording medium
CN108346427A (en) * 2018-02-05 2018-07-31 广东小天才科技有限公司 A kind of audio recognition method, device, equipment and storage medium
CN108389573A (en) * 2018-02-09 2018-08-10 北京易真学思教育科技有限公司 Language Identification and device, training method and device, medium, terminal
CN109448711A (en) * 2018-10-23 2019-03-08 珠海格力电器股份有限公司 A kind of method, apparatus and computer storage medium of speech recognition
CN109377995A (en) * 2018-11-20 2019-02-22 珠海格力电器股份有限公司 A kind of method and apparatus controlling equipment
CN110827823A (en) * 2019-11-13 2020-02-21 联想(北京)有限公司 Voice auxiliary recognition method and device, storage medium and electronic equipment
CN111243585B (en) * 2020-01-07 2022-11-22 百度在线网络技术(北京)有限公司 Control method, device and equipment under multi-user scene and storage medium
CN111243585A (en) * 2020-01-07 2020-06-05 百度在线网络技术(北京)有限公司 Control method, device and equipment under multi-person scene and storage medium
CN113157080A (en) * 2020-01-07 2021-07-23 宝马股份公司 Instruction input method for vehicle, storage medium, system and vehicle
CN111554294A (en) * 2020-04-23 2020-08-18 苏州大学 Intelligent garbage classification method based on voice recognition
CN112927688A (en) * 2021-01-25 2021-06-08 思必驰科技股份有限公司 Voice interaction method and system for vehicle
CN114093354A (en) * 2021-10-26 2022-02-25 惠州市德赛西威智能交通技术研究院有限公司 Method and system for improving recognition accuracy of vehicle-mounted voice assistant
CN115243104A (en) * 2021-11-30 2022-10-25 广州汽车集团股份有限公司 Method and system for automatically adjusting vehicle-mounted multimedia volume
CN115083428B (en) * 2022-05-30 2023-05-30 湖南中周至尚信息技术有限公司 Voice model recognition device for news broadcasting assistance and control method thereof
CN115083428A (en) * 2022-05-30 2022-09-20 湖南中周至尚信息技术有限公司 Voice model recognition device for assisting news broadcasting and control method thereof
CN116028603A (en) * 2022-06-07 2023-04-28 成都成电金盘健康数据技术有限公司 Intelligent pre-consultation method, device and system based on big data, and storage medium
CN116028603B (en) * 2022-06-07 2023-12-19 成都成电金盘健康数据技术有限公司 Intelligent pre-consultation method, device and system based on big data, and storage medium

Similar Documents

Publication Publication Date Title
CN102324035A (en) Method and system of applying lip posture assisted speech recognition technique to vehicle navigation
JP4311190B2 (en) In-vehicle device interface
US9092394B2 (en) Depth based context identification
US8560313B2 (en) Transient noise rejection for speech recognition
CN102023703B (en) Combined lip reading and voice recognition multimodal interface system
US8639508B2 (en) User-specific confidence thresholds for speech recognition
US11854550B2 (en) Determining input for speech processing engine
Chan et al. Real-time lip tracking and bimodal continuous speech recognition
CN202329640U (en) System for applying auxiliary voice recognition technology by mouth shape in vehicular navigation
CN102097096B (en) Using pitch during speech recognition post-processing to improve recognition accuracy
CN109941231B (en) Vehicle-mounted terminal equipment, vehicle-mounted interaction system and interaction method
US9564120B2 (en) Speech adaptation in speech synthesis
JP6977004B2 (en) In-vehicle devices, methods and programs for processing vocalizations
CN112397065A (en) Voice interaction method and device, computer readable storage medium and electronic equipment
US9881609B2 (en) Gesture-based cues for an automatic speech recognition system
CN108382155B (en) Air conditioner voice control device with reminding function
US20110125500A1 (en) Automated distortion classification
US20150248881A1 (en) Dynamic speech system tuning
US20150255063A1 (en) Detecting vanity numbers using speech recognition
CN113129867A (en) Training method of voice recognition model, voice recognition method, device and equipment
KR20130046759A (en) Apparatus and method for recogniting driver command in a vehicle
Ivanko et al. DAVIS: Driver's Audio-Visual Speech recognition.
CN204926573U (en) Intelligent robot of auxiliary exercise mandarin
CN112863485A (en) Accent voice recognition method, apparatus, device and storage medium
CN111756986A (en) Camera control method, storage medium, device and electronic equipment with camera control device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20120118