CN105022470A - Method and device of terminal operation based on lip reading - Google Patents

Method and device of terminal operation based on lip reading Download PDF

Info

Publication number
CN105022470A
CN105022470A CN201410153736.2A CN201410153736A CN105022470A CN 105022470 A CN105022470 A CN 105022470A CN 201410153736 A CN201410153736 A CN 201410153736A CN 105022470 A CN105022470 A CN 105022470A
Authority
CN
China
Prior art keywords
lip
user
recognition result
sequence
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201410153736.2A
Other languages
Chinese (zh)
Inventor
尚国强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201410153736.2A priority Critical patent/CN105022470A/en
Priority to PCT/CN2014/084557 priority patent/WO2015158082A1/en
Publication of CN105022470A publication Critical patent/CN105022470A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition

Abstract

The invention discloses a method and a device of terminal operation based on lip reading, and relates to technical field of multimedia communication. The method comprises the following steps: respectively recognizing lip motion and voice of a user, to obtain a lip motion recognition result and a voice recognition result of the user; performing matching processing on the obtained lip motion recognition result and the voice recognition result of the user, to obtain a matching result; and according to the matching result, operating the terminal. The method and the device improve accuracy of terminal operation, and provide convenience for users, and improve user experience.

Description

A kind of terminal operation method based on labiomaney and device
Technical field
The present invention relates to multimedia communication technology field, particularly a kind of terminal operation method based on labiomaney and device.
Background technology
Along with the development of hardware technology and the development of software, the technology such as plane picture, 3D rendering, voice technology, digital image-forming obtain great development, not only image is more and more clear, and its module is also more and more less, and the equipment such as mobile terminal are widely used; The ability of the processor of mobile terminal also from strength to strength, the video of high-resolution and image and broadband voice can be processed comparatively easily, the development of speech recognition technology, make to have also been obtained great expansion to the manipulation of mobile terminal, namely mutual by voice and mobile terminal, the both hands discharging people do other affairs more.
Voice technology development on mobile terminals thus, also allow manufacturer terminal in the man-machine interaction constantly attempting how allowing mobile terminal convenient and more accurate, the process of interactive voice also has some shortcomings not temporarily also to be resolved on mobile terminals, accuracy as speech recognition in a noisy environment can decline, when multiple sound resource, the accuracy identified also declines fast, to decline to a great extent the also shortcoming such as None-identified descending voice assignment more at a distance.
In order to solve the problem, the invention provides a kind of terminal operation method based on labiomaney and device.
Summary of the invention
The object of the present invention is to provide a kind of terminal operation method based on labiomaney and device, solve in prior art under noisy environment or larger distance, the problem that the accuracy that use speech recognition operates terminal is lower.
According to an aspect of the present invention, provide a kind of terminal operation method based on labiomaney, comprise the following steps:
The lip motion of user and voice are identified respectively, obtains lip motion recognition result and the voice identification result of user;
The lip motion recognition result of obtained user and voice identification result are carried out matching treatment, obtains matching result;
According to described matching result, described terminal is operated.
Preferably, user's lip motion is identified, obtains user's lip motion recognition result and comprise:
Obtain user's face image sequence;
Lip-region in obtained user's face image sequence is identified, obtains user's lip characteristic sequence;
The lip standard sequence feature obtained user's lip characteristic sequence and terminal prestored carries out matching treatment, finds the lip standard sequence feature of mating with described user's lip characteristic sequence;
Using the operational order corresponding to the lip standard sequence feature of mating with described user's lip characteristic sequence as user's lip motion recognition result.
Preferably, user speech is identified, obtains user speech recognition result and comprise:
By carrying out voice recognition processing to the user speech picked up, obtain user vocal feature sequence;
The token sound sequence signature that obtained user vocal feature sequence and terminal prestore is carried out matching treatment, finds the token sound characteristic sequence with described user vocal feature sequences match;
Using with the operational order corresponding to the token sound sequence signature of described user vocal feature sequences match as user speech recognition result.
Preferably, described carries out matching treatment by the lip motion recognition result of obtained user and voice identification result, obtains matching result and comprises:
Judge whether lip motion recognition result and the voice identification result of the user obtained match;
When described lip motion recognition result and institute's speech recognition result match, using the described recognition result matched as matching result;
When described lip motion recognition result does not mate with institute speech recognition result, using described lip motion recognition result or institute's speech recognition result as matching result.
Preferably, the operational order corresponding to the described lip standard sequence feature of mating with described user's lip characteristic sequence is comprised as user's lip motion recognition result:
Set up the first mapping table of lip standard sequence feature and operational order;
According to the first set up mapping table, find out the operational order corresponding to lip standard sequence feature mated with described user's lip characteristic sequence;
Using described operational order as user's lip motion recognition result.
Preferably, the operational order corresponding to token sound sequence signature that is described and described user vocal feature sequences match is comprised as user speech recognition result:
Set up the second mapping table of token sound sequence signature and operational order;
According to the second set up mapping table, find out the operational order corresponding with the token sound sequence signature of described user vocal feature sequences match;
Using described operational order as user speech recognition result.
According to a further aspect in the invention, provide a kind of terminal operation device based on labiomaney, comprising:
Identification module, for identifying respectively the lip motion of user and voice, obtains lip motion recognition result and the voice identification result of user;
Matching module, for the lip motion recognition result of obtained user and voice identification result are carried out matching treatment, obtains matching result;
Operational module, for according to described matching result, operates described terminal.
Preferably, described identification module comprises:
Acquiring unit, for obtaining user's face image sequence, and identifies the lip-region in obtained user's face image sequence, obtains user's lip characteristic sequence;
Lip movement matching unit, carries out matching treatment for the lip standard sequence feature obtained user's lip characteristic sequence and terminal prestored, and finds the lip standard sequence feature of mating with described user's lip characteristic sequence;
Obtain lip movement recognition result unit, for the operational order corresponding to the lip standard sequence feature of will mate with described user's lip characteristic sequence as user's lip motion recognition result.
Preferably, described identification module comprises:
Obtaining phonetic feature sequence units, for by carrying out voice recognition processing to the user speech picked up, obtaining user vocal feature sequence;
Voice match unit, carries out matching treatment for the token sound sequence signature obtained user vocal feature sequence and terminal prestored, and finds the token sound characteristic sequence with described user vocal feature sequences match;
Obtain voice identification result unit, for using with the operational order corresponding to the token sound sequence signature of described user vocal feature sequences match as user speech recognition result.
Preferably, described matching module comprises:
Whether judging unit, match for the lip motion recognition result and voice identification result judging obtained user;
Operating unit, for when described lip motion recognition result and institute's speech recognition result match, using the described recognition result matched as matching result, and when described lip motion recognition result does not mate with institute speech recognition result, using described lip motion recognition result or institute's speech recognition result as matching result.
Compared with prior art, beneficial effect of the present invention is:
The present invention is operated terminal by speech recognition and lip identification, improves the accuracy of user operation terminal, brings conveniently to user.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of a kind of terminal operation method based on labiomaney provided by the invention;
Fig. 2 is the schematic diagram of a kind of terminal operation device based on labiomaney provided by the invention;
Fig. 3 is the process flow diagram only utilizing the terminal operation method of labiomaney that first embodiment of the invention provides;
Fig. 4 is the process flow diagram utilizing the terminal operation method of labiomaney and voice that second embodiment of the invention provides;
Fig. 5 is the schematic diagram of the terminal operation device based on labiomaney that third embodiment of the invention provides.
Embodiment
Below in conjunction with accompanying drawing to a preferred embodiment of the present invention will be described in detail, should be appreciated that following illustrated preferred embodiment is only for instruction and explanation of the present invention, is not intended to limit the present invention.
Fig. 1 shows the process flow diagram of a kind of terminal operation method based on labiomaney provided by the invention, as shown in Figure 1, comprises the following steps:
Step S101: identify respectively the lip motion of user and voice, obtains lip motion recognition result and the voice identification result of user;
Specifically, user's lip motion is identified, obtain user's lip motion recognition result and comprise: obtain user's face image sequence; Lip-region in obtained user's face image sequence is identified, obtains user's lip characteristic sequence; The lip standard sequence feature obtained user's lip characteristic sequence and terminal prestored carries out matching treatment, finds the lip standard sequence feature of mating with described user's lip characteristic sequence; Using the operational order corresponding to the lip standard sequence feature of mating with described user's lip characteristic sequence as user's lip motion recognition result.User speech being identified, obtains user speech recognition result and comprise: by carrying out voice recognition processing to the user speech picked up, obtaining user vocal feature sequence; The token sound sequence signature that obtained user vocal feature sequence and terminal prestore is carried out matching treatment, finds the token sound characteristic sequence with described user vocal feature sequences match; Using with the operational order corresponding to the token sound sequence signature of described user vocal feature sequences match as user speech recognition result.Wherein, the acquisition of user's face image sequence can have several mode, obtain as obtained by camera, by video file or by the file acquisition of other types as animation sequence.
More particularly, the operational order corresponding to the described lip standard sequence feature of mating with described user's lip characteristic sequence is comprised as user's lip motion recognition result: the first mapping table setting up lip standard sequence feature and operational order; According to the first set up mapping table, find out the operational order corresponding to lip standard sequence feature mated with described user's lip characteristic sequence; Using described operational order as user's lip motion recognition result.Operational order corresponding to token sound sequence signature that is described and described user vocal feature sequences match is comprised as user speech recognition result: the second mapping table setting up token sound sequence signature and operational order; According to the second set up mapping table, find out the operational order corresponding with the token sound sequence signature of described user vocal feature sequences match; Using described operational order as user speech recognition result.
Step S102: the lip motion recognition result of obtained user and voice identification result are carried out matching treatment, obtains matching result;
Specifically, first, judge whether the lip motion recognition result of the user obtained and voice identification result match; Secondly, when described lip motion recognition result and institute's speech recognition result match, using the described recognition result matched as matching result; When described lip motion recognition result does not mate with institute speech recognition result, using described lip motion recognition result or institute's speech recognition result as matching result.
Step S103: according to described matching result, operates described terminal.
Specifically, when described lip motion recognition result and institute's speech recognition result match, according to the described recognition result matched, terminal is operated; When described lip motion recognition result does not mate with institute speech recognition result, according to described lip motion recognition result or institute's speech recognition result, terminal is operated.
Fig. 2 shows the schematic diagram of a kind of terminal operation device based on labiomaney provided by the invention, as shown in Figure 2, comprising: identification module 201, for identifying respectively the lip motion of user and voice, obtaining lip motion recognition result and the voice identification result of user; Matching module 202, for the lip motion recognition result of obtained user and voice identification result are carried out matching treatment, obtains matching result; Operational module 203, for according to described matching result, operates described terminal.
Specifically, identification module 201 comprises: acquiring unit, for obtaining user's face image sequence, and identifies the lip-region in obtained user's face image sequence, obtains user's lip characteristic sequence; Lip movement matching unit, carries out matching treatment for the lip standard sequence feature obtained user's lip characteristic sequence and terminal prestored, and finds the lip standard sequence feature of mating with described user's lip characteristic sequence; Obtain lip movement recognition result unit, for the operational order corresponding to the lip standard sequence feature of will mate with described user's lip characteristic sequence as user's lip motion recognition result.And obtain phonetic feature sequence units, for by carrying out voice recognition processing to the user speech picked up, obtain user vocal feature sequence; Voice match unit, carries out matching treatment for the token sound sequence signature obtained user vocal feature sequence and terminal prestored, and finds the token sound characteristic sequence with described user vocal feature sequences match; Obtain voice identification result unit, for using with the operational order corresponding to the token sound sequence signature of described user vocal feature sequences match as user speech recognition result.
Whether described matching module 203 comprises: judging unit, match for the lip motion recognition result and voice identification result judging obtained user; Operating unit, for when described lip motion recognition result and institute's speech recognition result match, using the described recognition result matched as matching result, and when described lip motion recognition result does not mate with institute speech recognition result, using described lip motion recognition result or institute's speech recognition result as matching result.
Fig. 3 shows the process flow diagram only utilizing the terminal operation method of labiomaney that first embodiment of the invention provides, and as shown in Figure 3, comprises the following steps:
Step S301: obtain image sequence;
Terminal starts labiomaney application technology, obtain the image comprising face accordingly, identify human face region, that is, containing people's face area or at least containing lip-region in this image sequence.
Wherein, the acquisition of image sequence can have several mode, obtain as obtained by camera, by video file or by the file acquisition of other types as animation sequence.
Step S302: identify from the lip-region obtained image sequence, obtains user's lip characteristic sequence;
According to the image sequence obtained in step S301, carry out the identification of lip-region, as the unique point of the major effect labiomaney such as two, left and right labial angle, upper lip summit, lower lip low spot, lip outline line for lip, line identify, form a lip-region characteristic sequence.
Step S303: described lip-region feature series is extracted, and and the identification of lip feature identification module, obtain recognition result;
Extract lip-region characteristic sequence, and form corresponding lip characteristic kinematic sequence chart according to time sequencing, and carry out Model Identification according to this sequence chart, that is, to lip-region feature series carry out extraction and and lip feature identification module obtain recognition result alternately.
Step S304: the order by recognition result coupling being correspondence, to realize the operation to terminal device.
Recognition result and interaction command module are carried out coupling and is converted into corresponding order, terminal makes corresponding operation to this order, then once complete alternately.
Below for the direct controlling equipment of labiomaney identification, specific embodiment illustrates particular content of the present invention:
Step 1. opens the camera of equipment, starts labiomaney application module;
Step 2. is by the Head And Face of interactive interface tracker, and first equipment identify Head And Face, according to the attribute of Head And Face, follows up lip, identifies lip-region and follows the trail of lip movement;
The feature of step 3. pair lip movement is extracted, and forms a characteristic sequence R, is carried out extracting the input as lip identification module by this characteristic sequence, lip identification module output matching result S;
Matching result S and man-machine interaction storehouse mate by step 4., if there is coupling, then to the operation that equipment is correlated with, as " making a phone call " etc.; If matching result is wrong, then point out this subcommand invalid.
Described above to be that lip-reading result directly applies to terminal device mutual, and that is, in interactive voice process, the result of lip-reading directly applies to alternately.But, when as interactive voice technique complementary time, when cannot voice identification result be got, the result of lip-reading identification is converted into voice mate, it is mutual that the result of coupling is applied to terminal device, therefore, with the embodiment of Fig. 4 below, particular content of the present invention is described:
Fig. 4 shows the process flow diagram utilizing the terminal operation method of labiomaney and voice that second embodiment of the invention provides, and as shown in Figure 4, comprises the following steps:
Step S401. opens the labiomaney application module of the phonetic feature identification module of equipment, the camera opening equipment and starting outfit;
Equipment, except carrying out except speech recognition, also can carry out lip identification simultaneously.
Step S402. is by interactive interface identification lip-region and follow the trail of lip movement;
By the Head And Face of interactive interface tracker, and identify number of people facial zone, according to the attribute of Head And Face, follow up lip, identify lip-region and follow the trail of lip movement
The feature of step S403. to lip movement is extracted;
The feature of lip movement is extracted, forms a characteristic sequence R, this characteristic sequence is extracted, and using the input of the R1 of extraction as lip identification module, lip identification module output matching result S
The result of the result of speech recognition and lip identification compares by step S404., operates terminal according to comparative result.
The result of the result of speech recognition and lip identification is compared, if identical, then man-machine interaction performs this order, if not identical, the result of user's choice for use speech recognition is then pointed out still to use the result of labiomaney, or user is arranged based on voice recognition commands, when speech recognition is without the result using labiomaney identification during result.Here speech recognition and lip identification can be confirmed mutually, mutually supplement, and object obtains better recognition result.
Fig. 5 shows the schematic diagram of the terminal operation device based on labiomaney that third embodiment of the invention provides, and as shown in Figure 5, comprising: labiomaney application module, lip feature identification module, phonetic feature identification module and interaction command module.Wherein labiomaney application module refers to and uses lip-reading as the application exported, as used the messages application of lip-reading; Lip feature identification module refers to the public module that labiomaney application module will call, and this lip feature identification module can carry out face recognition, lip identification, lip feature extraction and lip reading identification; Phonetic feature identification module refers to the module realizing speech recognition, interaction command module refers to the application in order to realize man-machine interaction, interactive interface is had between interaction command module and labiomaney application module, lip feature identification module, phonetic feature identification module, receive the input from these modules, and make corresponding interactive action according to these inputs, or provide corresponding Output rusults.
In sum, the invention provides a kind of human-computer interaction device based on lip-reading, particularly lip-reading application on mobile terminals, if the realization of interactive command, labiomaney are to the conversion etc. of voice.Lip-reading can be used alone in the manipulation of terminal, also can supplement as the effective of speech recognition controlled, namely when Voice command is identified in too late, come supplementary complete with lip-reading.
In sum, the present invention has following technique effect:
The present invention makes lip-reading to be used fully by the ability such as front camera and rear camera, high-definition image process of terminal, improves the limit of power of man-machine interaction, makes to extend existing interactive capability to a certain extent.
Although above to invention has been detailed description, the present invention is not limited thereto, those skilled in the art of the present technique can carry out various amendment according to principle of the present invention.Therefore, all amendments done according to the principle of the invention, all should be understood to fall into protection scope of the present invention.

Claims (10)

1. based on a terminal operation method for labiomaney, it is characterized in that, comprise the following steps:
The lip motion of user and voice are identified respectively, obtains lip motion recognition result and the voice identification result of user;
The lip motion recognition result of obtained user and voice identification result are carried out matching treatment, obtains matching result;
According to described matching result, described terminal is operated.
2. method according to claim 1, is characterized in that, identifies user's lip motion, obtains user's lip motion recognition result and comprises:
Obtain user's face image sequence;
Lip-region in obtained user's face image sequence is identified, obtains user's lip characteristic sequence;
The lip standard sequence feature obtained user's lip characteristic sequence and terminal prestored carries out matching treatment, finds the lip standard sequence feature of mating with described user's lip characteristic sequence;
Using the operational order corresponding to the lip standard sequence feature of mating with described user's lip characteristic sequence as user's lip motion recognition result.
3. method according to claim 1, is characterized in that, identifies user speech, obtains user speech recognition result and comprises:
By carrying out voice recognition processing to the user speech picked up, obtain user vocal feature sequence;
The token sound sequence signature that obtained user vocal feature sequence and terminal prestore is carried out matching treatment, finds the token sound characteristic sequence with described user vocal feature sequences match;
Using with the operational order corresponding to the token sound sequence signature of described user vocal feature sequences match as user speech recognition result.
4. according to the method in claim 2 or 3, it is characterized in that, described carries out matching treatment by the lip motion recognition result of obtained user and voice identification result, obtains matching result and comprises:
Judge whether lip motion recognition result and the voice identification result of the user obtained match;
When described lip motion recognition result and institute's speech recognition result match, using the described recognition result matched as matching result;
When described lip motion recognition result does not mate with institute speech recognition result, using described lip motion recognition result or institute's speech recognition result as matching result.
5. method according to claim 2, is characterized in that, is comprised by the operational order corresponding to the described lip standard sequence feature of mating with described user's lip characteristic sequence as user's lip motion recognition result:
Set up the first mapping table of lip standard sequence feature and operational order;
According to the first set up mapping table, find out the operational order corresponding to lip standard sequence feature mated with described user's lip characteristic sequence;
Using described operational order as user's lip motion recognition result.
6. method according to claim 3, is characterized in that, is comprised by the operational order corresponding to token sound sequence signature that is described and described user vocal feature sequences match as user speech recognition result:
Set up the second mapping table of token sound sequence signature and operational order;
According to the second set up mapping table, find out the operational order corresponding with the token sound sequence signature of described user vocal feature sequences match;
Using described operational order as user speech recognition result.
7., based on a terminal operation device for labiomaney, it is characterized in that, comprising:
Identification module, for identifying respectively the lip motion of user and voice, obtains lip motion recognition result and the voice identification result of user;
Matching module, for the lip motion recognition result of obtained user and voice identification result are carried out matching treatment, obtains matching result;
Operational module, for according to described matching result, operates described terminal.
8. device according to claim 7, is characterized in that, described identification module comprises:
Acquiring unit, for obtaining user's face image sequence, and identifies the lip-region in obtained user's face image sequence, obtains user's lip characteristic sequence;
Lip movement matching unit, carries out matching treatment for the lip standard sequence feature obtained user's lip characteristic sequence and terminal prestored, and finds the lip standard sequence feature of mating with described user's lip characteristic sequence;
Obtain lip movement recognition result unit, for the operational order corresponding to the lip standard sequence feature of will mate with described user's lip characteristic sequence as user's lip motion recognition result.
9. device according to claim 7, is characterized in that, described identification module also comprises:
Obtaining phonetic feature sequence units, for by carrying out voice recognition processing to the user speech picked up, obtaining user vocal feature sequence;
Voice match unit, carries out matching treatment for the token sound sequence signature obtained user vocal feature sequence and terminal prestored, and finds the token sound characteristic sequence with described user vocal feature sequences match;
Obtain voice identification result unit, for using with the operational order corresponding to the token sound sequence signature of described user vocal feature sequences match as user speech recognition result.
10. device according to claim 8 or claim 9, it is characterized in that, described matching module comprises:
Whether judging unit, match for the lip motion recognition result and voice identification result judging obtained user;
Operating unit, for when described lip motion recognition result and institute's speech recognition result match, using the described recognition result matched as matching result, and when described lip motion recognition result does not mate with institute speech recognition result, using described lip motion recognition result or institute's speech recognition result as matching result.
CN201410153736.2A 2014-04-17 2014-04-17 Method and device of terminal operation based on lip reading Withdrawn CN105022470A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410153736.2A CN105022470A (en) 2014-04-17 2014-04-17 Method and device of terminal operation based on lip reading
PCT/CN2014/084557 WO2015158082A1 (en) 2014-04-17 2014-08-15 Lip-reading based terminal operation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410153736.2A CN105022470A (en) 2014-04-17 2014-04-17 Method and device of terminal operation based on lip reading

Publications (1)

Publication Number Publication Date
CN105022470A true CN105022470A (en) 2015-11-04

Family

ID=54323443

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410153736.2A Withdrawn CN105022470A (en) 2014-04-17 2014-04-17 Method and device of terminal operation based on lip reading

Country Status (2)

Country Link
CN (1) CN105022470A (en)
WO (1) WO2015158082A1 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105632497A (en) * 2016-01-06 2016-06-01 昆山龙腾光电有限公司 Voice output method, voice output system
CN106250829A (en) * 2016-07-22 2016-12-21 中国科学院自动化研究所 Digit recognition method based on lip texture structure
CN107293300A (en) * 2017-08-01 2017-10-24 珠海市魅族科技有限公司 Audio recognition method and device, computer installation and readable storage medium storing program for executing
CN107839440A (en) * 2017-11-07 2018-03-27 蔡璟 A kind of vehicular air purifier based on Intelligent Recognition
CN107911614A (en) * 2017-12-25 2018-04-13 腾讯数码(天津)有限公司 A kind of image capturing method based on gesture, device and storage medium
CN108052858A (en) * 2017-10-30 2018-05-18 珠海格力电器股份有限公司 The control method and smoke exhaust ventilator of smoke exhaust ventilator
WO2018113649A1 (en) * 2016-12-21 2018-06-28 深圳市掌网科技股份有限公司 Virtual reality language interaction system and method
WO2018113650A1 (en) * 2016-12-21 2018-06-28 深圳市掌网科技股份有限公司 Virtual reality language interaction system and method
CN108521516A (en) * 2018-03-30 2018-09-11 百度在线网络技术(北京)有限公司 Control method and device for terminal device
CN108537207A (en) * 2018-04-24 2018-09-14 Oppo广东移动通信有限公司 Lip reading recognition methods, device, storage medium and mobile terminal
CN110570862A (en) * 2019-10-09 2019-12-13 三星电子(中国)研发中心 voice recognition method and intelligent voice engine device
CN111176430A (en) * 2018-11-13 2020-05-19 奇酷互联网络科技(深圳)有限公司 Interaction method of intelligent terminal, intelligent terminal and storage medium
CN111201786A (en) * 2018-01-17 2020-05-26 Jvc建伍株式会社 Display control device, communication device, display control method, and program
WO2021196802A1 (en) * 2020-03-31 2021-10-07 科大讯飞股份有限公司 Method, apparatus, and device for training multimode voice recognition model, and storage medium
CN114708642A (en) * 2022-05-24 2022-07-05 成都锦城学院 Business English simulation training device, system, method and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101472066A (en) * 2007-12-27 2009-07-01 华晶科技股份有限公司 Near-end control method of image viewfinding device and image viewfinding device applying the method
CN101510256A (en) * 2009-03-20 2009-08-19 深圳华为通信技术有限公司 Mouth shape language conversion method and device
CN102023703A (en) * 2009-09-22 2011-04-20 现代自动车株式会社 Combined lip reading and voice recognition multimodal interface system
CN102298443A (en) * 2011-06-24 2011-12-28 华南理工大学 Smart home voice control system combined with video channel and control method thereof
CN202110564U (en) * 2011-06-24 2012-01-11 华南理工大学 Intelligent household voice control system combined with video channel
CN102324035A (en) * 2011-08-19 2012-01-18 广东好帮手电子科技股份有限公司 Method and system of applying lip posture assisted speech recognition technique to vehicle navigation
CN102664008A (en) * 2012-04-27 2012-09-12 上海量明科技发展有限公司 Method, terminal and system for transmitting data
EP2562746A1 (en) * 2011-08-25 2013-02-27 Samsung Electronics Co., Ltd. Apparatus and method for recognizing voice by using lip image

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101472066A (en) * 2007-12-27 2009-07-01 华晶科技股份有限公司 Near-end control method of image viewfinding device and image viewfinding device applying the method
CN101510256A (en) * 2009-03-20 2009-08-19 深圳华为通信技术有限公司 Mouth shape language conversion method and device
CN102023703A (en) * 2009-09-22 2011-04-20 现代自动车株式会社 Combined lip reading and voice recognition multimodal interface system
CN102298443A (en) * 2011-06-24 2011-12-28 华南理工大学 Smart home voice control system combined with video channel and control method thereof
CN202110564U (en) * 2011-06-24 2012-01-11 华南理工大学 Intelligent household voice control system combined with video channel
CN102324035A (en) * 2011-08-19 2012-01-18 广东好帮手电子科技股份有限公司 Method and system of applying lip posture assisted speech recognition technique to vehicle navigation
EP2562746A1 (en) * 2011-08-25 2013-02-27 Samsung Electronics Co., Ltd. Apparatus and method for recognizing voice by using lip image
CN102664008A (en) * 2012-04-27 2012-09-12 上海量明科技发展有限公司 Method, terminal and system for transmitting data

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105632497A (en) * 2016-01-06 2016-06-01 昆山龙腾光电有限公司 Voice output method, voice output system
CN106250829A (en) * 2016-07-22 2016-12-21 中国科学院自动化研究所 Digit recognition method based on lip texture structure
CN108227904A (en) * 2016-12-21 2018-06-29 深圳市掌网科技股份有限公司 A kind of virtual reality language interactive system and method
WO2018113649A1 (en) * 2016-12-21 2018-06-28 深圳市掌网科技股份有限公司 Virtual reality language interaction system and method
WO2018113650A1 (en) * 2016-12-21 2018-06-28 深圳市掌网科技股份有限公司 Virtual reality language interaction system and method
CN108227903A (en) * 2016-12-21 2018-06-29 深圳市掌网科技股份有限公司 A kind of virtual reality language interactive system and method
CN108227903B (en) * 2016-12-21 2020-01-10 深圳市掌网科技股份有限公司 Virtual reality language interaction system and method
CN107293300A (en) * 2017-08-01 2017-10-24 珠海市魅族科技有限公司 Audio recognition method and device, computer installation and readable storage medium storing program for executing
CN108052858A (en) * 2017-10-30 2018-05-18 珠海格力电器股份有限公司 The control method and smoke exhaust ventilator of smoke exhaust ventilator
CN107839440A (en) * 2017-11-07 2018-03-27 蔡璟 A kind of vehicular air purifier based on Intelligent Recognition
CN107911614A (en) * 2017-12-25 2018-04-13 腾讯数码(天津)有限公司 A kind of image capturing method based on gesture, device and storage medium
CN107911614B (en) * 2017-12-25 2019-09-27 腾讯数码(天津)有限公司 A kind of image capturing method based on gesture, device and storage medium
CN111201786B (en) * 2018-01-17 2022-04-08 Jvc建伍株式会社 Display control device, communication device, display control method, and storage medium
CN111201786A (en) * 2018-01-17 2020-05-26 Jvc建伍株式会社 Display control device, communication device, display control method, and program
CN108521516A (en) * 2018-03-30 2018-09-11 百度在线网络技术(北京)有限公司 Control method and device for terminal device
CN108537207A (en) * 2018-04-24 2018-09-14 Oppo广东移动通信有限公司 Lip reading recognition methods, device, storage medium and mobile terminal
CN111176430A (en) * 2018-11-13 2020-05-19 奇酷互联网络科技(深圳)有限公司 Interaction method of intelligent terminal, intelligent terminal and storage medium
CN111176430B (en) * 2018-11-13 2023-10-13 奇酷互联网络科技(深圳)有限公司 Interaction method of intelligent terminal, intelligent terminal and storage medium
CN110570862A (en) * 2019-10-09 2019-12-13 三星电子(中国)研发中心 voice recognition method and intelligent voice engine device
WO2021196802A1 (en) * 2020-03-31 2021-10-07 科大讯飞股份有限公司 Method, apparatus, and device for training multimode voice recognition model, and storage medium
CN114708642A (en) * 2022-05-24 2022-07-05 成都锦城学院 Business English simulation training device, system, method and storage medium
CN114708642B (en) * 2022-05-24 2022-11-18 成都锦城学院 Business English simulation training device, system, method and storage medium

Also Published As

Publication number Publication date
WO2015158082A1 (en) 2015-10-22

Similar Documents

Publication Publication Date Title
CN105022470A (en) Method and device of terminal operation based on lip reading
CN110349081B (en) Image generation method and device, storage medium and electronic equipment
CN105979035A (en) AR image processing method and device as well as intelligent terminal
TW201805744A (en) Control system and control processing method and apparatus capable of directly controlling a device according to the collected information with a simple operation
US11048326B2 (en) Information processing system, information processing method, and program
WO2020078319A1 (en) Gesture-based manipulation method and terminal device
CN105518579A (en) Information processing device and information processing method
CN105635776B (en) Pseudo operation graphical interface remoting control method and system
CN106155315A (en) The adding method of augmented reality effect, device and mobile terminal in a kind of shooting
CN104306118A (en) Smartphone based family monitoring system on intelligent wheelchair
US10388325B1 (en) Non-disruptive NUI command
EP3343936A3 (en) Smart electronic device
CN110349232A (en) Generation method, device, storage medium and the electronic equipment of image
CN106446861A (en) Sign language recognition system, device and method
CN111522524B (en) Presentation control method and device based on conference robot, storage medium and terminal
CN111107278A (en) Image processing method and device, electronic equipment and readable storage medium
CN112241199B (en) Interaction method and device in virtual reality scene
US20150181161A1 (en) Information Processing Method And Information Processing Apparatus
CN207718803U (en) Multiple source speech differentiation identifying system
CN111104827A (en) Image processing method and device, electronic equipment and readable storage medium
CN104104899B (en) The method and apparatus that information transmits in video conference
CN102880288B (en) The method of the man-machine interaction of a kind of 3D display, device and equipment
CN105786361A (en) 3D vehicle-mounted terminal man-machine interaction system
CN110955331A (en) Human-computer interaction system based on computer virtual interface
CN110796096A (en) Training method, device, equipment and medium for gesture recognition model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20151104