CN110221693A - A kind of intelligent retail terminal operating system based on human-computer interaction - Google Patents
A kind of intelligent retail terminal operating system based on human-computer interaction Download PDFInfo
- Publication number
- CN110221693A CN110221693A CN201910436106.9A CN201910436106A CN110221693A CN 110221693 A CN110221693 A CN 110221693A CN 201910436106 A CN201910436106 A CN 201910436106A CN 110221693 A CN110221693 A CN 110221693A
- Authority
- CN
- China
- Prior art keywords
- information
- user
- capture module
- module
- human
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/013—Eye tracking input arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/03—Arrangements for converting the position or the displacement of a member into a coded form
- G06F3/041—Digitisers, e.g. for touch screens or touch pads, characterised by the transducing means
- G06F3/0416—Control or interface arrangements specially adapted for digitisers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/18—Eye characteristics, e.g. of the iris
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Abstract
The intelligent retail terminal operating system based on human-computer interaction that the invention discloses a kind of is related to for field of human-computer interaction, which includes motion capture module, voice capture module, video capture module, touch capture module.Based on the user is intended to determine to be determined by current block, supplemented by other modules determine.Being intended to final judgement is to be codetermined by four module, and get weight according to module contribution rate.The present invention can effectively enhance in certain circumstances, such as self-retailing terminal, bank self-help terminal, insurance self-aided terminal etc. need human-computer interaction usage scenario of the client front in face of equipment, to customer voice order, gesture command or touch order can recognize, and system is mutually authenticated, and substantially increases accuracy of identification.
Description
Technical field
The intelligent retail terminal operating system based on human-computer interaction that the invention discloses a kind of, is related to field of human-computer interaction.
Background technique
Current human-computer interaction technology is all based entirely on single user intent information and is identified, such as based on semanteme
Audio identification, based on mode segmentation video identification, the touch of gesture identification and conventional touch-based screen based on gesture
Identification etc..But in these methods, other than traditional touch recognition, all there is the problems such as this accuracy of identification is not high, often occur
Machine error understands the phenomenon that user is intended to.It is under a noisy environment especially for user, same sound source position different people
Alternating speaks or nearby has other people in the case where acquisition, can not effectively identify that target user is intended to.
Summary of the invention
Technical problem: to solve the above-mentioned problems.The invention discloses a kind of intelligent retail terminal based on human-computer interaction
Operating system.
Technical solution: disclosing a kind of intelligent retail terminal operating system based on human-computer interaction, is related to for man-machine friendship
Mutual field, the system include motion capture module, voice capture module, video capture module, touch capture module.The use
Based on family is intended to determine to be determined by current block, supplemented by other modules determine.It is common by four module for being intended to final judgement
It determines, and weight is got according to module contribution rate.
A kind of intelligent retail terminal operating system based on human-computer interaction, it is characterised in that: the movement is caught
User gesture and three-dimensional depth information can be captured by depth camera by catching module.Depth camera can be Kinect and Leap
The dedicated body-sensing camera such as Motion.Gesture library built in intelligent retail terminal operating system based on human-computer interaction, can in real time with
User gesture matches, and assessment user is intended to.Gesture information can be used as main information and carry out user's intention judgement, can also make
Assistant voice capture module, which is compared, for potential action carries out user's intention judgement;Mould is captured as accurate limb action auxiliary video
Block carries out user and is intended to determine;It assists touching capture module progress user's intention judgement as accurate position.
A kind of intelligent retail terminal operating system based on human-computer interaction, it is characterised in that: the voice is caught
Module is caught, audio-frequency information is acquired by audio sensor in real time.The audio-frequency information being collected into is subjected to noise reduction process, and to main
Sound source distinguishes enhancing.By determining an extremely multiple primary operational persons in main sound source, and extract its voice
In keyword.Sound bank built in intelligent retail terminal operating system based on human-computer interaction, can in real time with user's keyword phase
Matching, assessment user are intended to.Key word information in voice messaging can be used as main information and carry out user's intention judgement, can also
To prompt auxiliary movement capture module as semanteme, video capture module and touch capture module carry out user and are intended to determine.
A kind of intelligent retail terminal operating system based on human-computer interaction, it is characterised in that: the video is caught
Module is caught, video information is acquired by imaging sensor or camera in real time.The video information being collected into is subjected to noise reduction process,
And enhancing is distinguished to high priest.By determining an extremely multiple primary operational persons in high priest, and lock its face
And gesture.Task feature database built in intelligent retail terminal operating system based on human-computer interaction, the lip in facial expression information
Shape of mouth can be used as main information and carry out user's intention judgement, can also be used as movement trend and sentences with the movement of other noninventories
Determine auxiliary movement capture module and carries out user's intention judgement;It can also be used as the shape of the mouth as one speaks and compare assistant voice capture module progress user
It is intended to determine;Eye tracking information in facial expression information can be used as main information and carry out user's intention judgement, can also be used as
Eyes focus auxiliary touches capture module and carries out user's intention judgement.
A kind of intelligent retail terminal operating system based on human-computer interaction, it is characterised in that: the touch is caught
Module is caught, acquires the upper electric current of screen or information of voltage in real time by the sensor on touch screen.The power information being collected into is dropped
It makes an uproar processing, and enhancing is distinguished to high priest.By determining an extremely multiple primary operational persons in high priest, and lock
Operating position.The location information of touch screen can be used as main information and carry out user's judgement.Other opposite several information capture moulds
Block, the accuracy of information highest of the module, therefore its priority is also highest.
A kind of intelligent retail terminal operating system based on human-computer interaction, it is characterised in that: each user is intended to
The information of judgement both is from four modules, but is not that each module can determine user's intention.Setting can determine whether out user
The module of intention has n (n≤4) a, and when having, when touching capture module, main information and the auxiliary information for touching capture module are each
40% is accounted for, other modules divide Weighted residue equally;When not having touch modules, main information accounts for 70%, other modules divide residue equally
Weight.
A kind of intelligent retail terminal operating system based on human-computer interaction, it is characterised in that: since user is doing
When selection, the information more than one often divulged out, including its language, micro- expression and touches etc. consciously and to be all limb action
The user that reflects being more clear is intended to, therefore multichannel information acquires simultaneously and mutually determines, it is accurate to be more conducive to system
Understand that user selects.
The utility model has the advantages that the present invention has the advantage that
1. acquiring using multichannel information and mutually determining simultaneously, it is more conducive to system accurate understanding user selection.
2. using different weights and priority level initializing, it is more conducive to guarantee that main information is mainly adopted, secondary information quilt
It abandons as far as possible.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needing in embodiment using original
Attached drawing is briefly described.
Fig. 1 is overall structure top view illustration of the invention;
Fig. 2 is that multichannel information acquires simultaneously and mutually determine schematic diagram;
Specific embodiment
Below in conjunction with attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete
Ground description.
Fig. 1 shows that the system includes motion capture module, voice capture module, video capture module, touch capture mould
Block.Based on the user is intended to determine to be determined by current block, supplemented by other modules determine.Being intended to final judgement is by four
Big module codetermines, and gets weight according to module contribution rate.
A kind of intelligent retail terminal operating system based on human-computer interaction, it is characterised in that: the movement is caught
User gesture and three-dimensional depth information can be captured by depth camera by catching module.Depth camera can be Kinect and Leap
The dedicated body-sensing camera such as Motion.Gesture library built in intelligent retail terminal operating system based on human-computer interaction, can in real time with
User gesture matches, and assessment user is intended to.Gesture information can be used as main information and carry out user's intention judgement, can also make
Assistant voice capture module, which is compared, for potential action carries out user's intention judgement;Mould is captured as accurate limb action auxiliary video
Block carries out user and is intended to determine;It assists touching capture module progress user's intention judgement as accurate position.
A kind of intelligent retail terminal operating system based on human-computer interaction, it is characterised in that: the voice is caught
Module is caught, audio-frequency information is acquired by audio sensor in real time.The audio-frequency information being collected into is subjected to noise reduction process, and to main
Sound source distinguishes enhancing.By determining an extremely multiple primary operational persons in main sound source, and extract its voice
In keyword.Sound bank built in intelligent retail terminal operating system based on human-computer interaction, can in real time with user's keyword phase
Matching, assessment user are intended to.Key word information in voice messaging can be used as main information and carry out user's intention judgement, can also
To prompt auxiliary movement capture module as semanteme, video capture module and touch capture module carry out user and are intended to determine.
A kind of intelligent retail terminal operating system based on human-computer interaction, it is characterised in that: the video is caught
Module is caught, video information is acquired by imaging sensor or camera in real time.The video information being collected into is subjected to noise reduction process,
And enhancing is distinguished to high priest.By determining an extremely multiple primary operational persons in high priest, and lock its face
And gesture.Task feature database built in intelligent retail terminal operating system based on human-computer interaction, the lip in facial expression information
Shape of mouth can be used as main information and carry out user's intention judgement, can also be used as movement trend and sentences with the movement of other noninventories
Determine auxiliary movement capture module and carries out user's intention judgement;It can also be used as the shape of the mouth as one speaks and compare assistant voice capture module progress user
It is intended to determine;Eye tracking information in facial expression information can be used as main information and carry out user's intention judgement, can also be used as
Eyes focus auxiliary touches capture module and carries out user's intention judgement.
A kind of intelligent retail terminal operating system based on human-computer interaction, it is characterised in that: the touch is caught
Module is caught, acquires the upper electric current of screen or information of voltage in real time by the sensor on touch screen.The power information being collected into is dropped
It makes an uproar processing, and enhancing is distinguished to high priest.By determining an extremely multiple primary operational persons in high priest, and lock
Operating position.The location information of touch screen can be used as main information and carry out user's judgement.Other opposite several information capture moulds
Block, the accuracy of information highest of the module, therefore its priority is also highest.
A kind of intelligent retail terminal operating system based on human-computer interaction, it is characterised in that: each user is intended to
The information of judgement both is from four modules, but is not that each module can determine user's intention.Setting can determine whether out user
The module of intention has n (n≤4) a, and when having, when touching capture module, main information and the auxiliary information for touching capture module are each
40% is accounted for, other modules divide Weighted residue equally;When not having touch modules, main information accounts for 70%, other modules divide residue equally
Weight.
A kind of intelligent retail terminal operating system based on human-computer interaction, it is characterised in that: since user is doing
When selection, the information more than one often divulged out, including its language, micro- expression and touches etc. consciously and to be all limb action
The user that reflects being more clear is intended to, therefore multichannel information acquires simultaneously and mutually determines, it is accurate to be more conducive to system
Understand that user selects.
Particularly, the present embodiment also provides following functions module.
One, audio processing modules;Audio processing modules include speech preprocessing module and speech recognition module.
1, speech preprocessing module includes but are not limited to auditory localization, sound source increasing according to the difference of audio input device
By force, echo cancellor, noise suppression etc. can improve the accuracy rate for distinguishing ambient sound and speech sound.
2, speech recognition module includes but are not limited to cloud speech recognition system according to specific products scheme deployment scenario
System, end side speech recognition apparatus or speech recognition algorithm etc..
Two, image processing module;Image processing module includes face recognition module, lip state recognition module and lip reading
Identification module.
1, face recognition module includes but are not limited to cloud face identification system, end side face recognition module and face knowledge
Other algorithm.Mainly realize Face detection, face characteristic extracts and compare and human face characteristic point and eyes, nose, mouth
Position and profile calibration.
2, lip state recognition module includes but are not limited to cloud lip state recognition system, end side lip state recognition
Module and lip state recognition algorithm, wherein algorithm can using common HAAR+Cascade, HOG+SVM or VGG,
The state recognition algorithm of the classifiers model realization such as AlexNet, Inception, ResNet.Mainly realize dependence recognition of face
The information that module provides judges the function of the lip state of specified current face.
3, lip reading identification module includes but are not limited to lip reading identifying system, lip reading identification module and lip reading recognizer,
Algorithm wherein based on lip reading identification, it is main to be calculated using RNN+LSTM etc. based on the deep learning model of time series identification
Method.Lip state in the continuous videos according to input is realized, corresponding lip reading and the language content to be said of speaker are provided
The function of text.
Three, voice synthetic module;Voice synthetic module mainly includes speaking to terminate judgment module and speech recognition correction mould
Block.
1, speak terminate judgment module include but are not limited to speak terminate judgement system, speaking terminates judgment module and says
Words, which terminate, judges algorithm, and algorithm is using common moulds based on deep learning such as VGG, AlexNet, Inception, ResNet
The classification and identification algorithm of type increases the input of audio fragment sequence on the basis of tradition inputs tomographic image.Pass through current video
Identification lip reading state outcome and the newest one section of segment of audio are judged.
2, speech recognition corrects module and includes but are not limited to speech recognition correcting system, speech recognition correction module and language
Sound identifies that correct algorithm, algorithm use RNN+LSTM etc. based on the deep learning model algorithm of time series identification, and model is defeated
Enter the text sequence and corresponding lip reading identification sequence and lip reading state that feature is speech recognition result, output end is after correcting
Speech text.It is compared by the lip reading recognition result and speech recognition result of input, carries out corresponding speech recognition and entangle
Just, wherein mainly being realized using the method for deep learning.
The present embodiment is identified together with voice signal by the way that vision signal input is added in traditional voice identification process;
Voice auxiliary is carried out in recognition of face and human face and lip movement identification, judges whether that the target to be identified is being spoken;Together
When, by recognition of face and auxiliary positioning, judge speaker orientation, and utilize corresponding orientation, to assigned direction sound-source signal into
Row enhancing processing.By this technology, can effectively it enhance in particular circumstances, such as self-retailing terminal, bank self-help are whole
End, insurance self-aided terminal etc. need human-computer interaction usage scenario of the user front in face of equipment, to customer voice order and voice
Input the accuracy rate of information identification.
Claims (7)
1. a kind of intelligent retail terminal operating system based on human-computer interaction is related to for field of human-computer interaction, which includes
Motion capture module, voice capture module, video capture module touch capture module;The user is intended to determine by working as
Based on front module determines, supplemented by the judgement of other modules;Be intended to it is final determine to be codetermined by four module, and according to module tribute
Rate is offered to get weight.
2. a kind of intelligent retail terminal operating system based on human-computer interaction according to claim 1, it is characterised in that: institute
The motion capture module stated can capture user gesture and three-dimensional depth information by depth camera;Depth camera can be
The dedicated body-sensing camera of Kinect or Leap Motion;Gesture built in intelligent retail terminal operating system based on human-computer interaction
Library can match with user gesture in real time, and assessment user is intended to;Gesture information can be used as main information progress user and be intended to sentence
It is fixed, it can also be used as potential action and compare the progress user's intention judgement of assistant voice capture module;It is auxiliary as accurate limb action
It helps video capture module to carry out user to be intended to determine;It assists touch capture module to carry out user as accurate position to be intended to sentence
It is fixed.
3. a kind of intelligent retail terminal operating system based on human-computer interaction according to claim 1, it is characterised in that: institute
The voice capture module stated acquires audio-frequency information by audio sensor in real time;The audio-frequency information being collected into is carried out at noise reduction
Reason, and enhancing is distinguished to main sound source;By determining an extremely multiple primary operational persons in main sound source, and
Extract the keyword in its voice;Sound bank built in intelligent retail terminal operating system based on human-computer interaction, can in real time with use
Family keyword matches, and assessment user is intended to;Key word information in voice messaging can be used as main information and carry out user's intention
Determine, can also be used as semantic prompt auxiliary movement capture module, video capture module and touch capture module carry out user's meaning
Figure determines.
4. a kind of intelligent retail terminal operating system based on human-computer interaction according to claim 1, it is characterised in that: institute
The video capture module stated acquires video information by imaging sensor or camera in real time;By the video information being collected into
Row noise reduction process, and enhancing is distinguished to high priest;By determining an extremely multiple primary operational persons in high priest, and
Lock its face and gesture;Task feature database built in intelligent retail terminal operating system based on human-computer interaction, facial expression letter
Lip Shape of mouth in breath can be used as main information and carry out user's intention judgement, it is non-with other to can also be used as movement trend
Inventory, which acts, determines that auxiliary movement capture module carries out user and is intended to determine;It can also be used as the shape of the mouth as one speaks and compare assistant voice capture mould
Block carries out user and is intended to determine;Eye tracking information in facial expression information can be used as main information and carry out user's intention judgement,
It can also be used as eyes focus auxiliary and touch capture module progress user's intention judgement.
5. a kind of intelligent retail terminal operating system based on human-computer interaction according to claim 1, it is characterised in that: institute
The touch capture module stated acquires the upper electric current of screen or information of voltage by the sensor on touch screen in real time;The electricity that will be collected into
Information carries out noise reduction process, and distinguishes enhancing to high priest;By determining an extremely multiple main behaviour in high priest
Author, and lock operation position;The location information of touch screen can be used as main information and carry out user's judgement;Other are several relatively
Information capturing module, the accuracy of information highest of the module, therefore its priority is also highest.
6. a kind of intelligent retail terminal operating system based on human-computer interaction according to claim 1, it is characterised in that: every
The information that a user is intended to determine both is from four modules, but is not that each module can determine user's intention;Setting can
The module for judging that user is intended to has n (n≤4) a, when there is touch capture module, main information and touch capture module
Auxiliary information respectively accounts for 40%, other modules divide Weighted residue equally;When not having touch modules, main information accounts for 70%, other moulds
Block divides Weighted residue equally.
7. a kind of intelligent retail terminal operating system based on human-computer interaction according to claim 1, it is characterised in that: by
In user when selecting, the information more than one often divulged out, including its language, micro- expression, limb action and consciously
Touch all is that the user that reflects that will be more clear is intended to, therefore multichannel information acquires simultaneously and mutually judgement, is more conducive to
System accurate understanding user selection.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910436106.9A CN110221693A (en) | 2019-05-23 | 2019-05-23 | A kind of intelligent retail terminal operating system based on human-computer interaction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910436106.9A CN110221693A (en) | 2019-05-23 | 2019-05-23 | A kind of intelligent retail terminal operating system based on human-computer interaction |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110221693A true CN110221693A (en) | 2019-09-10 |
Family
ID=67817887
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910436106.9A Pending CN110221693A (en) | 2019-05-23 | 2019-05-23 | A kind of intelligent retail terminal operating system based on human-computer interaction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110221693A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111080927A (en) * | 2019-12-17 | 2020-04-28 | 中国建设银行股份有限公司 | Software architecture method and device for closed self-service financial equipment |
CN111968628A (en) * | 2020-08-22 | 2020-11-20 | 彭玲玲 | Signal accuracy adjusting system and method for voice instruction capture |
CN114781401A (en) * | 2022-05-06 | 2022-07-22 | 马上消费金融股份有限公司 | Data processing method, device, equipment and storage medium |
WO2022252951A1 (en) * | 2021-06-02 | 2022-12-08 | International Business Machines Corporation | Curiosity based activation and search depth |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110302538A1 (en) * | 2010-06-03 | 2011-12-08 | Vennelakanti Ramadevi | System and method for distinguishing multimodal commands directed at a machine from ambient human communications |
CN202110564U (en) * | 2011-06-24 | 2012-01-11 | 华南理工大学 | Intelligent household voice control system combined with video channel |
CN105739688A (en) * | 2016-01-21 | 2016-07-06 | 北京光年无限科技有限公司 | Man-machine interaction method and device based on emotion system, and man-machine interaction system |
CN107239139A (en) * | 2017-05-18 | 2017-10-10 | 刘国华 | Based on the man-machine interaction method and system faced |
CN107728780A (en) * | 2017-09-18 | 2018-02-23 | 北京光年无限科技有限公司 | A kind of man-machine interaction method and device based on virtual robot |
CN109410957A (en) * | 2018-11-30 | 2019-03-01 | 福建实达电脑设备有限公司 | Positive human-computer interaction audio recognition method and system based on computer vision auxiliary |
-
2019
- 2019-05-23 CN CN201910436106.9A patent/CN110221693A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110302538A1 (en) * | 2010-06-03 | 2011-12-08 | Vennelakanti Ramadevi | System and method for distinguishing multimodal commands directed at a machine from ambient human communications |
CN202110564U (en) * | 2011-06-24 | 2012-01-11 | 华南理工大学 | Intelligent household voice control system combined with video channel |
CN105739688A (en) * | 2016-01-21 | 2016-07-06 | 北京光年无限科技有限公司 | Man-machine interaction method and device based on emotion system, and man-machine interaction system |
CN107239139A (en) * | 2017-05-18 | 2017-10-10 | 刘国华 | Based on the man-machine interaction method and system faced |
CN107728780A (en) * | 2017-09-18 | 2018-02-23 | 北京光年无限科技有限公司 | A kind of man-machine interaction method and device based on virtual robot |
CN109410957A (en) * | 2018-11-30 | 2019-03-01 | 福建实达电脑设备有限公司 | Positive human-computer interaction audio recognition method and system based on computer vision auxiliary |
Non-Patent Citations (1)
Title |
---|
陈玥同、赵海峰、陈杰: ""融合语音、体感、触摸方式的人机交互整合框架"", 《第四届中国指挥控制大会论文集》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111080927A (en) * | 2019-12-17 | 2020-04-28 | 中国建设银行股份有限公司 | Software architecture method and device for closed self-service financial equipment |
CN111968628A (en) * | 2020-08-22 | 2020-11-20 | 彭玲玲 | Signal accuracy adjusting system and method for voice instruction capture |
CN111968628B (en) * | 2020-08-22 | 2021-06-25 | 南京硅基智能科技有限公司 | Signal accuracy adjusting system and method for voice instruction capture |
WO2022252951A1 (en) * | 2021-06-02 | 2022-12-08 | International Business Machines Corporation | Curiosity based activation and search depth |
CN114781401A (en) * | 2022-05-06 | 2022-07-22 | 马上消费金融股份有限公司 | Data processing method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110221693A (en) | A kind of intelligent retail terminal operating system based on human-computer interaction | |
US10776470B2 (en) | Verifying identity based on facial dynamics | |
US11854550B2 (en) | Determining input for speech processing engine | |
Sahoo et al. | Emotion recognition from audio-visual data using rule based decision level fusion | |
CN105739688A (en) | Man-machine interaction method and device based on emotion system, and man-machine interaction system | |
KR101214732B1 (en) | Apparatus and method for recognizing faces using a plurality of face images | |
Minotto et al. | Multimodal multi-channel on-line speaker diarization using sensor fusion through SVM | |
JP2012038131A (en) | Information processing unit, information processing method, and program | |
Paleari et al. | Features for multimodal emotion recognition: An extensive study | |
Alshamsi et al. | Automated facial expression and speech emotion recognition app development on smart phones using cloud computing | |
US20210012064A1 (en) | Recording medium recording complementary program, complementary method, and information processing device | |
Cai et al. | Visual focus of attention estimation using eye center localization | |
Dalka et al. | Visual lip contour detection for the purpose of speech recognition | |
Nath et al. | Embedded sign language interpreter system for deaf and dumb people | |
JP2019154575A (en) | Individual identification device and feature collection device | |
JP7032284B2 (en) | A device, program and method for estimating the activation timing based on the image of the user's face. | |
Nakata et al. | Lipreading method using color extraction method and eigenspace technique | |
Zheng et al. | Review of lip-reading recognition | |
Suganya et al. | Design Of a Communication aid for physically challenged | |
Sithara et al. | A survey on face recognition technique | |
Ouellet et al. | Multimodal biometric identification system for mobile robots combining human metrology to face recognition and speaker identification | |
Ivanko et al. | A novel task-oriented approach toward automated lip-reading system implementation | |
Sahu et al. | Result based analysis of various lip tracking systems | |
KR20170090872A (en) | Apparatus and Method for Recognizing User using Expression and Motion | |
Tang et al. | Multimodal emotion recognition (MER) system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190910 |
|
WD01 | Invention patent application deemed withdrawn after publication |