CN106547356A - Intelligent interactive method and device - Google Patents
Intelligent interactive method and device Download PDFInfo
- Publication number
- CN106547356A CN106547356A CN201611025898.3A CN201611025898A CN106547356A CN 106547356 A CN106547356 A CN 106547356A CN 201611025898 A CN201611025898 A CN 201611025898A CN 106547356 A CN106547356 A CN 106547356A
- Authority
- CN
- China
- Prior art keywords
- user
- image
- hand motion
- hand
- region
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Social Psychology (AREA)
- Psychiatry (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- User Interface Of Digital Computer (AREA)
- Image Analysis (AREA)
Abstract
The application proposes a kind of intelligent interactive method and device, and the intelligent interactive method includes:User's hand motion image is obtained, user's hand motion image is obtained after shooting to user's hand motion;According to user's hand motion image, the corresponding class of operation of user's hand motion is determined;According to the class of operation, user's hand motion is responded.The method can be realized man-machine efficiently interacting naturally.
Description
Technical field
The application is related to human-computer interaction technique field, more particularly to a kind of intelligent interactive method and device.
Background technology
Increasingly mature with artificial intelligence correlation technique, the life of people starts to move towards intelligent, such as various intelligent families
Residence progressed into usually other, such as various augmented reality equipment are had started to into practical, so that man-machine friendship
Mutually also day by day usually it is and required, during man-machine interaction, most it is concerned with whether people naturally can be interacted with machine by user, or even
The degree interacted with people can be reached;Therefore, increasing technical staff begins one's study and how efficiently to realize naturally
The process of man-machine interaction.
In correlation technique, when user carries out intelligent interaction using hand and machine, it is necessary first in hand usage record equipment,
Such as writing pencil, hand-written fingerstall etc.;Then the two dimension or three-dimensional coordinate data of user's hand motion are gathered according to recording equipment;Root again
The action of user's hand or the movement locus of hand are identified according to the hand data of collection, to determine the operation of user, by
System provides the response results of corresponding operating.
But, above-mentioned intelligent interaction mode does not simultaneously meet natural interaction custom, and the data for being also easily caused collection are inaccurate,
Cause interaction effect undesirable.
The content of the invention
The application is intended at least to solve to a certain extent one of technical problem in correlation technique.
For this purpose, a purpose of the application is to propose a kind of intelligent interactive method, the method can realize man-machine nature
Efficient interaction.
Further object is to propose a kind of intelligent interaction device.
To reach above-mentioned purpose, the intelligent interactive method that the application first aspect embodiment is proposed, including:Obtain user's hand
Portion's motion images, user's hand motion image are obtained after shooting to user's hand motion;According to the user
Hand motion image, determines the corresponding class of operation of user's hand motion;According to the class of operation, to user's hand
Portion's action is responded.
The intelligent interactive method that the application first aspect embodiment is proposed, is grasped by being determined according to user's hand motion image
Make classification, and respective response is carried out according to class of operation, it may not be necessary to Special Equipment is worn in user's hand, so as to meet
Natural interaction is accustomed to, and the degree of accuracy of the data of collection can also be improved by the process to image, man-machine naturally high so as to realize
The interaction of effect.
To reach above-mentioned purpose, the intelligent interaction device that the application second aspect embodiment is proposed, including:Acquisition module,
User's hand motion image is obtained, user's hand motion image is obtained after shooting to user's hand motion;Really
Cover half block, for according to user's hand motion image, determining the corresponding class of operation of user's hand motion;Response mould
Block, for according to the class of operation, responding to user's hand motion.
The intelligent interaction device that the application second aspect embodiment is proposed, is grasped by being determined according to user's hand motion image
Make classification, and respective response is carried out according to class of operation, it may not be necessary to Special Equipment is worn in user's hand, so as to meet
Natural interaction is accustomed to, and the degree of accuracy of the data of collection can also be improved by the process to image, man-machine naturally high so as to realize
The interaction of effect.
The aspect and advantage that the application is added will be set forth in part in the description, and partly will become from the following description
Obtain substantially, or recognized by the practice of the application.
Description of the drawings
The above-mentioned and/or additional aspect of the application and advantage will become from the following description of the accompanying drawings of embodiments
It is substantially and easy to understand, wherein:
Fig. 1 is the schematic flow sheet of the intelligent interactive method that the application one embodiment is proposed;
Fig. 2 is the schematic flow sheet of the intelligent interactive method that the application another embodiment is proposed;
Fig. 3 is a kind of network topology structure schematic diagram of user's hand motion recognition model in the embodiment of the present application;
Fig. 4 is the schematic diagram of one group of user's hand motion image in the embodiment of the present application;
Fig. 5 is a kind of schematic diagram of the image for showing skin area in the embodiment of the present application;
Fig. 6 is a kind of schematic diagram of the image for showing hand region in the embodiment of the present application;
The schematic diagram of corresponding user's hand motion when Fig. 7 is single-click operation in the embodiment of the present application;
Fig. 8 is the schematic diagram of corresponding user's hand motion when choosing text maninulation in the embodiment of the present application;
Fig. 9 is various schematic diagrames for operating the corresponding user's hand motion of difference in the embodiment of the present application;
Figure 10 is the structural representation of the intelligent interaction device that the application one embodiment is proposed;
Figure 11 is the structural representation of the intelligent interaction device that the application another embodiment is proposed.
Specific embodiment
Embodiments herein is described below in detail, the example of the embodiment is shown in the drawings, wherein from start to finish
Same or similar label represents same or similar module or the module with same or like function.Below with reference to attached
The embodiment of figure description is exemplary, is only used for explaining the application, and it is not intended that restriction to the application.Conversely, this
The embodiment of application includes all changes fallen in the range of the spirit and intension of attached claims, modification and is equal to
Thing.
Fig. 1 is the schematic flow sheet of the intelligent interactive method that the application one embodiment is proposed.
As shown in figure 1, the method for the present embodiment includes:
S11:User's hand motion image is obtained, user's hand motion image is that user's hand motion is shot
Obtain afterwards.
User's hand motion refers to the hand motion that the track of user's hand movement, activity of finger etc. are combined into, institute
State hand motion to be generally used for operating screen or aerial display content, the display content can be displayed in screen or
The contents such as aerial text, image, application program, concrete this case are not construed as limiting.
The hand motion can be the single-handed exercise of user, or the both hands of user or many works manually, work as reception
When making manually more, there are multiple users to participate in interaction;The hand motion is such as clenched fist, opens palm, stretches out the actions such as forefinger.
It should be noted that user's hand motion can be realized by hand or hands even many hands, it is same
Operation can be completed using one or more hand motions, concrete hand motion can also be determined according to application demand it is various, not
It is limited to hand motion described in this case.
For example, camera or video camera are set on smart machine, user's hand motion are entered by camera or video camera
Row shooting obtains user's hand motion image, and processing system can get user's hand motion figure from camera or video camera
Picture.When shooting to user's hand motion, can shoot and obtain a frame or multiple image, such as utilize camera or video camera
After being continuously shot to user's hand, continuous multiple frames user's hand motion image is obtained, when specifically shooting, is generally used and is carried
The video camera of RGBD sensors is shot, and directly obtains the RGBD data of user's hand motion, i.e., including color data RGB and
Depth data D, such that it is able to directly obtain the RGB image and depth map of user's hand motion.Rgb color pattern is industrial quarters
A kind of color standard, is by red (R), green (G), the change of blue (B) three Color Channels and their each other folded
Calais obtains color miscellaneous, and RGB is the color for representing three passages of red, green, blue, and this standard is almost included
The all colours that human eyesight can perceive are at present with one of most wide color system.In scene, each point is relative to shooting
The distance of machine can be represented with depth map (depth map), i.e., each pixel value in depth map represents a certain in scene
The distance between point and video camera.
It should be noted that user is when hand motion is carried out, can be carried out with naked hand, i.e., need not be worn specially in hand
Recording equipment.And, user can be specifically referred to contactless when operating to screen or the aerial content for showing
Operation, i.e., and do not need user's contact screen can just complete to display content operation.
It is understood that the embodiment of the present application is by taking hand motion as an example, but, what other body parts of user were carried out
Operation, such as headwork, arm action etc. can also be performed according to the mode of hand motion, therefore belong to the embodiment of the present application
Equivalent implementations.
S12:According to user's hand motion image, the corresponding class of operation of user's hand motion is determined.
The corresponding class of operation of user's hand motion refers to user on screen or the class that operated of aerial display content
Not, such as mobile cursor, crawl content, drag content, release content, it is hand-written, click.
It is determined that during the class of operation, for example, first identifying user's hand motion classification and hand according to described image
Key point position, further according to the user's hand motion classification and key point position that identify, determines the class of operation.Specifically
Content may refer to subsequent descriptions.
S13:According to the class of operation, user's hand motion is responded.
System can carry out respective response according to the response mode of every kind of class of operation set in advance.
When such as current operation classification is hand-written operation, after system determines the class of operation of user, hand-written pattern is switched to, with
The handwritten content of receive user carries out corresponding handwriting recognition, and recognition result is shown;
During such as current operation classification to click, after system determines the class of operation of user, system is provided according to the gesture
Response results, such as user click screen or the aerial application program for showing, perform corresponding operating.
It is understood that before being responded to user's hand motion according to class of operation, can also determine whether full
Foot is pre-conditioned, when meeting pre-conditioned, to respond to user's hand motion further according to class of operation;It is being unsatisfactory for
Do not responded when pre-conditioned.It is pre-conditioned for example to include:Response user's hand motion function is currently had turned on, and operates class
Do not belong to itself and support classification of response etc..
In the present embodiment, by determining class of operation according to user's hand motion image, and carried out according to class of operation
Respective response, it may not be necessary to wear Special Equipment in user's hand, so as to meet natural interaction custom, by the place to image
Reason can also improve the degree of accuracy of the data of collection, so as to realize man-machine efficiently interacting naturally.
Fig. 2 is the schematic flow sheet of the intelligent interactive method that the application another embodiment is proposed.
As shown in Fig. 2 the method for the present embodiment includes:
S21:Build user's hand motion recognition model.
Can specifically include:
(1) obtain training data.
Wherein, every group of training data includes input data and output data, and in the present embodiment, input data includes:User
The depth image of the RGB image of hand region and corresponding user's hand region, output data include:User's hand of mark is moved
Make classification and key point position, be typically based on domain expert's mark and obtain.
Specifically, substantial amounts of user's hand motion image can be first collected, collection mode is as using with RGBD sensors
Video camera user's hand motion is shot, such that it is able to obtain substantial amounts of mutually corresponding RGB image and depth image.
Again the RGB image and depth image in every group of user's hand motion image is split respectively, user's hand region is obtained
The depth image of RGB image and user's hand region.Concrete partitioning scheme may refer to described below.And, per group of use of correspondence
Family hand motion image labeling user's hand motion classification and key point position.
The activity determination of movement or finger of user's hand motion classification according to user's hand region, such as hand motion classification
Including:Clench fist, open palm or stretch out forefinger etc..
It is understood that hand motion classification can be default according to application demand, however it is not limited to above-mentioned example.
The position of key point can be chosen according to application demand etc., such as, by the position of central point of clenching fist, or by forefinger
Position as key point position.
(2) determine the structure of user's hand motion recognition model.
Model structure can be arranged according to demand, and the present embodiment is by taking deep neural network structure as an example.
Fig. 3 gives a kind of network topology structure schematic diagram of user's hand motion recognition model.As shown in figure 3, the mould
Type includes:Input layer, eigentransformation layer, full articulamentum and output layer.
Input layer is input into the RGB image of user's hand region and corresponding depth image respectively;Eigentransformation layer is right respectively
The RGB image and depth image of input carries out eigentransformation, respectively obtains characteristics of image and depth after the conversion of user's hand region
Degree feature, the eigentransformation layer are generally convolutional neural networks structure, per layer of change of concrete transform method and convolutional neural networks
Change method identical;Change of the characteristics of image and depth characteristic after again convert user's hand region through a full articulamentum
Output layer is input to after changing, by output layer export active user's hand motion image belong to every kind of hand motion classification probability and
The key point position of active user's hand motion.
(3) it is trained based on the training data and the structure, builds user's hand motion recognition model.
Such as, the input data in training data, is obtained as mode input after with the computing of each layer parameter of model
Export the probability and active user's hand for belonging to every kind of hand motion classification as active user's hand motion image to move to model
The key point position of work, using probability highest user's hand motion classification as the user's hand motion classification for predicting, will be pre-
The user's hand motion classification measured and the key point position of user's hand motion, then will be defeated in training data used as predicted value
Go out data as actual value, loss function is worth to according to actual value and prediction, by minimizing loss function, mould can be obtained
The each layer parameter of type, obtains model so as to train.Specific model training mode may refer to skill that is various existing or occurring in the future
Art, will not be described in detail herein.
S22:Obtain user's hand motion image.
For example, receive the user's hand motion image sent by camera or video camera, camera or video camera can be
After user produces hand motion, user's hand motion is shot, user's hand motion image is obtained.
Wherein, when shooting to user's hand motion, the video camera with RGBD sensors can be adopted to be connected
Continuous to shoot, so as to obtain continuous multigroup user's hand motion image, every group of user's hand motion image includes a frame RGB image
With a frame depth image.
Fig. 4 gives one group of user's hand motion image, including mutual corresponding frame RGB image and a frame depth map
Picture.Wherein, it is RGB image on the left of Fig. 4, right side is depth image.It should be noted that accompanying drawing requirement is limited to, in Fig. 4
RGB image is by taking gray level image as an example, but in actual enforcement, RGB image is with coloured coloured image.
S23:Determine the user's hand region in user's hand motion image, and according to user's hand region pair
User's hand motion image is split, and obtains user's hand region image.
It is determined that during user's hand region, can include:
According to the RGB image, the skin area in user's hand motion image is determined;
Pixel in the skin area is clustered, different skin region is obtained;
According to depth image, the corresponding depth value in different skin region is obtained, and according to the depth value, determined described
User's hand region in user's hand motion image.
When determining skin area, RGB image can be converted to into CrCb images first, conversion regime can be incited somebody to action using existing
Rgb space is mapped to the mode in CrCb spaces;Again by the skin mask in default CrCb spaces and the CrCb images carry out with
Operation, determines the skin area in user's hand motion image.Skin mask can be by collecting a large amount of skins in advance
Picture construction is obtained, the skin image such as hand skin image.As shown in figure 5, the RGB image for giving Fig. 4 is corresponding aobvious
The image of skin area is shown.It is understood that for the ease of image procossing, bianry image can be converted the image into, its
In, the pixel in skin area is represented with white, and the pixel in non-skin region is represented with black.
After skin area is determined, the pixel in skin area can be clustered, obtain different skin region.
Cluster mode is not limited, and is such as clustered using k-means modes.
After different skin region is obtained, the corresponding depth value in different skin region can be obtained according to depth image, and
According to the depth value, the user's hand region in user's hand motion image is determined.For example, by different skin region
Cluster centre point corresponding to depth image in pixel depth value, as the corresponding depth value of corresponding skin area or
By the mean value of the depth value in all pixels point correspondence depth image in cluster, as the corresponding depth of corresponding skin area
Value, such that it is able to obtain the corresponding depth value of each skin area.As hand is usually located at the front of body, therefore, it can
The minimum skin area of depth value is defined as into user's hand region.As shown in fig. 6, giving from the skin area of Fig. 5 really
The schematic diagram of the user's hand region made.
After user's hand region is determined, the original user's hand motion image for obtaining that shoots can be split,
Obtain user's hand region image.For example, when user's hand motion image includes RGB image and depth image, then directly clapping
The image that user's hand region is partitioned in the RGB image and depth image for obtaining is taken the photograph, the RGB image of user's hand region is obtained
With the depth image of user's hand region.
S24:The user's hand motion recognition model built according to user's hand region image and in advance, identifies institute
State the corresponding user's hand motion classification of user's hand motion image and key point position.
Assume that user's hand region image includes:The depth map of the RGB image and user's hand region of user's hand region
Picture, then corresponding with the model structure shown in Fig. 3, the RGB image and depth image difference of user's hand region that segmentation is obtained
As the input of user's hand motion recognition model, model output is obtained, the model output includes:Every kind of user's hand motion
The probability of classification and key point position;Move probability highest user's hand motion classification as the user's hand for identifying afterwards
Make classification, the key point position that model is exported is used as the key point position identified.
S25:According to the corresponding user's hand motion classification of continuous multigroup user's hand motion image and key point position, really
Determine the corresponding class of operation of user's hand motion.
As hand motion easy to use is relatively limited, therefore some user operation classifications are needed with reference to multiple user's hands
Action realization, such as single-click operation, need palm first to open and clench fist again.Fig. 7 gives user's hand motion corresponding to single-click operation
Schematic diagram, as shown in fig. 7, the corresponding user's hand motion of single-click operation include on the left of Fig. 7 shown in palm open and Fig. 7 on the right side of
Shown clenches fist.Therefore, in order to determine the class of operation of user's hand motion, need first to obtain multigroup user's hand motion figure
Picture, determines that the corresponding user's hand motion classification of every group of user's hand motion image and key point are postponed, then determines user's
Class of operation.How many groups of user's hand motion images are specifically obtained, can be determined according to application demand, such as continuously acquire 15 groups of use
Family hand motion image.
During the class of operation of concrete determination user, can be according to the course of action of predefined every kind of class of operation, pass
The position of key point and it is current in screen or the aerial content for showing determining.
Such as current screen or the interface for user's Query Information for showing in the air, user have been input into respective queries information,
The user action classification for now identifying first is opened for palm and is clenched fist again, that is, in the multigroup user's hand motion image for obtaining, front
Image corresponding user's hand motion classification in several groups of face is opened for palm, behind the corresponding user's hand motion classification of several groups of images
To clench fist, then can be determined that the class of operation of user to click;
As current screen or it is aerial show for multiline text content, multigroup user's hand motion image of acquisition is corresponding
User's hand motion classification is all clenched fist, and the position of user's hand key point is moved in the same direction, then can determine user
Class of operation continuously to choose text;If Fig. 8 is to choose text user hand motion schematic diagram.
Further, when user's hand motion combination of multiple image cannot find respective operations classification, then it is assumed that current
User's hand motion is invalid action, and system does not provide response, or prompting user hand motion malfunctions;
Certainly, except the corresponding user's hand motion of above user operation classification, which can also be predefined according to demand
Its corresponding user's hand motion of user operation classification, as shown in figure 9, figure (a) is crawl display content, such as crawl text,
Crawl image etc., figure (b) are mobile cursor, if currently displaying for a large amount of content of text, cursor is in certain text character
Before, need to will be moved into another location, then can use the operation, figure (c) release content, the operation typically with crawl content
Or other operations are used together, such as text are dragged after crawl text, then discharge the text after movement, figure (d) is hand-written
Operation, the operation are generally used for opening hand-written pattern, when user is needed in screen or aerial input content, it is possible to use the behaviour
Make;This case does not limit the corresponding user's hand motion of predefined user operation classification, and every kind of class of operation can also make
It is with the combination of user's hand motion, every kind of class of operation is directly corresponding with every kind of user's hand motion.
S26:According to the class of operation, user's hand motion is responded.
System can carry out respective response according to the response mode of every kind of class of operation set in advance.
When such as current operation classification is hand-written operation, after system determines the class of operation of user, hand-written pattern is switched to, with
The handwritten content of receive user carries out corresponding handwriting recognition, and recognition result is shown;
During such as current operation classification to click, after system determines the class of operation of user, system is provided according to the gesture
Response results, such as user click screen or the aerial application program for showing, perform corresponding operating.
In the present embodiment, by determining class of operation according to user's hand motion image, and carried out according to class of operation
Respective response, it may not be necessary to wear Special Equipment in user's hand, so as to meet natural interaction custom, by the place to image
Reason can also improve the degree of accuracy of the data of collection, so as to realize man-machine efficiently interacting naturally.By according to continuous multigroup use
The corresponding user's hand motion classification of family hand motion image and key point position, determine class of operation, can improve the degree of accuracy.
By user's hand region image is partitioned in user's hand motion image, treatment effeciency can be improved.By adopting depth
Neural metwork training builds user's hand motion recognition model, can improve the recognition accuracy of user's hand motion.
Figure 10 is the structural representation of the intelligent interaction device that the application one embodiment is proposed.
As shown in Figure 10, the device 100 of the present embodiment includes:Acquisition module 101, determining module 102 and respond module
103。
Acquisition module 101, obtains user's hand motion image, and user's hand motion image is to user's hand motion
Obtain after being shot;
Determining module 102, for according to user's hand motion image, determining the corresponding behaviour of user's hand motion
Make classification;
Respond module 103, for according to the class of operation, responding to user's hand motion.
In some embodiments, referring to Figure 11, the determining module 102 includes:
Segmentation submodule 1021, for determining the user's hand region in user's hand motion image, and according to institute
State user's hand region to split user's hand motion image, obtain user's hand region image;
Identification submodule 1022, for according to user's hand region image and user's hand motion knowledge of structure in advance
Other model, identifies the corresponding user's hand motion classification of user's hand motion image and key point position;
Determination sub-module 1023, for according to the continuously corresponding user's hand motion classification of multigroup user's hand motion image
With key point position, the corresponding class of operation of user's hand motion is determined.
In some embodiments, single group user's hand motion image includes:Mutual corresponding frame RGB image and a frame depth
Image.
It is in some embodiments, described to split submodule 1021 for determining the user's hand in user's hand motion image
Portion region, including:
According to the RGB image, the skin area in user's hand motion image is determined;
The skin area is clustered, the different skin region after being clustered;
According to depth image, the corresponding depth value in different skin region is obtained, and according to the depth value, determined described
User's hand region in user's hand motion image.
In some embodiments, the submodule 1021 of splitting is for according to the RGB image, determining user's hand
Skin area in motion images, including:
The RGB image is converted to into CrCb images;
The skin mask in default CrCb spaces is carried out and operation with the CrCb images, user's hand is determined
Skin area in motion images.
In some embodiments, user's hand region image:The RGB image and depth image of user's hand region, institute
State identification submodule 1022 specifically for:
Using the RGB image and depth image of user's hand region as the input of user's hand motion recognition model, obtain
Model is exported, and the model output includes:The probability of every kind of user's hand motion classification and key point position;
Using probability highest user's hand motion classification as the user's hand motion classification for identifying, model is exported
Key point position is used as the key point position identified.
In some embodiments, referring to Figure 11, the device 100 also includes:For building user's hand motion recognition model
Build module 104, it is described structure module 104 specifically for:
Training data is obtained, the training data includes:The RGB image and depth image and mark letter of user's hand region
Breath, the RGB image and depth image of user's hand region are obtained after the user's hand motion image to collection is split
Arrive, the markup information is corresponding with the user's hand motion image collected, including user's hand motion classification and key point
Put;
Determine the structure of user's hand motion recognition model;
It is trained based on the training data and the structure, builds user's hand motion recognition model.
In some embodiments, the structure includes:Deep neural network structure.
It is understood that the device of the present embodiment is corresponding with said method embodiment, particular content may refer to method
The associated description of embodiment, here are no longer described in detail.
It is understood that same or similar part mutually can refer in the various embodiments described above, in certain embodiments
Unspecified content may refer to same or analogous content in other embodiment.
It should be noted that in the description of the present application, term " first ", " second " etc. are only used for describing purpose, and not
It is understood that to indicate or implying relative importance.Additionally, in the description of the present application, unless otherwise stated, the implication of " multiple "
Refer at least two.
In flow chart or here any process described otherwise above or method description are construed as, expression includes
It is one or more for realizing specific logical function or process the step of the module of code of executable instruction, fragment or portion
Point, and the scope of the preferred embodiment of the application includes other realization, wherein the suitable of shown or discussion can not be pressed
Sequence, including according to involved function by it is basic simultaneously in the way of or in the opposite order, carry out perform function, this should be by the application
Embodiment person of ordinary skill in the field understood.
It should be appreciated that each several part of the application can be realized with hardware, software, firmware or combinations thereof.Above-mentioned
In embodiment, the software that multiple steps or method can be performed in memory and by suitable instruction execution system with storage
Or firmware is realizing.For example, if realized with hardware, and in another embodiment, can be with well known in the art
Any one of row technology or their combination are realizing:With for the logic gates of logic function is realized to data-signal
Discrete logic, the special IC with suitable combinational logic gate circuit, programmable gate array (PGA), scene
Programmable gate array (FPGA) etc..
Those skilled in the art are appreciated that to realize all or part of step that above-described embodiment method is carried
Suddenly the hardware that can be by program to instruct correlation is completed, and described program can be stored in a kind of computer-readable storage medium
In matter, the program upon execution, including one or a combination set of the step of embodiment of the method.
Additionally, each functional unit in the application each embodiment can be integrated in a processing module, it is also possible to
It is that unit is individually physically present, it is also possible to which two or more units are integrated in a module.Above-mentioned integrated mould
Block both can be realized in the form of hardware, it would however also be possible to employ the form of software function module is realized.The integrated module is such as
Fruit using in the form of software function module realize and as independent production marketing or use when, it is also possible to be stored in a computer
In read/write memory medium.
Storage medium mentioned above can be read-only storage, disk or CD etc..
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
Example ", or the description of " some examples " etc. mean specific features with reference to the embodiment or example description, structure, material or spy
Point is contained at least one embodiment or example of the application.In this manual, to the schematic representation of above-mentioned term not
Identical embodiment or example are referred to necessarily.And, the specific features of description, structure, material or feature can be any
One or more embodiments or example in combine in an appropriate manner.
Although embodiments herein has been shown and described above, it is to be understood that above-described embodiment is example
Property, it is impossible to the restriction to the application is interpreted as, one of ordinary skill in the art within the scope of application can be to above-mentioned
Embodiment is changed, changes, replacing and modification.
Claims (16)
1. a kind of intelligent interactive method, it is characterised in that include:
Obtain user's hand motion image, user's hand motion image be user's hand motion is shot after obtain
's;
According to user's hand motion image, the corresponding class of operation of user's hand motion is determined;
According to the class of operation, user's hand motion is responded.
2. method according to claim 1, it is characterised in that described according to user's hand motion image, determines institute
The corresponding class of operation of user's hand motion is stated, including:
Determine the user's hand region in user's hand motion image, and according to user's hand region to the user
Hand motion image is split, and obtains user's hand region image;
The user's hand motion recognition model built according to user's hand region image and in advance, identifies user's hand
The corresponding user's hand motion classification of portion's motion images and key point position;
According to the continuously corresponding user's hand motion classification of multigroup user's hand motion image and key point position, the use is determined
The corresponding class of operation of family hand motion.
3. method according to claim 2, it is characterised in that single group user's hand motion image includes:It is mutually corresponding
One frame RGB image and a frame depth image.
4. method according to claim 3, it is characterised in that the user in determination user's hand motion image
Hand region, including:
According to the RGB image, the skin area in user's hand motion image is determined;
The skin area is clustered, the different skin region after being clustered;
According to depth image, the corresponding depth value in different skin region is obtained, and according to the depth value, determines the user
User's hand region in hand motion image.
5. method according to claim 4, it is characterised in that described according to the RGB image, determines user's hand
Skin area in portion's motion images, including:
The RGB image is converted to into CrCb images;
The skin mask in default CrCb spaces is carried out and operation with the CrCb images, user's hand motion is determined
Skin area in image.
6. method according to claim 2, it is characterised in that user's hand region image:User's hand region
RGB image and depth image, the user's hand motion recognition mould for building according to user's hand region image and in advance
Type, identifies the corresponding user's hand motion classification of user's hand motion image and key point position, including:
Using the RGB image and depth image of user's hand region as the input of user's hand motion recognition model, model is obtained
Output, the model output include:The probability of every kind of user's hand motion classification and key point position;
Using probability highest user's hand motion classification as the user's hand motion classification for identifying, the key that model is exported
Point position is used as the key point position identified.
7. method according to claim 2, it is characterised in that also include:User's hand motion recognition model is built, it is described
User's hand motion recognition model is built, including:
Training data is obtained, the training data includes:The RGB image and depth image of user's hand region and markup information,
The RGB image and depth image of user's hand region is obtained after the user's hand motion image to collection is split
, the markup information is corresponding with the user's hand motion image collected, including user's hand motion classification and key point position;
Determine the structure of user's hand motion recognition model;
It is trained based on the training data and the structure, builds user's hand motion recognition model.
8. method according to claim 7, it is characterised in that the structure includes:Deep neural network structure.
9. a kind of intelligent interaction device, it is characterised in that include:
Acquisition module, obtains user's hand motion image, and user's hand motion image is that user's hand motion is clapped
Obtain after taking the photograph;
Determining module, for according to user's hand motion image, determining the corresponding class of operation of user's hand motion;
Respond module, for according to the class of operation, responding to user's hand motion.
10. device according to claim 9, it is characterised in that the determining module includes:
Segmentation submodule, for determining the user's hand region in user's hand motion image, and according to user's hand
Portion region is split to user's hand motion image, obtains user's hand region image;
Identification submodule, for the user's hand motion recognition model according to user's hand region image and advance structure,
Identify the corresponding user's hand motion classification of user's hand motion image and key point position;
Determination sub-module, for according to the continuously corresponding user's hand motion classification of multigroup user's hand motion image and key point
Position, determines the corresponding class of operation of user's hand motion.
11. devices according to claim 10, it is characterised in that single group user's hand motion image includes:It is mutually corresponding
A frame RGB image and a frame depth image.
12. devices according to claim 11, it is characterised in that the segmentation submodule is used to determine user's hand
User's hand region in motion images, including:
According to the RGB image, the skin area in user's hand motion image is determined;
The skin area is clustered, the different skin region after being clustered;
According to depth image, the corresponding depth value in different skin region is obtained, and according to the depth value, determines the user
User's hand region in hand motion image.
13. devices according to claim 12, it is characterised in that the segmentation submodule for according to the RGB image,
The skin area in user's hand motion image is determined, including:
The RGB image is converted to into CrCb images;
The skin mask in default CrCb spaces is carried out and operation with the CrCb images, user's hand motion is determined
Skin area in image.
14. devices according to claim 10, it is characterised in that user's hand region image:User's hand region
RGB image and depth image, the identification submodule specifically for:
Using the RGB image and depth image of user's hand region as the input of user's hand motion recognition model, model is obtained
Output, the model output include:The probability of every kind of user's hand motion classification and key point position;
Using probability highest user's hand motion classification as the user's hand motion classification for identifying, the key that model is exported
Point position is used as the key point position identified.
15. devices according to claim 10, it is characterised in that also include:For building user's hand motion recognition mould
The structure module of type, the structure module specifically for:
Training data is obtained, the training data includes:The RGB image and depth image of user's hand region and markup information,
The RGB image and depth image of user's hand region is obtained after the user's hand motion image to collection is split
, the markup information is corresponding with the user's hand motion image collected, including user's hand motion classification and key point position;
Determine the structure of user's hand motion recognition model;
It is trained based on the training data and the structure, builds user's hand motion recognition model.
16. devices according to claim 15, it is characterised in that the structure includes:Deep neural network structure.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611025898.3A CN106547356B (en) | 2016-11-17 | 2016-11-17 | Intelligent interaction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611025898.3A CN106547356B (en) | 2016-11-17 | 2016-11-17 | Intelligent interaction method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106547356A true CN106547356A (en) | 2017-03-29 |
CN106547356B CN106547356B (en) | 2020-09-11 |
Family
ID=58394834
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611025898.3A Active CN106547356B (en) | 2016-11-17 | 2016-11-17 | Intelligent interaction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106547356B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107291221A (en) * | 2017-05-04 | 2017-10-24 | 浙江大学 | Across screen self-adaption accuracy method of adjustment and device based on natural gesture |
CN108257139A (en) * | 2018-02-26 | 2018-07-06 | 中国科学院大学 | RGB-D three-dimension object detection methods based on deep learning |
CN108733287A (en) * | 2018-05-15 | 2018-11-02 | 东软集团股份有限公司 | Detection method, device, equipment and the storage medium of physical examination operation |
CN109117746A (en) * | 2018-07-23 | 2019-01-01 | 北京华捷艾米科技有限公司 | Hand detection method and machine readable storage medium |
CN110414393A (en) * | 2019-07-15 | 2019-11-05 | 福州瑞芯微电子股份有限公司 | A kind of natural interactive method and terminal based on deep learning |
WO2020020146A1 (en) * | 2018-07-25 | 2020-01-30 | 深圳市商汤科技有限公司 | Method and apparatus for processing laser radar sparse depth map, device, and medium |
WO2020252918A1 (en) * | 2019-06-20 | 2020-12-24 | 平安科技(深圳)有限公司 | Human body-based gesture recognition method and apparatus, device, and storage medium |
CN112383805A (en) * | 2020-11-16 | 2021-02-19 | 四川长虹电器股份有限公司 | Method for realizing man-machine interaction at television end based on human hand key points |
CN112686231A (en) * | 2021-03-15 | 2021-04-20 | 南昌虚拟现实研究院股份有限公司 | Dynamic gesture recognition method and device, readable storage medium and computer equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102737235A (en) * | 2012-06-28 | 2012-10-17 | 中国科学院自动化研究所 | Head posture estimation method based on depth information and color image |
CN102854983A (en) * | 2012-09-10 | 2013-01-02 | 中国电子科技集团公司第二十八研究所 | Man-machine interaction method based on gesture recognition |
US20140253429A1 (en) * | 2013-03-08 | 2014-09-11 | Fastvdo Llc | Visual language for human computer interfaces |
CN104598915A (en) * | 2014-01-24 | 2015-05-06 | 深圳奥比中光科技有限公司 | Gesture recognition method and gesture recognition device |
-
2016
- 2016-11-17 CN CN201611025898.3A patent/CN106547356B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102737235A (en) * | 2012-06-28 | 2012-10-17 | 中国科学院自动化研究所 | Head posture estimation method based on depth information and color image |
CN102854983A (en) * | 2012-09-10 | 2013-01-02 | 中国电子科技集团公司第二十八研究所 | Man-machine interaction method based on gesture recognition |
US20140253429A1 (en) * | 2013-03-08 | 2014-09-11 | Fastvdo Llc | Visual language for human computer interfaces |
CN104598915A (en) * | 2014-01-24 | 2015-05-06 | 深圳奥比中光科技有限公司 | Gesture recognition method and gesture recognition device |
Non-Patent Citations (1)
Title |
---|
黄晓林、董洪伟: "基于深度信息的实时手势识别和虚拟书写系统", 《计算机工程与应用》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107291221A (en) * | 2017-05-04 | 2017-10-24 | 浙江大学 | Across screen self-adaption accuracy method of adjustment and device based on natural gesture |
CN107291221B (en) * | 2017-05-04 | 2019-07-16 | 浙江大学 | Across screen self-adaption accuracy method of adjustment and device based on natural gesture |
CN108257139A (en) * | 2018-02-26 | 2018-07-06 | 中国科学院大学 | RGB-D three-dimension object detection methods based on deep learning |
CN108257139B (en) * | 2018-02-26 | 2020-09-08 | 中国科学院大学 | RGB-D three-dimensional object detection method based on deep learning |
CN108733287A (en) * | 2018-05-15 | 2018-11-02 | 东软集团股份有限公司 | Detection method, device, equipment and the storage medium of physical examination operation |
CN109117746A (en) * | 2018-07-23 | 2019-01-01 | 北京华捷艾米科技有限公司 | Hand detection method and machine readable storage medium |
WO2020020146A1 (en) * | 2018-07-25 | 2020-01-30 | 深圳市商汤科技有限公司 | Method and apparatus for processing laser radar sparse depth map, device, and medium |
WO2020252918A1 (en) * | 2019-06-20 | 2020-12-24 | 平安科技(深圳)有限公司 | Human body-based gesture recognition method and apparatus, device, and storage medium |
CN110414393A (en) * | 2019-07-15 | 2019-11-05 | 福州瑞芯微电子股份有限公司 | A kind of natural interactive method and terminal based on deep learning |
CN112383805A (en) * | 2020-11-16 | 2021-02-19 | 四川长虹电器股份有限公司 | Method for realizing man-machine interaction at television end based on human hand key points |
CN112686231A (en) * | 2021-03-15 | 2021-04-20 | 南昌虚拟现实研究院股份有限公司 | Dynamic gesture recognition method and device, readable storage medium and computer equipment |
CN112686231B (en) * | 2021-03-15 | 2021-06-01 | 南昌虚拟现实研究院股份有限公司 | Dynamic gesture recognition method and device, readable storage medium and computer equipment |
WO2022193453A1 (en) * | 2021-03-15 | 2022-09-22 | 南昌虚拟现实研究院股份有限公司 | Dynamic gesture recognition method and apparatus, and readable storage medium and computer device |
Also Published As
Publication number | Publication date |
---|---|
CN106547356B (en) | 2020-09-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106547356A (en) | Intelligent interactive method and device | |
CN102081918B (en) | Video image display control method and video image display device | |
CN103941866B (en) | Three-dimensional gesture recognizing method based on Kinect depth image | |
CN103530613B (en) | Target person hand gesture interaction method based on monocular video sequence | |
US10671841B2 (en) | Attribute state classification | |
CN104573706B (en) | A kind of subject image recognition methods and its system | |
CN104395856B (en) | For recognizing the computer implemented method and system of dumb show | |
Qi et al. | Computer vision-based hand gesture recognition for human-robot interaction: a review | |
CN107578023A (en) | Man-machine interaction gesture identification method, apparatus and system | |
CN104838337A (en) | Touchless input for a user interface | |
CN106598227A (en) | Hand gesture identification method based on Leap Motion and Kinect | |
CN106200971A (en) | Man-machine interactive system device based on gesture identification and operational approach | |
CN110135497B (en) | Model training method, and method and device for estimating strength of facial action unit | |
Jin et al. | Real-time action detection in video surveillance using sub-action descriptor with multi-cnn | |
CN107654406A (en) | Fan air supply control equipment, fan air supply control method and device | |
Desai et al. | Human Computer Interaction through hand gestures for home automation using Microsoft Kinect | |
CN106503619B (en) | Gesture recognition method based on BP neural network | |
CN111857334A (en) | Human body gesture letter recognition method and device, computer equipment and storage medium | |
Dardas | Real-time hand gesture detection and recognition for human computer interaction | |
CN103201706A (en) | Method for driving virtual mouse | |
Zhang et al. | Emotion recognition from body movements with as-lstm | |
Lee et al. | Recognition of hand gesture to human-computer interaction | |
US10095308B2 (en) | Gesture based human machine interface using marker | |
CN110134241A (en) | Dynamic gesture exchange method based on monocular cam | |
Liu et al. | Recognizing object manipulation activities using depth and visual cues |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |