CN111913584B - Mouse cursor control method and system based on gesture recognition - Google Patents

Mouse cursor control method and system based on gesture recognition Download PDF

Info

Publication number
CN111913584B
CN111913584B CN202010834455.9A CN202010834455A CN111913584B CN 111913584 B CN111913584 B CN 111913584B CN 202010834455 A CN202010834455 A CN 202010834455A CN 111913584 B CN111913584 B CN 111913584B
Authority
CN
China
Prior art keywords
gesture
skin color
hog
mouse
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010834455.9A
Other languages
Chinese (zh)
Other versions
CN111913584A (en
Inventor
陈康
易金
王俊
林瑞全
欧明敏
武义
邢新华
赵显煜
李振嘉
郑炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202010834455.9A priority Critical patent/CN111913584B/en
Publication of CN111913584A publication Critical patent/CN111913584A/en
Application granted granted Critical
Publication of CN111913584B publication Critical patent/CN111913584B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Abstract

The invention relates to a mouse cursor control method and system based on gesture recognition. A gesture image recognition method fusing a PHOG (pyramid history of organized gradients) feature and an improved LBP (local Binary Pattern) feature and based on a K-NN (K-nearest neighbor classification) classifier is provided. In order to improve the real-time performance of the system operation, the system determines whether the current frame has a human hand or not through skin color detection. When a human hand is detected, the PHOG features and the improved LBP features are further extracted. After the PHOG characteristics and the improved LBP characteristics are fused, a K-NN classifier is adopted to realize gesture classification recognition. The invention realizes the quick and accurate recognition of the user gesture under the complex background conditions of different angles, different light rays and the like, and accurately controls the action of the mouse cursor in real time according to the recognition result.

Description

Mouse cursor control method and system based on gesture recognition
Technical Field
The invention relates to the technical field of image processing and the field of man-machine interaction, in particular to a mouse cursor control method and system based on gesture recognition.
Background
Gesture recognition technology is one of the important research contents in the field of human-computer interaction. Through the gesture recognition technology, the computer equipment and the robot can be controlled more naturally and effectively. The limitation of traditional human-computer interaction such as keyboard, mouse and remote control is broken through, and the friendliness of human-computer interaction is greatly improved.
The development of gesture recognition goes through two stages. The first stage is a gesture recognition technology represented by a wearable device. Many sensors are installed in these devices. The second stage is a gesture recognition technique based on computer vision. And in the second stage, electronic equipment such as a camera is used for capturing, identifying and tracking the object instead of human eyes, and corresponding processing and analysis are carried out to obtain a final result. In the gesture recognition technology based on computer vision, the common method is to extract the gesture features and then use a classifier to perform classification recognition. LBP and HOG are more commonly used gesture feature descriptors. The HOG features have good geometric invariance and optical invariance, and are usually used as features to extract edge and contour information of a target image. LBP is often used to describe local texture feature information of an image. As the study on the LBP features becomes deeper and deeper, the LBP features are changed more and more, and the application is wider and wider.
Although there has been some research into gesture recognition, there are still some inherent problems in this area. Research has shown that gestures of different scales, angles, grayscales and positions all reduce the recognition rate.
Disclosure of Invention
In view of the above, the present invention provides a mouse cursor control method and system based on gesture recognition, which provides a new mouse control method to make the operation of the computer more flexible.
The invention is realized by adopting the following scheme: a mouse cursor control method based on gesture recognition comprises the following steps:
step S1: calling a computer camera or an external USB camera on an MATLAB platform to acquire user gesture video data;
step S2: carrying out median filtering image preprocessing on the collected user gesture video data;
step S3: carrying out skin color detection on the filtered image based on an elliptical clustering model, and carrying out binarization processing on video data according to the elliptical skin color clustering model so as to segment a gesture area of the current video frame;
step S4: setting a threshold value of skin color pixel points, and counting the number of the detected skin color pixel points; if the number of the skin color pixel points exceeds a set threshold value beta, and the number is 50< beta <100, the gesture is detected in the current frame, and the HOG feature extraction of the step S5 is carried out; if the number of the skin color pixel points does not exceed the set threshold, the current frame is considered to have no detected gesture, and the next frame data is processed;
step S5: extracting PHOG characteristics of the gesture area;
step S6: improving the traditional LBP characteristics to obtain MLBP characteristics, and fusing the MLBP characteristics and the PHOG characteristics, namely connecting an MLBP characteristic vector and a PHOG characteristic vector in series to obtain a fused characteristic vector;
step S7: making a characteristic template database;
step S8: comparing the fusion features extracted in real time in the step S6 with the features in the feature template database by using a K-NN classification algorithm to obtain an identification result;
step S9: and controlling the motion of the mouse cursor in real time according to the identification result, namely flexibly controlling the mouse to move to any position on a screen by setting the position of the mouse cursor after calling java.
Further, the step S2 specifically includes the following steps:
step S21: converting the collected RGB video data into YCbCr video data;
step S22: and performing median filtering on the YCbCr video data to obtain clearer video data.
Further, in step S3, based on the elliptical clustering model skin color detection, the elliptical skin color clustering model formula is as follows:
Figure BDA0002639132450000031
Figure BDA0002639132450000032
and carrying out binarization processing on the video data according to the oval skin color clustering model, wherein a binarization formula is as follows:
Figure BDA0002639132450000033
if the value after binarization is 1, the color is a skin color area, namely a gesture area; if the value after binarization is 0, the non-skin color area is a non-gesture area; in this equation, θ is 2.53,cx=109.38,cy=152.02,ecx=1.60,ecy=2.41,a=25.39,b=14.03。
Further, the step S5 specifically includes the following steps:
step S51: gamma correction is carried out on the gesture area image after normalization processing;
step S52: calculating the gradient amplitude and the gradient direction of the corrected gesture area pixel point (x, y), wherein the calculation formula is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y)
Gy(x,y)=H(x,y+1)-H(x,y-1)
Figure BDA0002639132450000041
Figure BDA0002639132450000042
Gx(x, y) is the gradient in the horizontal direction, Gy(x, y) is a vertical direction gradient, G (x, y) gradient magnitude, and theta (x, y) gradient direction;
step S53: dividing the normalized gesture area image into a plurality of cell cells, calculating gradient information of each cell by adopting a histogram of 9 bins, namely dividing the gradient direction of the cell into 9 directions, and performing weighted projection on pixels in each cell in the gradient direction according to the gradient amplitude to obtain a gradient direction histogram of the cell, namely the HOG characteristic of the cell; connecting HOG characteristics of a plurality of cells in series to form a larger block HOG characteristic; connecting HOG features of all blocks in the gesture area in series to form HOG features of the image;
step S54: dividing the gesture area into three block scales, and extracting the HOG characteristics of the gesture area on the three scales respectively:
the HOG features extracted from the original gesture area are HOG features of a first scale, the HOG features extracted after the original gesture area image is divided into 2 x 2 sub-blocks are HOG features of a second scale, and the HOG features extracted after the original gesture area image is divided into 4 x 4 sub-blocks are HOG features of a third scale;
and connecting HOG features of three scales in series to obtain the PHOG feature.
Further, the step S6 of improving the conventional LBP feature to obtain the MLBP feature specifically includes the following steps:
step Sa: calculating the coordinates of the neighborhood pixel points according to the coordinates of the current center pixel point; wherein (X)c,Yc) Is the coordinate of the central pixel point, R is the sampling radius, P is the number of sampling points, i.e. the number of field pixel points, (X)p,Yp) Is the neighborhood pixel point coordinate;
Figure BDA0002639132450000051
Figure BDA0002639132450000052
and Sb: carrying out bilinear interpolation calculation on the calculated coordinate values of the neighborhood pixel points, and rounding the coordinate values to obtain an interpolation formula as follows:
Figure BDA0002639132450000053
step Sc: sorting P neighborhood pixel values around the central pixel, and after removing the minimum value and the maximum value, solving a mean value as a binary threshold value; MLBP characteristic calculation formula is as follows, wherein gmThe mean value obtained after the minimum value and the maximum value are removed from the neighborhood pixels;
Figure BDA0002639132450000054
further, the specific content of step S7 is: under the environment close to the actual operation of the system, 50 pictures of various predefined gestures are collected, and the PHOG and MLBP fusion characteristics of each picture are extracted according to the steps from S1 to S6 and stored in a template database.
Further, the specific content of step S8 is:
calculating Euclidean distance between a target characteristic vector and an actual characteristic vector in a database; the calculation formula of the euclidean distance is as follows, where T ═ T (T)1,t2,…tn) A feature vector representing an input frame image, Q ═ Q1,q2,...qn) Representing the feature vectors stored in the template database;
Figure BDA0002639132450000061
selecting the first K features closest to the template database according to the calculation result, wherein the category with the highest frequency in the K features is the identification result; the selection of the K value is determined through K-fold cross validation, wherein when the recognized gesture is gesture 1, the left movement of the mouse is realized; when the recognized gesture is gesture 2, the right movement of the mouse is realized; when the recognized gesture is gesture 3, the mouse is moved upwards; and when the recognized gesture is a gesture 4, the downward movement of the mouse is realized. When the recognized gesture is a gesture fist, left-clicking of the mouse is realized; when the recognized gesture is a gesture palm, right click of the mouse is realized; setting gesture 1 as a gesture when the human hand stretches out one finger, gesture 2 as a gesture when the human hand stretches out two fingers, gesture 3 as a gesture when the human hand stretches out three fingers, and gesture 4 as a gesture when the human hand stretches out four fingers.
Further, the present invention also provides a mouse cursor control system based on gesture recognition, comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein when the computer program is run by the processor, the steps of the method as described above are implemented.
Compared with the prior art, the invention has the following beneficial effects:
the invention directly collects the user video data by using the MATLAB software platform, can accurately detect and recognize the gesture of the user, and can control the movement of the mouse cursor according to the recognized result. The invention provides a new mouse control mode, which enables the operation of a computer to be more flexible and solves some problems in human-computer interaction.
Drawings
FIG. 1 is a flow chart of a method according to an embodiment of the present invention.
FIG. 2 is a flowchart illustrating image preprocessing according to an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the present embodiment provides a mouse cursor control method based on gesture recognition, including the following steps:
step S1: calling a computer camera or an external USB camera on an MATLAB platform to acquire user gesture video data;
step S2: carrying out median filtering image preprocessing on the collected user gesture video data; as shown in fig. 2
Step S3: carrying out skin color detection on the filtered image based on an elliptical clustering model, and carrying out binarization processing on video data according to the elliptical skin color clustering model so as to segment a gesture area of the current video frame;
step S4: setting a threshold value of skin color pixel points, and counting the number of the detected skin color pixel points; if the number of the skin color pixel points exceeds a set threshold value beta, and the number is 50< beta <100, the gesture is detected in the current frame, and the HOG feature extraction of the step S5 is carried out; if the number of the skin color pixel points does not exceed the set threshold, the current frame is considered to have no detected gesture, and the next frame data is processed;
step S5: extracting PHOG characteristics of the gesture area;
step S6: improving the traditional LBP characteristics to obtain MLBP characteristics, and fusing the MLBP characteristics and the PHOG characteristics, namely connecting an MLBP characteristic vector and a PHOG characteristic vector in series to obtain a fused characteristic vector;
step S7: making a characteristic template database;
step S8: comparing the fusion features extracted in real time in the step S6 with the features in the feature template database by using a K-NN classification algorithm to obtain an identification result;
step S9: and controlling the motion of the mouse cursor in real time according to the identification result, namely flexibly controlling the mouse to move to any position on a screen by setting the position of the mouse cursor after calling java.
In this embodiment, the step S2 specifically includes the following steps:
step S21: converting the collected RGB video data into YCbCr video data;
step S22: and performing median filtering on the YCbCr video data to obtain clearer video data.
In this embodiment, in step S3, based on the elliptical cluster model skin color detection, the elliptical skin color cluster model formula is as follows:
Figure BDA0002639132450000081
Figure BDA0002639132450000091
and carrying out binarization processing on the video data according to the oval skin color clustering model, wherein a binarization formula is as follows:
Figure BDA0002639132450000092
if the value after binarization is 1, the color is a skin color area, namely a gesture area; if the value after binarization is 0, the non-skin color area is a non-gesture area;
in this formula, θ is 2.53, cx=109.38,cy=152.02,ecx=1.60,ecy=2.41,a=25.39,b=14.03。
In this embodiment, the step S5 specifically includes the following steps:
step S51: gamma correction is carried out on the gesture area image after normalization processing;
step S52: calculating the gradient amplitude and the gradient direction of the corrected gesture area pixel point (x, y), wherein the calculation formula is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y)
Gy(x,y)=H(x,y+1)-H(x,y-1)
Figure BDA0002639132450000093
Figure BDA0002639132450000094
Gx(x, y) is the gradient in the horizontal direction, Gy(x, y) is a vertical direction gradient, G (x, y) gradient magnitude, and theta (x, y) gradient direction;
step S53: dividing the normalized gesture area image into a plurality of cell cells, calculating gradient information of each cell by adopting a histogram of 9 bins, namely dividing the gradient direction of the cell into 9 directions, and performing weighted projection on pixels in each cell in the gradient direction according to the gradient amplitude to obtain a gradient direction histogram of the cell, namely the HOG characteristic of the cell; connecting HOG characteristics of a plurality of cells in series to form a larger block HOG characteristic; connecting HOG features of all blocks in the gesture area in series to form HOG features of the image;
dividing the gesture area into a plurality of cells, and counting a gradient direction histogram of each cell; combining the histograms of the cells into a block histogram, and combining the histograms of all the blocks into an image histogram;
step S54: dividing the gesture area into three block scales, and extracting the HOG characteristics of the gesture area on the three scales respectively:
the HOG features extracted from the original gesture area are HOG features of a first scale, the HOG features extracted after the original gesture area image is divided into 2 x 2 sub-blocks are HOG features of a second scale, and the HOG features extracted after the original gesture area image is divided into 4 x 4 sub-blocks are HOG features of a third scale;
and connecting HOG features of three scales in series to obtain the PHOG feature.
In this embodiment, the step S6 of improving the conventional LBP feature to obtain the MLBP feature specifically includes the following steps:
step Sa: calculating the coordinates of the neighborhood pixel points according to the coordinates of the current center pixel point; wherein (X)c,Yc) Is the coordinate of the central pixel point, R is the sampling radius, P is the number of sampling points, i.e. the number of field pixel points, (X)p,Yp) Is the neighborhood pixel point coordinate;
Figure BDA0002639132450000101
Figure BDA0002639132450000102
and Sb: carrying out bilinear interpolation calculation on the calculated coordinate values of the neighborhood pixel points, and rounding the coordinate values to obtain an interpolation formula as follows:
Figure BDA0002639132450000111
step Sc: sorting P neighborhood pixel values around the central pixel, and after removing the minimum value and the maximum value, solving a mean value as a binary threshold value; MLBP characteristic calculation formula is as follows, wherein gmThe mean value obtained after the minimum value and the maximum value are removed from the neighborhood pixels;
Figure BDA0002639132450000112
in this embodiment, the specific content of step S7 is: under the environment close to the actual operation of the system, 50 pictures of various predefined gestures are collected, and the PHOG and MLBP fusion characteristics of each picture are extracted according to the steps from S1 to S6 and stored in a template database.
In this embodiment, the specific content of step S8 is:
calculating Euclidean distance between a target characteristic vector and an actual characteristic vector in a database; the calculation formula of the euclidean distance is as follows, where T ═ T (T)1,t2,…tn) A feature vector representing an input frame image, Q ═ Q1,q2,...qn) Representing the feature vectors stored in the template database;
Figure BDA0002639132450000113
selecting the first K features closest to the template database according to the calculation result, wherein the category with the highest frequency in the K features is the identification result; the selection of the K value is determined through K-fold cross validation, wherein when the recognized gesture is gesture 1, the left movement of the mouse is realized; when the recognized gesture is gesture 2, the right movement of the mouse is realized; when the recognized gesture is gesture 3, the mouse is moved upwards; and when the recognized gesture is a gesture 4, the downward movement of the mouse is realized. When the recognized gesture is a gesture fist, left-clicking of the mouse is realized; when the recognized gesture is a gesture palm, right click of the mouse is realized; setting gesture 1 as a gesture when the human hand stretches out one finger, gesture 2 as a gesture when the human hand stretches out two fingers, gesture 3 as a gesture when the human hand stretches out three fingers, and gesture 4 as a gesture when the human hand stretches out four fingers.
Preferably, the present embodiment further provides a mouse cursor control system based on gesture recognition, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and when the computer program is run by the processor, the method steps as described above are implemented.
Preferably, in this embodiment, a computer-carried camera or an external USB camera is called on the MATLAB platform to acquire user video data, the acquired video data is processed in the MATLAB platform in real time, and the real-time control of the mouse cursor is realized through a processing result. The video data processing comprises converting the collected RGB video data into YCbCr format and then carrying out median filtering processing. And (4) detecting the filtered data based on an elliptical skin color model, and segmenting the gesture area. Normalizing the segmented gesture area, extracting PHOG characteristics and MLBP characteristics from the segmented gesture area, and performing classification and identification by using a KNN classifier. And controlling the movement of the mouse cursor according to the classification recognition result. It is preferable to propose a method for describing the gesture features in combination with the PHOG features and the improved LBP features. And identifying the gesture of the current frame by comparing the distances between the target feature vector and the actual feature vector in the database by using the K-NN as a classifier.
The embodiment can simply and efficiently realize the control of the mouse cursor and provide an efficient gesture recognition method for other gesture recognition occasions.
Preferably, in this embodiment, the MATLAB is used to invoke a camera of the computer or an external USB camera to acquire video data, detect and recognize gestures of a person in the video, and control the cursor to move or click in real time according to the gestures. A gesture image recognition method fusing a PHOG (pyramid history of organized gradients) feature and an improved LBP (local Binary Pattern) feature and based on a K-NN (K-nearest neighbor classification) classifier is provided. In order to improve the real-time performance of the system operation, the system determines whether the current frame has a human hand or not through skin color detection. When a human hand is detected, the PHOG features and the improved LBP features are further extracted. After the PHOG characteristics and the improved LBP characteristics are fused, a K-NN classifier is adopted to realize gesture classification recognition. According to the embodiment, the gestures of the user can be quickly and accurately recognized under the complex background conditions of different angles, different light rays and the like, and the action of the mouse cursor can be accurately controlled in real time according to the recognition result.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

Claims (6)

1. A mouse cursor control method based on gesture recognition is characterized in that: the method comprises the following steps:
step S1: calling a computer camera or an external USB camera on an MATLAB platform to acquire user gesture video data;
step S2: carrying out median filtering image preprocessing on the collected user gesture video data;
step S3: carrying out skin color detection on the filtered image based on an elliptical clustering model, and carrying out binarization processing on video data according to the elliptical skin color clustering model so as to segment a gesture area of the current video frame;
step S4: setting a threshold value of skin color pixel points, and counting the number of the detected skin color pixel points; if the number of the skin color pixel points exceeds a set threshold value beta, and the number is 50< beta <100, the gesture is detected in the current frame, and the HOG feature extraction of the step S5 is carried out; if the number of the skin color pixel points does not exceed the set threshold, the current frame is considered to have no detected gesture, and the next frame data is processed;
step S5: extracting PHOG characteristics of the gesture area;
step S6: improving the traditional LBP characteristics to obtain MLBP characteristics, and fusing the MLBP characteristics and the PHOG characteristics, namely connecting an MLBP characteristic vector and a PHOG characteristic vector in series to obtain a fused characteristic vector;
step S7: making a characteristic template database;
step S8: comparing the fusion features extracted in real time in the step S6 with the features in the feature template database by using a K-NN classification algorithm to obtain an identification result;
step S9: after calling java.awt.robot class and initializing in the Matlab platform, the mouse can be flexibly controlled to move to any position on the screen by setting the position of the mouse cursor;
the step S5 specifically includes the following steps:
step S51: gamma correction is carried out on the gesture area image after normalization processing;
step S52: calculating the gradient amplitude and the gradient direction of the corrected gesture area pixel point (x, y), wherein the calculation formula is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y)
Gy(x,y)=H(x,y+1)-H(x,y-1)
Figure FDA0003389897880000011
Figure FDA0003389897880000012
Gx(x, y) is the gradient in the horizontal direction, Gy(x, y) is a vertical direction gradient, G (x, y) gradient magnitude, and theta (x, y) gradient direction;
step S53: dividing the normalized gesture area image into a plurality of cell cells, calculating gradient information of each cell by adopting a histogram of 9 bins, namely dividing the gradient direction of the cell into 9 directions, and performing weighted projection on pixels in each cell in the gradient direction according to the gradient amplitude to obtain a gradient direction histogram of the cell, namely the HOG characteristic of the cell; connecting HOG characteristics of a plurality of cells in series to form a larger block HOG characteristic; connecting HOG features of all blocks in the gesture area in series to form HOG features of the image;
step S54: dividing the gesture area into three block scales, and extracting the HOG characteristics of the gesture area on the three scales respectively:
the HOG features extracted from the original gesture area are HOG features of a first scale, the HOG features extracted after the original gesture area image is divided into 2 x 2 sub-blocks are HOG features of a second scale, and the HOG features extracted after the original gesture area image is divided into 4 x 4 sub-blocks are HOG features of a third scale;
connecting HOG characteristics of three scales in series to obtain a PHOG characteristic;
in step S6, the improving the conventional LBP feature to obtain the MLBP feature specifically includes the following steps:
step Sa: calculating the coordinates of the neighborhood pixel points according to the coordinates of the current center pixel point; wherein (X)c,Yc) Is the coordinate of the central pixel point, R is the sampling radius, P is the number of sampling points, i.e. the number of field pixel points, (X)p,Yp) Is the neighborhood pixel point coordinate;
Figure FDA0003389897880000021
Figure FDA0003389897880000022
and Sb: carrying out bilinear interpolation calculation on the calculated coordinate values of the neighborhood pixel points, and rounding the coordinate values to obtain an interpolation formula as follows:
Figure FDA0003389897880000023
step Sc: sorting P neighborhood pixel values around the central pixel, and after removing the minimum value and the maximum value, solving a mean value as a binary threshold value; MLBP characteristic calculation formula is as follows, wherein gmThe mean value obtained after the minimum value and the maximum value are removed from the neighborhood pixels;
Figure FDA0003389897880000031
2. the mouse cursor control method based on gesture recognition according to claim 1, wherein: the step S2 specifically includes the following steps:
step S21: converting the collected RGB video data into YCbCr video data;
step S22: and performing median filtering on the YCbCr video data to obtain clearer video data.
3. The mouse cursor control method based on gesture recognition according to claim 1, wherein: in step S3, the ellipse skin color clustering model based skin color detection is based on the following formula:
Figure FDA0003389897880000032
Figure FDA0003389897880000033
and carrying out binarization processing on the video data according to the oval skin color clustering model, wherein a binarization formula is as follows:
Figure FDA0003389897880000034
if the value after binarization is 1, the color is a skin color area, namely a gesture area; if the value after binarization is 0, the non-skin color area is a non-gesture area;
in this formula, θ is 2.53, cx=109.38,cy=152.02,ecx=1.60,ecy=2.41,a=25.39,b=14.03。
4. The mouse cursor control method based on gesture recognition according to claim 1, wherein: the specific content of step S7 is: under the environment close to the actual operation of the system, 50 pictures of various predefined gestures are collected, and the PHOG and MLBP fusion characteristics of each picture are extracted according to the steps from S1 to S6 and stored in a template database.
5. The mouse cursor control method based on gesture recognition according to claim 1, wherein: the specific content of step S8 is:
calculating Euclidean distance between a target characteristic vector and an actual characteristic vector in a database; the calculation formula of the euclidean distance is as follows, where T ═ T (T)1,t2,…tn) A feature vector representing an input frame image, Q ═ Q1,q2,...qn) Representing the feature vectors stored in the template database;
Figure FDA0003389897880000041
selecting the first K features closest to the template database according to the calculation result, wherein the category with the highest frequency in the K features is the identification result; the selection of the K value is determined through K-fold cross validation, wherein when the recognized gesture is gesture 1, the left movement of the mouse is realized; when the recognized gesture is gesture 2, the right movement of the mouse is realized; when the recognized gesture is gesture 3, the mouse is moved upwards; when the recognized gesture is a gesture 4, the downward movement of the mouse is realized; when the recognized gesture is a gesture fist, left-clicking of the mouse is realized; when the recognized gesture is a gesture palm, right click of the mouse is realized; setting gesture 1 as a gesture when the human hand stretches out one finger, gesture 2 as a gesture when the human hand stretches out two fingers, gesture 3 as a gesture when the human hand stretches out three fingers, and gesture 4 as a gesture when the human hand stretches out four fingers.
6. The utility model provides a mouse cursor control system based on gesture recognition which characterized in that: comprising a memory, a processor and a computer program stored on the memory and being executable on the processor, the method steps as claimed in any of the claims 1-5 being carried out when the computer program is executed by the processor.
CN202010834455.9A 2020-08-19 2020-08-19 Mouse cursor control method and system based on gesture recognition Active CN111913584B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010834455.9A CN111913584B (en) 2020-08-19 2020-08-19 Mouse cursor control method and system based on gesture recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010834455.9A CN111913584B (en) 2020-08-19 2020-08-19 Mouse cursor control method and system based on gesture recognition

Publications (2)

Publication Number Publication Date
CN111913584A CN111913584A (en) 2020-11-10
CN111913584B true CN111913584B (en) 2022-04-01

Family

ID=73278583

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010834455.9A Active CN111913584B (en) 2020-08-19 2020-08-19 Mouse cursor control method and system based on gesture recognition

Country Status (1)

Country Link
CN (1) CN111913584B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114148838A (en) * 2021-12-29 2022-03-08 淮阴工学院 Elevator non-contact virtual button operation method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7588771B2 (en) * 2003-06-18 2009-09-15 Genelux Corporation Microorganisms for therapy
CN103679145A (en) * 2013-12-06 2014-03-26 河海大学 Automatic gesture recognition method
CN106886751A (en) * 2017-01-09 2017-06-23 深圳数字电视国家工程实验室股份有限公司 A kind of gesture identification method and system
CN107038416A (en) * 2017-03-10 2017-08-11 华南理工大学 A kind of pedestrian detection method based on bianry image modified HOG features
CN109086687A (en) * 2018-07-13 2018-12-25 东北大学 The traffic sign recognition method of HOG-MBLBP fusion feature based on PCA dimensionality reduction
CN109189219A (en) * 2018-08-20 2019-01-11 长春理工大学 The implementation method of contactless virtual mouse based on gesture identification
CN109359549A (en) * 2018-09-20 2019-02-19 广西师范大学 A kind of pedestrian detection method based on mixed Gaussian and HOG_LBP

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7588771B2 (en) * 2003-06-18 2009-09-15 Genelux Corporation Microorganisms for therapy
CN103679145A (en) * 2013-12-06 2014-03-26 河海大学 Automatic gesture recognition method
CN106886751A (en) * 2017-01-09 2017-06-23 深圳数字电视国家工程实验室股份有限公司 A kind of gesture identification method and system
CN107038416A (en) * 2017-03-10 2017-08-11 华南理工大学 A kind of pedestrian detection method based on bianry image modified HOG features
CN109086687A (en) * 2018-07-13 2018-12-25 东北大学 The traffic sign recognition method of HOG-MBLBP fusion feature based on PCA dimensionality reduction
CN109189219A (en) * 2018-08-20 2019-01-11 长春理工大学 The implementation method of contactless virtual mouse based on gesture identification
CN109359549A (en) * 2018-09-20 2019-02-19 广西师范大学 A kind of pedestrian detection method based on mixed Gaussian and HOG_LBP

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Hand gesture recognition based on HOG-LBP feature";Zhang F等;《 2018 IEEE International Instrumentation and Measurement Technology Conference (I2MTC)》;20180712;第1-6页 *
"基于PCA-HOG与LBP特征融合的静态手势识别方法研究";王瑶;《中国优秀硕士学位论文全文数据库 信息科技辑》;20180215(第02期);I138-2262 *
"基于肤色特征和卷积神经网络的手势识别方法";杨文斌等;《重庆工商大学学报(自然科学版)》;20180831;第35卷(第4期);第76-81页 *

Also Published As

Publication number Publication date
CN111913584A (en) 2020-11-10

Similar Documents

Publication Publication Date Title
CN107808143B (en) Dynamic gesture recognition method based on computer vision
JP5297530B2 (en) Image processing apparatus and interface apparatus
Jin et al. A mobile application of American sign language translation via image processing algorithms
US7912253B2 (en) Object recognition method and apparatus therefor
Lahiani et al. Real time hand gesture recognition system for android devices
US20160154469A1 (en) Mid-air gesture input method and apparatus
JP2014137818A (en) Method and device for identifying opening and closing operation of palm, and man-machine interaction method and facility
Yasen Vision-based control by hand-directional gestures converting to voice
CN110232308B (en) Robot-following gesture track recognition method based on hand speed and track distribution
EP2601615A1 (en) Gesture recognition system for tv control
CN103150019A (en) Handwriting input system and method
CN110032932B (en) Human body posture identification method based on video processing and decision tree set threshold
CN110688965A (en) IPT (inductive power transfer) simulation training gesture recognition method based on binocular vision
Thabet et al. Fast marching method and modified features fusion in enhanced dynamic hand gesture segmentation and detection method under complicated background
KR20120089948A (en) Real-time gesture recognition using mhi shape information
CN111913584B (en) Mouse cursor control method and system based on gesture recognition
Thabet et al. Algorithm of local features fusion and modified covariance-matrix technique for hand motion position estimation and hand gesture trajectory tracking approach
Mahmud et al. Recognition of symbolic gestures using depth information
Hu et al. Gesture detection from RGB hand image using modified convolutional neural network
JP4929460B2 (en) Motion recognition method
Shitole et al. Dynamic hand gesture recognition using PCA, Pruning and ANN
Elsayed et al. Hybrid method based on multi-feature descriptor for static sign language recognition
Li Vision based gesture recognition system with high accuracy
CN110956095A (en) Multi-scale face detection method based on corner skin color detection
Kotha et al. Gesture Recognition System

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant