Mechanical arm remote control method based on static gesture recognition
Technical Field
The invention relates to the technical field of mechanical arm control, in particular to a mechanical arm remote control method based on static gesture recognition.
Background
With the rapid development of the robot industry, various robots are layered endlessly. Voice conversation robots, accompanying robots and remote control robots begin to be popularized, and common people can experience fun and convenience brought by high-tech products. As a post robot arm in the robot industry, it has been used in factories to repeat the same work.
At present, gesture recognition remote control based on a sensor needs to rely on expensive external equipment to recognize gestures so as to control a mechanical arm, for example, in a paper "ultrasonic detection of man-machine interaction gestures and an HMM fusion SVM recognition algorithm thereof", an SVM-HMM gesture recognition algorithm of ultrasonic Doppler frequency shift is used to recognize and classify an extracted gesture feature sequence, and finally the total gesture recognition rate reaches 94.625%. Alternatively, in a robot arm control system with patent publication No. CN110695990A and a patent name based on Kinect gesture recognition, the device on which gesture recognition depends is Kinect. Such gesture recognition is costly and few methods for controlling the robot arm using conventional camera-to-stationary gesture recognition require further improvements in efficiency and time for known algorithmic recognition.
Disclosure of Invention
Aiming at the technical problems, the invention provides a mechanical arm remote control method based on static gesture recognition, compared with the traditional mechanical arm control system based on gesture recognition, the system only needs one color camera connected with a PC (personal computer), thereby helping an operator get rid of heavy wearing equipment and reducing the equipment cost; the above problems can be effectively solved. The invention is realized by the following technical scheme:
a mechanical arm remote control method based on static gesture recognition comprises a mechanical arm body, wherein a driving device of the mechanical arm body is connected with a mechanical arm controller, and commands are issued by the mechanical arm controller to control the mechanical arm to operate; the mechanical arm controller is in signal connection with an upper computer through a wireless network or a signal wire to perform signal transmission; the upper computer is in signal connection with a camera for acquiring gestures, the camera is connected with the upper computer, acquired images are transmitted to the upper computer, the upper computer performs image processing and recognition, and commands issued by the gestures are judged; the specific operation flow is as follows:
s1: starting a camera, starting the camera to work, collecting images in a visual field range in a frame data form, and sending the images to an upper computer for processing;
s2: the upper computer performs primary processing on the image uploaded by the camera and detects the image frame by frame; detecting whether a gesture exists in an image uploaded by a camera; if the corresponding skin color appears on the image, the gesture is judged to appear, and the next step S3 is carried out; if not, returning to S1 to continue to collect the image;
s3: extracting the characteristics of gesture data in the picture;
the upper computer firstly performs gray level processing on the image with the gesture, then performs normalization processing on the image with the gesture, then calculates the gradient of the image in the horizontal direction and the vertical direction, performs HOG extraction of histogram features of directional gradients, and finally extracts the hand features in the image;
s4: comparing the extracted gesture feature information with the stored feature information, if the comparison is successful, performing S5, otherwise, returning to S1;
s5: and the mechanical arm completes corresponding operation according to the gesture mapping instruction.
Further, when the camera described in step 1 is installed, a numerical value needs to be set for the camera first.
Further, the numerical value setting is to combine a motion frame difference method and an HSV skin color space threshold value limiting method to obtain a skin color data area of an operator, to perform difference between a current frame image and an adjacent frame image to obtain a motion area, and to swing an arm to enable the motion area to comprise a skin color area of a gesture part; after the motion area is obtained, converting the image from an RGB color space to an HSV color space, and limiting a threshold value by adopting an HSV skin color model; and determining the range of hue H, saturation S and lightness V of the skin color model.
Further, the step 3 of extracting the features of the gesture data in the picture includes the following specific operation steps:
s31: carrying out gray processing on the acquired image, wherein the gray processing comprises parameter information x, y and z; wherein x, y are pixel coordinates and z is the pixel gray value of the pixel at (x, y);
s32: normalizing the gesture image by a gamma correction method, and adjusting the contrast of the image;
s33: calculating the gradient of the image in the horizontal direction and the vertical direction, wherein the gradient is a vector and comprises the magnitude and the direction, and the gradient is used for solving a feature vector in the HOG feature extraction; including data Gh(x,y),Gv(x, y) and θ (x, y); gh(x, y) is the gradient value of the pixel point in the horizontal direction, Gv(x, y) is the gradient value of the pixel point in the vertical direction, and theta (x, y) is the gradient direction of the pixel point; the gradient strength M (x, y) and the direction θ (x, y) of the pixel point (x, y) are calculated by the following formula:
M(x,y)≈|Gh(x,y)+Gv(x,y)|;
θ(x,y)=arctan(Gh(x,y)/Gv(x,y));
s34: extracting HOG characteristics; compressing the image to 28 × 28, dividing the image into 36 definition blocks, dividing each definition block into 4 cell units, dividing the direction of each cell unit into 9 segments, each segment being 20 °, performing weight projection on the gradient intensity M (x, y) and the direction θ (x, y) of each cell unit, performing contrast normalization processing on the cell units in the overlapped definition blocks, and serially fusing the feature vectors of all the cell units to obtain the HOG feature vector;
forming the vectors into a characteristic vector X, calculating a covariance matrix C of the characteristic vector X, solving characteristic values and corresponding characteristic vectors of the matrix C, arranging the characteristic values corresponding to the characteristic vectors into a matrix from top to bottom according to rows, forming a matrix P by taking the first 100 rows, and obtaining a new matrix Y according to the following formula:
Y=PX;
wherein the dimension of the matrix Y is 100 dimensions;
s35: extracting hand features; the hand characteristics include three parameters, which are: n, m, L; wherein n is the number of convex hull points of the outline, and m is the number of convex hull defects of the outline; l is the circularity;
traversing each point of the contour, and solving two polar points A and B through coordinates; connecting AB, respectively solving convex hulls from points in the upper convex hull and the lower convex hull, finding a point C farthest from AB in recursion, making line segments AC and BC, then performing recursion by taking AC as an extreme point, and circulating in sequence until the convex hull can not be found, wherein the farthest point is the peak of the convex hull, and the number n of the convex hull points of the outline is obtained;
each two adjacent convex hull points contain a convex hull pit, and in the n solving process, the convex hull points are marked to obtain the convex hull pit number m which is n-1;
the circularity L of the outline is obtained by the area and the perimeter formed by the outline, and the area S and the perimeter C of the outline are respectively obtained by using bwearea () and bwpherim () functions in matalab, and the calculation formula of the circularity is as follows:
s36: HOG characteristics and hand characteristics are normalized; obtaining the eigenvector alpha of the matrix Y after HOG dimensionality reduction as [ h ═ h1,h2,…,h100]And the vector alpha and the hand characteristic vector beta are [ n, m, L ]]Performing serial fusion to obtain a fusion feature vector R ═ h1,h2,…,h100,n,m,L](ii) a And obtaining the final hand characteristics.
Further, the step S4 compares the extracted gesture feature information with the stored feature information, and the specific operation mode is as follows: and (5) classifying and matching the hand characteristic information extracted in the step (S3) with the SVM classifier trained in matalab, and if the matching is successful, issuing a corresponding instruction to the mechanical arm.
Further, said issuing the corresponding instruction includes Mreset,MpauseAnd Mstart(ii) a Wherein M isresetControlling the resetting of the arm, MpauseControl of the suspension of the arm, MstartAnd controlling the starting of the mechanical arm.
Further, the upper computer and the mechanical arm controller are arranged in the same 5G CPE environment, and the data transmission information structures of the upper computer and the mechanical arm controller are as follows:
0
|
1
|
2
|
3
|
4
|
5
|
N
|
N+6
|
N+7
|
STX
|
LEN
|
SEQ
|
SYS
|
COMP
|
MSG
|
PAYLOAD
|
CKA
|
CKB |
the STX is used for identifying and analyzing the message; the LEN records the length of the load information; SEQ is a message sending sequence code used for communication reliability test; SYS is the system ID of the system sending the message; COMP is the component ID of the component sending the message system; MSG is used for identifying the kind of the message; PAYLOAD is instruction information; CKA and CKB are message check bits.
Advantageous effects
Compared with the prior art, the mechanical arm remote control method based on static gesture recognition has the following beneficial effects:
(1) according to the technical scheme, the image is obtained through a camera, interference information is eliminated through gesture segmentation, morphological processing and contour extraction, in the gesture feature extraction, HOG features of the gesture image are extracted firstly, PCA (principal component analysis) dimensionality reduction processing is conducted on the HOG features, then a plurality of features of a hand are calculated through a fast convex hull algorithm, the HOG features and the hand features are subjected to normalization processing and then are connected in series and fused to form final classification features, finally the final classification features are classified and recognized through an SVM (support vector machine), the average recognition rate reaches 96%, and compared with HOG + SVM and HOG + Hu + SVM, the speed and the precision are obviously improved.
Drawings
FIG. 1 is an overall flow chart of the present invention.
Fig. 2 is a block diagram of a hardware device architecture in the present invention.
Fig. 3 is a schematic diagram of the hardware device connection in the present invention.
Reference numbers in the drawings: the robot comprises a mechanical arm body 1, a mechanical arm controller 2, an upper computer 3 and a camera 4.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. The described embodiments are only some embodiments of the invention, not all embodiments. Various modifications and improvements of the technical solutions of the present invention may be made by those skilled in the art without departing from the design concept of the present invention, and all of them should fall into the protection scope of the present invention.
Example 1:
as shown in fig. 1-3, a robot arm remote control method based on static gesture recognition includes a robot arm body, where a driving device of the robot arm body is connected to a robot arm controller, and the robot arm controller issues a command to control the operation of the robot arm; the mechanical arm controller is in signal connection with an upper computer through a wireless network or a signal wire to perform signal transmission; the upper computer is in signal connection with a camera for acquiring gestures, the camera is connected with the upper computer, acquired images are transmitted to the upper computer, the upper computer performs image processing and recognition, and commands issued by the gestures are judged; the specific operation flow is as follows:
s1: starting a camera, starting the camera to work, collecting images in a visual field range in a frame data form, and sending the images to an upper computer for processing;
when the camera is installed, the numerical value of the camera needs to be set first. The numerical value setting is to combine a motion frame difference method and an HSV skin color space threshold value limiting method to obtain a skin color data area of an operator, to perform difference on a current frame image and an adjacent frame image to obtain a motion area, and to swing an arm to enable the motion area to comprise a skin color area of a gesture part; after the motion area is obtained, converting the image from an RGB color space to an HSV color space, and limiting a threshold value by adopting an HSV skin color model; and determining the range of hue H, saturation S and lightness V of the skin color model.
S2: the upper computer performs primary processing on the image uploaded by the camera and detects the image frame by frame; detecting whether a gesture exists in an image uploaded by a camera; if the corresponding skin color appears on the image, the gesture is judged to appear, and the next step S3 is carried out; if not, returning to S1 to continue to collect the image;
s3: extracting the characteristics of gesture data in the picture;
the upper computer firstly performs gray level processing on the image with the gesture, then performs normalization processing on the image with the gesture, then calculates the gradient of the image in the horizontal direction and the vertical direction, performs HOG extraction of histogram features of directional gradients, and finally extracts the hand features in the image;
the method comprises the following specific operation steps of extracting the features of gesture data in a picture:
s31: carrying out gray processing on the acquired image, wherein the gray processing comprises parameter information x, y and z; wherein x, y are pixel coordinates and z is the pixel gray value of the pixel at (x, y);
s32: normalizing the gesture image by a gamma correction method, and adjusting the contrast of the image;
s33: calculating the gradient of the image in the horizontal direction and the vertical direction, wherein the gradient is a vector and comprises the magnitude and the direction, and the gradient is used for solving a feature vector in the HOG feature extraction; including data Gh(x,y),Gv(x, y) and θ (x, y); gh(x, y) is the gradient value of the pixel point in the horizontal direction, Gv(x, y) is the gradient value of the pixel point in the vertical direction, and theta (x, y) is the gradient direction of the pixel point; the gradient strength M (x, y) and the direction θ (x, y) of the pixel point (x, y) are calculated by the following formula:
M(x,y)≈|Gh(x,y)+Gv(x,y)|;
θ(x,y)=arctan(Gh(x,y)/Gv(x,y));
s34: extracting HOG characteristics; compressing the image to 28 × 28, dividing the image into 36 definition blocks, dividing each definition block into 4 cell units, dividing the direction of each cell unit into 9 segments, each segment being 20 °, performing weight projection on the gradient intensity M (x, y) and the direction θ (x, y) of each cell unit, performing contrast normalization processing on the cell units in the overlapped definition blocks, and serially fusing the feature vectors of all the cell units to obtain the HOG feature vector;
forming the vectors into a characteristic vector X, calculating a covariance matrix C of the characteristic vector X, solving characteristic values and corresponding characteristic vectors of the matrix C, arranging the characteristic values corresponding to the characteristic vectors into a matrix from top to bottom according to rows, forming a matrix P by taking the first 100 rows, and obtaining a new matrix Y according to the following formula:
Y=PX;
wherein the dimension of the matrix Y is 100 dimensions;
s35: extracting hand features; the hand characteristics include three parameters, which are: n, m, L; wherein n is the number of convex hull points of the outline, and m is the number of convex hull defects of the outline; l is the circularity;
traversing each point of the contour, and solving two polar points A and B through coordinates; connecting AB, respectively solving convex hulls from points in the upper convex hull and the lower convex hull, finding a point C farthest from AB in recursion, making line segments AC and BC, then performing recursion by taking AC as an extreme point, and circulating in sequence until the convex hull can not be found, wherein the farthest point is the peak of the convex hull, and the number n of the convex hull points of the outline is obtained;
each two adjacent convex hull points contain a convex hull pit, and in the n solving process, the convex hull points are marked to obtain the convex hull pit number m which is n-1;
the circularity L of the outline is obtained by the area and the perimeter formed by the outline, and the area S and the perimeter C of the outline are respectively obtained by using bwearea () and bwpherim () functions in matalab, and the calculation formula of the circularity is as follows:
s36: HOG characteristics and hand characteristics are normalized; obtaining the eigenvector alpha of the matrix Y after HOG dimensionality reduction as [ h ═ h1,h2,…,h100]And the vector alpha and the hand characteristic vector beta are equal to n, m, L]Performing serial fusion to obtain a fusion feature vector R ═ h1,h2,…,h100,n,m,L](ii) a And obtaining the final hand characteristics.
S4: comparing the extracted gesture feature information with the stored feature information, if the comparison is successful, performing S5, otherwise, returning to S1;
in step S4, comparing the extracted gesture feature information with the stored feature information, the specific operation mode is as follows: and (5) classifying and matching the hand characteristic information extracted in the step (S3) with the SVM classifier trained in matalab, and if the matching is successful, issuing a corresponding instruction to the mechanical arm. The issuing of the corresponding instruction comprises Mreset,MpauseAnd Mstart(ii) a Wherein M isresetControlling the resetting of the arm, MpauseControl of the suspension of the arm, MstartAnd controlling the starting of the mechanical arm.
S5: and the mechanical arm completes corresponding operation according to the gesture mapping instruction.
The upper computer and the mechanical arm controller are arranged in the same 5G CPE environment, and the data transmission information structures of the upper computer and the mechanical arm controller are as follows:
0
|
1
|
2
|
3
|
4
|
5
|
N
|
N+6
|
N+7
|
STX
|
LEN
|
SEQ
|
SYS
|
COMP
|
MSG
|
PAYLOAD
|
CKA
|
CKB |
the STX is used for identifying and analyzing the message; the LEN records the length of the load information; SEQ is a message sending sequence code used for communication reliability test; SYS is the system ID of the system sending the message; COMP is the component ID of the component sending the message system; MSG is used for identifying the kind of the message; PAYLOAD is instruction information; CKA and CKB are message check bits.
The experimental environment of this embodiment is based on experimental verification performed under the Windows10 operating system, and a desktop computer with amd26003.4ghz, 16GB memory, 256G solid-state hard disk, and a general camera with a resolution of 640 × 480 are used as acquisition devices, and Matalab2020a is used as image processing software.
Respectively collecting 586 gestures of the three gestures, randomly selecting 200 gestures for training an SVM model, collecting 586 x 3 gesture images in total for testing 386 gestures, wherein 200 x 3 gestures are used for testing, 386 x 3 gestures are used for training, and respectively carrying out image processing such as normalization, binarization, morphology and the like on the collected images and storing the images as a training set and a data set.
In order to verify the robustness of the experiment, a training set and a test set are used for respectively carrying out experiment comparison on HOG + SVM, HOG + Hu + SVM and HOG + hand multi-feature (algorithm in the text). The comparative table is shown below:
TABLE 1 comparison of performance of the three algorithms
The invention mainly develops research around static gestures and remote control of a mechanical arm according to the gestures, and provides a mechanical arm remote control method based on static gesture recognition aiming at the defects of high price and low recognition rate of the current gesture recognition, the method comprises the steps of acquiring images through a camera, eliminating interference information through gesture segmentation, morphological processing and contour extraction, extracting HOG features of gesture images in the gesture feature extraction, the HOG features are subjected to PCA dimension reduction processing, then a plurality of hand features are calculated by using a fast convex hull algorithm, the HOG features and the hand features are subjected to normalization processing and then are connected in series and fused to form final classification features, and finally the final classification features are classified and recognized through an SVM (support vector machine), so that the average recognition rate reaches 96%, and the speed and the precision of the method are obviously improved compared with those of an HOG + SVM and an HOG + Hu + SVM.
In the experiment, an HOG + SVM algorithm and an HOG + Hu + SVM algorithm are selected for carrying out the experiment, and compared with the algorithm, the following results are obtained:
the HOG + SVM is used for extracting gesture features by adopting a histogram of oriented gradients feature (HOG) extraction algorithm, and a Support Vector Machine (SVM) is used as a classifier for recognizing gestures; the HOG + Hu + SVM extracts the collected features by adopting a method of fusion of Histogram of Oriented Gradient (HOG) features and Hu moment features, and a Support Vector Machine (SVM) is adopted to identify the gesture, wherein the Hu moment of the image is an image feature with translation, rotation and scale invariance, so that the defect of independent HOG feature extraction can be overcome to a certain extent by the fusion of the extracted hand multi-features and the HOG features, and the gesture identification method based on single features cannot ensure the accuracy of gesture identification.