CN107886057B

CN107886057B - Robot hand waving detection method and system and robot

Info

Publication number: CN107886057B
Application number: CN201711042859.9A
Authority: CN
Inventors: 张帆
Original assignee: Nanjing Avatarmind Robot Technology Co ltd
Current assignee: Nanjing Avatarmind Robot Technology Co ltd
Priority date: 2017-10-30
Filing date: 2017-10-30
Publication date: 2021-03-30
Anticipated expiration: 2037-10-30
Also published as: CN107886057A; WO2019085060A1

Abstract

The invention discloses a robot hand waving detection method, a system and a robot, wherein the method comprises the following steps: detecting a palm in a standard position in a video stream by a cascade of classifiers; extracting angular points of the palm at the standard position to obtain an angular point set; tracking and detecting each corner point in the corner point set in a video stream to obtain a motion track corresponding to each corner point; and judging whether the palm waves the hand or not according to the motion trail of each angular point in the angular point set. The invention realizes the detection of the waving hand in a complex environment, has strong anti-interference capability to the dynamic background noise and has high detection accuracy.

Description

Robot hand waving detection method and system and robot

Technical Field

The invention relates to the field of image recognition, in particular to a robot hand waving detection method and system and a robot.

Background

With the development of science and technology, artificial intelligence is more and more popular, and people hope to interact with a robot like a real person. However, the current interaction mode with the robot is mainly a mouse, a keyboard and a touch screen, and people cannot meet the traditional interaction mode with many limitations. Therefore, some more intelligent interaction methods need to be designed. The vision is an important channel for understanding the interactive information between people, and the interactive intention of the other party can be understood by visually observing the limb actions of the interactive objects. Therefore, in order to increase the personifying character in the robot-human interaction process, it is necessary for the robot to understand some of the body languages of the human, such as waving a hand for a call.

The invention patent application No. 201610859376.7 discloses a waving detection method based on motion history images. The present invention first detects a rough area where a human body is located by a human body detector, and then analyzes the area based on a motion history image to determine whether the human body is waving a hand. However, the method based on motion history image analysis is only suitable for a static background, and when a person is in a complex dynamic background, the motion background can be easily judged as that the person is waving. The hand waving detection algorithm of the prior patent actually performs differential operation on three continuous frames of images of an area to be detected, then obtains a binary image from the obtained differential image by a threshold method, and finally utilizes a series of acquired binary image historical information to perform hand waving judgment. Because the invention adopts the frame difference method of continuous images, when the background is not fixed but dynamic, for example, if a person moves in the area to be detected, the dynamic background is easily interfered to identify the dynamic background as the person waving the hand.

Therefore, it is necessary to design a method capable of accurately recognizing a waving hand even under a complex background, so as to improve the intelligence of human-computer interaction.

Disclosure of Invention

The invention provides a robot hand waving detection method, a robot hand waving detection system and a robot, which realize the identification of palms under a complex background and track the palms, thereby detecting whether hands are waving or not and having strong anti-interference performance on background noise. The technical scheme is as follows:

a method of detecting a hand swing of a robot, comprising: detecting a palm in a standard position in a video stream by a cascade of classifiers; extracting angular points of the palm at the standard position to obtain an angular point set; tracking and detecting each corner point in the corner point set in a video stream to obtain a motion track corresponding to each corner point; and judging whether the palm waves the hand or not according to the motion trail of each angular point in the angular point set.

Compared with the detection method for waving hands in the prior art, the palm detection method can execute palm detection on a single video frame, and the waving hands can be detected in the waving process without combining historical data of the front video frame and the back video frame. Gesture recognition can be completed on a complex background through the cascade classifier, then the corner points of the palm are detected and tracked, hand waving detection under a complex environment is achieved, anti-interference capability on dynamic background noise is high, and accuracy on hand waving detection is high.

Preferably, the determining whether the user swings the hand according to the motion trajectory of each corner point in the corner point set specifically includes: analyzing the number of effective motion angular points in the angular point set according to the motion trail of each angular point in the angular point set between two adjacent frames in the video stream and a preset pixel distance; analyzing and obtaining the detection conditions of all corner points in the corner point set according to the number of the effective motion corner points and a first preset number; when the detection condition meets a preset requirement, analyzing and obtaining the number of effective waving angular points in the effective motion angular points according to the motion trail of the effective motion angular points; and judging whether the user waves the hands or not according to the number of the effective hand waving angular points and a second preset number.

By detecting the angular point track in the video stream, whether the angular point carries out effective motion is judged firstly, so that the accuracy of hand waving detection is improved; whether the angular point effectively moves leftwards and rightwards is judged, and whether the palm waves the hand or not can be effectively judged.

Preferably, the analyzing the number of effective motion corner points in the corner point set according to the motion trajectory of each corner point in the corner point set between two adjacent frames of the video stream and the preset pixel distance specifically includes: calculating the motion distance of the same corner point between two adjacent frames in the video stream according to the motion trail of each corner point in the corner point set between two adjacent frames in the video stream; judging whether the movement distance is larger than a preset pixel distance; when the motion distance is larger than a preset pixel distance, judging that the angular point performs effective motion, and taking the angular point as the effective motion angular point; when the motion distance is smaller than the preset pixel distance, judging the angular points to carry out invalid motion, and counting the number N of the valid motion angular points_trackSubtracting 1, finally calculating the obtained N_trackI.e. the number of effective motion corners in said set of corners, N_trackIs the total number N of corner points in the set of corner points;

the angular points which do effective motion can be effectively screened out by comparing the angular point motion distance with the preset pixel distance, and the effective motion angular points N obtained from the angular points_trackThe quantity of the palm is accurate, and preliminary preparation is made for judging whether the palm swings the hand or not.

Preferably, the detection of all corner points in the corner point set is obtained by analyzing according to the number of the effective motion corner points and a first preset numberThe situation specifically includes: judging the number N of the effective motion angular points_trackWhether the number is larger than a first preset number; when the number of effective motion corner points N_trackWhen the number of the effective motion angular points is less than the first preset number, stopping detecting the motion trail of the effective motion angular points, and judging that the user does not wave the hand; when the number of effective motion corner points N_trackAnd when the number of the effective motion corner points is larger than the first preset number, continuously detecting the motion trail of the effective motion corner points.

Judging the number N of the effective motion angular points_trackWhether the quantity of the effective motion angular points is larger than a first preset quantity or not can preliminarily judge whether the palm waves the hand, if the quantity of the effective motion angular points is too low, the palm can be judged not to move, and if the quantity N of the effective motion angular points is larger than a second preset quantity_trackIf the number is larger than the first preset number, the next judgment process is continued.

Preferably, when the number of effective motion corner points N is less than the number of effective motion corner points_trackWhen the number of the effective motion corner points is larger than the first preset number, detecting the left motion distance and the right motion distance of the effective motion corner points between two adjacent frames in the video stream; judging whether the left movement distance and the right movement distance of the effective movement corner point between two adjacent frames in the video stream are both larger than a preset distance; when the left movement distance and the right movement distance of the effective movement angular points between two adjacent frames in the video stream are larger than the preset distance, recording the effective movement angular points as effective hand-waving angular points, and counting the number N of the effective hand-waving angular points_moveAdding 1 to finally calculate the obtained N_moveThat is, the number of effective hand-waving corner points in the corner point set is N_moveIs 0.

If the number N of the effective motion corner points_trackMore than a first preset number, judging whether the effective motion angular points carry out effective motion leftwards and rightwards, and accurately obtaining the number N of effective hand waving angular points which carry out effective motion leftwards and rightwards_moveFacilitating the next decision process.

Preferably, judging whether the user swings the hand specifically according to the number of the effective hand swinging corner points and a second preset number includes: judge said hasNumber of effective hand-waving corner points N_moveWhether the number is larger than a second preset number; when the number of effective hand waving angular points is N_moveWhen the number of the hands is larger than the second preset number, judging that the palm swings the hand; otherwise, judging that the palm does not wave the hand.

Because the palm must carry out left right hand process of waving when waving the hand, consequently the movement track of palm angular point also can move left right, can detect from this and effectively wave the quantity of angular point, through effectively waving the judgement of angular point quantity, when effectively waving angular point quantity and being greater than the second and predetermineeing quantity, then can accurately appear whether the palm is waving the hand respectively.

Preferably, before detecting the palm in the standard position in the video stream by the cascade classifier, the method comprises: making a palm training sample; extracting the LBP texture characteristic vector of the palm training sample to obtain an extracted and processed palm training sample; and training a cascade classifier by extracting the processed palm training sample.

Before hand waving detection, an adaboost cascade classifier needs to be trained, and LBP texture feature extraction is carried out on palm training samples required by the training of the adaboost cascade classifier, so that the identification accuracy of the adaboost cascade classifier can be improved.

Preferably, the making of the palm training sample specifically comprises: palm samples of different users in different contexts are collected, including samples in which the palm is rotated within a preset angular range relative to a standard position.

When the palm samples are screened, in order to extract the LBP characteristic value conveniently, the selected palm samples are palm samples within a preset angle range relative to the standard position, so that the gesture recognition success rate of the cascade classifier near the standard position is improved better. The palm samples of different users under different backgrounds are collected, the diversity of the samples is improved, when the collected palm training samples have certain shapes and illumination differences, the obtained detection effect is very good, and the success rate of the cascade classifier for gesture recognition is improved.

Preferably, the process of extracting the LBP texture feature vector of the palm training sample specifically includes: dividing a detection window into N multiplied by N pixel regions, and comparing the gray value of a central pixel point in the pixel regions with the gray value of 8 adjacent pixel points, wherein N belongs to {64,32,16,8 }; if the pixel value of the adjacent pixel point is larger than that of the central pixel point, the adjacent pixel point is marked as 1, otherwise, the adjacent pixel point is 0, so that 8-bit binary number is generated and used as the LBP value of the central pixel point, and the LBP value of each pixel point in the pixel area can be obtained through calculation; calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region, and carrying out normalization processing on the statistical histogram; connecting the statistical histograms of each pixel region of the palm training sample to form an LBP texture feature vector of the palm training sample;

before training the cascade classifier, the LBP characteristic detection is carried out on the palm sample, so that the trained cascade classifier has high calculation speed and can be accurately positioned to the position of the palm on the image.

Preferably, detecting the corner points of the palm region to obtain a corner point set, and tracking and detecting all corner points in the corner point set in a video stream to obtain a motion trajectory corresponding to each corner point specifically: extracting the angular points of the palm at the standard position through an angular point detection algorithm to obtain an angular point set, carrying out optical flow tracking on all the angular points in the angular point set in a video stream, detecting the positions of all the angular points in the angular point set after movement according to the optical flow tracking algorithm, and calculating a movement track corresponding to each angular point according to a sparse optical flow algorithm.

The corner points of the palm area can be effectively extracted through a corner point detection algorithm, the optical flow tracking of each corner point in the video stream can be better realized through the optical flow tracking algorithm, the motion trail of the corner points can be better detected through a sparse optical flow algorithm, and the detection process of waving a hand is facilitated later.

A robot hand-waving detection system comprising: the detection module is used for detecting the palm at the standard position in the video stream through the cascade classifier; the angular point extraction module is electrically connected with the detection module and used for extracting the angular points of the palm at the standard position to obtain an angular point set; the angular point tracking module is electrically connected with the angular point extraction module and used for tracking and detecting each angular point in the angular point set in a video stream to obtain a motion track corresponding to each angular point; and the judging module is electrically connected with the angular point tracking module and is used for judging whether the palm waves the hand or not according to the motion trail of each angular point in the angular point set.

Preferably, the judging module includes: the analysis submodule analyzes and obtains the number of effective motion angular points in the angular point set according to the motion track of each angular point in the angular point set between two adjacent frames in the video stream and a preset pixel distance; the analysis submodule is further used for analyzing and obtaining the detection conditions of all the angular points in the angular point set according to the number of the effective motion angular points and a first preset number; the analysis submodule is further used for analyzing and obtaining the number of effective hand waving angular points in the effective motion angular points according to the motion trail of the effective motion angular points when the detection condition meets the preset requirement; and the judging submodule is electrically connected with the analyzing submodule and used for judging whether the user waves the hand or not according to the number of the effective hand waving corner points and a second preset number.

Preferably, the judging module further includes: the calculation submodule is used for calculating the motion distance of the same angular point between two adjacent frames in the video stream according to the motion trail of each angular point in the angular point set between two adjacent frames in the video stream; the judgment submodule is also electrically connected with the calculation submodule and is used for judging whether the movement distance is greater than a preset pixel distance; the judgment sub-module is further used for judging that the angular point performs effective motion when the motion distance is larger than a preset pixel distance, and taking the angular point as the effective motion angular point; the judgment sub-module is further used for judging the angular points to carry out invalid motion when the motion distance is smaller than the preset pixel distance, and the number N of the valid motion angular points_trackSubtracting 1, finally calculating the obtained N_trackI.e. the number of effective motion corners in said set of corners, N_trackIs the angleThe total number of angular points N in the point set;

preferably, the determining sub-module is further configured to determine the number N of effective motion corner points_trackWhether the number is larger than a first preset number; the judgment sub-module is further used for judging the number N of the effective motion corner points_trackWhen the number of the effective motion angular points is less than the first preset number, stopping detecting the motion trail of the effective motion angular points, and judging that the user does not wave the hand; the judgment sub-module is further used for judging the number N of the effective motion corner points_trackAnd when the number of the effective motion corner points is larger than the first preset number, continuously detecting the motion trail of the effective motion corner points.

Preferably, the determining module further includes a detecting sub-module, configured to determine the number N of valid motion corners_trackWhen the number of the effective motion corner points is larger than the first preset number, detecting the left motion distance and the right motion distance of the effective motion corner points between two adjacent frames in the video stream; the judgment sub-module is further used for judging whether the left movement distance and the right movement distance of the effective movement corner point between two adjacent frames in the video stream are both larger than a preset distance; the judgment sub-module is further used for recording the effective motion angular points as effective hand waving angular points and recording the number N of the effective hand waving angular points when the leftward motion distance and the rightward motion distance of the effective motion angular points between two adjacent frames in the video stream are larger than preset distances_moveAdding 1 to finally calculate the obtained N_moveThat is, the number of effective hand-waving corner points in the corner point set is N_moveIs 0.

Preferably, the judging sub-module is further configured to judge the number N of effective hand waving corner points_moveWhether the number is larger than a second preset number; the judging submodule is also used for judging the number N of the effective hand waving angular points_moveWhen the number of the hands is larger than the second preset number, judging that the palm swings the hand; otherwise, judging that the palm does not wave the hand.

Preferably, the method further comprises the following steps: the sample making module is used for making a palm training sample; the feature extraction module is used for extracting the LBP texture feature vector of the palm training sample to obtain the extracted and processed palm training sample; and the classifier training module is electrically connected with the feature extraction module and used for training the cascade classifier through the extracted palm training sample.

Preferably, the feature extraction module includes: the processing submodule is used for dividing the detection window into N multiplied by N pixel areas and comparing a central pixel point in the pixel areas with the gray values of 8 adjacent pixel points, wherein N belongs to {64,32,16 and 8 }; the characteristic value operator module is electrically connected with the processing submodule, if the pixel value of the adjacent pixel point is greater than that of the central pixel point, the adjacent pixel point is marked as 1, otherwise, the adjacent pixel point is 0, so that 8-bit binary number is generated and is used as the LBP value of the central pixel point, and the LBP value of each pixel point in the pixel area can be obtained through calculation; the processing submodule is also used for calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region and carrying out normalization processing on the statistical histogram; the processing sub-module is further configured to connect the statistical histograms of each pixel region of the palm training sample to form an LBP texture feature vector of the palm training sample;

preferably, the corner point tracking module is further configured to extract the corner points of the palm located in the standard position through a corner point detection algorithm to obtain a corner point set, perform optical flow tracking on all corner points in the corner point set in a video stream, detect positions of all corner points in the corner point set after movement according to the optical flow tracking algorithm, and calculate a movement track corresponding to each corner point according to a sparse optical flow algorithm.

A robot is integrated with the hand waving detection system of the robot.

According to the robot hand waving detection method and system and the robot provided by the invention, at least one of the following beneficial effects can be realized:

1. compared with the detection method for waving hands in the prior art, the palm detection method can execute palm detection on a single video frame, and the waving hands can be detected in the waving process without combining historical data of the front video frame and the back video frame. Gesture recognition can be completed on a complex background through the cascade classifier, then the corner points of the palm are detected and tracked, hand waving detection under a complex environment is achieved, anti-interference capability on dynamic background noise is high, and accuracy on hand waving detection is high.

2. The corner points of the palm area can be effectively extracted through a corner point detection algorithm, the optical flow tracking of each corner point in the video stream can be better realized through the optical flow tracking algorithm, the motion trail of the corner points can be better detected through a sparse optical flow algorithm, and the detection process of waving a hand is facilitated later.

3. By detecting the angular point track in the video stream, whether the angular point carries out effective motion is judged at first, and the accuracy of hand waving detection is improved. Whether the angular point effectively moves leftwards and rightwards is judged, and whether the palm waves the hand or not can be effectively judged.

4. When the palm samples are screened, in order to extract the LBP characteristic value conveniently, the selected palm samples are palm samples within a preset angle range relative to the standard position, so that the gesture recognition success rate of the cascade classifier near the standard position is improved better. The palm samples of different users under different backgrounds are collected, the diversity of the samples is improved, when the collected palm training samples have certain shapes and illumination differences, the obtained detection effect is very good, and the success rate of the cascade classifier for gesture recognition is improved.

5. Before training the cascade classifier, the LBP characteristic detection is carried out on the palm sample, so that the trained cascade classifier has high calculation speed and can be accurately positioned to the position of the palm on the image.

Drawings

The above features, technical features, advantages and implementations of a method, system and robot for detecting a waving hand of a robot will be further described in the following detailed description of preferred embodiments with reference to the accompanying drawings.

FIG. 1 is a flowchart of an embodiment of a method for detecting a waving hand of a robot according to the present invention;

FIG. 2 is a flowchart of another embodiment of a method for detecting a waving hand of a robot according to the present invention;

FIG. 3 is a schematic view of a palm swing of the present invention;

FIG. 4 is a flowchart of another embodiment of a method for detecting a waving hand of a robot according to the present invention;

FIG. 5 is a view of the palm corner point tracking inspection of the present invention;

FIG. 6 is a flowchart of another embodiment of a method for detecting a waving hand of a robot according to the present invention;

FIG. 7 is a schematic view of a hand swing detecting system of a robot according to the present invention;

FIG. 8 is another schematic view of the hand swing detecting system of the robot according to the present invention;

FIG. 9 is another schematic view of the hand swing detecting system of the robot according to the present invention;

the reference numbers illustrate:

the system comprises a sample making module-1, a feature extraction module-2, a processing sub-module-21, a feature value calculating sub-operator module-22, a classifier training module-3, a detection module-4, a corner extraction module-5, a corner tracking module-6, a judgment module-7, an analysis sub-module-71, a judgment sub-module-72 and a calculation sub-module-73.

Detailed Description

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will be made with reference to the accompanying drawings. It is obvious that the drawings in the following description are only some examples of the invention, and that for a person skilled in the art, other drawings and embodiments can be derived from them without inventive effort.

For the sake of simplicity, the drawings only schematically show the parts relevant to the present invention, and they do not represent the actual structure as a product. In addition, in order to make the drawings concise and understandable, components having the same structure or function in some of the drawings are only schematically illustrated or only labeled. In this document, "one" means not only "only one" but also a case of "more than one".

As shown in fig. 1, the present invention provides an embodiment of a method for detecting a waving hand of a robot, including:

s4 detecting a palm in a standard position in the video stream through the cascade of classifiers;

s5, extracting the corner points of the palm at the standard position to obtain a corner point set;

s6, tracking and detecting each corner point in the corner point set in the video stream to obtain a motion trail corresponding to each corner point;

s7, judging whether the palm swings according to the motion trail of each corner point in the corner point set.

Specifically, in this embodiment, first, the adaboost cascade classifier detects a palm in a standard position in the video stream by using an adaboost algorithm, after the palm in the standard position is detected, angular points on the palm are acquired by using an angular point acquisition method, tracking detection is performed on the angular points in the video stream, a motion trajectory of each angular point is obtained, and whether the palm in the video stream is waving can be judged according to the motion trajectory.

In this embodiment, the robot can synchronously detect the motion trajectories of the corners on the palm and the palm when acquiring the video stream, the palm waving detection is not performed based on the historical image, and the adaboost cascade classifier is used to have a high recognition rate for the palm.

In this embodiment, whether the palm swings the hand or not is determined by detecting the motion trajectories of the corner points, for example, the motion trajectories of each corner point in the set of corner points are detected, and if more than half of the corner points perform effective left-to-right motion, it can be determined that the user swings the hand.

As shown in fig. 2, the present invention also provides another embodiment of a method for detecting a waving hand of a robot, including:

s1, making a palm training sample;

s2, extracting LBP texture characteristic vectors of the palm training samples to obtain extracted and processed palm training samples;

s3 trains a cascade classifier by extracting the processed palm training samples.

Preferably, the making of the palm training sample specifically comprises: s11 collects palm samples of different users in different contexts, the palm samples including samples in which the palm is rotated within a preset angle range with respect to a standard position.

Preferably, the process of extracting the LBP texture feature vector of the palm training sample specifically includes:

s21, dividing the detection window into N multiplied by N pixel regions, and comparing the gray value of a central pixel point in the pixel regions with the gray value of 8 adjacent pixel points, wherein N belongs to {64,32,16,8 };

s22, if the pixel value of the neighboring pixel is greater than the pixel value of the central pixel, the neighboring pixel is marked as 1, otherwise, the neighboring pixel is 0, thereby generating an 8-bit binary number as the LBP value of the central pixel, and thereby calculating the LBP value of each pixel in the pixel region;

s23, calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region, and carrying out normalization processing on the statistical histogram;

s24, connecting the statistical histograms of each pixel region of the palm training sample to form an LBP texture feature vector of the palm training sample;

specifically, the cascade classifier needs to be trained before detecting the palm in the standard position in the video stream by using the cascade classifier, and this embodiment specifically describes a sample making process of the cascade classifier and a process how to train the cascade classifier.

As shown in FIG. 3, the palm of the user moves back and forth between the first, second and third positions during waving the hand. In the embodiment, the palm in the video frame is detected by using an Adaboost algorithm in combination with the LBP feature. Since the algorithm has no rotation invariance, only the palm with the rotation angle within the range of +/-15 degrees at a certain standard position can be identified. In the embodiment, the gesture with the palm at the position II is selected as the standard palm, and the gestures at other positions cannot be recognized.

The training process of the cascade classifier can be divided into: 1. making a palm training sample; 2. extracting the LBP characteristics of the palm training sample; 3. a cascade classifier is trained.

When a palm training sample is collected, in order to extract an LBP characteristic value, the selected palm sample is a palm sample of which the palm is within a preset angle range relative to a standard position, so as to better improve the gesture recognition success rate of the cascade classifier near the standard position, the preset angle range of the palm can be a range of ± 15 °, that is, the sample includes gesture samples of which the rotation angle relative to the standard position is within a range of ± 15 °, so as to better improve the gesture recognition rate of an algorithm near the standard position. When palm samples of different users under different backgrounds are collected, in order to improve the diversity of the samples, when the collected palm training samples have certain shape and illumination difference, the obtained detection effect is very good, and the success rate of the cascade classifier for gesture recognition can be improved to a greater extent.

The extraction process of the LBP characteristics of the palm training sample comprises the following steps: a. dividing the detection window into 16 × 16 small regions (cells), comparing the gray values of 8 adjacent pixels with one pixel in each cell, and if the peripheral pixel values are greater than the central pixel value, marking the position of the pixel as 1, otherwise, marking the position as 0. Thus, 8 points in the 3-by-3 neighborhood can generate 8-bit binary numbers through comparison, and the LBP value of the window center pixel point is obtained; b. calculating a histogram for each cell, i.e. the frequency of occurrence of each digit (assuming a decimal LBP value); normalizing the histogram; c. and finally, connecting the obtained statistical histograms of all the cells into a feature vector, namely an LBP texture feature vector of the whole graph.

The AdaBoost algorithm used for training the cascade classifier is an iterative algorithm, the core idea is to train different classifiers, namely weak classifiers, aiming at the same training set, then the weak classifiers are gathered to construct a stronger final classifier, and the algorithm is divided into the following 3 steps:

a. and initializing weight distribution of the training data. If there are N samples, each training sample is initially given the same weight: 1/N;

b. and training the weak classifier. If a sample point has been accurately classified, its weight is reduced in constructing the next training set; conversely, if a sample point is not classified accurately, its weight is increased. The sample set with updated weights is then used to train the next classifier, and the entire training process proceeds iteratively. Learning by using a training data set with weight distribution to obtain a basic classifier (selecting a threshold value with the lowest error rate to design the basic classifier):

G_m(x):x→{-1,+1}

calculate the classification error rate on the training data set:

from the above equation, the error rate on the training data set is the sum of the weights of the misclassified samples.

The calculated coefficients, representing the degree of importance in the final classifier (objective: obtaining the weight that the basic classifier occupies in the final classifier):

as can be seen from the above equation, when, and increasing with decreasing, it means that the basic classifier with smaller classification error rate has a larger role in the final classifier.

The weight distribution of the training data set is updated (in order to obtain a new weight distribution of the samples) for the next iteration

D_m+1＝(w_m+1,1,w_m+1,2,...,w_m+1,N)

So that the weight of the misclassified samples by the basic classifier is increased and the weight of the correctly classified samples is decreased. In this manner, the Adaboost method can "focus" or "focus on" those samples that are less readily separable.

c. And combining the weak classifiers obtained by training into a strong classifier. After the training process of each weak classifier is finished, the weight of the weak classifier with small classification error rate is increased to play a larger decision role in the final classification function, and the weight of the weak classifier with large classification error rate is reduced to play a smaller decision role in the final classification function. In other words, weak classifiers with low error rates take up more weight in the final classifier, and are otherwise smaller.

Combining the weak classifiers to obtain a final classifier as follows:

because the improved palm detection algorithm is adopted, the method has the following advantages:

1. the gesture detection algorithm performs palm detection on separate video frames without combining historical data of previous and subsequent video frames.

2. The detection algorithm is used for detecting through the LBP characteristics of the palm image, the calculation speed is high, and the position of the palm on the image can be accurately positioned.

3. When the collected palm training samples have certain shape and illumination difference, the obtained detection effect is very good, and the recognition rate is high.

As shown in fig. 4, the present invention also provides another embodiment of a method for detecting a waving hand of a robot, including:

s11, acquiring palm samples of different users under different backgrounds, wherein the palm samples comprise samples of a palm rotating within a preset angle range relative to a standard position;

s3 training a cascade classifier by extracting the processed palm training sample;

s51, extracting the corner points of the palm in the standard position through a corner point detection algorithm to obtain a corner point set;

s61, carrying out optical flow tracking on all the angular points in the angular point set in a video stream, detecting the positions of all the angular points in the angular point set after movement according to an optical flow tracking algorithm, and calculating to obtain a movement track corresponding to each angular point according to a sparse optical flow algorithm;

s71, analyzing and obtaining the number of effective motion corner points in the corner point set according to the motion trail of each corner point in the corner point set between two adjacent frames in the video stream and the preset pixel distance;

s72, when the number of the effective motion corner points is larger than a first preset number, continuing to track and detect all corner points in the corner point set;

s73, analyzing and obtaining the number of effective hand waving corner points in the effective motion corner points according to the motion trail of the effective motion corner points;

and S74, when the number of the effective hand waving corner points is larger than a second preset number, judging that the user is waving hands.

The embodiment specifically explains the detection and tracking of the corner points of the palm, and how to detect the process of waving the hand according to the motion tracks of the corner points.

Specifically, as shown in fig. 5, the palm movement detection and tracking process. Detecting a palm in a standard position in a video frame; extracting angular points in the palm area through an angular point detection algorithm, wherein the extracted angular points are multiple, a part of the extracted angular points are schematically shown in the figure, and black points are detected angular points; thirdly, carrying out optical flow tracking on the video frame by taking the initial position of the corner point detected when the palm is in the standard position in the video frame as the initial position; fourthly, the fifth step is the position of the corner point detected by the optical flow tracking algorithm in the next video frame after the movement; l1, L2 are motion paths traced to the hand corner points by sparse optical flow algorithms during the palm swing.

In this embodiment, the corner detection algorithm is adopted as the method for extracting the corner, so that the corner of the palm region can be effectively extracted, and in addition, other methods for extracting the feature points, such as a FAST feature point extraction method, may also be used, and details are not described here. The tracking of the angular points in the video stream adopts an optical flow tracking algorithm, each angular point can be effectively tracked in the video stream, the effective degree of tracking is improved, the motion trail of the angular points can be better detected through a sparse optical flow algorithm, and the detection process of waving a hand is facilitated later.

And when the motion distance of the angular points in the two adjacent frames is greater than a preset pixel distance, taking the angular points as effective motion angular points, and counting the number of the effective motion angular points.

Analyzing and obtaining the detection conditions of all the angular points in the angular point set according to the number of the effective motion angular points and a first preset number, wherein the specific detection conditions are as follows: the utility model provides a be that the quantity of effective motion angular point is less than first predetermined quantity, the value ratio of first predetermined quantity is less, can establish to 5 ~ 10, because the palm angular point quantity of extraction far exceeds 5 ~ 10, and when the quantity of effective motion angular point is less than first predetermined quantity, most angular points do not carry out effective motion promptly, can judge that the palm does not wave the hand this moment.

When the number of the effective motion angular points is larger than a first preset number, namely, when the detection condition meets the preset requirement, the number of the effective hand waving angular points in the effective motion angular points is analyzed and obtained according to the motion trail of the effective motion angular points, when a hand is waved, the palm can certainly move leftwards and rightwards, the angular points on the palm can also do the same motion, therefore, a corresponding motion trail can be formed, and the palm waving hand can be judged according to the motion trail of the angular points. When the effective motion corner points are tracked and detected, the target is inevitably lost, so that the number of the effective motion corner points obtained by tracking is not as large as that of the original effective motion corner points, and after the effective motion corner points are effectively moved leftwards and rightwards, the effective motion corner points are recorded as effective waving corner points. And when the number of the effective hand waving corner points is greater than a second preset number, judging that the user waves the hand.

As shown in fig. 6, the present invention also provides another embodiment of a hand swing detection method of a robot, including:

s711, calculating the motion distance of the same corner point between two adjacent frames in the video stream according to the motion trail of each corner point in the corner point set between two adjacent frames in the video stream;

s712 determining whether the movement distance is greater than a preset pixel distance;

s713, when the motion distance is larger than a preset pixel distance, judging the angular point to carry out effective motion, and taking the angular point as the effective motion angular point;

s714, when the motion distance is smaller than the preset pixel distance, judging the corner points to carry out invalid motion, and counting the number N of the valid motion corner points_trackSubtracting 1, finally calculating the obtained N_trackI.e. the number of effective motion corners in said set of corners, N_trackIs the total number N of corner points in the set of corner points;

s721, judging the number N of the effective motion corner points_trackWhether the number is larger than a first preset number;

s722 when the number N of the effective motion corner points_trackWhen the number of the effective motion angular points is less than the first preset number, stopping detecting the motion trail of the effective motion angular points, and judging that the user does not wave the hand;

s723 as the number N of effective motion corner points_trackAnd when the number of the effective motion corner points is larger than the first preset number, continuously detecting the motion trail of the effective motion corner points.

S731 when the number of effective motion corner points N_trackWhen the number of the effective motion corner points is larger than the first preset number, detecting the left motion distance and the right motion distance of the effective motion corner points between two adjacent frames in the video stream;

s732, judging whether the left movement distance and the right movement distance of the effective movement corner point between two adjacent frames in the video stream are both larger than a preset distance;

s733, when the left movement distance and the right movement distance of the effective movement angular point between two adjacent frames in the video stream are larger than the preset distance, the effective movement angular point is recorded as an effective hand waving angular point, and the number N of the effective hand waving angular points is_moveAdding 1 to finally calculate the obtained N_moveThat is, the number of effective hand-waving corner points in the corner point set is N_moveIs 0;

s741 judges the number N of effective hand waving angular points_moveWhether the number is larger than a second preset number;

s742 when the number N of effective hand waving corner points_moveWhen the number of the hands is larger than the second preset number, judging that the palm swings the hand;

and S743 otherwise, judging that the palm is not waved.

Specifically, the present embodiment describes in more detail how to determine whether the palm is waving according to the motion trajectory of each corner point in the set of corner points.

The hand waving detection process specifically comprises the following steps:

firstly, a standard gesture is detected from a video frame to start tracking the palm, and all tracking angular points A of a palm area are counted_iA motion path L of { i ═ 1, 2.. N }_iN, where N is the total number of corner points in the palm area. Stipulate as L_i> 0 indicates that the corner moves to the right of the standard gesture, when L_i< 0 indicates that the corner point moves to the left of the standard gesture.

When L is_i＜-D_leftIndicating that the corner has effectively moved to the left and the corresponding flag M of the corner_i,leftTrue, wherein D_leftA threshold value for the leftward movement distance may be set to a two-pixel distance; when L is_i＞D_rightIndicating that the corner has effectively moved to the right, and the corresponding flag M of the corner_i,rightTrue, wherein D_rightThe threshold value for the rightward movement distance may be set to a two-pixel distance;

thirdly, when the angular point A between two frames in the video stream_iIs moved along a path L_iLess than two pixel distance, the corner point tracking is regarded as invalid, and the effective corner point number N is tracked_track Minus 1, N_trackIs the total number N of corner points in the set of corner points;

fourthly, when tracking the effective angular point total number N_trackStopping tracking when the tracking is less than 5, and stopping tracking when the total number of effective corner points N is tracked_trackWhen the motion angle point is more than or equal to 5, continuously judging whether the effective motion angle point in the video stream carries out effective motion to the left and the right,

when a certain corner point A_iEffective movements both to the left and to the right, i.e. movement marks M_i,left＝true,M_i,rightTrue, the number of valid hand corner points N_moveAnd adding 1. Number N of effective hand waving angular points_moveIs 0;

sixthly, counting the number N of effective hand-waving angular points in real time_moveWhen is coming into contact with

It can be determined that the user is waving his hand, and at the same time, the corner point tracking is stopped and a new round of palm detection is started.

Sixthly, if fruit is waved, the total number of the effective motion angular points is detected

The detection is invalid, and a new round of detection is started.

The embodiment provides a method capable of accurately recognizing hand waving in a dynamic background, and a camera can detect whether a person waves the hand through continuous video frames. The invention is different from the frame difference method-based identification in nature, firstly, the palm is detected on the picture of each frame through a palm detector, once the palm is detected, the palm is tracked, and the hand waving judgment is carried out through the motion track of the palm. The palm detection is performed on a single-frame picture without combining upper and lower historical image frames, so that the palm detection is not interfered by a complex background and a dynamic background. Namely, the detection method is accurate and stable, can not be influenced by a complex background, and can accurately identify whether the hand is swung or not even if the background is interfered by the motion of people or other objects.

As shown in fig. 7, the present invention provides an embodiment of a waving detection system of a robot, including:

the detection module is used for detecting the palm at the standard position in the video stream through the cascade classifier;

the angular point extraction module is electrically connected with the detection module and used for extracting the angular points of the palm at the standard position to obtain an angular point set;

the angular point tracking module is electrically connected with the angular point extraction module and used for tracking and detecting each angular point in the angular point set in a video stream to obtain a motion track corresponding to each angular point;

and the judging module is electrically connected with the angular point tracking module and is used for judging whether the palm waves the hand or not according to the motion trail of each angular point in the angular point set.

Firstly, a robot calls a detection module to detect a palm at a standard position in a video stream through a cascade classifier; secondly, calling an angular point extraction module to extract the angular points of the palm at the standard position to obtain an angular point set; thirdly, calling an angular point tracking module to perform tracking detection on each angular point in the angular point set in the video stream to obtain a motion track corresponding to each angular point; and finally, calling a judging module, and judging whether the palm waves the hand according to the motion trail of each angular point in the angular point set.

As shown in fig. 8, the present invention provides an embodiment of a waving detection system of a robot, and based on the above embodiment, the waving detection system further includes:

the sample making module is used for making a palm training sample;

the feature extraction module is used for extracting the LBP texture feature vector of the palm training sample to obtain the extracted and processed palm training sample;

and the classifier training module is electrically connected with the feature extraction module and used for training the cascade classifier through the extracted palm training sample.

Preferably, the feature extraction module includes:

the processing submodule is used for dividing the detection window into N multiplied by N pixel areas and comparing a central pixel point in the pixel areas with the gray values of 8 adjacent pixel points, wherein N belongs to {64,32,16 and 8 };

the characteristic value operator module is electrically connected with the processing submodule, if the pixel value of the adjacent pixel point is greater than that of the central pixel point, the adjacent pixel point is marked as 1, otherwise, the adjacent pixel point is 0, so that 8-bit binary number is generated and is used as the LBP value of the central pixel point, and the LBP value of each pixel point in the pixel area can be obtained through calculation;

the processing submodule is also used for calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region and carrying out normalization processing on the statistical histogram;

the processing sub-module is further configured to connect the statistical histograms of each pixel region of the palm training sample to form an LBP texture feature vector of the palm training sample;

the corner point tracking module is further used for extracting the corner points of the palm in the standard position through a corner point detection algorithm to obtain a corner point set, carrying out optical flow tracking on all the corner points in the corner point set in a video stream, detecting the positions of all the corner points in the corner point set after movement according to the optical flow tracking algorithm, and calculating a movement track corresponding to each corner point according to a sparse optical flow algorithm.

As shown in fig. 3, the palm of the user moves back and forth among the positions of (i), (ii), and (iii) in the process of waving the hand, and the palm of the video frame is detected by using the Adaboost algorithm in combination with the LBP feature in the embodiment. Since the algorithm has no rotation invariance, only the palm with the rotation angle within the range of +/-15 degrees at a certain standard position can be identified. In the embodiment, the gesture with the palm at the position II is selected as the standard palm, and the gestures at other positions cannot be recognized.

The process of extracting the LBP characteristics of the palm training sample by the characteristic extraction module is as follows: a. dividing the detection window into 16 × 16 small regions (cells), comparing the gray values of 8 adjacent pixels with one pixel in each cell, and if the peripheral pixel values are greater than the central pixel value, marking the position of the pixel as 1, otherwise, marking the position as 0. Thus, 8 points in the 3-by-3 neighborhood can generate 8-bit binary numbers through comparison, and the LBP value of the window center pixel point is obtained; b. calculating a histogram for each cell, i.e. the frequency of occurrence of each digit (assuming a decimal LBP value); normalizing the histogram; c. and finally, connecting the obtained statistical histograms of all the cells into a feature vector, namely an LBP texture feature vector of the whole graph.

G_m(x):x→{-1,+1}

calculate the classification error rate on the training data set:

D_m+1＝(w_m+1,1,w_m+1,2,...,w_m+1,N)

Combining the weak classifiers to obtain a final classifier as follows:

As shown in fig. 9, the present invention provides an embodiment of a waving detection system of a robot, including:

the judging module is electrically connected with the angular point tracking module and is used for judging whether the palm waves the hand or not according to the motion trail of each angular point in the angular point set;

the judging module comprises:

the analysis submodule analyzes and obtains the number of effective motion angular points in the angular point set according to the motion track of each angular point in the angular point set between two adjacent frames in the video stream and a preset pixel distance;

the analysis submodule is further used for analyzing and obtaining the detection conditions of all the angular points in the angular point set according to the number of the effective motion angular points and a first preset number;

the analysis submodule is further used for analyzing and obtaining the number of effective hand waving angular points in the effective motion angular points according to the motion trail of the effective motion angular points when the detection condition meets the preset requirement;

the judging submodule is electrically connected with the analyzing submodule and used for judging whether the user waves the hand or not according to the number of the effective hand waving angular points and a second preset number;

the judging module further comprises:

the calculation submodule is used for calculating the motion distance of the same angular point between two adjacent frames in the video stream according to the motion trail of each angular point in the angular point set between two adjacent frames in the video stream;

the judgment submodule is also electrically connected with the calculation submodule and is used for judging whether the movement distance is greater than a preset pixel distance;

the judgment sub-module is further used for judging that the angular point performs effective motion when the motion distance is larger than a preset pixel distance, and taking the angular point as the effective motion angular point;

the judgment sub-module is further used for judging the angular points to carry out invalid motion when the motion distance is smaller than the preset pixel distance, and the number N of the valid motion angular points_trackSubtracting 1, finally calculating the obtained N_trackI.e. the number of effective motion corners in said set of corners, N_trackIs the total number N of corner points in the set of corner points;

the judging submodule is also used for judging the number N of the effective motion angular points_trackWhether the number is larger than a first preset number;

the judgment sub-module is further used for judging the number N of the effective motion corner points_trackWhen the number of the effective motion angular points is less than the first preset number, stopping detecting the motion trail of the effective motion angular points, and judging that the user does not wave the hand;

the judgment sub-module is further used for judging the number N of the effective motion corner points_trackAnd when the number of the effective motion corner points is larger than the first preset number, continuously detecting the motion trail of the effective motion corner points.

The judging module further comprises a detection submodule for detecting the number N of the effective motion corner points_trackWhen the number of the effective motion corner points is larger than the first preset number, detecting the left motion distance and the right motion distance of the effective motion corner points between two adjacent frames in the video stream;

the judgment sub-module is further used for judging whether the left movement distance and the right movement distance of the effective movement corner point between two adjacent frames in the video stream are both larger than a preset distance;

the judgment sub-module is further used for recording the effective motion angular points as effective hand waving angular points and recording the number N of the effective hand waving angular points when the leftward motion distance and the rightward motion distance of the effective motion angular points between two adjacent frames in the video stream are larger than preset distances_moveAdding 1 to finally calculate the obtained N_moveThat is, the number of effective hand-waving corner points in the corner point set is N_moveIs 0;

the judging submodule is also used for judging the number N of the effective hand waving angular points_moveWhether the number is larger than a second preset number;

the judging submodule is also used for judging the number N of the effective hand waving angular points_moveWhen the number of the hands is larger than the second preset number, judging that the palm swings the hand; otherwise, judging that the palm does not wave the hand;

the sample making module is used for making a palm training sample;

the classifier training module is electrically connected with the feature extraction module and used for training a cascade classifier through the extracted palm training sample;

the feature extraction module includes:

Specifically, in this embodiment, how to judge whether the palm swings the hand according to the motion trajectory of each corner point in the corner point set is described in more detail, and the hand swing detection process specifically includes:

The detection is invalid, and a new round of detection is started.

It should be noted that the above embodiments can be freely combined as necessary. The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A method for detecting a hand swinging of a robot, comprising:

detecting a palm in a standard position in a video stream by a cascade of classifiers;

extracting angular points of the palm at the standard position to obtain an angular point set;

tracking and detecting each corner point in the corner point set in a video stream to obtain a motion track corresponding to each corner point;

analyzing the number of effective motion angular points in the angular point set according to the motion trail of each angular point in the angular point set between two adjacent frames in the video stream and a preset pixel distance;

when the number of the effective motion angular points is larger than a first preset number, continuously tracking and detecting all angular points in the angular point set;

analyzing and obtaining the number of effective waving angular points in the effective motion angular points according to the motion trail of the effective motion angular points;

and when the number of the effective hand waving corner points is larger than a second preset number, judging that the user waves the hand.

2. The method as claimed in claim 1, wherein the analyzing the number of effective motion corners in the corner set according to the motion trajectory of each corner in the corner set between two adjacent frames of the video stream and the preset pixel distance comprises:

calculating the motion distance of the same corner point between two adjacent frames in the video stream according to the motion trail of each corner point in the corner point set between two adjacent frames in the video stream;

judging whether the movement distance is larger than a preset pixel distance;

when the motion distance is larger than a preset pixel distance, judging that the angular point performs effective motion, and taking the angular point as the effective motion angular point;

when the motion distance is smaller than the preset pixel distance, judging the angular points to carry out invalid motion, and counting the number N of the valid motion angular points_trackSubtracting 1, finally calculating the obtained N_trackI.e. the number of effective motion corners in said set of corners, N_trackIs the total number N of corner points in said set of corner points.

3. The method for detecting the waving of a robot as claimed in any one of claims 1 to 2, wherein before detecting the palm in the standard position in the video stream by the cascade classifier, the method comprises:

making a palm training sample;

extracting the LBP texture characteristic vector of the palm training sample to obtain an extracted and processed palm training sample;

and training a cascade classifier by extracting the processed palm training sample.

4. The method according to claim 3, wherein the step of preparing a palm training sample comprises:

palm samples of different users in different contexts are collected, including samples in which the palm is rotated within a preset angular range relative to a standard position.

5. The method as claimed in claim 3, wherein the process of extracting the LBP texture feature vector of the palm training sample comprises:

dividing a detection window into N multiplied by N pixel regions, and comparing the gray value of a central pixel point in the pixel regions with the gray value of 8 adjacent pixel points, wherein N belongs to {64,32,16,8 };

if the pixel value of the adjacent pixel point is larger than that of the central pixel point, the adjacent pixel point is marked as 1, otherwise, the adjacent pixel point is 0, so that 8-bit binary number is generated and used as the LBP value of the central pixel point, and the LBP value of each pixel point in the pixel area can be obtained through calculation;

calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region, and carrying out normalization processing on the statistical histogram;

and connecting the statistical histograms of each pixel region of the palm training sample to form an LBP texture feature vector of the palm training sample.

6. The method according to any one of claims 1 to 2, wherein the extracting angular points of the palm at a standard position to obtain an angular point set, and performing tracking detection on each angular point in the angular point set in a video stream to obtain a motion trajectory corresponding to each angular point specifically comprises:

extracting the angular points of the palm at the standard position through an angular point detection algorithm to obtain an angular point set, carrying out optical flow tracking on each angular point in the angular point set in a video stream, detecting the position of each angular point in the angular point set after movement according to the optical flow tracking algorithm, and calculating the movement track corresponding to each angular point according to a sparse optical flow algorithm.

7. A robot hand-waving detection system comprising:

the judging module is electrically connected with the angular point tracking module and is used for judging whether the palm waves the hand or not according to the motion trail of each angular point in the angular point set; wherein, the judging module comprises:

the analysis submodule is further used for continuing to track and detect all the angular points in the angular point set when the number of the effective motion angular points is larger than a first preset number;

the analysis submodule is further used for analyzing and obtaining the number of effective hand waving angular points in the effective motion angular points according to the motion trail of the effective motion angular points;

and the judging submodule is electrically connected with the analyzing submodule and used for judging that the user waves the hand when the number of the effective hand waving angular points is larger than a second preset number.

8. The system of claim 7, wherein the determining module further comprises:

the judgment sub-module is further used for judging the angular points to carry out invalid motion when the motion distance is smaller than the preset pixel distance, and the number N of the valid motion angular points_trackSubtracting 1, finally calculating the obtained N_trackI.e. the number of effective motion corners in said set of corners, N_trackIs the total number N of corner points in said set of corner points.

9. A robot hand-waving detection system according to any one of claims 7 to 8, further comprising:

the sample making module is used for making a palm training sample;

10. The system of claim 8, wherein the feature extraction module comprises:

the processing sub-module is further configured to connect the statistical histograms of each pixel region of the palm training samples to form an LBP texture feature vector of the palm training samples.

11. A robot hand-waving detection system as set forth in any one of claims 7 to 8, characterized in that:

the corner point tracking module is further used for extracting the corner points of the palm in the standard position through a corner point detection algorithm to obtain a corner point set, carrying out optical flow tracking on each corner point in the corner point set in a video stream, detecting the positions of all the corner points in the corner point set after movement according to the optical flow tracking algorithm, and calculating the movement track corresponding to each corner point according to a sparse optical flow algorithm.

12. A robot characterized by being integrated with a hand swing detection system of a robot as claimed in any one of claims 7 to 11.