WO2019158015A1 - 样本获取方法、目标检测模型生成方法、目标检测方法 - Google Patents

样本获取方法、目标检测模型生成方法、目标检测方法 Download PDF

Info

Publication number
WO2019158015A1
WO2019158015A1 PCT/CN2019/074668 CN2019074668W WO2019158015A1 WO 2019158015 A1 WO2019158015 A1 WO 2019158015A1 CN 2019074668 W CN2019074668 W CN 2019074668W WO 2019158015 A1 WO2019158015 A1 WO 2019158015A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
sample
node
positive
score
Prior art date
Application number
PCT/CN2019/074668
Other languages
English (en)
French (fr)
Inventor
唐小军
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Priority to US16/605,752 priority Critical patent/US11238296B2/en
Priority to EP19753684.0A priority patent/EP3754539A4/en
Publication of WO2019158015A1 publication Critical patent/WO2019158015A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • G06F18/2193Validation; Performance evaluation; Active pattern learning techniques based on specific statistical tests
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/285Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/778Active pattern-learning, e.g. online learning of image or video features
    • G06V10/7796Active pattern-learning, e.g. online learning of image or video features based on specific statistical tests
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/87Arrangements for image or video recognition or understanding using pattern recognition or machine learning using selection of the recognition techniques, e.g. of a classifier in a multiple classifier system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition

Definitions

  • the present disclosure relates to the field of intelligent identification technologies, and in particular, to a sample acquisition method, a target detection model generation method, a target detection method, a computing device, and a computer readable medium.
  • Target detection is an important research direction in the field of computer vision. It involves intelligently analyzing the visual image captured by the camera to automatically detect the target contained in the image. Target detection has a wide range of applications in the fields of vehicle assisted driving, intelligent monitoring, and intelligent robots.
  • the corresponding classifier is often trained based on the pre-acquired positive and negative samples, and then the detected image is identified based on the classifier to detect whether the target is included in the image to be detected.
  • the number of positive samples and the quality of the selection can affect the recognition accuracy of the classifier to some extent.
  • the existing positive sample selection is often done by manually marking the positive sample identification box in the original image, and a positive sample identification frame is just a complete target (without background).
  • a sample acquisition method may include: adding a disturbance to a pre-marked sample original frame in the original image to obtain a sample selection frame, the image enclosed by the original frame of the sample includes a target; and extracting a circle surrounded by the sample selection frame The image is taken as a sample.
  • the step of adding a disturbance to the sample original frame may include: a center position of the sample original frame, and/or a width of the sample original frame, and/or a height of the sample original frame , adding a random perturbation to obtain a sample perturbation box, and using the sample perturbation box as the sample decimation box.
  • the image enclosed by the sample original frame may contain a complete image of the target or a partial image of the target.
  • a target detection model generation method may include: acquiring a plurality of positive samples from the original image by using the sample acquisition method as described above; acquiring a plurality of images not including the target from the original image as a negative sample; The obtained plurality of the positive samples and the plurality of the negative samples use a preset classification model algorithm to generate a target detection model.
  • the step of generating the target detection model by using the preset classification model algorithm according to the acquired plurality of positive samples and the plurality of negative samples may include: in the plurality of positive samples Normalizing each negative sample and each of the plurality of negative samples; performing feature vector extraction on each of the normalized positive samples and each negative sample by using a preset feature extraction algorithm to obtain positive sample features a training set and a negative sample feature training set, the positive sample feature training set includes extracted feature vectors of all positive samples, the negative sample feature training set includes extracted feature vectors of all negative samples; and a training set according to the positive sample feature And the negative sample feature training set trains a plurality of weak classifiers, all of the plurality of weak classifiers constituting the target detection model.
  • the step of training a plurality of weak classifiers according to the positive sample feature training set and the negative sample feature training set includes:
  • the depth of the root node is 1, the root node includes all the positive sample feature vectors and all the negative sample feature vectors;
  • step S3 detecting whether the storage structure has unprocessed nodes; if it is detected that the storage structure has no unprocessed nodes, step S7 is performed, otherwise step S4 is performed;
  • step S4 taking a node from the storage structure, and determining whether the node is detachable, if it is determined that the node is detachable, step S5 is performed, otherwise step S3 is performed;
  • Step S9 Update, by using a weak classifier that currently completes training, a detection score and a weight of each positive sample feature vector in the positive sample feature training set and each negative sample feature vector in the negative sample feature training set, and Step S2 is performed based on the updated detection score and weight.
  • step S1 between step S1 and step S3, the method further includes: S2a, normalizing the weight of each of the positive sample feature vectors and the weight of each of the negative sample feature vectors.
  • step S4 may include:
  • WPS is the weight sum of the positive sample feature vector in the node
  • WNS is the weight sum of the negative sample feature vector in the node
  • p is a constant greater than 0;
  • DEPTH is the depth of the node
  • MAXDEPTH is the preset maximum depth of the node
  • TH is a preset ratio threshold
  • step S42 If it is determined in step S42 that the node satisfies the condition a and the condition b at the same time, it is determined that the node is detachable; otherwise, it is determined that the node is not separable.
  • step S5 may include:
  • S51a randomly selects NF feature attributes, and sets corresponding feature thresholds, classifies the positive sample feature vector and the negative sample feature vector of the node for each feature attribute and its corresponding feature threshold, and obtains NF feature classification results.
  • Each classification result includes a left sub-set and a right sub-set, wherein the threshold corresponding to the i-th feature attribute is TH i , i ⁇ [1, NF], when classified based on the i-th feature attribute,
  • the feature vector whose eigenvalue of the i-th feature attribute is smaller than TH i is placed in the left sub-set, and the feature vector whose ei-valued feature of the i-th feature attribute is greater than or equal to TH i is placed in the right sub-set.
  • WLP is the weight sum of the positive sample feature vector in the left subset
  • WLN is the weight sum of the negative sample feature vector in the left subset
  • WRP is the weight sum of the positive sample feature vector in the right subset
  • WRN is the weight sum of the negative sample feature vectors in the right subset
  • the step of acquiring multiple positive samples may further include:
  • x and y are the abscissa and the ordinate of the center point of the original frame of the positive sample, respectively, w and h are respectively the width and height of the original frame of the positive sample;
  • x' and y' are the abscissa and the ordinate of the center point of the positive sample perturbation box, respectively, and w' and h' are the width and height of the positive sample perturbation frame, respectively.
  • step S5 may include:
  • S51b randomly selects NF feature attributes, and sets corresponding feature thresholds, classifies the node for each feature attribute and its corresponding feature threshold, and obtains NF feature classification results, each of which includes a left subset And a right sub-collection, wherein the threshold corresponding to the i-th feature attribute is TH i , i ⁇ [1, NF], and when categorizing based on the i-th feature attribute, the node corresponds to the i-th feature attribute
  • the feature vector whose feature value is smaller than TH i is placed in the left subset, and the feature vector of the node corresponding to the feature value of the i-th feature attribute greater than or equal to TH i is placed in the right subset;
  • SCORE_C
  • WLP is the weight sum of the positive sample feature vector in the left subset
  • WLN is the weight sum of the negative sample feature vector in the left subset
  • WRP is the weight sum of the positive sample feature vector in the right subset
  • WRN is the sum of the weights of the negative sample feature vectors in the right subset
  • S53b determining positive and negative attributes of the left sub-set and the right sub-set in each classification result, wherein if the value of WLP-WLN-WRP+WRN is positive, the attribute of the left sub-set is positive, and the attribute of the right sub-set Negative; otherwise, the attribute of the left sub-collection is negative, and the attribute of the right sub-collection is positive;
  • N is the number of positive sample feature vectors in the positive sample subset
  • dx j , dy j , dw j , dh j are respectively in the position parameter vector of the positive sample corresponding to the jth positive sample feature vector in the positive sample subset
  • the average value of the abscissa change rate, the average value of the ordinate change rate, the mean value of the wide change rate, and the high change in the position parameter vector of the positive sample corresponding to the N positive sample feature vectors in the positive sample subset are respectively Average of the rate;
  • is a constant greater than 0
  • S56b selecting a feature attribute and a feature threshold corresponding to the classification result with the largest total score as the feature attribute selected by the node when splitting and the corresponding feature threshold.
  • step S9 in step S9,
  • NP is the number of positive samples
  • NN is the number of negative samples
  • HP k1 is the detection score of the current k1 positive sample feature vector
  • HN k2 is the detection score of the current k2 negative sample feature vector
  • Hp k1 is the node score output by the weak classifier when the K1 positive sample is input to the weak classifier currently completing the training
  • hn k2 is the weak classifier output when the K2 negative sample is input to the weak classifier currently completing the training. Node score.
  • the feature extraction algorithm includes at least one of a directional gradient histogram feature extraction algorithm, a luminance chrominance color feature extraction algorithm, and a local binary mode feature extraction algorithm.
  • a target detection method may include: generating a target detection model by using the generation method as described above; normalizing the image to be detected, and performing feature vector extraction using a preset feature extraction algorithm to obtain a feature vector to be detected; and using the target detection
  • the model classifies the to-be-detected feature vector, and determines whether the target is included in the image to be detected based on the classification result.
  • the step of classifying the to-be-detected feature vector using the target detection model may include: classifying the to-be-detected feature vector using each weak classifier of the target detection model, And obtaining a score of the node to which the to-be-detected feature vector is classified in the weak classifier; summing all acquired node scores to obtain a classifier total score S; determining whether the classifier total score S is greater than The preset threshold score S′; if it is determined that S>S′, it is detected that the image to be detected includes the target; otherwise, the image to be detected does not include the target.
  • the classifying the to-be-detected feature vector by using the target detection model may include:
  • each weak classifier it is determined whether the attribute of the leaf node to which the to-be-detected feature vector in the weak classifier is classified is positive, and if it is determined that the attribute of the leaf node is positive, acquiring the leaf The average value of the abscissa change rate corresponding to the node, the average value of the ordinate change rate, the average value of the wide change rate, and the average value of the high change rate;
  • the image to be detected is calculated according to the average value of the acquired abscissa change rate, the average value of the ordinate change rate, the average value of the wide change rate, and the average value of the high change rate.
  • a detection frame surrounding the image to be detected is adjusted according to the position parameter vector of the image to be detected.
  • a computing device can include a processor; and a memory that stores a computer program that, when executed by the processor, causes the processor to perform a sample detection method as previously described, or as previously described The target detection model generation method or the target detection method as described above.
  • a computer readable medium stores a computer program that, when executed by a processor, causes the processor to perform a sample detection method as previously described, or a target detection model generation method as previously described or as before The target detection method.
  • FIG. 1 is a flowchart of a method for acquiring a positive sample according to an embodiment of the present disclosure
  • FIG. 2a is a schematic diagram of obtaining a positive sample selection frame according to a positive sample original frame in an embodiment of the present disclosure
  • 2b is a schematic diagram of obtaining a positive sample selection frame according to a positive sample original frame in another embodiment of the present disclosure
  • FIG. 3 is a flowchart of a method for generating a pedestrian detection model according to an embodiment of the present disclosure
  • step 203 is a flow chart of one embodiment of step 203 as shown in FIG. 3;
  • FIG. 5 is a flow diagram of one embodiment of step 2033 as shown in Figure 4;
  • FIG. 6 is a flow chart of one embodiment of step S4 as shown in Figure 5;
  • Figure 7a is a flow chart of one embodiment of step S5 as shown in Figure 5;
  • Figure 7b is a flow chart of another embodiment of step S5 as shown in Figure 5;
  • FIG. 8 is a flowchart of a pedestrian detection method according to an embodiment of the present disclosure.
  • Figure 9 is a flow chart of one embodiment of step 303 as shown in Figure 8.
  • FIG. 10 is a flowchart of a pedestrian detection method according to another embodiment of the present disclosure.
  • FIG. 11 is a schematic structural diagram of a computing device according to an embodiment of the present disclosure.
  • the present invention is described in the present disclosure with an example of a pedestrian as a target.
  • the object according to the present disclosure may be any other object such as a cat, a dog, an elephant or the like.
  • the sample acquisition method according to the present application is described only by taking a positive sample as an example, but those skilled in the art should understand that the sample acquisition method according to the present application can also be applied to the acquisition of other samples. , for example, applied to the acquisition of negative samples.
  • the classifier trained based on the positive samples extracted by the positive sample identification frame is difficult to recognize the pedestrian image containing the background, resulting in low recognition accuracy.
  • the present disclosure proposes a sample acquisition method, a pedestrian detection model generation method, a pedestrian detection method, a computing device, and a computer readable medium.
  • the sample acquisition method, the pedestrian detection model generation method, the pedestrian detection method, the calculation device, and the computer readable medium provided by the present disclosure can effectively improve the pedestrian detection model by increasing the number of extractable positive samples and adding a background to the positive samples. Identify accuracy.
  • the present disclosure can obtain the positional parameter vector of the pedestrian in the positive sample, and determine the feature attribute at the time of splitting based on the classification score and the positioning error in the process of training the weak classifier, so that the final shaped pedestrian detection model can be treated not only It is detected whether there is a pedestrian in the image for detection, and the position of the pedestrian can be accurately positioned when detecting the presence of a pedestrian in the image to be detected.
  • FIG. 1 is a flowchart of a positive sample acquisition method according to an embodiment of the present disclosure.
  • the positive sample acquisition method may include:
  • Step 101 Add a disturbance to the pre-marked positive sample original frame in the original image to obtain a positive sample selection frame.
  • the positive sample original frame in the present disclosure is a positive sample identification box of the artificial mark in the related art, and the positive sample original frame just surrounds a complete pedestrian. According to the present disclosure, a random perturbation is added to the original sample original frame such that the positive sample original frame moves within a certain region to obtain a positive sample marquee.
  • step 101 a random perturbation is added to the center position of the original block of the positive sample to obtain a positive sample perturbation box, and the positive sample perturbation box is used as a positive sample decimation box.
  • the positive sample marquee exactly matches the shape and size of the positive sample perturbation box.
  • the width of the original sample original box or the high of the positive sample original box can be perturbed to obtain a positive sample perturbation box.
  • FIG. 2b is a schematic diagram of obtaining a positive sample selection frame according to a positive sample original frame in another embodiment of the present disclosure.
  • the center position of the positive sample disturbance frame can be kept unchanged, and the positive sample disturbance frame can be randomly enlarged or reduced to obtain a positive sample selection frame.
  • the robustness of the subsequently trained pedestrian detection model can be further improved.
  • the anti-interference ability of the detection algorithm can be improved. That is to say, if there is an error in the actual marquee and the target is not just framed, the detector can still accurately detect the target.
  • the quality of the positive sample should be guaranteed, and the image enclosed by the positive sample selection frame should contain at least part of the image of the pedestrian.
  • the image enclosed by the positive sample marquee may contain a complete image of the pedestrian.
  • step 101 the background around the actor is obtained by adding a perturbation to the original sample original box.
  • the positive sample selection box contains not only the pedestrian image but also the background image around the pedestrian. This can greatly improve the robustness of the pedestrian detection model.
  • the method may further include the step 102 of extracting an image surrounded by the positive sample marquee as a positive sample.
  • the positive sample acquisition method according to the present disclosure can extract an infinite number of different positive samples.
  • the technical solution of the present disclosure can acquire a plurality of different positive samples based on one pedestrian, thereby effectively increasing the number of positive samples that can be extracted in the original image, and is beneficial to improving the recognition accuracy of the classifier.
  • the positive sample contains not only the complete pedestrian image, but also the background image around the pedestrian, which can also improve the recognition ability of the classifier to pedestrians to some extent.
  • FIG. 3 is a flowchart of a method for generating a pedestrian detection model according to an embodiment of the present disclosure. As shown in FIG. 3, the pedestrian detection model generation method may include:
  • Step 201 Acquire multiple positive samples.
  • the positive sample acquisition method provided in the above embodiment is used to obtain a plurality of different positive samples, and the specific process is not described herein again.
  • the number of positive samples may be NP.
  • the pedestrian detection model generation method may further include a step 202 of extracting, from the original image, a plurality of images that do not include a pedestrian as a negative sample.
  • a plurality of images that do not contain pedestrians are randomly selected in the original image, and each of the images can be used as a negative sample.
  • the process of selecting a negative sample is the same as in the related art and will not be described in detail here.
  • the number of negative samples may be NN.
  • the number of negative samples is often more than the number of positive samples NP, for example, the number of positive samples is NP, and the number of negative samples is 20,000.
  • the pedestrian detection model generating method may further include step S203, generating a pedestrian detection model by using a preset classification model algorithm according to the acquired plurality of positive samples and the plurality of negative samples.
  • step 203 can include:
  • Step 2031 normalizing each positive sample and each negative sample.
  • each sample image X is scaled to be converted into a standard image having a specific width W and height H and a feature color channel (eg, RGB channel).
  • the pixel data of the standard image is represented as X(m, n, c), where m represents the ordinate on the image, and its value range is [1, H]; n represents the abscissa, and its value range is [1, W ]; c represents the RGB color channel, which ranges from 1, 2, and 3.
  • step 203 may further include step 2032: performing feature vector extraction on each normalized positive sample and each negative sample by using a preset feature extraction algorithm to obtain a positive sample feature training set and a negative sample feature training set.
  • FP K1 represents the eigenvector corresponding to the k1th positive sample, k1 ⁇ [1,NP], FN k2 The eigenvector corresponding to the k2th negative sample, k2 ⁇ [1, NN].
  • the preset feature extraction algorithm may include: a Histogram of Oriented Gradient (HOG) feature extraction algorithm, a luminance chrominance (LUV) color feature extraction algorithm, and a local binary mode.
  • HOG Histogram of Oriented Gradient
  • LUV luminance chrominance
  • LBP Local Binary Patterns, abbreviated as LBP
  • the predetermined classification model algorithm as previously described may include a preset feature extraction algorithm as previously described.
  • step 203 may further include step 2033, training a plurality of weak classifiers according to the positive sample feature training set and the negative sample feature training set. All weak classifiers constitute a pedestrian detection model.
  • the present disclosure employs an Adaboost algorithm to generate a plurality of weak classifiers, and in turn forms a strong classifier based on the plurality of weak classifiers.
  • FIG. 5 is a flow diagram of one embodiment of step 2033 as shown in FIG. As shown in FIG. 5, step 2033 can include:
  • Step S1 Initialize the detection scores and weights of each sample feature vector, and initialize the number of weak classifiers.
  • Initializing a detection score and a weight corresponding to each negative sample feature vector in the negative sample feature training set FN, wherein the initialization weight of the k2 negative sample feature vector may be
  • step 2033 may further include step S2, setting an empty storage structure, and placing the root node into the storage structure, the depth of the root node is 1, and the root node includes all positive sample feature vectors. And all negative sample feature vectors.
  • a storage structure is described as a first in first out (FIFO) stack as an example.
  • the storage structure can also be other storage structures with data storage functions.
  • step S2 an empty FIFO node stack is created, and the root node is placed in the stack.
  • the depth of the root node is set to 1, and the root node contains all NP positive sample feature vectors FP k1 and all NN.
  • step S2a may be further included between step S1 and step S3 to normalize the weights of the sample feature vectors.
  • the sum of the weights (total weights) of all positive sample feature vectors and all negative sample feature vectors can be calculated.
  • the weight of each positive sample feature vector and the weight of each negative sample feature vector are divided by W to normalize the weights, so that the weights of all normal sample feature vectors and all negative sample features are normalized.
  • the sum of the weights of the vectors is 1.
  • step S2a may be performed before step S2, or after, or simultaneously with step S2.
  • step 2033 may further include step S3 of detecting whether there are unprocessed nodes in the storage structure.
  • step S3 if it is detected that there is no unprocessed node in the storage structure, step S7 is performed; if it is detected that there is an unprocessed node in the storage structure, step S4 is performed.
  • whether a node is processed may be indicated by setting an identifier. For example, if the value of the identifier is 1, it indicates that the node has been processed; if the value of the identifier is 0, it indicates that the node has not been processed.
  • step 2033 may further include S4, taking an unprocessed node from the storage structure and determining whether the node is detachable.
  • step S4 may include:
  • Step S41 Extract an unprocessed node from the storage structure, calculate a weight of the positive sample feature vector in the node and a ratio RATIO to the total weight, and a score SCORE of the node.
  • the node contains NP2 positive sample feature vectors FP k3 and NN2 negative sample feature vectors FP k4 , where k3 ⁇ [1, NP2], k4 ⁇ [1, NN2].
  • NP2 may be equal to NP
  • NN2 may be equal to NN.
  • the weight of the positive sample feature vector and the ratio of the total weight to the RATIO and the node score SCORE are:
  • WPS is the weight sum of the positive sample feature vector in the node
  • WNS is the weight sum of the negative sample feature vector in the node
  • step S4 may further include:
  • Step S42 Determine whether the node satisfies the condition a and the condition b at the same time.
  • condition a can be:
  • Condition b can be:
  • step S42 If it is determined in step S42 that the node satisfies the condition a and the condition b at the same time, it is determined that the node is detachable; if it is determined in step S42 that the node cannot satisfy the condition a and the condition b at the same time, it is determined that the node is not separable.
  • step S4 If it is determined in step S4 that the node is detachable, step S5 is performed; if it is determined in step S4 that the node is not separable, step S3 is performed.
  • Step S5 Determine a feature attribute to be selected when the node is split and a corresponding feature threshold.
  • step S5 is a flow chart of one embodiment of step S5 as shown in Figure 5. As shown in FIG. 7a, step S5 may include:
  • Step S51a randomly selecting a plurality of feature attributes, and setting a corresponding feature threshold, and classifying all feature vectors (including a positive sample feature vector and a negative sample feature vector) of the node for each feature attribute and its corresponding feature threshold. A plurality of feature vector classification results are obtained.
  • NF for example, 1000
  • feature attributes are randomly selected, and corresponding feature threshold values are set; and feature vectors in the node are classified for each feature attribute and its corresponding feature threshold to obtain NF features.
  • the threshold corresponding to the i-th feature attribute is TH i , i ⁇ [1, NF]
  • the eigenvalue of the i-th feature attribute in the node may be smaller than TH i when classifying based on the i-th feature attribute.
  • the feature vector is placed in the left subset, and the feature vector of the i-th feature attribute of the node whose feature value is greater than or equal to TH i is placed in the right subset.
  • step S5 may further include: step S52a, calculating a classification score SCORE_C of each classification result of the node.
  • the classification score SCORE_C
  • step S5 may further include: step S53a, selecting a feature attribute and a feature threshold corresponding to the classification result with the largest classification score.
  • step S53a the classification score SCORE_C of each classification result is compared and sorted, and the feature attribute and the feature threshold of the classification result with the largest classification score are selected as the feature attribute and corresponding feature to be selected when the node is split. Threshold.
  • Figure 7b shows a flow chart of another embodiment of step S5 as shown in Figure 5.
  • the feature attribute and the feature threshold at the time of splitting can be determined based on the classification score (classification error) and the positioning error (position variance).
  • x and y are the abscissa and ordinate of the center point of the original frame of the positive sample, respectively, w and h are the width and height of the original frame of the positive sample respectively; x' and y' are the abscissa of the center point of the positive sample perturbation box respectively. And the ordinate, w', h' are the width and height of the positive sample perturbation box, respectively.
  • step S5 may include:
  • Step S51b randomly selecting a plurality of (for example, NF) feature attributes, and setting a corresponding feature threshold, and classifying the feature vectors in the node for each feature attribute and its corresponding feature threshold to obtain a plurality of corresponding ( For example, NF) feature classification results.
  • NF for example, NF
  • Step S5 may further include: step S52b, calculating a classification score SCORE_C of each classification result.
  • steps S51a and S51b For the description of steps S51a and S51b, reference may be made to the foregoing description of steps S51a and S51b, and details are not described herein again.
  • Step S5 may further include: step S53b, determining positive and negative attributes of the left sub-set and the right sub-set in each classification result.
  • step S53b it is determined whether the value of WLP-WLN-WRP+WRN is positive or negative. If the value of WLP-WLN-WRP+WRN is positive, the attribute of the left subset is positive, and the attribute of the right subset is negative; If the value of WLP-WLN-WRP+WRN is negative, the attribute of the left sub-collection is negative, and the attribute of the right sub-set is positive.
  • Step S5 may further include: step S54b, for each classification result, calculating a regression error ERROR_R of the positive sample feature vector in the subset of the classification result that is positive;
  • N is the number of positive sample feature vectors in the subset with positive attributes
  • dx j , dy j , dw j , dh j are respectively the position parameter vectors of the positive samples corresponding to the j-th positive sample feature vector in the subset
  • Change rate of abscissa, rate of change of ordinate, rate of change of width, rate of change The average value of the abscissa change rate in the position parameter vector of the positive sample corresponding to the N positive sample feature vectors in the subset (that is, the result of adding the N abscissa change rates and then averaging)
  • the average value of the ordinate rate of change that is, the result of adding the N ordinate rate of change and then averaging
  • the average value of the wide rate of change that is, adding N wide rate of change and then averaging
  • the average of the high rate of change that is, the result of adding N high rate of change and then averaging).
  • Step S5 may further include: step S55b, calculating a total score SCORE_TOTAL of each classification result.
  • the step S5 may further include: step S56b, selecting a feature attribute and a feature threshold corresponding to the classification result with the largest total score.
  • step S56b the total score SCORE_TOTAL of each classification result is compared and sorted, and the feature attribute and the feature threshold of the classification result with the largest total score are selected as the feature attribute and the corresponding feature threshold to be selected when the node is split. .
  • the selection method shown in Fig. 7b not only considers the classification score, but also considers the positioning error. It can not only improve the recognition accuracy of the classifier, but also accurately position the pedestrians in the detected image to a certain extent.
  • the positioning principle can be seen in the subsequent description.
  • step 2033 may further include: step S6, splitting all feature vectors in the node into a left sub-set and a right sub-set according to the determined feature attribute and its corresponding feature threshold, and adding two new nodes, Each of the left and right subsets is included as a child of the current node, and the two new nodes are added to the storage structure.
  • the properties of the two new nodes are the properties of the left and right subsets. If the depth of the current node is DEPTH, the depth of the two new nodes is DEPTH+1.
  • the left subset and the right subset corresponding to the determined feature attributes and their corresponding feature thresholds may be determined as previously described.
  • Positive and negative attributes and determining an average value of the abscissa change rate in the position parameter vector of the positive sample corresponding to the positive sample feature vector in the left or right subset of the positive attribute, and the rate of change of the ordinate
  • the average value, the average value of the wide change rate, and the average value of the high change rate are used as the position-dependent average value of the nodes corresponding to the subset.
  • step 2033 jumps to step S4.
  • step 2033 may further include: step S7, recording related information of all nodes, such as node depth, child nodes, feature attributes and their corresponding feature thresholds, positive and negative attributes, for example, position-dependent average values of positive nodes (That is, the average of the abscissa change rate, the average of the ordinate change rate, the average of the wide change rate, the average of the high change rate), the total score, and the like, thereby completing the training of a weak classifier.
  • step S7 recording related information of all nodes, such as node depth, child nodes, feature attributes and their corresponding feature thresholds, positive and negative attributes, for example, position-dependent average values of positive nodes (That is, the average of the abscissa change rate, the average of the ordinate change rate, the average of the wide change rate, the average of the high change rate), the total score, and the like, thereby completing the training of a weak classifier.
  • step S3 When it is determined in step S3 that no child nodes in the storage structure are not processed, it indicates that no nodes can be split, and one decision tree corresponding to one classifier is completed. By recording the depth, sub-nodes, feature attributes, feature thresholds corresponding to the feature attributes of all nodes of the decision tree, a weak classifier is trained.
  • t is the current number of weak classifiers
  • Step S9 includes: updating the detection scores and weights of each positive sample feature vector and each negative sample feature vector by using a weak classifier that currently completes training.
  • the weight of the updated positive sample feature vector can be,
  • the weight of the updated negative sample feature vector can be,
  • NP is the number of positive samples
  • NN is the number of negative samples
  • HP k1 is the detection score of the current k1 positive sample feature vector
  • k1 is greater than or equal to 1 and less than NP
  • HN k2 is the current k2 negative sample
  • the detection score of the feature vector, k2 is greater than or equal to 1 and less than NN
  • hp k1 is the node score output by the weak classifier when the K1 positive sample is input to the weak classifier currently completing the training
  • hn k2 is the K2 negative sample input.
  • step 2033 can jump to step S2.
  • the Adboost algorithm for the same sample set (ie, NP positive samples and NN negative samples as described above), multiple (T) weak classifiers need to be trained, and then the multiple Weak classifiers can be combined to form a strong classifier.
  • the embodiment of the present disclosure provides a pedestrian detection model generation method, and the pedestrian detection model generated based on the generation method can accurately identify pedestrians in the image to be detected.
  • the finally generated pedestrian detection model can not only identify the pedestrian in the image to be detected, but also determine the position of the pedestrian in the image to be detected.
  • FIG. 8 is a flowchart of a pedestrian detection method according to an embodiment of the present disclosure. As shown in FIG. 8, the pedestrian detection method may include:
  • Step 301 Generate a pedestrian detection model.
  • the pedestrian detection model generation method is used to generate the pedestrian detection model.
  • the specific process refer to the content in the foregoing embodiment, and details are not described herein again.
  • the pedestrian detection method may further include: step 302: normalizing the image to be detected, and performing feature vector extraction by using a preset feature extraction algorithm to obtain a feature vector to be detected.
  • the image to be detected may be obtained from a larger image using an original detection frame formed by a preset rule.
  • the image to be detected is exactly the portion of the larger image that is enclosed by the original detection frame.
  • the coordinates of the image to be detected or the center of the original detection frame may be coordinates referring to the larger image.
  • a plurality of images to be detected may be acquired from the larger image by sliding the original detection frame.
  • the image to be measured can be standardized.
  • Normalizing the image to be detected may include converting the image to be detected into a standard image having a specific width w and height h and a feature color channel (in this case, the abscissa of the center of the standard image may be x, the ordinate may be y, standard image)
  • the width is w and the height is h)
  • the feature extraction algorithm used by the weak classifier in the pedestrian detection model is used to perform feature vector extraction on the image to be detected, and the feature vector to be detected is obtained.
  • Step 303 classify the detected feature vector by using a pedestrian detection model, and determine whether a pedestrian is included in the image to be detected based on the classification result.
  • step 303 is a flow chart of one embodiment of step 303 as shown in FIG.
  • the pedestrian detection model according to the present disclosure may include a plurality of weak classifiers.
  • step 303 may include:
  • Step 3031 Use each of the plurality of weak classifiers to classify the detected feature vectors, and output corresponding node scores, wherein the node score output by the kth classifier is recorded as S k , k ⁇ [1,T], T is the number of weak classifiers.
  • the k-th classifier the to-be-detected feature vector will be finally classified into a specific leaf node of the decision tree corresponding to the k-th classifier (ie, a node having no child nodes). )in.
  • the node score output by the kth weak classifier is actually the node score of the specific leaf node recorded when the kth weak classifier is trained, that is, according to Calculate.
  • step 303 may further include step 3032 of summing the node scores output by all the weak classifiers to obtain a classifier total score S, that is,
  • Step 303 may further include a step 3033 of determining whether the classifier total score S is greater than a preset threshold score S'.
  • step S3033 if it is determined that S>S', it is determined that the image to be detected contains a pedestrian; otherwise, it is determined that the image to be detected does not include a pedestrian.
  • the pedestrian detection method provided by the embodiment of the present disclosure can accurately detect whether a pedestrian is included in the image to be detected.
  • FIG. 10 is a flowchart of a pedestrian detection method according to another embodiment of the present disclosure. As shown in FIG. 10, the pedestrian detection method may include:
  • Step 401 Generate a pedestrian detection model.
  • the feature attributes (such as the method shown in FIG. 7b) when the node performs splitting are determined based on the classification score and the positioning error.
  • the pedestrian detection method may further include the step 402: normalizing the image to be detected, and performing feature vector extraction by using a preset feature extraction algorithm to obtain a feature vector to be detected.
  • the abscissa of the center point of the image to be detected may be represented as x, and the ordinate may be represented as y.
  • normalizing the image to be measured may include converting the image to be detected into a standard image having a particular width w and height h and a feature color channel, and employing feature extraction using the weak classifier in the pedestrian detection model The algorithm performs feature vector extraction on the detected image to obtain a feature vector to be detected.
  • the pedestrian detection method continues to step 403, and each of the plurality of weak classifiers in the pedestrian detection model is used to classify the detected feature vectors, and output corresponding node scores.
  • the specific method is as shown in, for example, step 3031, and details are not described herein again.
  • the pedestrian detection method continues to step 404, for each weak classifier of the plurality of weak classifiers in the pedestrian detection model, determining the leaf node to which the to-be-detected feature vector in the weak classifier is classified Whether the attribute is positive, and when it is determined that the attribute of the leaf node of the weak classifier is positive, acquiring a position-related average value corresponding to the leaf node recorded in the weak classifier in the pedestrian detection model ( That is, the average value of the abscissa change rate, the average value of the ordinate change rate, the average value of the wide change rate, and the average value of the high change rate).
  • the pedestrian detection method continues to step 405, summing the node scores output by all the weak classifiers, and obtaining the classifier total score S, that is, the total score.
  • the specific method is shown in step 3032 as described above.
  • the pedestrian detection method continues to step 406 to determine if the classifier total score S is greater than a preset threshold score S'.
  • step 406 if it is determined that S>S', it is detected that the pedestrian to be detected includes the pedestrian. At this time, step 407 is performed; otherwise, it is detected that the image to be detected does not include the pedestrian, and the flow ends.
  • the center of the image to be detected after normalization has an abscissa of x and an ordinate of y, and the normalized image to be detected has a width w and a height h.
  • the adjusted detection frame obtained at this time corresponds to the normalized image to be detected. Therefore, in order to obtain an adjusted detection frame corresponding to a larger image from which the image to be detected is obtained, the adjusted waiting for the image corresponding to the image to be detected may be performed in accordance with a process opposite to the normalization. The detection frame is processed. The adjusted detection frame will more accurately enclose the pedestrian.
  • the pedestrian detection method provided by the embodiment of the present disclosure can not only detect whether there is a pedestrian in the image to be detected, but also more accurately locate the position of the pedestrian when detecting the presence of a pedestrian in the image to be detected.
  • FIG. 11 is a schematic structural diagram of a computing device according to an embodiment of the present disclosure.
  • the computing device is used to implement the various steps of the described method.
  • the computing device can be used to implement all or a portion of the methods shown in Figures 1 and 3-10.
  • the computing device is only one example of a suitable computing device and is not intended to suggest any limitation as to the scope of use or functionality of the presently disclosed subject matter.
  • Components of the computing device can include, but are not limited to, a processor 11, a memory 12, and a system bus 16 that couples various system components including memory to the processor 11.
  • System bus 16 may be any of several types of bus structures including a memory bus or a memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures.
  • bus architectures include an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an Enhanced ISA (EISA) bus, a Video Electronics Standards Association (VESA) local bus, and are also referred to as mezzanine buses.
  • PCI Peripheral Component Interconnect
  • Processor 11 may include a microprocessor, controller circuitry, etc., and may be configured to execute a computer program stored in memory 12.
  • Memory 12 can include a wide variety of computer readable media.
  • Computer readable media can be any available media that can be accessed by a computing device and includes both volatile and nonvolatile media, removable and non-removable media.
  • Computer readable media may comprise computer readable storage media and communication media.
  • Computer readable storage medium includes volatile and nonvolatile and removable and non-volatile and implemented methods or techniques for storing information such as computer readable instructions, data structures, program modules or other data. Removable media.
  • Computer readable storage media includes, but is not limited to, random access memory (RAM), read only memory (ROM), EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, cartridge A tape, tape, disk storage or other magnetic storage device, or any other medium that can be used to store the desired information and that can be accessed by the computing device.
  • Communication media typically embody computer readable instructions, data structures, program modules or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media.
  • modulated data signal means a signal that has one or more of its characteristics set or changed in such a manner that information is encoded in the signal.
  • communication media may include a wired medium such as a wired network or a direct wired connection, and a wireless medium such as RF and other wireless medium. Combinations of any of the above are also included within the scope of computer readable media.
  • Memory 12 may include computer storage media in the form of volatile and/or nonvolatile memory such as ROM and RAM.
  • a basic input/output system (BIOS) containing basic routines such as facilitating the transfer of information between elements within a computing device during startup is typically stored in ROM.
  • the RAM typically contains data and/or program modules that are immediately accessible to the processor 11 and/or are currently being operated by the processor 11.
  • data 13 that may be stored in memory 12 shown in FIG. 11 may include a BIOS, an operating system, a computer program, other program modules, and program data.
  • the computer program when executed in the processor 11, causes the processor 11 to perform the method as previously described.
  • the computing device can also include other removable/non-removable, volatile/non-volatile computer storage media.
  • removable/non-removable, volatile/nonvolatile computer storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, DVDs, digital video tapes, solid state RAM, Solid state ROM, etc.
  • the computer storage media discussed above provide a computing device with storage of computer-executable instructions, data structures, program modules, and other data.
  • a user can enter commands and information into a computing device through an input device such as a keyboard and pointing device, commonly referred to as a mouse, trackball or touch pad.
  • Other input devices may include a microphone, joystick, game pad, satellite dish, scanner, and the like.
  • processor 11 may include a user input output (I/O) interface 14 coupled to system bus 16.
  • I/O user input output
  • a monitor or other type of display device can be connected to system bus 16 via a user input output (I/O) interface 14, such as a video interface.
  • the computing device can also be connected to other peripheral output devices, such as speakers and printers, through a user input output (I/O) interface 14.
  • the computing device can be coupled to one or more remote computers via a network interface 15.
  • the remote computer can be a personal computer, server, router, network PC, peer device, or other common network node, and typically includes many or all of the elements described above with respect to the computing device.
  • Embodiments of the present disclosure also provide a computer readable medium having stored thereon a computer program that, when executed on a processor, causes the processor to perform methods and functions in accordance with embodiments of the present disclosure.
  • the computer readable medium can include any of the computer readable media described above.
  • Embodiments of the present disclosure also provide a computer program product that can be implemented in accordance with an embodiment of the present disclosure when instructions in the computer program product are executed by a processor.
  • first and second are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated.
  • features defining “first” or “second” may include at least one of the features, either explicitly or implicitly.
  • the meaning of "a plurality” is at least two, such as two, three, etc., unless specifically defined otherwise.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

本公开公开了一种样本获取方法、目标检测模型生成方法、目标检测方法、计算设备、计算机可读介质。该样本获取方法包括:对原始图像中预先标记的样本原始框添加扰动,以得到样本选取框,所述样本原始框所围住的图像包含目标;提取所述样本选取框所围住的图像以作为样本。本公开的技术方案可有效提升原始图像中可获取的样本的数量,以及为样本添加背景,可有效提升训练出的目标检测模型的识别精准度。

Description

样本获取方法、目标检测模型生成方法、目标检测方法 技术领域
本公开涉及智能识别技术领域,特别涉及样本获取方法、目标检测模型生成方法、目标检测方法、计算设备和计算机可读介质。
背景技术
目标检测是计算机视觉领域的一项重要研究方向。其涉及,通过对摄像头拍摄得到的视觉图像进行智能分析,自动检测出图像中所包含的目标。目标检测在车辆辅助驾驶、智能监控、智能机器人等领域中有着广泛的应用。
在现有的目标检测过程中,往往是先基于预先获取的正样本和负样本来训练相应的分类器,再基于分类器对待检测图像进行识别,以检测出待检测图像中是否包含目标。在这样的目标检测过程中,正样本的数量和选取质量可在一定程度上影响分类器的识别精准度。
现有的正样本选取,往往是在原始图像中人工标记正样本标识框,且一个正样本标识框正好框住一个完整的目标(不含背景)。
发明内容
在根据本公开的一个方面中,提供了一种样本获取方法。该样本获取方法可以包括:对原始图像中预先标记的样本原始框添加扰动,以得到样本选取框,所述样本原始框所围住的图像包含目标;以及提取所述样本选取框所围住的图像以作为样本。
在一个实施例中,所述对样本原始框添加扰动的步骤可以包括:对所述样本原始框的中心位置,和/或所述样本原始框的宽,和/或所述样本原始框的高,添加随机扰动,以得到样本扰动框,并将所述样本扰动框作为所述样本选取框。
在一个实施例中,所述样本原始框所围住的图像可以包含所述目标的完整图像或者所述目标的部分图像。
根据本公开的另一个方面,提供了一种目标检测模型生成方法。该目标检测模型生成方法可以包括:采用如前所述的样本获取方法来从原始图像中获取多个正样本;从所述原始图像中获取多个不包含目 标的图像以作为负样本;以及根据获取的多个所述正样本和多个所述负样本采用预设的分类模型算法来生成目标检测模型。
在一个实施例中,所述根据获取的所述多个正样本和所述多个负样本采用预设的分类模型算法来生成目标检测模型的步骤可以包括:对所述多个正样本中的每一个正样本和所述多个负样本中的每一个负样本进行标准化;采用预设的特征提取算法对标准化后的每一个正样本和每一个负样本进行特征向量提取,以得到正样本特征训练集和负样本特征训练集,正样本特征训练集包括所提取的所有正样本的特征向量,负样本特征训练集包括所提取的所有负样本的特征向量;以及根据所述正样本特征训练集和所述负样本特征训练集训练出多个弱分类器,全部所述多个弱分类器构成所述目标检测模型。
在一个实施例中,所述根据所述正样本特征训练集和所述负样本特征训练集训练出多个弱分类器的步骤包括:
S1、初始化所述正样本特征训练集中每一个正样本特征向量所对应的检测得分和权值,初始化所述负样本特征训练集中每一个负样本特征向量所对应的检测得分和权值,初始化弱分类器数量t=0;
S2、设置存储结构,并将根节点放入至所述存储结构中,根节点的深度为1,根节点中包含全部所述正样本特征向量和全部所述负样本特征向量;
S3、检测所述存储结构是否有未被处理的节点;若检测出所述存储结构没有未被处理的节点,则执行步骤S7,否则执行步骤S4;
S4、从所述存储结构中取出一个节点,并判断该节点是否可拆分,若判断出该节点可拆分则执行步骤S5,否则执行步骤S3;
S5、确定该节点进行拆分时所选取的特征属性及其对应的特征阈值;
S6、根据确定出的特征属性及其对应特征阈值,将该节点中的全部特征向量拆分为左子集合和右子集合,增加两个新节点,分别包含所述左子集合和所述右子集合,该两个新节点作为该节点的子节点,并将该两个新节点添加到所述存储结构中,此时跳转至执行步骤S4;
S7、记录所有节点的相关信息,完成一个弱分类器的训练;
S8、执行t=t+1,并判断t是否大于预设阈值T,若判断出t大于T,则训练出多个弱分类器的步骤结束;否则执行步骤S9;
S9、利用当前完成训练的弱分类器对所述正样本特征训练集中的每一个正样本特征向量和所述负样本特征训练集中的每一个负样本特征向量的检测得分和权值进行更新,并基于更新后的检测得分和权值执行步骤S2。
在一个实施例中,在步骤S1和步骤S3之间还可以包括:S2a、对每一个所述正样本特征向量的权值和每一个所述负样本特征向量的权值进行规范化。
在一个实施例中,在步骤S1中,第k1个正样本特征向量的初始化权值为
Figure PCTCN2019074668-appb-000001
第k1个正样本特征向量的初始化检测得分为HP k1=0,第k2个负样本特征向量的初始化权值为
Figure PCTCN2019074668-appb-000002
第k2个负样本特征向量的初始化检测得分为HN k2=0,NP为所述正样本的数量,NN为所述负样本的数量,k1∈[1,NP],k2∈[1,NN]。
在一个实施例中,步骤S4可以包括:
S41、从存储结构中取出一个节点,计算该节点中正样本特征向量的权值和与总权值的比率RATIO、以及节点得分SCORE,
比率
Figure PCTCN2019074668-appb-000003
节点得分
Figure PCTCN2019074668-appb-000004
其中,WPS为该节点中正样本特征向量的权值和,WNS为该节点中负样本特征向量的权值和,p为大于0的常数;
S42、判断该节点是否同时满足条件a和条件b,条件a为DEPTH<MAXDEPTH,条件b为TH≤RATIO≤1-TH,
其中,DEPTH为该节点的深度,MAXDEPTH为预先设置的节点最大深度,TH为预先设置的比率阈值,
若步骤S42判断出该节点同时满足条件a和条件b,则判断出该节点可拆分;否则,判断出该节点不可拆分。
在一个实施例中,步骤S5可以包括:
S51a、随机选择NF个特征属性,并设置对应的特征阈值,针对每 一个特征属性及其对应的特征阈值对该节点的正样本特征向量和负样本特征向量进行分类,得到NF个特征分类结果,每个分类结果均包含一个左子集合和一个右子集合,其中,第i个特征属性对应的阈值为TH i,i∈[1,NF],在基于第i个特征属性进行分类时,将该节点中其第i个特征属性的特征值小于TH i的特征向量放入左子集合,将该节点中其第i个特征属性的特征值大于或等于TH i的特征向量放入右子集合;
S52a、计算出每一个分类结果的分类得分SCORE_C:
SCORE_C=|WLP-WLN-WRP+WRN|,
其中,WLP为左子集合中的正样本特征向量的权值和,WLN为左子集合中的负样本特征向量的权值和,WRP为右子集合中的正样本特征向量的权值和,WRN为右子集合中的负样本特征向量的权值和;以及
S53a、选取分类得分最大的分类结果所对应的特征属性和特征阈值,以作为该节点进行拆分时所选取的特征属性及其对应的特征阈值。
在一个实施例中,所述获取多个正样本的步骤还可以包括:
获取所述多个正样本中的每一个正样本的位置参数向量L=(dx,dy,dw,dh),
其中,横坐标变化率
Figure PCTCN2019074668-appb-000005
纵坐标变化率
Figure PCTCN2019074668-appb-000006
宽变化率
Figure PCTCN2019074668-appb-000007
高变化率
Figure PCTCN2019074668-appb-000008
x、y分别为所述正样本原始框的中心点的横坐标和纵坐标,w、h分别为所述正样本原始框的宽和高;
x’、y’分别为所述正样本扰动框的中心点的横坐标和纵坐标,w’、h’分别为所述正样本扰动框的宽和高。
在一个实施例中,步骤S5可以包括:
S51b、随机选择NF个特征属性,并设置对应的特征阈值,针对每一个特征属性及其对应的特征阈值对该节点进行分类,得到NF个特征分类结果,每一个分类结果均包含一个左子集合和一个右子集合,其中,第i个特征属性对应的阈值为TH i,i∈[1,NF],在基于第i个特征属性进行分类时,将该节点中对应于第i个特征属性的特征值小于TH i的特征向量放入左子集合,将该节点中对应于第i个特征属性的特征值大于或等于TH i的特征向量放入右子集合;
S52b、计算出每一个分类结果的分类得分SCORE_C:
SCORE_C=|WLP-WLN-WRP+WRN|;
其中,WLP为左子集合中的正样本特征向量的权值和,WLN为左子集合中的负样本特征向量的权值和,WRP为右子集合中的正样本特征向量的权值和,WRN为右子集合中的负样本特征向量的权值和;
S53b、确定每一个分类结果中的左子集合和右子集合的正负属性,其中,若WLP-WLN-WRP+WRN的值为正,则左子集合的属性为正,右子集合的属性为负;反之,则左子集合的属性为负,右子集合的属性为正;
S54b、计算正样本子集合中正样本特征向量的回归误差ERROR_R:
ERROR_R=Var(dx)+Var(dy)+Var(dw)+Var(dh),
其中,
Figure PCTCN2019074668-appb-000009
Figure PCTCN2019074668-appb-000010
Figure PCTCN2019074668-appb-000011
Figure PCTCN2019074668-appb-000012
N为该正样本子集合中正样本特征向量的数量,dx j、dy j、dw j、dh j分别为该正样本子集合中第j个正样本特征向量对应的正样本的位置参数向量中的横坐标变化率、纵坐标变化率、宽变化率、高变化率,
Figure PCTCN2019074668-appb-000013
Figure PCTCN2019074668-appb-000014
分别为该正样本子集合中N个正样本特征向量所对应的正 样本的位置参数向量中的横坐标变化率的平均值、纵坐标变化率的平均值、宽变化率的平均值、高变化率的平均值;
S55b、计算每一个分类结果的总得分SCORE_TOTAL:
SCORE_TOTAL=SCORE_C-λ*ERROR_R,
其中,λ为大于0的常数;
S56b、选取总得分最大的分类结果所对应的特征属性和特征阈值,以作为该节点进行拆分时所选取的特征属性及其对应的特征阈值。
在一个实施例中,在步骤S9中,
更新后的正样本特征向量的检测得分HP k1'=HP k1+hp k1
更新后的负样本特征向量的检测得分HN k2'=HN k2+hn k2
更新后的正样本特征向量的权值
Figure PCTCN2019074668-appb-000015
更新后的负样本特征向量的权值
Figure PCTCN2019074668-appb-000016
其中,NP为所述正样本的数量,NN为所述负样本的数量,HP k1为当前第k1个正样本特征向量的检测得分,HN k2为当前第k2个负样本特征向量的检测得分,hp k1为第K1个正样本输入至当前完成训练的弱分类器时该弱分类器输出的节点得分,hn k2为第K2个负样本输入至当前完成训练的弱分类器时该弱分类器输出的节点得分。
在一个实施例中,所述特征提取算法包括:方向梯度直方图特征提取算法、亮度色度颜色特征提取算法、局部二进制模式特征提取算法中的至少一种。
根据本公开的又一方面,提供了一种目标检测方法。该目标检测方法可以包括:采用如前所述生成方法生成目标检测模型;对待检测图像进行标准化,并采用预设的特征提取算法进行特征向量提取,得到待检测特征向量;以及使用所述目标检测模型对所述待检测特征向量进行分类,并基于分类结果确定所述待检测图像中是否包含目标。
在一个实施例中,所述使用所述目标检测模型对所述待检测特征向量进行分类的步骤可以包括:使用所述目标检测模型的每一个弱分类器对所述待检测特征向量进行分类,并获取在该弱分类器中的所述待检测特征向量被分类到的节点的得分;对所获取的所有的节点得分进行求和,得到分类器总得分S;判断分类器总得分S是否大于预先设 置的阈值得分S’;若判断出S>S’,则检测出待检测图像中包含目标;反之,则检测出待检测图像中不包含目标。
在一个实施例中,所述使用所述目标检测模型对所述待检测特征向量进行分类可以包括:
使用所述目标检测模型的每一个弱分类器对所述待检测特征向量进行分类,并获取在该弱分类器中的所述待检测特征向量被分类到的节点的得分;
对每一个弱分类器而言,判断该弱分类器中的所述待检测特征向量被分类到的叶子节点的属性是否为正,若判断出该叶子节点的属性为正,则获取与该叶子节点相对应的横坐标变化率的平均值、纵坐标变化率的平均值、宽变化率的平均值、高变化率的平均值;
对所获取的所有的节点得分进行求和,得到分类器总得分S;
判断分类器总得分S是否大于预先设置的阈值得分S’,若判断出S>S,则检测出待检测图像中包含目标;
在检测出待检测图像中包含目标之后,根据所获取的横坐标变化率的平均值、纵坐标变化率的平均值、宽变化率的平均值和高变化率的平均值来计算待检测图像的位置参数向量;以及
根据该待检测图像的位置参数向量来调整围住该待检测图像的检测框。
在一个实施例中,该目标所对应的位置参数向量为L'=(dx',dy',dw',dh'),
其中,
Figure PCTCN2019074668-appb-000017
Figure PCTCN2019074668-appb-000018
Figure PCTCN2019074668-appb-000019
Figure PCTCN2019074668-appb-000020
k表征弱分类器的编号,k大于等于1且小于T;M(k)用于表征第k个弱分类器中的所述待检测特征向量被分类到的叶子节点的属性是否为正,其中若该叶子节点的属性为正,则M(k)=1,若该叶子节点的属性为负,则M(k)=0;SCORE(k)为第k个弱分类器中的所述待检测特征向量被分类到的叶子节点所对应的节点得分;
Figure PCTCN2019074668-appb-000021
Figure PCTCN2019074668-appb-000022
分别是在第k个弱分类器中的与该待检测特征向量被分类到的叶子节点对应的横坐标变化率的平均值、纵坐标变化率的平均值、宽变化率的平均值、高变化率的平均值;
调整后的该检测框的横坐标为X=x+w*dx',纵坐标为Y=y+h*dy',宽为W=w*dw',高为H=h*dh',其中,x,y分别为标准化后的待检测图像的中心的横坐标和纵坐标,w,h分别为标准化后的待检测图像的宽和高。
根据本公开的再一个方面,提供了一种计算设备。该计算设备可以包括处理器;以及存储器,其存储了计算机程序,所述计算机程序在被所述处理器执行时,使得所述处理器执行如前所述的样本检测方法、或者如前所述的目标检测模型生成方法或者如前所述的目标检测方法。
根据本公开的再一个方面,提供了一种计算机可读介质。该计算机可读介质存储了计算机程序,所述计算机程序在被处理器执行时,使得所述处理器执行如前所述的样本检测方法、或者如前所述的目标检测模型生成方法或者如前所述的目标检测方法。
附图说明
图1为本公开一个实施例提供的一种正样本获取方法的流程图;
图2a为本公开一个实施例中根据正样本原始框来获取正样本选取框的示意图;
图2b为本公开另一个实施例中根据正样本原始框来获取正样本选取框的示意图;
图3为本公开一个实施例提供的一种行人检测模型生成方法的流程图;
图4为如图3中所示的步骤203的一个实施例的流程图;
图5为如图4中所示的步骤2033的一个实施例的流程图;
图6为如图5中所示的步骤S4的一个实施例的流程图;
图7a为如图5中所示的步骤S5的一个实施例的流程图;
图7b为如图5中所示的步骤S5的另一个实施例的流程图;
图8为本公开一个实施例提供的一种行人检测方法的流程图;
图9为如图8中所示的步骤303的一个实施例的流程图;
图10为本公开另一个实施例提供的一种行人检测方法的流程图;以及
图11为本公开一个实施例提供的计算设备的结构示意图。
具体实施方式
需要说明的是,为了便于理解本发明,在本公开中以行人作为目标的示例来描述本发明。但是本领域技术人员可以理解的是,根据本公开的目标可以是任何其他对象,比如猫、狗、大象等。还需要说明的是,在下文中,仅仅以正样本为例对根据本申请的样本获取方法进行描述,但是本领域技术人员应该理解,根据本申请的样本获取方法也可以应用到其他样本的获取中,例如应用到负样本的获取中。
在相关技术中,在原始图像中,针对一个行人,仅能提取出一个正样本,因而原始图像中可提取的正样本数量有限。此外,相关技术中基于正样本标识框所提取的正样本所训练出的分类器,难以对含有背景的行人图像进行识别,从而导致识别精准度不高。
为了至少解决相关技术中存在的技术问题之一,本公开提出了一 种样本获取方法、行人检测模型生成方法、行人检测方法、计算设备和计算机可读介质。
本公开提供的样本获取方法、行人检测模型生成方法、行人检测方法、计算设备以及计算机可读介质,通过增大可提取的正样本的数量以及为正样本添加背景,可有效提升行人检测模型的识别精准度。此外,本公开通过获取正样本中行人的位置参数向量,并在训练弱分类器的过程中基于分类得分和定位误差来确定拆分时的特征属性,可使得最终成形的行人检测模型不但可对待检测图像中是否有行人进行检测,还可在检测出待检测图像中存在行人时精确地对行人的位置进行定位。
为使本领域的技术人员更好地理解本公开的技术方案,下面结合附图对本公开提供的一种样本获取方法、行人检测模型生成方法、行人检测方法、计算设备以及计算机可读存储介质进行详细描述。图1为本公开一个实施例提供的一种正样本获取方法的流程图。如图1中所示,该正样本获取方法可以包括:
步骤101、对原始图像中预先标记的正样本原始框添加扰动,以得到正样本选取框。
本公开中的正样本原始框即为相关技术中人工标记的正样本标识框,该正样本原始框正好围住一个完整的行人。根据本公开,通过对正样本原始框添加随机扰动,以使得正样本原始框在一定区域内移动,以得到正样本选取框。
图2a为本公开一个实施例中根据正样本原始框来获取正样本选取框的示意图。如图2a中所示,作为一种可选方案,在步骤101中,对正样本原始框的中心位置添加随机扰动,以得到正样本扰动框,并将正样本扰动框作为正样本选取框。在这种情况下,正样本选取框与正样本扰动框的形状、尺寸完全一致。与如图2a中所示的实施例类似,在步骤101中,可以对正样本原始框的宽或者对正样本原始框的高添加扰动,以得到正样本扰动框。
图2b为本公开另一个实施例中根据正样本原始框来获取正样本选取框的示意图。如图2b中所示,与上述图2a中所示获取正样本过程不同的是,在图2b所示获取过程中,在对正样本原始框的中心位置添加随机扰动得到正样本扰动框后,可保持该正样本扰动框的中心位置 不变,对正样本扰动框进行随机放大或缩小,以得到正样本选取框。通过对正样本扰动框作随机放大或缩小处理以得到正样本选取框,可进一步提升后续训练出的行人检测模型的鲁棒性。如此,通过改善正样本选取框的多样性,可以提升检测算法的抗干扰能力。也就是说,如果在实际的选取框存在误差没有刚好框住目标时,检测器也仍然能够准确检测到目标。
以对正样本扰动框随机放大α倍为例,保持所述正样本扰动框的中心位置不变,将正样本扰动框的宽和高同时放大
Figure PCTCN2019074668-appb-000023
倍,从而得到正样本选取框。在一个实施例中,可以对正样本扰动框的宽和高添加不同的扰动。
在实际应用中,应保证正样本的质量,正样本选取框所围住的图像应至少包含行人的部分图像。在一个实施例中,正样本选取框所围住的图像可以包含行人的完整图像。
在步骤101中,通过为正样本原始框添加扰动,可获取到行为人周围的背景。也就是说,正样本选取框中不仅包含行人图像也包含行人周边的背景图像。这可以大大提升行人检测模型的鲁棒性。
如图1中所示,该方法还可以包括步骤102、提取正样本选取框所围住的图像以作为正样本。
在本实施例中,针对原始图像中的一个行人,通过对正样本原始框添加随机扰动,可得到无数个不同的正样本扰动框,再对正样本扰动框进行放大可得到相应的正样本选取框。正样本选取框的数量理论上也是无数个。因此,根据本公开的正样本获取方法可提取出无数个不同的正样本。
本公开的技术方案可基于一个行人来获取多个不同的正样本,从而可有效提升原始图像中可提取的正样本的数量,有利于提高分类器的识别精准度。此外,该正样本中不仅包含完整的行人图像,还包含行人周围的背景图像,因而也能在一定程度上提升分类器对行人的识别能力。
图3为本公开一个实施例提供的一种行人检测模型生成方法的流程图。如图3中所示,该行人检测模型生成方法可以包括:
步骤201、获取多个正样本。
在步骤S201中,采用上述实施例中提供的正样本获取方法来获取 多个不同的正样本,具体过程此处不再赘述。在本实施例中,正样本的数量可以为NP个。
该行人检测模型生成方法还可以包括步骤202、从原始图像中提取多个不包含行人的图像以作为负样本。
在原始图像中随机选取多个不包含行人的图像,其中每一个图像均可作为一个负样本。选取负样本的过程与相关技术中相同,此处不再详细描述。在本实施例中,负样本的数量可以为NN个。
需要说明的是,在实际应用中负样本的数量NN往往多于正样本的数量NP,例如正样本的数量NP为2000个,负样本的数量为20000个。
如图3中所示,该行人检测模型生成方法还可以包括步骤S203、根据获取的多个正样本和多个负样本采用预设的分类模型算法来生成行人检测模型。
图4为如图3中所示的步骤203的一个实施例的流程图。如图4所示,步骤203可以包括:
步骤2031、对各正样本和各负样本进行标准化。
首先,对各样本图像X进行缩放,使其转变为具有特定宽W和高H以及特征颜色通道(例如RGB通道)的标准图像。标准图像的像素数据表示为X(m,n,c),其中m表示图像上的纵坐标,其取值范围是[1,H];n表示横坐标,其取值范围是[1,W];c表示RGB颜色通道,其取值范围是1,2,3。
如图4所示,步骤203还可以包括步骤2032、采用预设的特征提取算法对标准化后的各正样本和各负样本进行特征向量提取,以得到正样本特征训练集和负样本特征训练集。
例如,对标准化后的各正样本和各负样本采用预设的特征提取算法进行特征向量提取,得到正样本特征训练集FP和负样本特征训练集FN,其中,FP={FP 1,…,FP k1,…,FP NP},FN={FN 1,…,FN k2,…,FN NN};FP K1表示第k1个正样本对应的特征向量,k1∈[1,NP],FN k2表示第k2个负样本对应的特征向量,k2∈[1,NN]。
在一个实施例中,预设的特征提取算法可以包括:方向梯度直方图(Histogram of Oriented Gradient,简称HOG)特征提取算法、亮度 色度(luminance chrominance,简称LUV)颜色特征提取算法、局部二进制模式(Local Binary Patterns,简称LBP)特征提取算法中的至少一种。在一个实施例中,如前所述的预设的分类模型算法可以包括如前所述的预设的特征提取算法。
如图4所示,步骤203还可以包括步骤2033、根据正样本特征训练集和负样本特征训练集训练出多个弱分类器。全部弱分类器构成行人检测模型。在一个实施例中,本公开采用Adaboost算法来生成多个弱分类器,并进而基于该多个弱分类器形成强分类器。
图5为如图4中所示的步骤2033的一个实施例的流程图。如图5中所示,步骤2033可以包括:
步骤S1、初始化各样本特征向量的检测得分和权值,初始化弱分类器数量。
例如,初始化正样本特征训练集FP中各正样本特征向量所对应的检测得分和权值,其中,第k1个正样本特征向量的初始化权值可以为
Figure PCTCN2019074668-appb-000024
第k1个正样本特征向量的初始化检测得分可以为HP k1=0。
初始化负样本特征训练集FN中各负样本特征向量所对应的检测得分和权值,其中,第k2个负样本特征向量的初始化权值可以为
Figure PCTCN2019074668-appb-000025
第k2个负样本特征向量的初始化检测得分可以为HN k2=0。
初始化弱分类器数量t=0,表示当前得到的弱分类器的数量t为0个。
如图5中所示,步骤2033还可以包括步骤S2、设置一个空的存储结构,并将根节点放入至该存储结构中,根节点的深度为1,根节点中包含全部正样本特征向量和全部负样本特征向量。
在本实施例中,以存储结构为一个先进先出(First In First Out,简称FIFO)的堆栈为例进行描述。当然,该存储结构还可以为其他具有数据存储功能的存储结构。
在步骤S2中,创建一个空的先进先出节点堆栈,并将根节点放入该堆栈,根节点的深度设置为1,根节点中包含全部NP个正样本特征向量FP k1和和全部NN个负样本特征向量FN k2,k1∈[1,NP],k2∈[1,NN]。
需要说明的是,为便于后续的计算,在步骤S1和步骤S3之间还 可以包括步骤S2a、对各样本特征向量的权值进行规范化。
在一个实施例中,可以计算全部正样本特征向量和全部负样本特征向量的权值总和(总权值)
Figure PCTCN2019074668-appb-000026
并将每个正样本特征向量的权值和每个负样本特征向量的权值除以W,以对各权值进行规范化,使得规范化后的全部正样本特征向量的权值和全部负样本特征向量的权值的总和为1。
需要说明的是,步骤S2a可在步骤S2之前执行,或之后执行,或与步骤S2同时执行。
如图5中所示,步骤2033还可以包括步骤S3、检测存储结构中是否有未被处理的节点。
在步骤S3中,若检测出存储结构中没有未被处理的节点,则执行步骤S7;若检测出存储结构中有未被处理的节点,则执行步骤S4。在一个实施例中,可以通过设置标识符来表示一个节点是否被处理。例如,如果该标识符的数值为1,则表示该节点已被处理;如果该标识符的数值为0,则表示该节点未被处理。
如图5中所示,步骤2033还可以包括S4、从存储结构中取出一个未被处理的节点,并判断该节点是否可拆分。
图6为如图5中所示的步骤S4的一个实施例的流程图。如图6中所示,步骤S4可以包括:
步骤S41、从存储结构中取出一个未被处理的节点,计算该节点中正样本特征向量的权值和与总权值的比率RATIO、以及该节点的得分SCORE。
假定取出的节点的深度为DEPTH,该节点包含NP2个正样本特征向量FP k3和NN2个负样本特征向量FP k4,其中k3∈[1,NP2],k4∈[1,NN2]。在该节点是根节点的情况下,NP2可以等于NP,NN2可以等于NN。此外,该节点中正样本特征向量的权值和与总权值的比率RATIO和节点得分SCORE分别为:
Figure PCTCN2019074668-appb-000027
Figure PCTCN2019074668-appb-000028
其中,
Figure PCTCN2019074668-appb-000029
WPS为该节点中正样本特征向量的权值和,WNS为节点中负样本特征向量的权值和,p为大于0的常数(例如P=5)。
如图6中所示,步骤S4还可以包括:
步骤S42、判断该节点是否同时满足条件a和条件b。
在一个实施例中,条件a可以为:
DEPTH<MAXDEPTH;
条件b可以为:
TH≤RATIO≤1-TH;
MAXDEPTH为预先设置的节点最大深度(例如MAXDEPTH=5),TH为预先设置的比率阈值(例如TH=0.001)。
若步骤S42判断出该节点同时满足条件a和条件b,则判断出该节点可拆分;若步骤S42判断出该节点无法同时满足条件a和条件b,则判断出该节点不可拆分。
若步骤S4中判断出该节点可拆分,则执行步骤S5;若步骤S4中判断出该节点不可拆分,则执行步骤S3。
步骤S5、确定在拆分该节点时所要选取的特征属性及其对应的特征阈值。
作为本公开的一种可选方案,在步骤S5中仅仅基于分类得分(分类误差)来确定拆分时的特征属性和特征阈值。图7a为如图5中所示的步骤S5的一个实施例的流程图。如图7a所示,步骤S5可以包括:
步骤S51a、随机选择多个特征属性,并设置对应的特征阈值,针对每一特征属性及其对应的特征阈值对该节点的所有特征向量(包括正样本特征向量和负样本特征向量)进行分类,得到多个特征向量分类结果。
例如,在步骤S51a中,随机选择NF(例如1000)个特征属性,并设置对应的特征阈值;针对每一特征属性及其对应的特征阈值对该节点中的特征向量进行分类,得到NF个特征分类结果;每个分类结果均包含一个左子集合和一个右子集合。假设第i个特征属性对应的阈值为TH i,i∈[1,NF],则在基于第i个特征属性进行分类时,可以将该节 点中其第i个特征属性的特征值小于TH i的特征向量放入左子集合,将该节点中其第i个特征属性的特征值大于或等于TH i的特征向量放入右子集合。
如图7a所示,步骤S5还可以包括:步骤S52a、计算出该节点的各分类结果的分类得分SCORE_C。
分类得分SCORE_C=|WLP-WLN-WRP+WRN|,其中,WLP为左子集合中的正样本特征向量的权值和,WLN为左子集合中的负样本特征向量的权值和,WRP为右子集合中的正样本特征向量的权值和,WRN为右子集合中的负样本特征向量的权值和。
如图7a所示,步骤S5还可以包括:步骤S53a、选取分类得分最大的分类结果所对应的特征属性和特征阈值。
在步骤S53a中,对各分类结果的分类得分SCORE_C进行比较、排序,选取分类得分最大的分类结果的特征属性和特征阈值,以作为在拆分该节点时所要选取的特征属性及其对应的特征阈值。
图7b示出了如图5中所示的步骤S5的另一个实施例的流程图。在该实施例中,可以基于分类得分(分类误差)和定位误差(位置方差)来确定拆分时的特征属性和特征阈值。
在这样的情况下,在通过步骤201获取正样本的同时,还需要获取各正样本的位置参数向量L=(dx,dy,dw,dh),其中,横坐标变化率为:
Figure PCTCN2019074668-appb-000030
纵坐标变化率为:
Figure PCTCN2019074668-appb-000031
宽变化率为:
Figure PCTCN2019074668-appb-000032
高变化率为:
Figure PCTCN2019074668-appb-000033
x、y分别为正样本原始框的中心点的横坐标和纵坐标,w、h分别为正样本原始框的宽和高;x’、y’分别为正样本扰动框的中心点的横坐标和纵坐标,w’、h’分别为正样本扰动框的宽和高。
如图7b所示,步骤S5可以包括:
步骤S51b、随机选择多个(例如,NF个)特征属性,并设置对应的特征阈值,针对每一特征属性及其对应的特征阈值对该节点中的特征向量进行分类,得到多个对应的(例如,NF个)特征分类结果。
步骤S5还可以包括:步骤S52b、计算出各分类结果的分类得分SCORE_C。
对于步骤S51a和S51b的描述可参见前述对步骤S51a和S51b的描述,此处不再赘述。
步骤S5还可以包括:步骤S53b、确定每一个分类结果中的左子集合和右子集合的正负属性。
在步骤S53b中,判断WLP-WLN-WRP+WRN的取值的正负,若WLP-WLN-WRP+WRN的值为正,则左子集合的属性为正,右子集合的属性为负;若WLP-WLN-WRP+WRN的值为负,则左子集合的属性为负,右子集合的属性为正。
步骤S5还可以包括:步骤S54b、对于每一个分类结果而言,计算该分类结果中的属性为正的子集合中正样本特征向量的回归误差ERROR_R;
回归误差ERROR_R=Var(dx)+Var(dy)+Var(dw)+Var(dh),
其中,
Figure PCTCN2019074668-appb-000034
Figure PCTCN2019074668-appb-000035
Figure PCTCN2019074668-appb-000036
Figure PCTCN2019074668-appb-000037
N为属性为正的子集合中的正样本特征向量的数量,dx j、dy j、dw j、dh j分别为该子集合中第j个正样本特征向量对应的正样本的位置参数向量中的横坐标变化率、纵坐标变化率、宽变化率、高变化率,
Figure PCTCN2019074668-appb-000038
Figure PCTCN2019074668-appb-000039
分别为所述子集合中N个正样本特征向量所对应的正样本的位置参数向量中的横坐标变化率的平均值(也就是,将N个横坐标变化率相加后再平均的结果)、纵坐标变化率的平均值(也就是,将N个纵坐标变化率相加后再平均的结果)、宽变化率的平均值(也就是,将N个宽变 化率相加后再平均的结果)、高变化率的平均值(也就是,将N个高变化率相加后再平均的结果)。
步骤S5还可以包括:步骤S55b、计算各分类结果的总得分SCORE_TOTAL。
在一个实施例中,总得分可以是,SCORE_TOTAL=SCORE_C-λ*ERROR_R,
其中,λ为大于0的常数(例如λ=0.1),可根据实际经验进行选取。
步骤S5还可以包括:步骤S56b、选取总得分最大的分类结果所对应的特征属性和特征阈值。
在步骤S56b中,对各分类结果的总得分SCORE_TOTAL进行比较、排序,选取总得分最大的分类结果的特征属性和特征阈值,以作为拆分该节点时所要选取的特征属性及其对应的特征阈值。
与图7a所示选取在拆分节点时的特征属性不同的是,图7b所示的选取方法不仅考虑了分类得分,还考虑到了定位误差。其不仅能提高分类器的识别精确度,还能在一定程度对待检测图像中的行人进行精准定位。定位原理可参见后续描述。
返回图5,步骤2033还可以包括:步骤S6、根据确定出的特征属性及其对应特征阈值,将该节点中的全部特征向量拆分为左子集合和右子集合,增加两个新节点,分别包含左子集合和右子集合,该两个新节点作为当前节点的子节点,并将该两个新节点添加到存储结构中。两个新节点的属性就是左子集合和右子集合的属性。如果当前节点的深度为DEPTH,则两个新节点的深度为DEPTH+1。在一个实施例中,在考虑分类得分和定位误差二者的情况下,可以如前所述那样确定与所确定出的特征属性及其对应特征阈值相对应的该左子集合和该右子集合的正负属性,并且确定属性为正的所述左子集合或右子集合中的正样本特征向量所对应的正样本的位置参数向量中的横坐标变化率的平均值、纵坐标变化率的平均值、宽变化率的平均值、高变化率的平均值,作为与该子集合对应的节点的位置相关平均值。
在步骤S6执行完后,步骤2033跳转至步骤S4。
如图5中所示,步骤2033还可以包括:步骤S7、记录所有节点的相关信息,例如节点深度、子节点、特征属性及其对应特征阈值、正 负属性、例如正节点的位置相关平均值(即,横坐标变化率的平均值、纵坐标变化率的平均值、宽变化率的平均值、高变化率的平均值)、总得分等信息,从而完成一个弱分类器的训练。
在步骤S3中判断出该存储结构中没有子节点未被处理时,则表明无节点可进行拆分,一个分类器所对应的一颗决策树完成。通过记录该决策树的所有节点的深度、子节点、特征属性、特征属性对应的特征阈值等信息,就完成了一个弱分类器的训练。
在一个实施例中,步骤2033还可以包括:步骤S8、执行t=t+1,并判断t是否大于预设阈值T。
t为弱分类器的当前数量,预设阈值T表示弱分类器的数量上限值(例如T=2000)。若判断出t大于T,则表示弱分类器的数量达到上限,训练弱分类器的步骤结束;否则,执行步骤S9。
步骤S9包括:利用当前完成训练的弱分类器对各正样本特征向量、各负样本特征向量的检测得分和权值进行更新。
在一个实施例中,更新后的正样本特征向量的检测得分可以是HP k1'=HP k1+hp k1
更新后的负样本特征向量的检测得分可以是HN k2'=HN k2+hn k2
更新后的正样本特征向量的权值可以是,
Figure PCTCN2019074668-appb-000040
更新后的负样本特征向量的权值可以是,
Figure PCTCN2019074668-appb-000041
NP为所述正样本的数量,NN为所述负样本的数量,HP k1为当前第k1个正样本特征向量的检测得分,k1大于等于1且小于NP,HN k2为当前第k2个负样本特征向量的检测得分,k2大于等于1且小于NN,hp k1为第K1个正样本输入至当前完成训练的弱分类器时该弱分类器输出的节点得分,hn k2为第K2个负样本输入至当前完成训练的弱分类器时该弱分类器输出的节点得分。
在对各正样本特征向量的检测得分和权值更新后,针对初始提供的同一个正样本集合和同一个负样本集合,可以利用更新后的检测得分和权值继续训练新的弱分类器。如此,步骤2033可以跳转至步骤S2。一般地,对于Adboost算法而言,针对同一个样本集合(即,如前所述的NP个正样本和NN个负样本),需要训练多个(T)个弱分类器, 然后所述多个弱分类器可以组合形成一个强分类器。
本公开实施例提供了一种行人检测模型生成方法,基于该生成方法生成的行人检测模型可对待检测图像中的行人进行精准识别。此外,当获取正样本的同时还获取了位置参数向量时,最终生成的行人检测模型不仅可对待检测图像中的行人进行识别,还可确定出行人在待检测图像中的位置。
图8为本公开实施例提供的一种行人检测方法的流程图。如图8中所示,该行人检测方法可以包括:
步骤301、生成行人检测模型。
在步骤301中,采用上述实施例中提供的行人检测模型生成方法来生成行人检测模型,具体过程可参见上述实施例中的内容,此处不再赘述。
该行人检测方法还可以包括:步骤302、对待检测图像进行标准化,并采用预设的特征提取算法进行特征向量提取,得到待检测特征向量。
根据本公开,待检测图像可以是利用预设的规则形成的原始检测框从一个更大的图像中获取的。该待检测图像正好是该更大的图像中被原始检测框围住的部分。待检测图像或该原始检测框的中心的坐标可以是参照该更大的图像的坐标。在一个实施例中,可以通过滑动所述原始检测框从所述更大的图像中获取多个待检测图像。
根据本公开,可以对待测图像进行标准化。对待检测图像进行标准化可以包括将待检测图像转变为具有特定宽w和高h以及特征颜色通道的标准图像(此时,标准图像的中心的横坐标可以为x,纵坐标可以为y,标准图像的宽为w,高为h),并采用所述行人检测模型中的弱分类器使用的特征提取算法对待检测图像进行特征向量提取,得到待检测特征向量。
步骤303、使用行人检测模型对待检测特征向量进行分类,并基于分类结果确定待检测图像中是否包含行人。
图9为如图8中所示的步骤303的一个实施例的流程图。如前所述,根据本公开的行人检测模型可以包括多个弱分类器。在这种情况下,如图9中所示,步骤303可以包括:
步骤3031、使用所述多个弱分类器中每一个弱分类器分别对待检测特征向量进行分类,并且输出相应的节点得分,其中,第k个分类 器输出的节点得分记为S k,k∈[1,T],T为弱分类器的数量。需要说明的是,对于例如第k个分类器而言,该待检测特征向量将被最终分类到与该第k个分类器对应的决策树的某个特定叶子节点(即,没有子节点的节点)中。如此,针对该待检测特征向量而言,该第k个弱分类器输出的节点得分实际就是在训练该第k个弱分类器时记录的该特定叶子节点的节点得分,即根据
Figure PCTCN2019074668-appb-000042
计算出。
如图9中所示,步骤303还可以包括步骤3032、对全部弱分类器所输出的节点得分进行求和,得到分类器总得分S,即
Figure PCTCN2019074668-appb-000043
步骤303还可以包括步骤3033、判断分类器总得分S是否大于预先设置的阈值得分S’。
在步骤S3033中,若判断出S>S’,则确定待检测图像中包含行人;反之,则确定待检测图像中不包含行人。
本公开实施例提供的行人检测方法可以对待检测图像中是否包含行人进行精准识别。
图10为本公开另一个实施例提供的一种行人检测方法的流程图。如图10中所示,该行人检测方法可以包括:
步骤401、生成行人检测模型。
在生成行人检测模型中的各弱分类器的过程中,基于分类得分和定位误差来确定节点进行拆分时的特征属性(如图7b中所示的方法)。
该行人检测方法还可以包括步骤402、对待检测图像进行标准化,并采用预设的特征提取算法进行特征向量提取,得到待检测特征向量。
待检测图像的中心点的横坐标可以表示为x,纵坐标可以表示为y。在一个实施例中,对待测图像进行标准化可以包括将待检测图像转变为具有特定宽w和高h以及特征颜色通道的标准图像,并采用所述行人检测模型中的弱分类器使用的特征提取算法对待检测图像进行特征向量提取,得到待检测特征向量。
如图10中所示,该行人检测方法继续至步骤403、使用行人检测模型中的所述多个弱分类器中的每个弱分类器分别对待检测特征向量进行分类,并输出相应的节点得分。具体方法如例如步骤3031中所示, 在此不再赘述。
该行人检测方法继续至步骤404、对于行人检测模型中的多个弱分类器中的每个弱分类器而言,判断在该弱分类器中的该待检测特征向量被分类到的叶子节点的属性是否为正,且在判断出该弱分类器的该叶子节点的属性为正时,获取在所述行人检测模型中的该弱分类器中记录的与该叶子节点对应的位置相关平均值(即,横坐标变化率的平均值、纵坐标变化率的平均值、宽变化率的平均值、高变化率的平均值)。
例如,若判断出第k个弱分类器的叶子节点的属性为正时,则获取在所述行人检测模型中的该弱分类器中记录的与该待检测特征向量被分类到的该叶子节点对应的位置相关平均值,即横坐标变化率的平均值
Figure PCTCN2019074668-appb-000044
纵坐标变化率的平均值
Figure PCTCN2019074668-appb-000045
宽变化率的平均值
Figure PCTCN2019074668-appb-000046
高变化率的平均值
Figure PCTCN2019074668-appb-000047
该行人检测方法继续至步骤405、对全部弱分类器所输出的节点得分进行求和,得到分类器总得分S,即,总得分
Figure PCTCN2019074668-appb-000048
具体方法如前所述的步骤3032所示。
该行人检测方法继续至步骤406、判断分类器总得分S是否大于预先设置的阈值得分S’。
在步骤406中,若判断出S>S’,则检测出待检测图像中包含行人,此时执行步骤407;反之,则检测出待检测图像中不包含行人,流程结束。
在步骤407中、根据所获取的位置相关平均值来计算待检测图像中行人所对应的位置参数向量L'=(dx',dy',dw',dh')。
其中,
Figure PCTCN2019074668-appb-000049
Figure PCTCN2019074668-appb-000050
Figure PCTCN2019074668-appb-000051
Figure PCTCN2019074668-appb-000052
k表征弱分类器的编号,k大于等于1且小于T;M(k)用于表征第k个弱分类器中的待检测特征向量被分类到的叶子节点的属性,若属性为正,则M(k)=1,若属性为负,则M(k)=0;SCORE(k)为第k个弱分类器中的待检测特征向量被分类到的叶子节点的节点得分;
Figure PCTCN2019074668-appb-000053
分别是在第k个弱分类器中记录的与该待检测特征向量被分类到的叶子节点对应的位置相关平均值,即横坐标变化率的平均值、纵坐标变化率的平均值、宽变化率的平均值、高变化率的平均值。
通过上述步骤407所计算出的标识框的位置参数向量L'=(dx',dy',dw',dh'),可在一定程度上反映出行人在待检测图像的位置信息并且可以用于调整围住该待检测图像的检测框。如前所述,标准化后的待检测图像的中心的横坐标为x、纵坐标为y,标准化后的待检测图像的宽为w,高为h。根据本公开,可以根据步骤407计算出的位置参数向量L'=(dx',dy',dw',dh')来调整围住该待检测图像的检测框。在一个实施例中,调整后的该检测框的横坐标可以为X=x+w*dx',纵坐标可以为Y=y+h*dy',宽可以为W=w*dw',高可以为H=h*dh'。需要说明的是,此时得到的调整后的该检测框与标准化后的待检测图像相对应。因此,为了得到与从中获得该待检测图像的更大的图像相对应的 调整后的检测框,可以按照与标准化相反的过程对与标准化后的待检测图像相对应的所述调整后的该待检测框进行处理。调整后的该检测框将更精确地围出行人。
由此可见,本公开的该实施例提供的行人检测方法不但可对待检测图像中是否有行人进行检测,还可在检测出待检测图像中存在行人时更精确地对行人的位置进行定位。
图11为本公开一个实施例提供的计算设备的结构示意图。该计算设备用于实施所描述的方法的各个步骤。该计算设备可以用于实施图1以及3-10中所示的方法中的全部或者一部分。该计算设备仅仅是合适的计算设备的一个示例,并且不旨在关于当前所公开的主题的用途或者功能性的范围建议任何限制。
该计算设备的部件可以包括但不限于处理器11、存储器12和将包括存储器的各种系统部件耦合到处理器11的系统总线16。系统总线16可以是使用各种各样总线架构中的任一种总线架构的包括存储器总线或者存储器控制器、外围总线和本地总线的几种类型的总线结构中的任一种总线结构。作为示例而非限制,这样的架构包括工业标准架构(ISA)总线、微通道架构(MCA)总线、增强型ISA(EISA)总线、视频电子标准协会(VESA)本地总线和也被称为夹层总线的外围部件互连(PCI)总线。
处理器11可以包括微处理器、控制器电路等,并且可以被配置为执行在存储器12中存储的计算机程序。
存储器12可以包括各种各样的计算机可读介质。计算机可读介质可以是可以被计算设备访问的任何可用介质,并且包括易失性和非易失性介质、可移除和非可移除介质两者。作为示例而非限制,计算机可读介质可以包括计算机可读存储介质和通信介质。计算机可读存储介质包括使用任何用于存储诸如是计算机可读指令、数据结构、程序模块或者其他数据之类的信息的方法或者技术实施的易失性和非易失性以及可移除和非可移除介质。计算机可读存储介质包括但不限于:随机存取存储器(RAM)、只读存储器(ROM)、EEPROM、闪存或者其他存储器技术、CD-ROM、数字多功能盘(DVD)或者其他光盘存储器、盒式磁带、磁带、磁盘存储装置或者其他磁存储设备、或者任何其他可以用于存储期望的信息并且可以被计算设备访问的介质。 通信介质通常将计算机可读指令、数据结构、程序模块或者其他数据体现在诸如是载波或者其他传输机制之类的已调制数据信号中,并且包括任何信息递送介质。术语“已调制数据信号”表示使其特性中的一个或多个特性以使得将信息编码在该信号中的方式被设置或者改变的信号。作为示例而非限制,通信介质包括:诸如是有线网络或者直接有线连接之类的有线介质;以及诸如是RF和其他无线介质之类的无线介质。以上各项中的任何项的组合也被包括在计算机可读介质的范围内。
存储器12可以包括采用诸如ROM和RAM之类的易失性和/或非易失性存储器的形式的计算机存储介质。包含诸如在启动期间帮助在计算设备内的元件之间传输信息的基本例程的基本输入/输出系统(BIOS)通常被存储在ROM中。RAM通常包含对于处理器11来说可立即访问和/或当前正被处理器11操作的数据和/或程序模块。作为示例而非限制,图11所示的存储器12中可以存储的数据13可以包括BIOS、操作系统、计算机程序、其他程序模块和程序数据。所述计算机程序在所述处理器11中执行时,使得处理器11执行如前所述的方法。
该计算设备还可以包括其他可移除/非可移除、易失性/非易失性计算机存储介质。
可以在示例性操作环境中使用的其他可移除/非可移除、易失性/非易失性计算机存储介质包括但不限于盒式磁带、闪存卡、DVD、数字视频磁带、固态RAM、固态ROM等。
上面所讨论的计算机存储介质为计算设备提供对计算机可执行指令、数据结构、程序模块和其他数据的存储。
用户可以通过诸如键盘和通常被称为鼠标、轨迹球或者触摸板的指向设备之类的输入设备向计算设备输进命令和信息。其他输入设备可以包括麦克风、操纵杆、游戏板、碟形卫星天线、扫描仪等。这些和其他输入设备通常通过耦合到系统总线16的用户输入输出(I/O)接口14连接到处理器11。监视器或者其他类型的显示设备可以经由诸如是视频接口之类的用户输入输出(I/O)接口14连接到系统总线16。除监视器之外,计算设备还可以通过用户输入输出(I/O)接口14连接到其他外围输出设备,诸如是扬声器和打印机。
该计算设备可以通过网络接口15与一个或多个远程计算机连接。远程计算机可以是个人计算机、服务器、路由器、网络PC、对等设备或者其他常见网络节点,并且通常包括上面关于计算设备所描述元件中的许多或者全部单元。
本公开实施例还提供一种计算机可读介质,其上存储有计算机程序,所述计算机程序在处理器上运行时,使得所述处理器执行根据本公开实施例的方法和功能。所述计算机可读介质可以包括以上所述的任何一种计算机可读介质。
本公开实施例还提供一种计算机程序产品,当上述计算机程序产品中的指令由处理器执行时,可以实现根据本公开实施例的方法。
在本说明书的描述中,术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点被包含于本公开的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本公开的描述中,“多个”的含义是至少两个,例如两个、三个等,除非另有明确具体的限定。
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现定制逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本公开的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序(包括根据所涉及的功能按基本同时的方式或按相反的顺序)来执行功能,这应被本公开的实施例所属技术领域的技术人员所理解。
可以理解的是,以上实施方式仅仅是为了说明本公开的原理而采用的示例性实施方式,然而本公开并不局限于此。对于本领域内的普通技术人员而言,在不脱离本公开的精神和实质的情况下,可以做出 各种变型和改进,这些变型和改进也视为本公开的保护范围。

Claims (20)

  1. 一种样本获取方法,包括:
    对原始图像中预先标记的样本原始框添加扰动,以得到样本选取框,所述样本原始框所围住的图像包含目标;
    提取所述样本选取框所围住的图像以作为样本。
  2. 根据权利要求1所述的样本获取方法,其中,所述对样本原始框添加扰动的步骤包括:
    对所述样本原始框的中心位置,和/或所述样本原始框的宽,和/或所述样本原始框的高,添加随机扰动,以得到样本扰动框,并将所述样本扰动框作为所述样本选取框。
  3. 根据权利要求1或2所述的样本获取方法,其中,所述样本原始框所围住的图像包含所述目标的完整图像或者所述目标的部分图像。
  4. 一种目标检测模型生成方法,包括:
    采用上述权利要求1至3中任一项所述的样本获取方法来从原始图像中获取多个正样本;
    从所述原始图像中获取多个不包含目标的图像以作为负样本;
    根据获取的所述多个正样本和所述多个负样本采用预设的分类模型算法来生成目标检测模型。
  5. 根据权利要求4所述的目标检测模型生成方法,其中,所述根据获取的所述多个正样本和所述多个负样本采用预设的分类模型算法来生成目标检测模型的步骤包括:
    对所述多个正样本中的每一个正样本和所述多个负样本中的每一个负样本进行标准化;
    采用预设的特征提取算法对标准化后的每一个正样本和每一个负样本进行特征向量提取,以得到正样本特征训练集和负样本特征训练集,正样本特征训练集包括所提取的所有正样本的特征向量,负样本特征训练集包括所提取的所有负样本的特征向量;
    根据所述正样本特征训练集和所述负样本特征训练集训练出多个弱分类器,全部所述多个弱分类器构成所述目标检测模型。
  6. 根据权利要求5所述的目标检测模型生成方法,其中,所述根据所述正样本特征训练集和所述负样本特征训练集训练出多个弱分类 器的步骤包括:
    S1、初始化所述正样本特征训练集中每一个正样本特征向量所对应的检测得分和权值,初始化所述负样本特征训练集中每一个负样本特征向量所对应的检测得分和权值,初始化弱分类器数量t=0;
    S2、设置存储结构,并将根节点放入至所述存储结构中,根节点的深度为1,根节点中包含全部所述正样本特征向量和全部所述负样本特征向量;
    S3、检测所述存储结构是否有未被处理的节点;若检测出所述存储结构没有未被处理的节点,则执行步骤S7,否则执行步骤S4;
    S4、从所述存储结构中取出一个节点,并判断该节点是否可拆分,若判断出该节点可拆分则执行步骤S5,否则执行步骤S3;
    S5、确定该节点进行拆分时所选取的特征属性及其对应的特征阈值;
    S6、根据确定出的特征属性及其对应特征阈值,将该节点中的全部特征向量拆分为左子集合和右子集合,增加两个新节点,分别包含所述左子集合和所述右子集合,该两个新节点作为该节点的子节点,并将该两个新节点添加到所述存储结构中,此时跳转至执行步骤S4;
    S7、记录所有节点的相关信息,完成一个弱分类器的训练;
    S8、执行t=t+1,并判断t是否大于预设阈值T,若判断出t大于T,则训练出多个弱分类器的步骤结束;否则执行步骤S9;
    S9、利用当前完成训练的弱分类器对所述正样本特征训练集中的每一个正样本特征向量和所述负样本特征训练集中的每一个负样本特征向量的检测得分和权值进行更新,并基于更新后的检测得分和权值执行步骤S2。
  7. 根据权利要求6所述的目标检测模型生成方法,其中,在步骤S1和步骤S3之间还包括:
    S2a、对每一个所述正样本特征向量的权值和每一个所述负样本特征向量的权值进行规范化。
  8. 根据权利要求6所述的目标检测模型生成方法,其中,在步骤S1中,第k1个正样本特征向量的初始化权值为
    Figure PCTCN2019074668-appb-100001
    第k1个正样本特征向量的初始化检测得分为HP k1=0,第k2个负样本特征向量 的初始化权值为
    Figure PCTCN2019074668-appb-100002
    第k2个负样本特征向量的初始化检测得分为HN k2=0,NP为所述正样本的数量,NN为所述负样本的数量,k1∈[1,NP],k2∈[1,NN]。
  9. 根据权利要求6所述的目标检测模型生成方法,其中,步骤S4包括:
    S41、从存储结构中取出一个节点,计算该节点中正样本特征向量的权值和与总权值的比率RATIO、以及节点得分SCORE,
    比率
    Figure PCTCN2019074668-appb-100003
    节点得分
    Figure PCTCN2019074668-appb-100004
    其中,WPS为该节点中正样本特征向量的权值和,WNS为该节点中负样本特征向量的权值和,p为大于0的常数;
    S42、判断该节点是否同时满足条件a和条件b,条件a为DEPTH<MAXDEPTH,条件b为TH≤RATIO≤1-TH,
    其中,DEPTH为该节点的深度,MAXDEPTH为预先设置的节点最大深度,TH为预先设置的比率阈值,
    若步骤S42判断出该节点同时满足条件a和条件b,则判断出该节点可拆分;否则,判断出该节点不可拆分。
  10. 根据权利要求6所述的目标检测模型生成方法,其中,步骤S5包括:
    S51a、随机选择NF个特征属性,并设置对应的特征阈值,针对每一个特征属性及其对应的特征阈值对该节点的正样本特征向量和负样本特征向量进行分类,得到NF个特征分类结果,每个分类结果均包含一个左子集合和一个右子集合,其中,第i个特征属性对应的阈值为TH i,i∈[1,NF],在基于第i个特征属性进行分类时,将该节点中其第i个特征属性的特征值小于TH i的特征向量放入左子集合,将该节点中其第i个特征属性的特征值大于或等于TH i的特征向量放入右子集合;
    S52a、计算出每一个分类结果的分类得分SCORE_C:
    SCORE_C=|WLP-WLN-WRP+WRN|,
    其中,WLP为左子集合中的正样本特征向量的权值和,WLN为左子集合中的负样本特征向量的权值和,WRP为右子集合中的正样本特征向量的权值和,WRN为右子集合中的负样本特征向量的权值和;
    S53a、选取分类得分最大的分类结果所对应的特征属性和特征阈值,以作为该节点进行拆分时所选取的特征属性及其对应的特征阈值。
  11. 根据权利要求6所述的目标检测模型生成方法,其中,所述获取多个正样本的步骤还包括:
    获取所述多个正样本中的每一个正样本的位置参数向量L=(dx,dy,dw,dh),
    其中,横坐标变化率
    Figure PCTCN2019074668-appb-100005
    纵坐标变化率
    Figure PCTCN2019074668-appb-100006
    宽变化率
    Figure PCTCN2019074668-appb-100007
    高变化率
    Figure PCTCN2019074668-appb-100008
    x、y分别为所述正样本原始框的中心点的横坐标和纵坐标,w、h分别为所述正样本原始框的宽和高;
    x’、y’分别为所述正样本扰动框的中心点的横坐标和纵坐标,w’、h’分别为所述正样本扰动框的宽和高。
  12. 根据权利要求11所述的目标检测模型生成方法,其中,步骤S5包括:
    S51b、随机选择NF个特征属性,并设置对应的特征阈值,针对每一个特征属性及其对应的特征阈值对该节点进行分类,得到NF个特征分类结果,每一个分类结果均包含一个左子集合和一个右子集合,其中,第i个特征属性对应的阈值为TH i,i∈[1,NF],在基于第i个特征属性进行分类时,将该节点中对应于第i个特征属性的特征值小于TH i的特征向量放入左子集合,将该节点中对应于第i个特征属性的特征值大于或等于TH i的特征向量放入右子集合;
    S52b、计算出每一个分类结果的分类得分SCORE_C:
    SCORE_C=|WLP-WLN-WRP+WRN|;
    其中,WLP为左子集合中的正样本特征向量的权值和,WLN为左子集合中的负样本特征向量的权值和,WRP为右子集合中的正样本特征向量的权值和,WRN为右子集合中的负样本特征向量的权值和;
    S53b、确定每一个分类结果中的左子集合和右子集合的正负属性,其中,若WLP-WLN-WRP+WRN的值为正,则左子集合的属性为正,右子集合的属性为负;反之,则左子集合的属性为负,右子集合的属性为正;
    S54b、计算正样本子集合中正样本特征向量的回归误差ERROR_R:
    ERROR_R=Var(dx)+Var(dy)+Var(dw)+Var(dh),
    其中,
    Figure PCTCN2019074668-appb-100009
    Figure PCTCN2019074668-appb-100010
    Figure PCTCN2019074668-appb-100011
    Figure PCTCN2019074668-appb-100012
    N为该正样本子集合中正样本特征向量的数量,dx j、dy j、dw j、dh j分别为该正样本子集合中第j个正样本特征向量对应的正样本的位置参数向量中的横坐标变化率、纵坐标变化率、宽变化率、高变化率,
    Figure PCTCN2019074668-appb-100013
    Figure PCTCN2019074668-appb-100014
    分别为该正样本子集合中N个正样本特征向量所对应的正样本的位置参数向量中的横坐标变化率的平均值、纵坐标变化率的平均值、宽变化率的平均值、高变化率的平均值;
    S55b、计算每一个分类结果的总得分SCORE_TOTAL:
    SCORE_TOTAL=SCORE_C-λ*ERROR_R,
    其中,λ为大于0的常数;
    S56b、选取总得分最大的分类结果所对应的特征属性和特征阈值,以作为该节点进行拆分时所选取的特征属性及其对应的特征阈值。
  13. 根据权利要求6所述的目标检测模型生成方法,其中,在步骤 S9中,
    更新后的正样本特征向量的检测得分HP k1'=HP k1+hp k1
    更新后的负样本特征向量的检测得分HN k2'=HN k2+hn k2
    更新后的正样本特征向量的权值
    Figure PCTCN2019074668-appb-100015
    更新后的负样本特征向量的权值
    Figure PCTCN2019074668-appb-100016
    其中,NP为所述正样本的数量,NN为所述负样本的数量,HP k1为当前第k1个正样本特征向量的检测得分,HN k2为当前第k2个负样本特征向量的检测得分,hp k1为第K1个正样本输入至当前完成训练的弱分类器时该弱分类器输出的节点得分,hn k2为第K2个负样本输入至当前完成训练的弱分类器时该弱分类器输出的节点得分。
  14. 根据权利要求5所述的目标检测模型生成方法,其中,所述特征提取算法包括:方向梯度直方图特征提取算法、亮度色度颜色特征提取算法、局部二进制模式特征提取算法中的至少一种。
  15. 一种目标检测方法,包括:
    采用上述权利要求4-14中任一项所述生成方法生成目标检测模型;
    对待检测图像进行标准化,并采用预设的特征提取算法进行特征向量提取,得到待检测特征向量;
    使用所述目标检测模型对所述待检测特征向量进行分类,并基于分类结果确定所述待检测图像中是否包含目标。
  16. 根据权利要求15所述的目标检测方法,其中,所述使用所述目标检测模型对所述待检测特征向量进行分类的步骤包括:
    使用所述目标检测模型的每一个弱分类器对所述待检测特征向量进行分类,并获取在该弱分类器中的所述待检测特征向量被分类到的节点的得分;
    对所获取的所有的节点得分进行求和,得到分类器总得分S;
    判断分类器总得分S是否大于预先设置的阈值得分S’,若判断出S>S’,则检测出待检测图像中包含目标,反之,则检测出待检测图像中不包含目标。
  17. 根据权利要求15所述的目标检测方法,其中,当采用上述权利要求11或12中所述生成方法生成目标检测模型时,所述使用所述 目标检测模型对所述待检测特征向量进行分类包括:
    使用所述目标检测模型的每一个弱分类器对所述待检测特征向量进行分类,并获取在该弱分类器中的所述待检测特征向量被分类到的节点的得分,
    对每一个弱分类器而言,判断该弱分类器中的所述待检测特征向量被分类到的叶子节点的属性是否为正,若判断出该叶子节点的属性为正,则获取与该叶子节点相对应的横坐标变化率的平均值、纵坐标变化率的平均值、宽变化率的平均值、高变化率的平均值;
    对所获取的所有的节点得分进行求和,得到分类器总得分S;
    判断分类器总得分S是否大于预先设置的阈值得分S’,若判断出S>S’,则检测出待检测图像中包含目标;
    在检测出待检测图像中包含目标之后,根据所获取的横坐标变化率的平均值、纵坐标变化率的平均值、宽变化率的平均值和高变化率的平均值来计算待检测图像的位置参数向量;以及
    根据该待检测图像的位置参数向量来调整围住该待检测图像的检测框。
  18. 根据权利要求17所述的目标检测方法,其中,该目标所对应的位置参数向量为L'=(dx',dy',dw',dh'),
    其中,
    Figure PCTCN2019074668-appb-100017
    Figure PCTCN2019074668-appb-100018
    Figure PCTCN2019074668-appb-100019
    Figure PCTCN2019074668-appb-100020
    k表征弱分类器的编号,k大于等于1且小于T;M(k)用于表征第k个弱分类器中的所述待检测特征向量被分类到的叶子节点的属性是否为正,其中若该叶子节点的属性为正,则M(k)=1,若该叶子节点的属性为负,则M(k)=0;SCORE(k)为第k个弱分类器中的所述待检测特征向量被分类到的叶子节点所对应的节点得分;
    Figure PCTCN2019074668-appb-100021
    Figure PCTCN2019074668-appb-100022
    分别是在第k个弱分类器中的与该待检测特征向量被分类到的叶子节点对应的横坐标变化率的平均值、纵坐标变化率的平均值、宽变化率的平均值、高变化率的平均值;
    调整后的该检测框的横坐标为X=x+w*dx',纵坐标为Y=y+h*dy',宽为W=w*dw',高为H=h*dh',其中,x,y分别为标准化后的待检测图像的中心的横坐标和纵坐标,w,h分别为标准化后的待检测图像的宽和高。
  19. 一种计算设备,包括
    处理器;以及
    存储器,其存储了计算机程序,所述计算机程序在被所述处理器执行时,使得所述处理器执行根据权利要求1-3中任一项所述的样本检测方法、或者根据权利要求4-14中任一项所述的目标检测模型生成方法或者根据权利要求15-18中任一项所述的目标检测方法。
  20. 一种计算机可读介质,其存储了计算机程序,所述计算机程序在被处理器执行时,使得所述处理器执行根据权利要求1-3中任一项所述的样本检测方法、或者根据权利要求4-14中任一项所述的目标检测模型生成方法或者根据权利要求15-18中任一项所述的目标检测方法。
PCT/CN2019/074668 2018-02-13 2019-02-03 样本获取方法、目标检测模型生成方法、目标检测方法 WO2019158015A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US16/605,752 US11238296B2 (en) 2018-02-13 2019-02-03 Sample acquisition method, target detection model generation method, target detection method, computing device and computer readable medium
EP19753684.0A EP3754539A4 (en) 2018-02-13 2019-02-03 SAMPLE ACQUISITION PROCESS, TARGET DETECTION MODEL GENERATION PROCESS, TARGET DETECTION PROCESS

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810149833.2A CN110163033B (zh) 2018-02-13 2018-02-13 正样本获取方法、行人检测模型生成方法和行人检测方法
CN201810149833.2 2018-02-13

Publications (1)

Publication Number Publication Date
WO2019158015A1 true WO2019158015A1 (zh) 2019-08-22

Family

ID=67619728

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/074668 WO2019158015A1 (zh) 2018-02-13 2019-02-03 样本获取方法、目标检测模型生成方法、目标检测方法

Country Status (4)

Country Link
US (1) US11238296B2 (zh)
EP (1) EP3754539A4 (zh)
CN (1) CN110163033B (zh)
WO (1) WO2019158015A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111353538A (zh) * 2020-02-28 2020-06-30 西安理工大学 基于深度学习的相似图像匹配方法
CN111598833A (zh) * 2020-04-01 2020-08-28 江汉大学 一种目标样本瑕疵检测的方法、装置及电子设备
CN112560992A (zh) * 2020-12-25 2021-03-26 北京百度网讯科技有限公司 优化图片分类模型的方法、装置、电子设备及存储介质
CN113283493A (zh) * 2021-05-19 2021-08-20 Oppo广东移动通信有限公司 样本的获取方法、装置、终端及存储介质
CN113344042A (zh) * 2021-05-21 2021-09-03 北京中科慧眼科技有限公司 基于辅助驾驶的路况图像模型训练方法、系统和智能终端

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814852B (zh) * 2020-06-24 2024-07-05 理光软件研究所(北京)有限公司 图像检测方法、装置、电子设备和计算机可读存储介质
CN112257797A (zh) * 2020-10-29 2021-01-22 瓴盛科技有限公司 行人头部图像分类器的样本图像生成方法及相应训练方法
CN112200274B (zh) * 2020-12-09 2021-03-30 湖南索莱智能科技有限公司 一种目标检测方法、装置、电子设备和存储介质
CN112651458B (zh) * 2020-12-31 2024-04-02 深圳云天励飞技术股份有限公司 分类模型的训练方法、装置、电子设备及存储介质
CN112926463B (zh) * 2021-03-02 2024-06-07 普联国际有限公司 一种目标检测方法和装置
CN112836768B (zh) * 2021-03-08 2022-07-19 北京电子工程总体研究所 一种数据平衡方法和系统、计算机设备和介质
CN113111708B (zh) * 2021-03-10 2023-12-29 北京爱笔科技有限公司 车辆匹配样本生成方法、装置、计算机设备和存储介质
CN112906669A (zh) * 2021-04-08 2021-06-04 济南博观智能科技有限公司 一种交通目标检测方法、装置、设备及可读存储介质
CN113255456B (zh) * 2021-04-28 2023-08-25 平安科技(深圳)有限公司 非主动活体检测方法、装置、电子设备及存储介质
CN113159209B (zh) * 2021-04-29 2024-05-24 深圳市商汤科技有限公司 目标检测方法、装置、设备和计算机可读存储介质
CN114266945B (zh) * 2022-02-28 2022-06-14 粤港澳大湾区数字经济研究院(福田) 一种目标检测模型的训练方法、目标检测方法及相关装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130094701A1 (en) * 2011-09-29 2013-04-18 Nec Laboratories America, Inc. Adaptive cross partition for learning weak classifiers
CN105095857A (zh) * 2015-06-26 2015-11-25 上海交通大学 基于关键点扰动技术的人脸数据增强方法
CN105426870A (zh) * 2015-12-15 2016-03-23 北京文安科技发展有限公司 一种人脸关键点定位方法及装置
CN107463879A (zh) * 2017-07-05 2017-12-12 成都数联铭品科技有限公司 基于深度学习的人体行为识别方法

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5649023A (en) * 1994-05-24 1997-07-15 Panasonic Technologies, Inc. Method and apparatus for indexing a plurality of handwritten objects
JP4712563B2 (ja) * 2006-01-16 2011-06-29 富士フイルム株式会社 顔検出方法および装置並びにプログラム
US20070233435A1 (en) * 2006-03-31 2007-10-04 Gary Bradski Boosted linear modeling of non-linear time series
US20070237387A1 (en) * 2006-04-11 2007-10-11 Shmuel Avidan Method for detecting humans in images
CN100565523C (zh) * 2007-04-05 2009-12-02 中国科学院自动化研究所 一种基于多分类器融合的敏感网页过滤方法及系统
CN101290660A (zh) * 2008-06-02 2008-10-22 中国科学技术大学 一种用于行人检测的树状组合分类方法
CN101853389A (zh) * 2009-04-01 2010-10-06 索尼株式会社 多类目标的检测装置及检测方法
CN103366160A (zh) * 2013-06-28 2013-10-23 西安交通大学 融合肤色、人脸和敏感部位检测的不良图像判别方法
CN103902968B (zh) * 2014-02-26 2015-03-25 中国人民解放军国防科学技术大学 一种基于AdaBoost分类器的行人检测模型训练方法
JP6320112B2 (ja) 2014-03-27 2018-05-09 キヤノン株式会社 情報処理装置、情報処理方法
CN103984953B (zh) * 2014-04-23 2017-06-06 浙江工商大学 基于多特征融合与Boosting决策森林的街景图像的语义分割方法
US9443320B1 (en) * 2015-05-18 2016-09-13 Xerox Corporation Multi-object tracking with generic object proposals
CN106778452A (zh) * 2015-11-24 2017-05-31 沈阳新松机器人自动化股份有限公司 服务机器人基于双目视觉的人体检测与跟踪方法
CN106295502B (zh) * 2016-07-25 2019-07-12 厦门中控智慧信息技术有限公司 一种人脸检测方法及装置
CN107301378B (zh) * 2017-05-26 2020-03-17 上海交通大学 图像中多分类器集成的行人检测方法和系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130094701A1 (en) * 2011-09-29 2013-04-18 Nec Laboratories America, Inc. Adaptive cross partition for learning weak classifiers
CN105095857A (zh) * 2015-06-26 2015-11-25 上海交通大学 基于关键点扰动技术的人脸数据增强方法
CN105426870A (zh) * 2015-12-15 2016-03-23 北京文安科技发展有限公司 一种人脸关键点定位方法及装置
CN107463879A (zh) * 2017-07-05 2017-12-12 成都数联铭品科技有限公司 基于深度学习的人体行为识别方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3754539A4 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111353538A (zh) * 2020-02-28 2020-06-30 西安理工大学 基于深度学习的相似图像匹配方法
CN111353538B (zh) * 2020-02-28 2023-04-07 西安理工大学 基于深度学习的相似图像匹配方法
CN111598833A (zh) * 2020-04-01 2020-08-28 江汉大学 一种目标样本瑕疵检测的方法、装置及电子设备
CN111598833B (zh) * 2020-04-01 2023-05-26 江汉大学 一种目标样本瑕疵检测的方法、装置及电子设备
CN112560992A (zh) * 2020-12-25 2021-03-26 北京百度网讯科技有限公司 优化图片分类模型的方法、装置、电子设备及存储介质
CN112560992B (zh) * 2020-12-25 2023-09-01 北京百度网讯科技有限公司 优化图片分类模型的方法、装置、电子设备及存储介质
CN113283493A (zh) * 2021-05-19 2021-08-20 Oppo广东移动通信有限公司 样本的获取方法、装置、终端及存储介质
CN113344042A (zh) * 2021-05-21 2021-09-03 北京中科慧眼科技有限公司 基于辅助驾驶的路况图像模型训练方法、系统和智能终端

Also Published As

Publication number Publication date
EP3754539A1 (en) 2020-12-23
CN110163033B (zh) 2022-04-22
CN110163033A (zh) 2019-08-23
US11238296B2 (en) 2022-02-01
EP3754539A4 (en) 2021-11-10
US20200151484A1 (en) 2020-05-14

Similar Documents

Publication Publication Date Title
WO2019158015A1 (zh) 样本获取方法、目标检测模型生成方法、目标检测方法
Bergman et al. Deep nearest neighbor anomaly detection
Gaidon et al. Recognizing activities with cluster-trees of tracklets
US8879796B2 (en) Region refocusing for data-driven object localization
US9008429B2 (en) Label-embedding for text recognition
US7958070B2 (en) Parameter learning method, parameter learning apparatus, pattern classification method, and pattern classification apparatus
Kading et al. Active learning and discovery of object categories in the presence of unnameable instances
US20110085728A1 (en) Detecting near duplicate images
JP2016134175A (ja) ワイルドカードを用いてテキスト−画像クエリを実施するための方法およびシステム
JP5591360B2 (ja) 分類及び対象物検出の方法及び装置、撮像装置及び画像処理装置
JP2010026603A (ja) 画像処理装置、画像処理方法、及びコンピュータプログラム
US8761510B2 (en) Object-centric spatial pooling for image classification
JP2011525012A (ja) デジタルコンテンツ記録のための意味論的イベント検出
JP2017062778A (ja) 画像のオブジェクトを分類するための方法およびデバイスならびに対応するコンピュータプログラム製品およびコンピュータ可読媒体
JP7077046B2 (ja) 情報処理装置、被写体の判別方法及びコンピュータプログラム
Chen et al. Discriminative BoW framework for mobile landmark recognition
US9489593B2 (en) Information processing apparatus and training method
CN112651996B (zh) 目标检测跟踪方法、装置、电子设备和存储介质
JP5214679B2 (ja) 学習装置、方法及びプログラム
Garcia-Fidalgo et al. Vision-based topological mapping and localization by means of local invariant features and map refinement
US8498978B2 (en) Slideshow video file detection
JP7341962B2 (ja) 学習データ収集装置、学習装置、学習データ収集方法およびプログラム
CN112241470B (zh) 一种视频分类方法及系统
JP5959446B2 (ja) コンテンツをバイナリ特徴ベクトルの集合で表現することによって高速に検索する検索装置、プログラム及び方法
Mohemmed et al. Particle swarm optimisation based AdaBoost for object detection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19753684

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019753684

Country of ref document: EP

Effective date: 20200914