CN112287769B - Face detection method, device, equipment and storage medium - Google Patents
Face detection method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN112287769B CN112287769B CN202011073284.9A CN202011073284A CN112287769B CN 112287769 B CN112287769 B CN 112287769B CN 202011073284 A CN202011073284 A CN 202011073284A CN 112287769 B CN112287769 B CN 112287769B
- Authority
- CN
- China
- Prior art keywords
- pixel
- unit
- units
- segmentation
- calculating
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 52
- 230000011218 segmentation Effects 0.000 claims abstract description 90
- 239000013598 vector Substances 0.000 claims abstract description 61
- 238000012706 support-vector machine Methods 0.000 claims abstract description 54
- 238000000034 method Methods 0.000 claims abstract description 26
- 238000010606 normalization Methods 0.000 claims abstract description 22
- 230000004927 fusion Effects 0.000 claims abstract description 6
- 238000005192 partition Methods 0.000 claims description 29
- 238000012549 training Methods 0.000 claims description 28
- 238000002372 labelling Methods 0.000 claims description 10
- 238000003062 neural network model Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 description 24
- 238000012545 processing Methods 0.000 description 19
- 238000004590 computer program Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 230000001629 suppression Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/50—Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the field of software image recognition, and provides a face detection method, a face detection device, face detection equipment and a storage medium. The method comprises the following steps: acquiring an input image; dividing an input image to obtain a plurality of dividing units; extracting the HOG characteristics of the direction gradient histogram of each segmentation unit to obtain HOG vectors; normalizing each segmentation unit to obtain a plurality of normalization units; calculating the mean value, variance and image difference of each normalization unit to obtain the mean value of a plurality of segmentation units, the variance of the plurality of segmentation units and the image difference of the plurality of segmentation units; feature fusion is carried out on the mean value of each segmentation unit, the variance of the segmentation unit, the image difference of the segmentation unit and the HOG vector to obtain a target feature vector; and inputting the target feature vector into a trained support vector machine to obtain a face detection result. The accuracy of face prediction is improved.
Description
Technical Field
The present invention relates to the field of image recognition, and in particular, to a face detection method, apparatus, device, and storage medium.
Background
The face detection problem is derived from face recognition, which is one of the most effective and popular verification means at present, and is widely applied to mobile phones, computers, access control and other devices. As face recognition is widely used, face detection is also beginning to be regarded as a separate problem. In general, the use of face detection systems is complex, and it must be able to recognize faces in different environments, which requires that it must be able to adapt to different environments in addition to a high recognition rate. At present, the technology related to face detection and face recognition is used in various fields, and has important academic value and application value in the aspects of information retrieval, target monitoring, target tracking, automatic driving and the like. Current improved algorithms based on raw HOG use only gradient information to extract feature descriptors, thus making the model unstable and not efficient in extracting image information when facing blurred and edge-smooth images.
Disclosure of Invention
The invention aims to solve the technical problem that the face recognition accuracy is too low, and provides a face detection method, which comprises the following steps:
acquiring an input image;
dividing the input image to obtain a plurality of dividing units;
extracting HOG characteristics of the directional gradient histogram of each segmentation unit to obtain HOG vectors;
normalizing each segmentation unit to obtain a plurality of normalization units;
calculating the mean value, the variance and the image difference of each normalization unit to obtain the mean value of a plurality of segmentation units, the variance of the plurality of segmentation units and the image difference of the plurality of segmentation units;
feature fusion is carried out on the mean value of each segmentation unit, the variance of the segmentation unit, the image difference of the segmentation unit and the HOG vector to obtain a target feature vector;
and inputting the target feature vector into a trained support vector machine to obtain a face detection result.
In some possible designs, the extracting the direction gradient Histogram (HOG) feature of each of the segmentation units, to obtain a plurality of HOG vectors includes:
by passing throughCalculating a horizontal gradient of each pixel in each of the partition units, wherein +.>For the coordinate in the partition unit +.>Horizontal gradient of pixel,/->For the coordinate in the partition unit +.>A pixel value of the pixel;
by passing throughCalculating a vertical gradient of each pixel in each of said segmentation units, wherein +.>For the coordinate in the partition unit +.>A vertical gradient of the pixel;
by passing throughCalculating the coordinates of each of the divided units as +.>Gradient magnitude of pixel, wherein +.>For the coordinate in the partition unit +.>Gradient magnitude of the pixel;
by passing throughCalculating the coordinates of each of the divided units as +.>Gradient direction of pixel, wherein->For the coordinate in the partition unit +.>Gradient direction of the pixel;
counting the gradient magnitude of the pixels and the gradient direction of the pixels in each unit to obtain the HOG characteristics;
and combining the HOG features to obtain the HOG vector.
In some possible designs, the normalizing process each of the segmentation units results in a plurality of normalization units, including:
taking any segmentation unit as a target segmentation unit;
obtaining a maximum pixel value in the target segmentation unit and a minimum pixel value in the target segmentation unit to obtain a pixel upper limit threshold valuePixel lower threshold +.>;
If saidSetting the pixel values of all the pixel points in the target segmentation unit to 0;
if saidBy->Calculating a normalized pixel value, wherein P is the normalized pixel value,>is the original pixel value;
and obtaining a normalization unit through the normalized pixel values.
In some possible designs, the calculating the mean, the variance, and the image difference of each normalized unit, to obtain the mean of the plurality of divided units, the variance of the plurality of divided units, and the image difference of the plurality of divided units includes:
calculating the pixel mean value of the pixel points in each dividing unit;
calculating pixel variances of pixel points in each dividing unit;
and calculating the absolute value of the difference value between each segmentation unit and the preset image.
In some possible designs, after the capturing the input image, the method further comprises:
by passing throughAnd graying the input image, wherein r is the pixel value of the red channel of the target pixel point, g is the pixel value of the green channel of the target pixel point, and b is the pixel value of the blue channel of the target pixel point.
In some possible designs, the inputting the target feature vector into a trained support vector machine to obtain a face detection result includes:
if multi-scale detection is to be performed, the input image is scaled by a non-maximum algorithm.
In some possible designs, before the target feature vector is input to the trained support vector machine to obtain the face detection result, the method further includes:
acquiring a plurality of training data and labeling labels corresponding to the training data;
inputting the training data and the corresponding labeling label into the initial support vector machine;
training the initial support vector machines under a plurality of neural network model parameters through a training function to obtain a plurality of support vector machines;
calculating the loss function values of the support vector machines, and taking the support vector machine with the smallest loss function value as the target support vector machine;
and deploying the target support vector machine to obtain the trained support vector machine.
In a second aspect, the present invention provides a face detection apparatus having a function of implementing a method corresponding to the face detection platform provided in the first aspect. The functions may be implemented by hardware, or may be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the functions described above, which may be software and/or hardware.
The face detection apparatus includes:
the input/output module is used for acquiring an input image;
the processing module is used for dividing the input image to obtain a plurality of dividing units; extracting HOG characteristics of the directional gradient histogram of each segmentation unit to obtain HOG vectors; normalizing each segmentation unit to obtain a plurality of normalization units; calculating the mean value, the variance and the image difference of each normalization unit to obtain the mean value of a plurality of segmentation units, the variance of the plurality of segmentation units and the image difference of the plurality of segmentation units; feature fusion is carried out on the mean value of each segmentation unit, the variance of the segmentation unit, the image difference of the segmentation unit and the HOG vector to obtain a target feature vector; and inputting the target feature vector into a trained support vector machine to obtain a face detection result.
In some possible designs, the processing module is further to:
by passing throughCalculating a horizontal gradient of each pixel in each of the partition units, wherein +.>For the coordinate in the partition unit +.>Horizontal gradient of pixel,/->For the coordinate in the partition unit +.>A pixel value of the pixel;
by passing throughCalculating a vertical gradient of each pixel in each of said segmentation units, wherein +.>For the coordinate in the partition unit +.>A vertical gradient of the pixel;
by passing throughCalculating the coordinates of each of the divided units as +.>Gradient magnitude of pixel, wherein +.>For the coordinate in the partition unit +.>Gradient magnitude of the pixel;
by passing throughCalculating the coordinates of each of the divided units as +.>Gradient direction of pixel, wherein->For the coordinate in the partition unit +.>Gradient direction of the pixel;
counting the gradient magnitude of the pixels and the gradient direction of the pixels in each unit to obtain the HOG characteristics;
and combining the HOG features to obtain the HOG vector.
In some possible designs, the processing module is further to:
taking any segmentation unit as a target segmentation unit;
obtaining a maximum pixel value in the target segmentation unit and a minimum pixel value in the target segmentation unit to obtain a pixel upper limit threshold valuePixel lower threshold +.>;
If saidSetting the pixel values of all the pixel points in the target segmentation unit to 0;
if saidBy->Calculating a normalized pixel value, wherein P is the normalized pixel value,>is the original pixel value;
and obtaining a normalization unit through the normalized pixel values.
In some possible designs, the processing module is further to:
calculating the pixel mean value of the pixel points in each dividing unit;
calculating pixel variances of pixel points in each dividing unit;
and calculating the absolute value of the difference value between each segmentation unit and the preset image.
In some possible designs, the processing module is further to:
by passing throughAnd graying the input image, wherein r is the pixel value of the red channel of the target pixel point, g is the pixel value of the green channel of the target pixel point, and b is the pixel value of the blue channel of the target pixel point.
In some possible designs, the processing module is further to:
if multi-scale detection is to be performed, the input image is scaled by a non-maximum algorithm.
In some possible designs, the processing module is further to:
acquiring a plurality of training data and labeling labels corresponding to the training data;
inputting the training data and the corresponding labeling label into the initial support vector machine;
training the initial support vector machines under a plurality of neural network model parameters through a training function to obtain a plurality of support vector machines;
calculating the loss function values of the support vector machines, and taking the support vector machine with the smallest loss function value as the target support vector machine;
and deploying the target support vector machine to obtain the trained support vector machine.
In yet another aspect, the present invention provides a face detection apparatus, which includes at least one connected processor, a memory, and an input/output unit, where the memory is configured to store program code, and the processor is configured to invoke the program code in the memory to perform the method described in the foregoing aspects.
In yet another aspect, the invention provides a computer storage medium comprising instructions which, when run on a computer, cause the computer to perform the method of the above aspects.
Compared with the prior art, the method has the advantages that features are extracted on the image by using the HOG algorithm, then the image is cut into independent units after global normalization is carried out on the original image, the mean value, variance and difference information between each unit and a standard face are extracted, and finally the information extracted by all the units is arranged into vectors and then combined with the original HOG features.
Drawings
Fig. 1-1 is a schematic flow chart of a face detection method in an embodiment of the invention;
fig. 1-2 are schematic diagrams illustrating the effect of a maximum suppression algorithm of a face detection method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a face detection apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those listed or explicitly listed or inherent to such process, method, article, or apparatus, but may include other steps or modules that may not be listed or inherent to such process, method, article, or apparatus, the partitioning of such modules by the present invention may be by one logical partitioning, and may be implemented by other means, such as a plurality of modules may be combined or integrated in another system, or some features may be omitted, or not implemented.
Referring to fig. 1-1, the following provides a face detection method, which includes:
101. an input image is acquired.
In this embodiment, the input image may be stored in a database or in a network hard disk.
102. Dividing the input image to obtain a plurality of dividing units.
In this embodiment, the image is divided into units of the same size. If the original image has a height H, a width W and a side length d of the dividing unit, then one of the original images may be divided into units, and for the case where the division is impossible, the edge portion of the image may be discarded so as to be exactly divisible. The edge portions of the image may be discarded because the edges of the image do not contain significant facial features such as eyes, nose, mouth, etc. The image is cut into units of the same size. If for an image of one size it is split into units of size, then the image can be split into a total of 64 units of the same size, 8 units in the horizontal direction and 8 units in the vertical direction.
103. And extracting the HOG characteristics of the directional gradient histogram of each segmentation unit to obtain HOG vectors.
In this embodiment, the gradient information of each cell is counted to obtain its gradient direction histogram. Merging statistical feature vectors of adjacent segmentation units
104. And normalizing each segmentation unit to obtain a plurality of normalization units.
In this embodiment, after the image is divided, unit normalization is performed. Assume that the maximum pixel value inside each cell isThe minimum pixel value is +.>。
105. And calculating the mean value, the variance and the image difference of each normalization unit to obtain the mean value of the plurality of segmentation units, the variance of the plurality of segmentation units and the image difference of the plurality of segmentation units.
In this embodiment, the mean, variance, and mean image difference of each cell are extracted.
106. And carrying out feature fusion on the mean value of each segmentation unit, the variance of the segmentation unit, the image difference of the segmentation unit and the HOG vector to obtain a target feature vector.
In this embodiment, the unit mean feature vector, the unit variance feature vector, the mean image difference feature vector and the HOG feature vector of the original image are spliced to obtain the final feature vector of the original image.
107. And inputting the target feature vector into a trained support vector machine to obtain a face detection result.
In this embodiment, face detection is implemented on an image of an arbitrary size. The support vector machine can only detect feature vectors of a specific length. If 8 x 8 cells are used in the training, the SVM can only detect 8 x 8 size targets. To use it to detect faces of different sizes in a picture, multi-scale detection and non-maximal suppression are also required.
Compared with the prior art, the method has the advantages that features are extracted on the image by using the HOG algorithm, then the image is cut into independent units after global normalization is carried out on the original image, the mean value, variance and difference information between each unit and a standard face are extracted, and finally the information extracted by all the units is arranged into vectors and then combined with the original HOG features.
In some embodiments, the extracting the HOG feature of the direction gradient histogram of each of the segmentation units to obtain a plurality of HOG vectors includes:
by passing throughCalculating a horizontal gradient of each pixel in each of the partition units, wherein +.>For the coordinate in the partition unit +.>Horizontal gradient of pixel,/->For the coordinate in the partition unit +.>A pixel value of the pixel;
by passing throughCalculating a vertical gradient of each pixel in each of said segmentation units, wherein +.>For the coordinate in the partition unit +.>A vertical gradient of the pixel;
by passing throughCalculating the coordinates of each of the divided units as +.>Gradient magnitude of pixel, wherein +.>For the coordinate in the partition unit +.>Gradient magnitude of the pixel;
by passing throughCalculating the coordinates of each of the divided units as +.>Gradient direction of pixel, wherein->For the coordinate in the partition unit +.>Gradient direction of the pixel;
counting the gradient magnitude of the pixels and the gradient direction of the pixels in each unit to obtain the HOG characteristics;
and combining the HOG features to obtain the HOG vector.
In the above embodiment, the image is cut into units of the same size. Such as for oneAn image of size, which is divided into +.>The image can be divided into a total of 64 cells of the same size, 8 cells in the horizontal direction and 8 cells in the vertical direction. 4. And counting the gradient information of each unit to obtain a gradient direction histogram of the unit. If the gradient information is statistically +.>Then the resulting vector length is 9. Initially, the value representing these 9 directions is 0, and for each pixel in the cell, its gradient direction is calculated first>To see which interval above it belongs to, then to let its gradient size +.>The above operation is performed for each pixel of the cell, added to its corresponding direction, to get its gradient direction histogram. 5. And merging the statistical feature vectors of the adjacent cell units to obtain the feature vector of each block. Different single cell merging schemes may result in different eigenvector lengths for the block. 4 units which are adjacent to each other in the vertical directionMerging into block blocks, then each block has a size of +.>The feature vector length is +.>. The whole image has +.>A block, then the feature vector length of the image is; />。
In some embodiments, the normalizing process obtains a plurality of normalized units by each of the dividing units, including:
taking any segmentation unit as a target segmentation unit;
obtaining a maximum pixel value in the target segmentation unit and a minimum pixel value in the target segmentation unit to obtain a pixel upper limit threshold valuePixel lower threshold +.>;
If saidSetting the pixel values of all the pixel points in the target segmentation unit to 0;
if saidBy->Calculating a normalized pixel value, wherein P is the normalized pixel value,>is the original pixel value;
and obtaining a normalization unit through the normalized pixel values.
In the above embodiment, the normalization processing is performed on the pixel points in the above manner.
In some embodiments, the calculating the mean, variance, and image difference of each normalized unit to obtain the mean of the plurality of divided units, the variance of the plurality of divided units, and the image difference of the plurality of divided units includes:
calculating the pixel mean value of the pixel points in each dividing unit;
calculating pixel variances of pixel points in each dividing unit;
and calculating the absolute value of the difference value between each segmentation unit and the preset image.
In the above embodiment, the average value of the extraction unit is: first, the pixel average value of each cell is calculated, and if the original image is divided into n cells, the cell average feature vector of the image is an n-dimensional feature vector. Extracting unit variance: the variance inside each cell is first calculated, then the variances of all cells are combined, and the cell variance feature vector is also an n-dimensional feature vector. Extracting mean image difference: firstly, extracting unit mean value feature vectors of a standard face and an original image, and then subtracting the feature vectors and taking absolute values to obtain a mean value image difference feature vector of the original image.
In some embodiments, after the capturing the input image, the method further comprises:
by passing throughAnd graying the input image, wherein r is the pixel value of the red channel of the target pixel point, g is the pixel value of the green channel of the target pixel point, and b is the pixel value of the blue channel of the target pixel point.
In the above embodiment, the image is first grayed out, and the RGB values of each pixel of the original image are (r, g, b), and then the pixel values after graying out are assumed to be the same.
In some embodiments, the inputting the target feature vector into a trained support vector machine to obtain a face detection result includes:
if multi-scale detection is to be performed, the input image is scaled by a non-maximum algorithm.
In the above embodiment, as in fig. 1-2, the image is subjected to multi-scale detection. The multi-scale detection algorithm can eliminate the influence caused by the scale difference of the images, and targets with different sizes can be found by reducing the images and scanning and detecting each reduced image. To achieve multi-scale detection, it is common practice to use relative coordinates to record the position of the matrix. If the coordinates of an upper left corner of an image are (u, d, l, r) and the coordinates of a matrix are (u/H, d/H, l/W, r/W), the relative coordinates can be expressed as (u/H, d/H, l/W). By continuously scaling the image, the face in the image can be scaled to the size which can be detected by the detector, then the relative coordinates of the face are recorded, and finally, the absolute coordinates under the (H, W) scale can be programmed by the inverse conversion of the above process for all face coordinates.
In some embodiments, before the target feature vector is input to the trained support vector machine to obtain the face detection result, the method further includes:
acquiring a plurality of training data and labeling labels corresponding to the training data;
inputting the training data and the corresponding labeling label into the initial support vector machine;
training the initial support vector machines under a plurality of neural network model parameters through a training function to obtain a plurality of support vector machines;
calculating the loss function values of the support vector machines, and taking the support vector machine with the smallest loss function value as the target support vector machine;
and deploying the target support vector machine to obtain the trained support vector machine.
In the above embodiment, the support vector machine face detector is trained. And obtaining feature vectors of all images of the training set, then sending the feature vectors into a support vector machine model for training to obtain a support vector machine face detector, wherein the classifier has the function of identifying faces, and if the feature vectors extracted for the images with the faces are output as 1, otherwise, the feature vectors are output as 0.
A schematic structure of a face detection apparatus 20 shown in fig. 2 is applicable to face detection. The face detection apparatus according to the embodiment of the present invention can implement the steps corresponding to the face detection method performed in the embodiment corresponding to fig. 1-1 described above. The functions of the face detection apparatus 20 may be implemented by hardware, or may be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the functions described above, which may be software and/or hardware. The face detection apparatus may include an input/output module 201 and a processing module 202, and the functional implementation of the processing module 202 and the input/output module 201 may refer to the operations performed in the embodiments corresponding to fig. 1-1, which are not described herein. The input-output module 201 may be used to control the input, output and acquisition operations of the input-output module 201.
In some embodiments, the input-output module 201 may be configured to obtain an input image;
the processing module 202 may be configured to segment the input image to obtain a plurality of segmentation units; extracting HOG characteristics of the directional gradient histogram of each segmentation unit to obtain HOG vectors; normalizing each segmentation unit to obtain a plurality of normalization units; calculating the mean value, the variance and the image difference of each normalization unit to obtain the mean value of a plurality of segmentation units, the variance of the plurality of segmentation units and the image difference of the plurality of segmentation units; feature fusion is carried out on the mean value of each segmentation unit, the variance of the segmentation unit, the image difference of the segmentation unit and the HOG vector to obtain a target feature vector; and inputting the target feature vector into a trained support vector machine to obtain a face detection result.
In some embodiments, the processing module 202 is further configured to:
by passing throughCalculating a horizontal gradient of each pixel in each of the partition units, wherein +.>For the coordinate in the partition unit +.>Horizontal gradient of pixel,/->For the coordinate in the partition unit +.>A pixel value of the pixel;
by passing throughCalculating a vertical gradient of each pixel in each of said segmentation units, wherein +.>For the coordinate in the partition unit +.>A vertical gradient of the pixel;
by passing throughCalculating the coordinates of each of the divided units as +.>Gradient magnitude of pixel, wherein +.>For the coordinate in the partition unit +.>Gradient magnitude of the pixel;
by passing throughCalculating the coordinates of each of the divided units as +.>Gradient direction of pixel, wherein->For the coordinate in the partition unit +.>Gradient direction of the pixel;
counting the gradient magnitude of the pixels and the gradient direction of the pixels in each unit to obtain the HOG characteristics;
and combining the HOG features to obtain the HOG vector.
In some embodiments, the processing module 202 is further configured to:
taking any segmentation unit as a target segmentation unit;
obtaining a maximum pixel value in the target segmentation unit and a minimum pixel value in the target segmentation unit to obtain a pixel upper limit threshold valuePixel lower threshold +.>;
If saidSetting the pixel values of all the pixel points in the target segmentation unit to 0;
if saidBy->Calculating a normalized pixel value, wherein P is the normalized pixel value,>is the original pixel value;
and obtaining a normalization unit through the normalized pixel values.
In some embodiments, the processing module 202 is further configured to:
calculating the pixel mean value of the pixel points in each dividing unit;
calculating pixel variances of pixel points in each dividing unit;
and calculating the absolute value of the difference value between each segmentation unit and the preset image.
In some embodiments, the processing module 202 is further configured to:
by passing throughAnd graying the input image, wherein r is the pixel value of the red channel of the target pixel point, g is the pixel value of the green channel of the target pixel point, and b is the pixel value of the blue channel of the target pixel point.
In some embodiments, the processing module 202 is further configured to:
if multi-scale detection is to be performed, the input image is scaled by a non-maximum algorithm.
In some embodiments, the processing module 202 is further configured to:
acquiring a plurality of training data and labeling labels corresponding to the training data;
inputting the training data and the corresponding labeling label into the initial support vector machine;
training the initial support vector machines under a plurality of neural network model parameters through a training function to obtain a plurality of support vector machines;
calculating the loss function values of the support vector machines, and taking the support vector machine with the smallest loss function value as the target support vector machine;
and deploying the target support vector machine to obtain the trained support vector machine.
The creation means in the embodiment of the present invention are described above from the point of view of modularized functional entities, and the following describes a computer device from the point of view of hardware, as shown in fig. 3, which includes: a processor, a memory, an input output unit (which may also be a transceiver, not identified in fig. 3) and a computer program stored in the memory and executable on the processor. For example, the computer program may be a program corresponding to the face detection method in the embodiment corresponding to fig. 1-1. For example, when the computer device implements the functions of the face detection apparatus 20 shown in fig. 2, the processor implements the steps in the face detection method performed by the face detection apparatus 20 in the embodiment corresponding to fig. 2 described above when executing the computer program. Alternatively, the processor may implement the functions of each module in the face detection apparatus 20 according to the embodiment corresponding to fig. 2 when executing the computer program. For another example, the computer program may be a program corresponding to the face detection method in the embodiment corresponding to fig. 1-1.
The processor may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like that is a control center of the computer device, connecting various parts of the overall computer device using various interfaces and lines.
The memory may be used to store the computer program and/or modules, and the processor may implement various functions of the computer device by running or executing the computer program and/or modules stored in the memory, and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, video data, etc.) created according to the use of the cellular phone, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
The input-output unit may be replaced by a receiver and a transmitter, and may be the same or different physical entities. Are the same physical entities and may be collectively referred to as input/output units. The input and output may be a transceiver.
The memory may be integrated in the processor or may be provided separately from the processor.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM), comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server or a network device, etc.) to perform the method according to the embodiments of the present invention.
While the embodiments of the present invention have been described above with reference to the drawings, the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many modifications may be made thereto by those of ordinary skill in the art without departing from the spirit of the present invention and the scope of the appended claims, which are to be accorded the full scope of the present invention as defined by the following description and drawings, or by any equivalent structures or equivalent flow changes, or by direct or indirect application to other relevant technical fields.
Claims (3)
1. A face detection method, the method comprising:
acquiring an input image;
after the input image is acquired, byGraying the input image, wherein r is the pixel value of a red channel of a target pixel point, g is the pixel value of a green channel of the target pixel point, and b is the pixel value of a blue channel of the target pixel point;
dividing the input image to obtain a plurality of dividing units;
extracting HOG characteristics of the directional gradient histogram of each segmentation unit to obtain HOG vectors;
normalizing each segmentation unit to obtain a plurality of normalization units;
calculating the mean value, variance and image difference of each normalization unit to obtain the mean value of a plurality of segmentation units, the variance of the plurality of segmentation units and the image difference of the plurality of segmentation units, wherein the method specifically comprises the following steps: calculating the pixel mean value of the pixel points in each dividing unit; calculating pixel variances of pixel points in each dividing unit; calculating the absolute value of the difference value between each segmentation unit and a preset image;
feature fusion is carried out on the mean value of each segmentation unit, the variance of the segmentation unit, the image difference of the segmentation unit and the HOG vector to obtain a target feature vector;
acquiring a plurality of training data and labeling labels corresponding to the training data;
inputting the training data and the corresponding labeling label into an initial support vector machine;
training the initial support vector machines under a plurality of neural network model parameters through a training function to obtain a plurality of support vector machines;
calculating the loss function values of the support vector machines, and taking the support vector machine with the smallest loss function value as a target support vector machine;
deploying the target support vector machine to obtain a trained support vector machine;
inputting the target feature vector into a trained support vector machine to obtain a face detection result;
if multi-scale detection is to be performed, the input image is scaled by a non-maximum algorithm.
2. The method of claim 1, wherein the extracting the direction gradient histogram HOG feature for each of the segmentation units results in a plurality of HOG vectors, comprising:
by passing throughCalculating a horizontal gradient of each pixel in each of the partition units, wherein +.>For the coordinate in the partition unit +.>Horizontal gradient of pixel,/->For the coordinate in the partition unit +.>A pixel value of the pixel;
by passing throughCalculating a vertical gradient of each pixel in each of said segmentation units, wherein +.>For the segmentationCoordinates in the cell are +.>A vertical gradient of the pixel;
by passing throughCalculating the coordinates of each of the divided units as +.>Gradient magnitude of pixel, wherein +.>For the coordinate in the partition unit +.>Gradient magnitude of the pixel;
by passing throughCalculating the coordinates of each of the divided units as +.>Gradient direction of pixel, wherein->For the coordinate in the partition unit +.>Gradient direction of the pixel;
counting the gradient magnitude of the pixels and the gradient direction of the pixels in each unit to obtain the HOG characteristics;
and merging the HOG features to obtain the HOG vector.
3. The method of claim 2, wherein the normalizing process each of the segmentation units results in a plurality of normalized units, comprising:
taking any segmentation unit as a target segmentation unit;
obtaining a maximum pixel value in the target segmentation unit and a minimum pixel value in the target segmentation unit to obtain a pixel upper limit threshold valuePixel lower threshold +.>;
If saidSetting the pixel values of all the pixel points in the target segmentation unit to 0;
if saidBy->Calculating a normalized pixel value, wherein P is the normalized pixel value,>is the original pixel value;
and obtaining a normalization unit through the normalized pixel values.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011073284.9A CN112287769B (en) | 2020-10-09 | 2020-10-09 | Face detection method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011073284.9A CN112287769B (en) | 2020-10-09 | 2020-10-09 | Face detection method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112287769A CN112287769A (en) | 2021-01-29 |
CN112287769B true CN112287769B (en) | 2024-03-12 |
Family
ID=74422765
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011073284.9A Active CN112287769B (en) | 2020-10-09 | 2020-10-09 | Face detection method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112287769B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115439938B (en) * | 2022-09-09 | 2023-09-19 | 湖南智警公共安全技术研究院有限公司 | Anti-splitting face archive data merging processing method and system |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102169544A (en) * | 2011-04-18 | 2011-08-31 | 苏州市慧视通讯科技有限公司 | Face-shielding detecting method based on multi-feature fusion |
CN103778435A (en) * | 2014-01-16 | 2014-05-07 | 大连理工大学 | Pedestrian fast detection method based on videos |
CN104091157A (en) * | 2014-07-09 | 2014-10-08 | 河海大学 | Pedestrian detection method based on feature fusion |
CN105630906A (en) * | 2015-12-21 | 2016-06-01 | 苏州科达科技股份有限公司 | Person searching method, apparatus and system |
CN108428236A (en) * | 2018-03-28 | 2018-08-21 | 西安电子科技大学 | The integrated multiple target SAR image segmentation method of feature based justice |
CN109063208A (en) * | 2018-09-19 | 2018-12-21 | 桂林电子科技大学 | A kind of medical image search method merging various features information |
CN109389074A (en) * | 2018-09-29 | 2019-02-26 | 东北大学 | A kind of expression recognition method extracted based on human face characteristic point |
CN109460719A (en) * | 2018-10-24 | 2019-03-12 | 四川阿泰因机器人智能装备有限公司 | A kind of electric operating safety recognizing method |
CN110659608A (en) * | 2019-09-23 | 2020-01-07 | 河南工业大学 | Scene classification method based on multi-feature fusion |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10706294B2 (en) * | 2018-05-03 | 2020-07-07 | Volvo Car Corporation | Methods and systems for generating and using a road friction estimate based on camera image signal processing |
-
2020
- 2020-10-09 CN CN202011073284.9A patent/CN112287769B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102169544A (en) * | 2011-04-18 | 2011-08-31 | 苏州市慧视通讯科技有限公司 | Face-shielding detecting method based on multi-feature fusion |
CN103778435A (en) * | 2014-01-16 | 2014-05-07 | 大连理工大学 | Pedestrian fast detection method based on videos |
CN104091157A (en) * | 2014-07-09 | 2014-10-08 | 河海大学 | Pedestrian detection method based on feature fusion |
CN105630906A (en) * | 2015-12-21 | 2016-06-01 | 苏州科达科技股份有限公司 | Person searching method, apparatus and system |
CN108428236A (en) * | 2018-03-28 | 2018-08-21 | 西安电子科技大学 | The integrated multiple target SAR image segmentation method of feature based justice |
CN109063208A (en) * | 2018-09-19 | 2018-12-21 | 桂林电子科技大学 | A kind of medical image search method merging various features information |
CN109389074A (en) * | 2018-09-29 | 2019-02-26 | 东北大学 | A kind of expression recognition method extracted based on human face characteristic point |
CN109460719A (en) * | 2018-10-24 | 2019-03-12 | 四川阿泰因机器人智能装备有限公司 | A kind of electric operating safety recognizing method |
CN110659608A (en) * | 2019-09-23 | 2020-01-07 | 河南工业大学 | Scene classification method based on multi-feature fusion |
Non-Patent Citations (4)
Title |
---|
HOG的一种改进算法在人脸检测上的应用;朱国华,徐昆;计算机仿真;第38卷(第9期);第185-189页 * |
一种基于多特征融合级联分类器的车辆检测算法;周行 等;现代计算机;第38-43页 * |
基于HOG-多尺度LBP特征的人脸性别识别;闫敬文;江志东;刘蕾;;扬州大学学报(自然科学版)(03) * |
结合 SVM 分类器与 HOG特征提取的行人检测;徐渊 等;计算机工程;第42卷(第1期);第56-65页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112287769A (en) | 2021-01-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11495264B2 (en) | Method and system of clipping a video, computing device, and computer storage medium | |
EP3493101B1 (en) | Image recognition method, terminal, and nonvolatile storage medium | |
CN105938622B (en) | Method and apparatus for detecting object in moving image | |
CN108388879B (en) | Target detection method, device and storage medium | |
CN108229324B (en) | Gesture tracking method and device, electronic equipment and computer storage medium | |
US9971941B2 (en) | Person counting method and device for same | |
EP3168810B1 (en) | Image generating method and apparatus | |
US9008366B1 (en) | Bio-inspired method of ground object cueing in airborne motion imagery | |
CN109033972A (en) | A kind of object detection method, device, equipment and storage medium | |
WO2019061658A1 (en) | Method and device for positioning eyeglass, and storage medium | |
CN110852310B (en) | Three-dimensional face recognition method and device, terminal equipment and computer readable medium | |
CN108229232B (en) | Method and device for scanning two-dimensional codes in batch | |
US20210097290A1 (en) | Video retrieval in feature descriptor domain in an artificial intelligence semiconductor solution | |
US10964028B2 (en) | Electronic device and method for segmenting image | |
Seo et al. | Effective and efficient human action recognition using dynamic frame skipping and trajectory rejection | |
CN110781770B (en) | Living body detection method, device and equipment based on face recognition | |
CN112154476A (en) | System and method for rapid object detection | |
CN103198311A (en) | Method and apparatus for recognizing a character based on a photographed image | |
CN114049499A (en) | Target object detection method, apparatus and storage medium for continuous contour | |
CN110738607A (en) | Method, device and equipment for shooting driving license based on artificial intelligence and storage medium | |
CN108960247B (en) | Image significance detection method and device and electronic equipment | |
CN112287769B (en) | Face detection method, device, equipment and storage medium | |
CN113228105A (en) | Image processing method and device and electronic equipment | |
US11709914B2 (en) | Face recognition method, terminal device using the same, and computer readable storage medium | |
CN110807457A (en) | OSD character recognition method, device and storage device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |