CN108446602B

CN108446602B - Device and method for detecting human face

Info

Publication number: CN108446602B
Application number: CN201810166110.3A
Authority: CN
Inventors: 时学鹏; 邬书哲; 阚美娜; 张�杰; 山世光; 陈熙霖
Original assignee: Seetatech Beijing Technology Co ltd
Current assignee: Seetatech Beijing Technology Co ltd
Priority date: 2018-02-28
Filing date: 2018-02-28
Publication date: 2021-08-20
Anticipated expiration: 2038-02-28
Also published as: CN108446602A

Abstract

The invention provides a device and a method for face detection. The device comprises: at least two layers of models, wherein each layer of model except the first layer of model takes the output of the previous layer of model as input; the first layer model takes an image to be detected as input and is used for screening windows possibly containing human faces from the input of the first layer model, and calibrating the windows possibly containing human faces screened by the first layer model so that the rotation angle of the human faces in each window after calibration is within an angle interval aiming at the first layer model; and the last layer of model is used for screening windows possibly containing human faces from the input of the model so as to output the result of human face detection.

Description

Device and method for detecting human face

Technical Field

The present invention relates to image processing, and more particularly to face detection of images.

Background

Face detection, which refers to determining whether a given image contains a face, is known as an image recognition technique. For example, when a photo or a video is taken by using a mobile phone or a digital camera, the position and size of a face appearing in a view are provided for face beautification and automatic focusing, or whether a face appears in a picture of a video is detected for further processing, such as detecting the identity, age, gender, and the like of a person based on the face appearing. The accuracy and the detection speed of the face detection directly affect the user experience of the application, however, in many practical application scenes, due to the influence of the view angle and the human body posture, the face existing in the image to be detected is often not vertical, for example, the top of the head is downward, the chin is upward or an included angle exists between the face and the horizontal direction of the image, which brings great challenges to the implementation of the face detection. This requires that the model or apparatus for performing face detection filters a possible face, and at the same time, eliminates the influence of an included angle between the possible face and a reference direction used for detection, and accurately detects a face existing in an image or determines whether the image includes a face.

In view of the above problems, some prior arts rotate the reference direction of the detector to different directions to avoid the influence of the rotation angle of the face on the face detection. For example, in the High-Performance Rotation initiative Multiview Face Detection published by Chang Huang et al in "IEEE Trans Pattern and Mach inner" 2007, a one-directional detector is trained and then operated four times from up, down, left, and right directions, respectively, to detect a Face in any rotational direction in a plane. However, such a face detection method makes the detection speed slow.

In other prior arts, face detection is performed on an image to be detected by using a multistage cascade convolutional neural network, the calculation speed of the convolutional neural network is increased by using a multistage model cascade mode, and finally whether the image to be detected contains face detection or not is determined. In these prior arts, the prediction information of the cascade network does not include information related to the rotation angle of the face, and although the influence of the rotation angle of the face can be reduced by repeatedly performing iterative computation through the convolutional neural network, it is necessary to trade a very large amount of iterative computation and time cost for the accuracy of the corresponding face detection.

There are also some prior arts, for example, in patent document CN106529408A, a scheme of face detection is proposed, which allocates one calculation engine for each region to be scanned in an image to be detected, and configures two threads within each engine to perform processing in parallel to increase the speed of face detection. It can be understood that, for an image to be detected, the proportion of a face occupying an image frame and the position of the face in the image are uncertain, and thus a corresponding granularity needs to be set for the image to be detected to divide a large number of areas to be scanned. For the above prior art, since one calculation engine is allocated to each region to be scanned, a large number of calculation engines need to be arranged and implemented by using dedicated FPGA hardware. For common existing hardware equipment, the scheme of face detection cannot be supported, and hardware needs to be improved, which inevitably increases hardware cost.

In addition, there are some prior arts, for example, in patent documents CN106778683A and CN107368797A, tree-shaped face detectors are used to solve the problem of face detection of large angle rotation. For example, a multilayer tree multi-angle face detector is arranged, and a classifier is arranged in the first layer to screen out the face which possibly appears; setting two classifiers A and B in a second layer, wherein the reference directions of the two classifiers are set to be opposite directions, so that one classifier A detects a face with an upward rotation angle, and the other classifier B detects a face with a downward rotation angle; in the third level, a plurality of classifiers are set for the output of classifier a for further finer division of smaller angles similarly as in the second level, a plurality of classifiers are set for the output of classifier B for further division, and so on. However, in the above method, the classifiers of the remaining layers, except the classifier of the first layer, require as input the output of the corresponding classifier in the previous layer, which causes the same region to be scanned of the image to be detected to be repeatedly processed in different classifiers in each layer except the first layer. For example, classifier a at the second layer needs to process all the data output by the first layer classifier, while classifier B at the second layer also needs to process all the data output by the first layer classifier. It can be seen that such a process is very inefficient, involving a large number of repeated calculations.

Disclosure of Invention

Therefore, the present invention is directed to overcome the above-mentioned drawbacks of the prior art, and provides an apparatus for face detection, comprising:

at least two layers of models, wherein each layer of model except the first layer of model takes the output of the previous layer of model as input;

the first layer model takes an image to be detected as input and is used for screening windows possibly containing human faces from the input of the first layer model, and calibrating the windows possibly containing human faces screened by the first layer model so that the rotation angle of the human faces in each window after calibration is within an angle interval aiming at the first layer model;

and the last layer of model is used for screening windows possibly containing human faces from the input of the model so as to output the result of human face detection.

Preferably, according to the apparatus, each layer of models other than the first layer of model and the last layer of model is configured to screen windows that may include faces from the input of the model, and calibrate the windows that may include faces screened by the model, so that the rotation angle of the faces in each window after calibration is within an angle interval for the current layer of model;

wherein the angle interval for the current layer model is within the angle interval for the previous layer model.

Preferably, according to the apparatus, the calibrating the window screened out to possibly contain the face includes:

classifying the window according to the rotation angle of the face in the window possibly containing the face; and

rotating the window of the category, in which the rotation angle of the face divided into the window deviates from the reference direction adopted by the face detection algorithm more than other categories, by a corresponding angle;

wherein the rotated angle is set to correspond to a range of the rotation angle of the face in the window of the category.

Preferably, according to the apparatus, wherein at least one of the at least two layer models employs a convolutional neural network model, or SURF features and multi-layer perceptrons, or HOG features and multi-layer perceptrons.

Preferably, according to the apparatus, each of the at least two models is set to have the same or similar processing time duration to each other.

Preferably, according to the apparatus, further comprising:

at least one data buffer unit shared by two adjacent layers of models, and a control unit;

the data buffer unit is used for writing the processing result into the two adjacent layers of models by the front layer of the two adjacent layers of models and reading the data from the two adjacent layers of models for processing by the rear layer of the two adjacent layers of models;

and the control unit is used for controlling the previous layer in the two adjacent layers of models to read the data corresponding to the next image to be detected after the processing result is written into the data buffer unit.

And, a method for face detection using the apparatus of any of the above, comprising:

1) the first layer model carries out face detection on an input image to be detected so as to screen out windows possibly containing faces, and calibrates the windows possibly containing faces screened out by the first layer model so that the rotation angle of the faces in each window after calibration is within an angle interval aiming at the first layer model;

2) the last layer of model carries out face detection on the content provided by the previous layer of model to screen windows possibly containing faces, and outputs the result of face detection.

Preferably, according to the method, wherein step 1) further comprises:

performing face detection on the content provided by the previous layer of model by using the models of the other layers except the first layer of model and the last layer of model to screen windows possibly containing faces, and calibrating the windows possibly containing faces screened by the models of the previous layer so that the rotation angle of the faces in each window after calibration is within an angle interval aiming at the current layer of model;

And a computer-readable storage medium, in which a computer program is stored, which computer program, when being executed, is adapted to carry out any of the above-mentioned methods.

And, a system for face detection, comprising:

a storage device and a processor;

wherein the storage means is adapted to store a computer program which, when executed by the processor, is adapted to carry out any of the methods described above.

Compared with the prior art, the invention has the advantages that:

according to the face detection device, each layer of model can screen whether the input content contains the face, so that the number of the reserved windows after each layer of model is gradually reduced, the situation that the same window is repeatedly processed by different classifiers does not exist, and an efficient face detection scheme is provided.

Moreover, the face detection device according to the present invention can gradually calibrate the rotation angle of each possible face, the rotation angle of the face can be adjusted to a certain extent to achieve calibration after each layer of model processing, and the calibrated window is further subjected to face recognition by the next layer of model after the adjustment. Therefore, the maximum rotation angle of the human face is gradually reduced, and the model is favorable for more accurate human face and non-human face screening. By the design, the human face detection device can be used for accurately and efficiently detecting the human face at any rotation angle in the plane.

The face detection device according to the present invention can be implemented in hardware or software. For the implementation mode by software, the method can be compatible with most existing processors, such as existing neural network processing or general processors, and avoids increasing hardware cost.

Furthermore, the face detection device according to the present invention can also perform face detection on a plurality of images (e.g., video files) in a pipeline manner in succession. And distributing each layer of model to different computing units, so that each computing unit performs computing in a pipeline mode, thereby fully utilizing hardware resources and further improving the computing speed.

Drawings

Embodiments of the invention are further described below with reference to the accompanying drawings, in which:

FIG. 1 is a schematic structural diagram of a face detection apparatus using a three-layer face detection and calibration model according to an embodiment of the present invention;

fig. 2 shows a flow of face detection corresponding to the face detection apparatus in fig. 1;

fig. 3 shows a pipeline timing diagram of a detection apparatus using a three-layer model corresponding to fig. 1, which continuously performs face detection on a plurality of images in a pipeline manner.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

In the invention, a detection device is also formed by adopting a mode of cascading a plurality of layers of models, which is different from the prior art that the models of each layer except the last layer need to detect whether possible human faces exist and classify according to the rotation angles of the possible human faces, and the model of the last layer can only detect whether the possible human faces exist. The rotation angle of the remaining region to be scanned (hereinafter referred to as a "window") after the screening of the model of one layer is roughly calibrated, and the calibration result is input into the model of the next layer so as to continue the detection of whether the human face exists and the classification according to the rotation angle of the human face which may exist.

Therefore, each layer of model can remove the detected non-human face window, and the number of the windows left after the processing of each layer of model is gradually reduced layer by layer. And the processing result of each layer of model is roughly calibrated, so that the possible rotation angle of the face can be reduced layer by layer, the difficulty of detecting whether the face exists in the next layer is reduced, and the method is favorable for realizing more accurate discrimination of the face and a non-face for a window input into the model in the later layer. It can be understood that the rough classification is performed by the model of the relatively front layer according to the rotation angle of the human face which may exist, and the classification is performed by the model of the relatively rear layer according to the finer angle, so that the prediction of the rough and discrete orientation and the prediction of the fine and continuous angle are realized when the calibration is performed on the output result of each layer of the model.

The face detection apparatus and the method of using the same according to the present invention will be described below by way of specific embodiments.

Fig. 1 is a schematic structural diagram of a face detection apparatus using a three-layer face detection and calibration model according to an embodiment of the present invention. It can be seen that the three layers of face detection and calibration models are connected in a cascade mode, and the output of the previous layer of model of the corresponding layer is used as the input of the model. The first layer of face detection and calibration model takes the content in a sliding window of an image to be detected (hereinafter referred to as a window) as input, performs face detection on each window to exclude windows which are not faces, classifies the rotation angles of the faces in each window and roughly calibrates the rotation angles of the faces in each window according to the classification result; the calibrated result is input into a second layer of face detection and calibration model, similarly, the second layer of face detection and calibration model carries out face detection on each window to eliminate non-face windows, and carries out classification and calibration according to the rotation angle of the face in each window, and the like; the rotation angle of the second layer of face detection and calibration model classification and calibration is smaller than that of the first layer of face detection and calibration model, and the rotation angle of the third layer of face detection and calibration model classification and calibration is smaller than that of the second layer of face detection and calibration model, so that the gradual calibration of the rotation angle of the face is realized.

Because most of the existing face detection algorithms can realize very accurate identification when the included angle between the rotation angle of the face and the reference angle is within 40-60 degrees, the invention can adopt a multi-layer face detection and calibration model to gradually reduce the included angle between the face to be detected and the reference angle adopted for face detection so as to finally output an accurate face detection result.

Fig. 2 shows a flow of face detection corresponding to the face detection apparatus in fig. 1. Referring to fig. 2, a method for detecting a face of an image by using the face detection apparatus in fig. 1 includes:

step 1, carrying out face detection on all windows of an image to be detected by a first-layer face detection and calibration model of the face detection device, dividing the screened windows possibly containing the face into two classes according to the rotation angle of the face, and calibrating the angle range of the windows possibly containing the face according to the divided classes.

If the rotation angles of the face in the image to be detected are assumed to be randomly distributed on all angles, the rotation angles of the face can be roughly divided into two categories of upward and downward. However, it can be understood that if it is found through statistics that the rotation angles of the faces appearing in the image to be detected are not randomly distributed in the current application scene of face detection, corresponding classifications can be set according to the statistical result, so that the respective corresponding intervals of the two classifications are matched with the statistical result.

As shown in fig. 2, through the screening of the first layer face detection and calibration model, seven windows possibly containing faces are reserved. It is possible here to classify windows in which the angle between the direction in which a recognized possibly contained human face points from its chin to its vertex and a reference direction vertically upward is in the range (-90 °,90 ° ]) as the classification of "upward", and to classify windows in which the angle is in the range (90 °, 270 ° ] as the classification of "downward"). The rotation angles of the possible faces classified as "face down" can be roughly calibrated, for example, in fig. 2, the possible faces contained in the first, fourth, fifth and sixth windows are all classified as "face down" (the rotation angle of the face is indicated by an arrow), after rotating them 180 degrees each, the windows that originally belonged to the "down" classification are aligned to the "up" classification.

In this way, the detected windows possibly containing human faces can be all classified into the same classification, namely, the included angles between the possible human faces and the reference direction in each window are all in the range of (-90 degrees, 90 degrees), and the same classification is closer to the reference direction of the algorithm of human face detection, which is beneficial to obtaining accurate recognition results by continuously performing human face detection in the subsequent step.

And 2, carrying out face detection on the window output by the first layer of face detection and calibration model by a second layer of face detection and calibration model of the face detection device, dividing the further screened window possibly containing the face into two classes according to the rotation angle of the face, and further calibrating the angle range of the window possibly containing the face according to the divided classes.

In step 2, a part of the input of the second layer face detection and calibration model may be calibrated in step 1 and adjusted to a smaller rotation angle interval, and at this time, face detection is repeatedly performed on the calibrated window, so that a higher accuracy can be obtained. Here, the face detection may be performed only on the window that is calibrated in step 1, so as to improve the calculation efficiency. However, considering that the classification according to the rotation angle division in step 1 is very rough, in the face detection process implementing the second layer face detection and calibration model, it is preferable to perform face detection on both the calibrated window and the uncalibrated window obtained after the processing by the first layer face detection and calibration model. After the face detection of the second layer of face detection and calibration model, six windows that may contain faces are reserved, as shown in fig. 2.

Similarly to the foregoing, the range of the rotation angle of the face may be divided into three classes according to which, if the reference direction of the face detection algorithm is vertically upward, a window having a rotation angle in the range of [ -90 °, -45 °) is divided into a "left-facing" class, a window having a rotation angle in the range of [45 °,90 ° ] is divided into a "right-facing" class, and a window having a rotation angle in the range of [ -45 °,45 ° ] is divided into a "temporary calibration-free" class.

Here, the window divided into "toward left" may be rotated clockwise by 90 °, and the window divided into "toward right" may be rotated counterclockwise by 90 °, so that the rotation angles of possible faces in all windows are calibrated to be within the range of [ -45 °,45 ° ].

And 3, carrying out face detection on the window output by the second layer of face detection and calibration model by a third layer of face detection and calibration model of the face detection device. The third layer of face detection and calibration model herein may directly output the results of face detection, such as identifying windows containing faces in the image, or providing the windows containing faces to other software or hardware for further analysis and processing.

If the face needs to be identified or otherwise processed subsequently in the current application, in step 3, the further screened window possibly including the face may be divided into two categories according to the rotation angle of the face, the angle range of the window possibly including the face is further calibrated according to the divided categories, and a calibrated result is output. Similarly to steps 1 and 2 above, the rotation angles of the possible faces in all windows can be calibrated to a smaller range according to the rotation angles of the faces in the respective windows.

Therefore, the human face detection of the image to be detected can be realized through the steps 1-3.

The models of each layer in the above embodiments need to implement face detection and classify based on the rotation angle of the face, so the present invention preferably uses a convolutional neural network to implement the models of each layer, which is because the convolutional neural network can obtain very good effect when implementing face detection and classification. However, it is understood that there are some network models that can also achieve the effects of determining whether a human face exists and classifying according to the rotation angle of the human face, such as speedup route Features (SURF) and Multi-layer Perceptron (MLP), or HOG Features (Histogram of organized Gradient) and Multi-layer Perceptron (MLP). Thus, in some embodiments of the present invention, it is not necessary that each layer employ the same type of model, for example SURF features and multi-layer perceptrons may be employed in a first layer and convolutional neural networks in a second and third layer.

It will be appreciated by those skilled in the art that as described in the foregoing, it is advantageous to achieve very accurate face detection when the angle of rotation of the face is within 40-60 of the reference angle. Therefore, when the number of layers of the model of the face detection device is set, a three-layer model is not necessarily adopted, and the rotation angle of the face in the window is within 40-60 degrees when the last layer performs face detection by gradually reducing the rotation angle of the face. Here, the rotation angles of the faces in the window may take into account the distribution range of the rotation angles of the faces in the application scene, for example, through statistical finding that more than 90% of the rotation angles of the faces in the current application scene are between-135 ° and 135 °, the rotation angles of the faces may be gradually calibrated to be within ± 40 ° or ± 60 ° by setting the corresponding number of layers of models and the standard of each layer of models when classification is performed.

For example, according to another embodiment of the present invention, a face detection apparatus having two-layer models may be provided, wherein four categories of "up", "left", "right", and "down" are divided by the first-layer model according to the rotation angle of the face in the window, and correspond to four angle intervals of (-45 °,45 ° ], (45 °, 135 ° ], (135 °, 225 ° ], and (225 °, -45 ° ]), respectively, and the second-layer model implements operations consistent with the third-layer face detection and calibration model in the foregoing embodiments.

It is considered that rough face detection and rough classification are expected to be realized by the former model layer in the above-described face detection apparatus using a multilayer model, and precise face detection and precise classification are expected to be recognized by the latter model layer (classification in a smaller range requires a higher degree of refinement and a larger amount of computation). Therefore, in the present invention, a model of a relatively previous layer (for example, a model of a first layer) can be realized with a smaller scale model, and a model of a relatively subsequent layer (for example, a model of a third layer) can be realized with a larger scale model.

Most algorithms for face detection improve the accuracy rate through multiple iterations, in other words, the smaller the rotation angle of the face is, the lower the difficulty of face detection is, and therefore the desired accuracy rate can be reached more quickly. It is easy to understand that the method of the invention can gradually reduce the included angle between the human face and the reference direction adopted by the algorithm, and gradually increase the calculation speed of each layer of model to the human face recognition.

Considering the above two points together, if the type and scale of each layer model are selected, it is possible to make the calculation time required for each layer model approximately the same, or at least on the same order of magnitude. For example, if the detection device with three layers of networks is used to perform face detection on an image to be detected, the number of windows remaining after screening through each layer of network is gradually reduced, for example, 1000 windows are reserved after passing through the first layer of model, 100 windows are reserved after passing through the second layer of model, and if the calculated amount of the set second layer of model is 10 times that of the first layer of model, the calculation time of the two layers of models can be substantially the same. In this case, it is very advantageous to implement the operations of the layer models in a pipeline manner, for example, each layer model is used as one thread of the pipeline to improve the method of performing face detection on a plurality of images in succession, since the pipeline is most efficient when the processing time of the threads of the pipeline is equal.

Therefore, the invention also provides a method for continuously detecting the faces of a plurality of images in a pipeline mode by adopting the detection device, and the method is particularly suitable for detecting the faces in the video file. When the face detection is carried out on the video file, each layer of model can be distributed to different computing units, and computing is carried out in a pipeline mode, so that hardware resources are fully utilized, and the computing speed is improved.

Fig. 3 shows a pipeline timing diagram of a detection apparatus using a three-layer model corresponding to fig. 1, which continuously performs face detection on a plurality of images in a pipeline manner. Referring to fig. 3, the first layer face detection and calibration model, the second layer face detection and calibration model, and the third layer face detection and calibration model are respectively assigned to three different computing units for implementation, which are respectively denoted as computing units A, B and C. In the present invention, the computing unit used may be a thread, a CPU core, a GPU core, or other units commonly used to implement a pipeline.

As shown in fig. 3, at a first unit time T1, the first layer face detection and calibration model receives a first image to be detected (input a) and processes it accordingly;

at a second unit time T2, the first layer of face detection and calibration model receives a second image to be detected (input two a) and performs corresponding processing thereon, and the second layer of face detection and calibration model receives a window to be processed (input one B) from the first layer of face detection and calibration model and performs corresponding processing thereon;

at a third unit time T3, the first layer of face detection and calibration model receives and processes a third image to be detected (input three a), the second layer of face detection and calibration model receives and processes a window to be processed (input two B) from the first layer of face detection and calibration model, and the third layer of face detection and calibration model receives and processes a window to be processed (input three B) from the second layer of face detection and calibration model;

by analogy … …

Therefore, the models of all layers are distributed to different computing units, so that a plurality of images can be continuously processed, and because each computing unit can work simultaneously, the computing resources of a hardware platform can be fully utilized, and the computing speed is greatly improved. Moreover, a plurality of pipelines can be arranged in parallel in the invention to further improve the calculation speed.

Considering that there may be a difference in the amount of computation required to process each image for the case where a plurality of images need to be processed in batch, there is a possibility that the computation unit of the model of the next layer does not yet process the current image but the model of the previous layer has already completed processing the next image. For example, if the processing of the first image by the computing unit A, B, C takes more than 30ms and the processing of the second image by the computing unit A, B, C takes about 10ms, then there may be a case where the processing of the second image by the computing unit a is already completed and the result of the processing is expected to be provided to the computing unit B while the computing unit B is still processing the first image, which may result in the pipeline of each stage not being properly connected.

In view of the situation, the invention provides a data buffer area for each calculation unit of the pipeline, and provides a corresponding control method to ensure the normal work of the pipeline. The data buffer is shared by two adjacent computing units, for example, for the data buffer 1-2 between the first layer model and the second layer model, the first layer model writes the processing result into the data buffer 1-2, and the second layer model reads the content needing the processing from the data buffer 1-2, wherein the writing and the reading can be performed simultaneously, so that when the processing result of the first layer model is not written completely, the first layer model can realize that the second layer model has not processed the previous image, and the first layer model can temporarily and slowly read the next image. The above process may be implemented by control of the system.

According to one embodiment of the invention, the workflow of the computing unit of each stage of the pipeline comprises:

the computing unit of the first stage: reading an input first image to be detected, generating some candidate regions on the image in a sliding window mode, carrying out face detection on the candidate regions (namely windows) to screen out windows possibly containing faces, and calibrating parts needing to be calibrated in the windows possibly containing the faces according to a set rotation angle. After the above process is completed, checking whether a residual space exists in a data buffer shared by the first-stage computing unit and the second-stage computing unit, if so, writing a processing result of the first-stage computing unit into the data buffer, and if not, executing a writing operation after waiting for the residual space to appear in the data buffer. And reading the input next image to be detected by the computing unit of the first level until all the processing results of the computing unit of the first level are written into the data buffer area, and so on to finish the processing of all the images to be detected.

Second-to-last-stage calculation units: reading the content provided by the computing unit of the previous stage from the data buffer shared by the computing unit of the previous stage, and performing corresponding processing, namely screening out a window possibly containing a human face and calibrating the screening result. Similarly to the above-described calculation unit of the first stage, after the above-described process is completed, whether or not to write the result processed by it is selected according to whether or not there is a remaining space in the data buffer shared by it and the calculation unit of the subsequent stage. After the writing operation is completed, the content is read from the data buffer shared by the previous computing unit and the previous computing unit, and the processing is carried out, and the process is repeatedly executed until no more data is transmitted.

The last stage of the computing unit: similar to the above-mentioned computing unit, corresponding processing operations are performed to obtain a result of face detection for the current image to be detected, and the result is directly output. Then, the content in the data buffer shared by the calculation unit of the previous stage is read for processing, and the above process is repeatedly executed until no more data is transmitted.

By the mode, each computing unit of the pipeline can be controlled to keep synchronization on reading data and processing data, data loss is avoided, and normal work of the pipeline is guaranteed.

In summary, the present invention provides an improved face detection scheme, which can accurately and efficiently detect a face at any rotation angle in a plane. The face detection device can be realized in a hardware or software mode, and can be compatible with most of the existing processors when being realized in the software mode, so that the increase of hardware cost is avoided. Moreover, the face detection device can continuously detect the faces of a plurality of images such as video files in a pipeline mode, fully utilize hardware resources and further improve the calculation speed.

It should be noted that, all the steps described in the above embodiments are not necessary, and those skilled in the art may make appropriate substitutions, replacements, modifications, and the like according to actual needs.

Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. An apparatus for face detection, comprising:

the first layer model takes an image to be detected as input and is used for screening windows possibly containing human faces from the input of the first layer model, and calibrating the windows possibly containing human faces screened by the first layer model so that the rotation angle of the human faces in each window after calibration is within an angle interval aiming at the first layer model; wherein, the calibrating the screened window possibly containing the face comprises: classifying the window according to the rotation angle of the face in the window possibly containing the face; rotating the window of the category of which the rotation angle of the face divided into the window deviates from the reference direction adopted by the face detection algorithm by a corresponding angle compared with other categories; wherein the rotated angle is set to correspond to a range of rotation angles of the face in the window of the category; and

the last layer of model is used for screening windows possibly containing human faces from the input of the model so as to output the result of human face detection;

the other layers of models except the first layer of model and the last layer of model are used for screening windows possibly containing human faces from the input of the models, and calibrating the windows possibly containing human faces screened by the models so that the rotation angles of the human faces in the windows after calibration are in an angle interval aiming at the current layer of model; wherein the angle interval for the current layer model is within the angle interval for the previous layer model.

2. The apparatus of claim 1, wherein at least one of the at least two layers of models employs a convolutional neural network model, or SURF features and multi-layer perceptrons, or HOG features and multi-layer perceptrons.

3. The apparatus of claim 1, wherein each of the at least two layer models is set to have the same or similar processing duration as each other.

4. The apparatus of claim 1, further comprising:

5. A method of face detection using the apparatus of any of claims 1-4, comprising:

6. The method of claim 5, wherein step 1) further comprises:

7. A computer-readable storage medium, in which a computer program is stored which, when being executed, is adapted to carry out the method of claim 5 or 6.

8. A system for face detection, comprising:

a storage device and a processor;

wherein the storage means is for storing a computer program for implementing the method as claimed in claim 5 or 6 when executed by the processor.