CN110427859A

CN110427859A - A kind of method for detecting human face, device, electronic equipment and storage medium

Info

Publication number: CN110427859A
Application number: CN201910683491.7A
Authority: CN
Inventors: 苏驰; 李凯; 刘弘也; 魏胜禹
Original assignee: Beijing Kingsoft Cloud Network Technology Co Ltd; Beijing Kingsoft Cloud Technology Co Ltd
Current assignee: Beijing Kingsoft Cloud Network Technology Co Ltd; Beijing Kingsoft Cloud Technology Co Ltd
Priority date: 2019-05-31
Filing date: 2019-07-26
Publication date: 2019-11-08
Also published as: CN110378314A

Abstract

The embodiment of the invention provides a kind of method for detecting human face, device, electronic equipment and storage mediums, wherein this method comprises: obtaining video to be detected, carries out pumping frame to video to be detected, obtains image to be detected set；By in each image to be detected input in image to be detected set in advance obtained Face datection model of training, obtain include the location information of face probability and human face region output result.Since the face probability is used to indicate that each detection zone in the first face detection model of input to include the probability of face；Also, human face region is the detection zone that face probability is greater than predetermined probabilities threshold value；Therefore, at least image to be detected for including in the set to be detected can be detected, detects the face in the image to be detected, at least one face for including in video to be detected is detected so as to realize.

Description

A kind of method for detecting human face, device, electronic equipment and storage medium

Technical field

The present invention relates to technical field of image processing, more particularly to a kind of method for detecting human face, device, electronic equipment and Storage medium.

Background technique

As deep learning is grown rapidly, face recognition technology obtains breakthrough progress, and face recognition technology is to life Bring huge convenience.Existing face recognition technology includes at least Face datection positioning and face character identifies two steps Suddenly, wherein Face datection positioning refers to detected human face region from an image, and face character identification is based on detection Obtained human face region carries out feature extraction, comparison, fusion etc., obtains the attribute of face, such as the identity of face, gender, duty The attributes such as industry, age.

Most of existing Face datection be face to be identified is identified under single environment, such as: take subway When recognition of face, the recognition of face in mobile-phone payment function, the recognition of face in gate inhibition under the above scene, requires pair Face to be identified is taken pictures, and the photo based on shooting is identified later, in the photo of shooting it is general only comprising one to Face is identified, and user can adjust the face location of oneself when taking pictures automatically according to prompt, to guarantee the facial image of acquisition Quality with higher.

Under some monitoring scenes shoot monitor video when, user will not adjust automatically oneself face location, therefore on The method for detecting human face stated is not suitable for detecting the face in video.

Summary of the invention

The embodiment of the present invention is designed to provide a kind of method for detecting human face, device, electronic equipment and storage medium, with Realization detects the face for including in video to be detected.Specific technical solution is as follows:

In a first aspect, the embodiment of the invention provides a kind of method for detecting human face, this method comprises:

Video to be detected is obtained, pumping frame is carried out to video to be detected, obtains image to be detected set；Wherein, mapping to be checked Image set includes at least image to be detected in closing；

Each image to be detected input in image to be detected set is trained in advance in obtained Face datection model, Obtain output result, wherein output result includes the location information of face probability and human face region, wherein face probability is used for Indicate that each detection zone in the first face detection model of input includes the probability of face；Human face region is that face probability is greater than The detection zone of predetermined probabilities threshold value；

Wherein, Face datection model is to be obtained using training dataset by machine learning training, what training data was concentrated Each group of training data includes: from the facial image extracted in monitor video and for identifying human face region in the facial image Location information.

Optionally, by each image to be detected input Face datection that training obtains in advance in image to be detected set In model, output result is obtained, comprising:

Image to be detected is inputted in the first face detection model that training obtains in advance, obtains the first output result；The One output result includes the location information of the first face probability and the first human face region；Wherein, the first face probability is for indicating Input the probability that the first detection zone of each of the first face detection model includes face；First human face region is the first face Probability is greater than the first detection zone of the first predetermined probabilities threshold value；

First human face region is inputted the second face that training obtains in advance and examined by the location information based on the first human face region It surveys in model, obtains the second output result；Second output result includes the position letter of the second face probability and the second human face region Breath；Wherein, the second face probability includes face for indicating the second detection zone of each of the second Face datection model of input Probability；The size of second detection zone is greater than the size of the first detection zone；Second human face region is that the second face probability is big In the second detection zone of the second predetermined probabilities threshold value.

Optionally, after obtaining the second output result, the method for detecting human face further include:

Second human face region is inputted the third face that training obtains in advance and examined by the location information based on the second human face region It surveys in model, obtains third output result；Third output result includes the location information of third face probability, third human face region With the location information of face organ；Wherein, third face probability is for indicating input each of third face detection model the Three detection zones include the probability of face；Third human face region is the third that third face probability is greater than third predetermined probabilities threshold value Detection zone.

Optionally, the first face detection model and the second Face datection model are to pass through engineering using the first training data Training is practised to obtain；Wherein, each group of training data in the first training data includes: the face figure extracted from monitor video Picture and for identifying the location information of human face region in the facial image；

Third face detection model is to be obtained using the second training data by machine learning training；Wherein, the second training Each group of training data in data includes: from the facial image extracted in monitor video, for identifying in the facial image The location information of human face region and the location information of face organ.

Second human face region is inputted the 4th face that training obtains in advance and examined by the location information based on the second human face region It surveys in model, obtains quality scale corresponding with each second human face region；

Wherein, the 4th Face datection model is to be obtained using third training dataset by machine learning training, third instruction Practicing each group of training data in data set includes: the facial image extracted from monitor video and for identifying the face figure The other label of image quality magnitude.

Optionally, after obtaining third output result, the method for detecting human face further include:

Third human face region is inputted the 4th face that training obtains in advance and examined by the location information based on third human face region It surveys in model, obtains quality scale corresponding with each third human face region；

Optionally, the 4th Face datection model includes: convolutional layer, batch normalization layer, activation primitive layer, global average pond Layer and output layer；

Convolutional layer, for extracting the local feature of each human face region；

By the local feature of each human face region, it is input to batch normalization layer and is normalized, obtain each human face region Local feature after normalization；

By the local feature after the normalization of each human face region, it is input to activation primitive layer, it is non-to obtain each human face region Local feature after linearisation；

By the local feature after each human face region non-linearization, it is input to global average pond layer and carries out pond, obtain The feature of each human face region Chi Huahou；

By the feature of each human face region Chi Huahou, input to output layer obtains the quality scale of each human face region.

Second aspect, the embodiment of the invention also provides a kind of human face detection device, which includes:

Frame module is taken out to carry out pumping frame for obtaining video to be detected to video to be detected, obtain image to be detected set； It wherein, include at least image to be detected in image to be detected set；

First face detection module, for training each image to be detected input in image to be detected set in advance In obtained Face datection model, output result is obtained, wherein output result includes the position letter of face probability and human face region Breath, wherein face probability is used to indicate that each detection zone in the first face detection model of input to include the probability of face；People Face region is the detection zone that face probability is greater than predetermined probabilities threshold value；

Optionally, the first face detection module, comprising:

First detection sub-module, for image to be detected to be inputted in the first face detection model that training obtains in advance, Obtain the first output result；First output result includes the location information of the first face probability and the first human face region；Wherein, One face probability is used to indicate that the first detection zone of each of the first face detection model of input to include the probability of face；First Human face region is the first detection zone that the first face probability is greater than the first predetermined probabilities threshold value；

First human face region is inputted instruction in advance for the location information based on the first human face region by the second detection sub-module In the second Face datection model got, the second output result is obtained；Second output result includes the second face probability and the The location information of two human face regions；Wherein, the second face probability is for indicating input each of second Face datection model the Two detection zones include the probability of face；The size of second detection zone is greater than the size of the first detection zone；Second face area Domain is the second detection zone that the second face probability is greater than the second predetermined probabilities threshold value.

Optionally, the human face detection device further include:

Second face detection module inputs the second human face region in advance for the location information based on the second human face region In the third face detection model that training obtains, third output result is obtained；Third output result includes third face probability, the The location information of three human face regions and the location information of face organ；Wherein, third face probability is for indicating the input third party Each third detection zone in face detection model includes the probability of face；Third human face region is that third face probability is greater than the The third detection zone of three predetermined probabilities threshold values.

Optionally, the human face detection device further include:

First quality scale detection module inputs the second human face region for the location information based on the second human face region In the 4th Face datection model that training obtains in advance, quality scale corresponding with each second human face region is obtained；

Optionally, the human face detection device further include:

Second quality scale detection module inputs third human face region for the location information based on third human face region In the 4th Face datection model that training obtains in advance, quality scale corresponding with each third human face region is obtained；

The third aspect, the embodiment of the invention also provides a kind of electronic equipment, including processor and machine readable storage to be situated between Matter, machine readable storage medium are stored with the machine-executable instruction that can be executed by processor, and processor can be performed by machine Instruction promotes: realizing a kind of method for detecting human face step that above-mentioned first aspect provides.

Fourth aspect, the embodiment of the invention also provides a kind of computer readable storage medium, computer-readable storage mediums It is stored with computer program in matter, when computer program is executed by processor, realizes a kind of face that above-mentioned first aspect provides Detection method step.

5th aspect, the embodiment of the invention also provides a kind of computer program products comprising instruction, when it is being calculated When being run on machine, so that computer executes a kind of method for detecting human face step that above-mentioned first aspect provides.

6th aspect, the embodiment of the invention also provides a kind of computer programs, when run on a computer, so that Computer executes a kind of method for detecting human face step that above-mentioned first aspect provides.

A kind of method for detecting human face, device, electronic equipment and storage medium provided in an embodiment of the present invention, in Face datection When, video to be detected can be first obtained, pumping frame is carried out to video to be detected, obtains image to be detected set；It then, will be to be detected In each image to be detected input Face datection model that training obtains in advance in image collection, obtain including face probability Output with the location information of human face region is as a result, since the face detection model is using including the face extracted from video It image and is trained for identifying the training data of the location information of human face region in the facial image；It therefore, can be right Image to be detected in video to be detected is detected, and detects the face in image to be detected, so as to realize to be checked The face for including in video is surveyed to be detected.Certainly, implement any of the products of the present invention or method it is not absolutely required at the same reach To above-mentioned all advantages.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.

Fig. 1 is a kind of flow chart of the first embodiment of method for detecting human face of the embodiment of the present invention；

Fig. 2 is a kind of flow chart of second of embodiment of method for detecting human face of the embodiment of the present invention；

Fig. 3 is a kind of flow chart of the third embodiment of method for detecting human face of the embodiment of the present invention；

Fig. 4 is a kind of flow chart of the 4th kind of embodiment of method for detecting human face of the embodiment of the present invention；

Fig. 5 is a kind of flow chart of Face datection model training method of the embodiment of the present invention；

Fig. 6 is a kind of structural schematic diagram of human face detection device of the embodiment of the present invention；

Fig. 7 is the structural schematic diagram of a kind of electronic equipment of the embodiment of the present invention.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

Of the existing technology in order to solve the problems, such as, the embodiment of the invention provides a kind of method for detecting human face, device, electricity Sub- equipment and storage medium detect the face for including in video to be detected with realizing.

In the following, a kind of method for detecting human face of the embodiment of the present invention is introduced first, as shown in Figure 1, for the present invention A kind of flow chart of the first embodiment of method for detecting human face of embodiment, this method may include:

S110 obtains video to be detected, carries out pumping frame to video to be detected, obtains image to be detected set.

It wherein, include at least image to be detected in image to be detected set.

In some instances, which can be the video of the monitoring scene of monitoring device shooting, the monitoring field There can be at least one people in scape.It therefore, may include the facial image of at least one people in the video to be detected.To monitoring When the video of equipment shooting carries out Face datection, video frame can be first extracted from video to be detected, then to the video of extraction Frame carries out Face datection.

For example, pumping frame can be carried out frame by frame to the video to be detected, the corresponding each video frame of the video to be detected is obtained, It, can also be to all videos in the video to be detected so as to form the corresponding image to be detected set of the detection video Frame carries out pumping frame at interval of the video frame of predetermined number, obtains this and is fond of the corresponding video frame to be detected of video, then can be by It takes out the video frame to be detected that frame obtains and forms the corresponding image to be detected set of the detection video.

In some instances, when carrying out pumping frame, a video frame can be extracted, multiple video frames can also be extracted, because This, it also may include multiple image to be detected which, which may include an image to be detected, the mapping to be checked As the video frame to be extracted from the video to be detected.

S120, by each image to be detected input Face datection mould that training obtains in advance in image to be detected set In type, output result is obtained.

It, then can be to be checked by this in order to detect the face of image to be detected set after obtaining image to be detected set In every image to be detected input Face datection model that training obtains in advance in altimetric image set, what which obtained Face datection model can be based on every image to be detected, export the position of the human face region in the image to be detected comprising face It include the probability of face in information and the human face region.

In some instances, human face region is the detection zone that face probability is greater than predetermined probabilities threshold value.

The Face datection model that above-mentioned preparatory training obtains can first carry out the image after obtaining image to be detected Various sizes of transformation, so as to obtain the subgraph to be detected of multiple and different sizes；It then can be to the preliminary convolution of progress And filtering, to quickly generate couple candidate detection region.Then the couple candidate detection region is further selected and is adjusted, with right The couple candidate detection region carries out high-precision filtration, and generates detection zone, finally, selecting face probability big from the detection zone It is exported in the detection zone of predetermined probabilities threshold value as human face region.Therefore, the Face datection model which obtains can It include the probability of face in location information and the human face region to export the human face region comprising face.

In other example, the Face datection model which obtains is to pass through engineering using training dataset It practises training to obtain, each group of training data which concentrates may each comprise: the face figure extracted from monitor video Picture and for identifying the location information of human face region in the facial image.

Wherein, which refers to the facial image extracted from video, and face corresponding with the facial image The location information in region.

A kind of method for detecting human face provided in an embodiment of the present invention can first obtain video to be detected in Face datection, Pumping frame is carried out to video to be detected, obtains image to be detected set；Then, each in image to be detected set is to be detected Image input is in advance in the obtained Face datection model of training, obtain include face probability and human face region location information it is defeated Out as a result, since the face probability is used to indicate that each detection zone in the first face detection model of input to include the general of face Rate；Also, human face region is the detection zone that face probability is greater than predetermined probabilities threshold value；It therefore, can be to the set to be detected In include at least image to be detected detected, the face in the image to be detected is detected, so as to realize pair At least one face for including in video to be detected is detected.

On the basis of a kind of method for detecting human face shown in Fig. 1, the embodiment of the invention also provides a kind of possible realities Existing mode, as shown in Fig. 2, a kind of flow chart of second of embodiment of method for detecting human face for the embodiment of the present invention, the party Method may include:

S210 obtains video to be detected, carries out pumping frame to video to be detected, obtains image to be detected set；Wherein, to be checked It include at least image to be detected in altimetric image set；

Image to be detected is inputted in the first face detection model that training obtains in advance, obtains the first output knot by S220 Fruit；First output result includes the location information of the first face probability and the first human face region；Wherein, the first face probability is used for Indicate that the first detection zone of each of the first face detection model of input includes the probability of face；First human face region is first Face probability is greater than the first detection zone of the first predetermined probabilities threshold value；

First human face region is inputted the second people that training obtains in advance by S230, the location information based on the first human face region In face detection model, the second output result is obtained.

Wherein, the second output result includes the location information of the second face probability and the second human face region；Second face is general Rate is used to indicate that the second detection zone of each of the second Face datection model of input to include the probability of face；Second detection zone Size be greater than the first detection zone size；Second human face region is that the second face probability is greater than the second predetermined probabilities threshold value Second detection zone.

In some instances, in order to improve the accuracy of Face datection, above-mentioned Face datection model may include two Face datection model namely the first face detection model and the second Face datection model.

After obtaining image to be detected set, every image to be detected in image to be detected set can first be inputted In the first face detection model obtained to the preparatory training, the first face detection model which obtains can be to this Each region in image to be detected is detected, so as to export the first detection zone of each of the image to be detected It include the probability of face in location information and first detection zone.It is then possible to which it is pre- that the probability comprising face is greater than first If the detection zone of threshold value as the first detection zone, so as to obtain first detection zone location information and this first The corresponding first face probability of detection zone.

In some instances, the first face detection model which obtains is to be passed through using the first training dataset Machine learning training obtains, and each group of training data which concentrates may each comprise: take out from monitor video The facial image that takes and for identifying the location information of human face region in the facial image.

It, can be based on first human face region after the location information for obtaining the first human face region and the first face probability Location information obtains the first human face region from the corresponding video frame of the first human face region, then can be by first face Region is input in the second Face datection model that training obtains in advance, which can be to first face Each detection zone in region is detected, so as to export include face the second detection zone location information with And this includes the probability in the second detection zone of face comprising face.It is then possible to will be greater than the second predetermined probabilities threshold value The second detection zone as the second human face region, so as to obtain location information and second people of the second human face region It include the second face probability of face in face region.

In some instances, the second Face datection model which obtains is also possible to logical using the first data set It crosses machine learning training to obtain, each group of training data which concentrates may each comprise: from monitor video The facial image of extraction and for identifying the location information of human face region in the facial image.

In some instances, since the size of the second detection zone is greater than the size of the first detection zone, using this Then first face detection model can use second face from the face detected in the image to be detected on smaller szie Detection model detects the face in the image to be detected in larger size.

Detection zone is screened by using first preset threshold, can be input to the second Face datection mould Human face region probability packet with higher in type contains face, then which is contained to the face of face Region is input in the second Face datection model and is detected again, can to detect the accuracy of the second human face region more Height, finally, by using the second preset threshold to the second Face datection model export include face the second detection zone Domain is screened, can be further such that the probability in the second human face region that screening obtains comprising face is higher.So as to Improve the accuracy that Face datection is carried out using the embodiment of the present invention.

It should be understood that the step S210 in the embodiment of the present invention, the first embodiment with the embodiment of the present invention In step S110 it is identical, which is not described herein again.

On the basis of a kind of method for detecting human face shown in Fig. 2, the embodiment of the invention also provides a kind of possible realities Existing mode, as shown in figure 3, a kind of flow chart of the third embodiment of method for detecting human face for the embodiment of the present invention, the party Method may include:

S310 obtains video to be detected, carries out pumping frame to video to be detected, obtains image to be detected set；Wherein, to be checked It include at least image to be detected in altimetric image set；Every image to be detected includes at least a face；

Image to be detected is inputted in the first face detection model that training obtains in advance, obtains the first output knot by S320 Fruit；First output result includes the location information of the first face probability and the first human face region；Wherein, the first face probability is used for Indicate that the first detection zone of each of the first face detection model of input includes the probability of face；First human face region is first Face probability is greater than the first detection zone of the first predetermined probabilities threshold value；

First human face region is inputted the second people that training obtains in advance by S330, the location information based on the first human face region In face detection model, the second output result is obtained；Second output result includes the position of the second face probability and the second human face region Confidence breath；Wherein, the second face probability includes for indicating the second detection zone of each of the second Face datection model of input The probability of face；The size of second detection zone is greater than the size of the first detection zone；Second human face region is that the second face is general Rate is greater than the second detection zone of the second predetermined probabilities threshold value.

Second human face region is inputted the third party that training obtains in advance by S340, the location information based on the second human face region In face detection model, third output result is obtained；Third output result includes the position of third face probability, third human face region The location information of information and face organ；Wherein, third face probability is used to indicate every in input third face detection model A third detection zone includes the probability of face；Third human face region is that third face probability is greater than third predetermined probabilities threshold value Third detection zone.

In some instances, after obtaining the second human face region, in order to more accurately obtain people in second human face region The facial characteristics of face, can location information first based on second human face region, get the second people in the image to be detected Then second human face region is input in the third face detection model that training obtains in advance by face region.

In some instances, third face detection model is to be obtained using the second training data by machine learning training； Each group of training data in second training data includes: the facial image extracted from monitor video, for identifying this The location information of human face region and the location information of face organ in facial image.

Since every group of training data of second training data all includes the location information of face organ, training Obtained third face detection model can detect the face organ of the face in image to be detected.Then by second people After face region is input in the third face detection model that the preparatory training obtains, third Face datection which obtains Model can export the probability in each third detection zone comprising face, the location information of the detection zone and the detection zone The location information of the face organ for the face for including in domain.

Finally, the probability comprising face can be greater than to the third detection zone of third predetermined probabilities threshold value as the third Human face region, so as to obtain the corresponding third face probability of location information third human face region of third human face region and be somebody's turn to do The location information of the face organ of face in third human face region.

Through the embodiment of the present invention, the detection to face not only may be implemented, can also realize the face organ to face Detection improve so as to realize finer detection and carry out face inspection using the method for detecting human face of the embodiment of the present invention The fine degree of survey.

It should be understood that step S310~S330 in the embodiment of the present invention, real with second of the embodiment of the present invention Step S210~the S230 applied in mode is identical, and which is not described herein again.

On the basis of a kind of method for detecting human face shown in Fig. 3, the embodiment of the invention also provides a kind of possible realities Existing mode, as shown in figure 4, a kind of flow chart of the 4th kind of embodiment of method for detecting human face for the embodiment of the present invention, the party Method may include:

S410 obtains video to be detected, carries out pumping frame to video to be detected, obtains image to be detected set；Wherein, to be checked It include at least image to be detected in altimetric image set；Every image to be detected includes at least a face；

Image to be detected is inputted in the first face detection model that training obtains in advance, obtains the first output knot by S420 Fruit；First output result includes the location information of the first face probability and the first human face region；Wherein, the first face probability is used for Indicate that the first detection zone of each of the first face detection model of input includes the probability of face；First human face region is first Face probability is greater than the first detection zone of the first predetermined probabilities threshold value；

First human face region is inputted the second people that training obtains in advance by S430, the location information based on the first human face region In face detection model, the second output result is obtained；Second output result includes the position of the second face probability and the second human face region Confidence breath；Wherein, the second face probability includes for indicating the second detection zone of each of the second Face datection model of input The probability of face；The size of second detection zone is greater than the size of the first detection zone；Second human face region is that the second face is general Rate is greater than the second detection zone of the second predetermined probabilities threshold value.

Second human face region is inputted the third party that training obtains in advance by S440, the location information based on the second human face region In face detection model, third output result is obtained；Third output result includes the position of third face probability, third human face region The location information of information and face organ；Wherein, third face probability is used to indicate every in input third face detection model A third detection zone includes the probability of face；Third human face region is that third face probability is greater than third predetermined probabilities threshold value Third detection zone.

Third human face region is inputted the 4th people that training obtains in advance by S450, the location information based on third human face region In face detection model, quality scale corresponding with each third human face region is obtained.

In some instances, after obtaining third human face region, a kind of method for detecting human face of the embodiment of the present invention may be used also To detect the quality scale of the third human face region, for example, can based on the location information of the third human face region, get to Then the third human face region is input to the 4th Face datection that training obtains in advance by the third human face region in detection image In model.

In some instances, the 4th Face datection model is trained by machine learning using third training dataset It arrives, each group of training data that third training data is concentrated includes: the facial image extracted from monitor video and for marking Know the label of the facial image quality scale.

Since the 4th Face datection model is using including facial image and for identifying the facial image quality-class What the training data of other label was trained, therefore, the 4th Face datection model can detecte out each to be detected The quality scale of image should after the third human face region to be then input to the 4th Face datection model that the preparatory training obtains The 4th Face datection model that training obtains in advance can export the quality scale of the third human face region.

In some instances, the 4th Face datection model may include: convolutional layer, batch normalization layer, activation primitive layer, The average pond layer of the overall situation and output layer.

Wherein, convolutional layer, for extracting the local feature of each human face region；

This batch normalizes layer, for the local feature to be normalized, the office after obtaining each human face region normalization Portion's feature；

The activation primitive layer obtains each for carrying out non-linearization to the local feature after the normalization of each human face region Local feature after a human face region non-linearization；

The overall situation is averaged pond layer, for carrying out pond to the local feature after each human face region non-linearization, obtains The feature of each human face region Chi Huahou；

The output layer obtains the quality scale of each human face region for the feature based on each human face region Chi Huahou.

Then when being input to each third human face region in the 4th Face datection model that training obtains in advance, Ke Yixian Each third human face region is input to convolutional layer, extracts the local feature of each third face area image；Then it can incite somebody to action The local feature of each third human face region is input to batch normalization layer and is normalized, and returns obtaining each third human face region After local feature after one change, the local feature after the normalization of each third human face region can be input to activation primitive layer, Local feature after obtaining each third human face region non-linearization；Finally, can be by each third human face region non-linearization Local feature afterwards is input to global average pond layer and carries out pond, the feature after obtaining each third face pool area；Most Afterwards, it is corresponding can be obtained into each third human face region for the feature after each third face pool area, input to output layer Quality scale.

In other example, can also the corresponding quality scale of each third human face region be marked and arrive the third party In face region, subsequent third human face region is selected according to quality scale to facilitate.

Through the embodiment of the present invention, the matter of each third human face region can be identified after obtaining third human face region Magnitude is other, subsequent third human face region is selected according to quality scale to facilitate.So as to improve the third face detected The use scope in region.

It should be understood that step S410~S440 in the embodiment of the present invention, the third reality with the embodiment of the present invention Step S310~the S340 applied in mode is identical, and which is not described herein again.

In other example, it is defeated that second is obtained in the step S230 of second of embodiment through the embodiment of the present invention It, can also be using obtaining the corresponding quality of each second human face region with the same or similar step of step S450 out after result Rank.For example, the second human face region is inputted the 4th face that training obtains in advance by the location information based on the second human face region In detection model, quality scale corresponding with each second human face region is obtained.

It should be understood that in embodiments of the present invention, obtaining the corresponding quality scale of each second human face region and obtaining To the corresponding quality scale of each third human face region, the detection method of use is same or similar, and which is not described herein again.

In some instances, four kinds of above-mentioned Face datection models are trained the embodiment of the invention also provides a kind of Method, as shown in figure 5, a kind of flow chart of Face datection model training method for the embodiment of the present invention, this method can be with Include:

S510 obtains the first training dataset, the second training dataset and third training dataset from monitor video.

In some instances, each group of training data which concentrates may each comprise: from monitor video The facial image of middle extraction and for identifying the location information of human face region in the facial image, what which concentrated Each group of training data may each comprise: from the facial image extracted in monitor video, for identifying face in the facial image The location information in region and the location information of face organ.Third training data concentrate each group of training data include: The facial image that is extracted from monitor video and for identifying the label of the facial image quality scale.

In some instances, should the facial image that be obtained from monitor video can be include: different faces size people Face image, the facial image of different resolution, the facial image under different illumination and the facial image there are circumstance of occlusion.It should Quality of human face image rank can be the rank divided in advance according to the readability of image, be also possible to the brightness according to image The rank divided in advance can also be the rank divided in advance according to the number that face in image is blocked, this is all possible.

First training dataset is inputted pre-set first face detection model and the second Face datection mould by S520 Type is trained the pre-set first face detection model and the second Face datection model, obtains training in advance and obtains The first face detection model and obtained the second Face datection model of training in advance.

Second training dataset is inputted pre-set third face detection model by S530, to this pre-set the Three Face datection models are trained, and obtain the third face detection model that training obtains in advance.

The third training dataset is inputted in pre-set 4th Face datection model, is preset to this by S540 The 4th Face datection model be trained, obtain obtained the 4th Face datection model of training in advance.

After getting above-mentioned training dataset, can using different data sets to different Face datection models into Row training, for example, using the first training dataset to pre-set first face detection model and the second Face datection model It is trained, pre-set third face detection model is trained using the second training dataset, using third training Data set is trained pre-set 4th Face datection model.It is trained in advance so as to obtain above-mentioned four Face datection model.

In this way, can be trained in advance to Face datection model before carrying out Face datection, to reduce Face datection mistake Time overhead in journey.

Corresponding to above-mentioned embodiment of the method, the embodiment of the invention also provides a kind of human face detection devices, such as Fig. 6 institute Show, is a kind of structural schematic diagram of human face detection device of the embodiment of the present invention, the apparatus may include:

Frame module 610 is taken out to carry out pumping frame for obtaining video to be detected to video to be detected, obtain image to be detected collection It closes；It wherein, include at least image to be detected in image to be detected set；Every image to be detected includes at least a people Face.

First face detection module 620, it is preparatory for inputting each image to be detected in image to be detected set In the Face datection model that training obtains, output result is obtained, wherein output result includes the position of face probability and human face region Confidence breath, wherein face probability is used to indicate that each detection zone in the first face detection model of input to include the general of face Rate；Human face region is the detection zone that face probability is greater than predetermined probabilities threshold value.

A kind of human face detection device provided in an embodiment of the present invention can first obtain video to be detected in Face datection, Pumping frame is carried out to video to be detected, obtains image to be detected set；Then, each in image to be detected set is to be detected Image input is in advance in the obtained Face datection model of training, obtain include face probability and human face region location information it is defeated Out as a result, since the face probability is used to indicate that each detection zone in the first face detection model of input to include the general of face Rate；Also, human face region is the detection zone that face probability is greater than predetermined probabilities threshold value；It therefore, can be to the set to be detected In include at least image to be detected detected, the face in the image to be detected is detected, so as to realize pair At least one face for including in video to be detected is detected.

In some instances, first face detection module 620, comprising:

In some instances, the human face detection device can also include:

In some instances, the first face detection model and the second Face datection model are to be passed through using the first training data Machine learning training obtains；Wherein, each group of training data in the first training data includes: to extract from monitor video Facial image and for identifying the location information of human face region in the facial image；

In some instances, the human face detection device can also include:

In some instances, the 4th Face datection model include: convolutional layer, it is batch normalization layer, activation primitive layer, global flat Equal pond layer and output layer；

The embodiment of the invention also provides a kind of electronic equipment, as shown in fig. 7, being a kind of face area of the embodiment of the present invention Area image archiving method is applied to the structural schematic diagram of electronic equipment, which may include processor 701 and machine can Storage medium 702 is read, machine readable storage medium 702 is stored with the machine-executable instruction that can be executed by processor 701, locates Reason device 701 is promoted by machine-executable instruction: the step of realizing above-mentioned face archiving method, for example, may be implemented to walk as follows It is rapid:

Video to be detected is obtained, pumping frame is carried out to video to be detected, obtains image to be detected set；Wherein, mapping to be checked Image set includes at least image to be detected in closing；Every image to be detected includes at least a face；

Machine readable storage medium 702 may include random access memory (Random Access Memory, RAM), It also may include nonvolatile memory (Non-Volatile Memory, NVM), for example, at least a magnetic disk storage.It is optional , memory can also be that at least one is located remotely from the storage device of aforementioned processor.

Above-mentioned processor 701 can be general processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network Processor, NP) etc.；It can also be digital signal processor (Digital Signal Processing, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic Device, discrete gate or transistor logic, discrete hardware components.

A kind of electronic equipment provided in an embodiment of the present invention can first obtain video to be detected, treat in Face datection Detection video carries out pumping frame, obtains image to be detected set；Then, by each image to be detected in image to be detected set Input is in advance in the obtained Face datection model of training, obtain include the location information of face probability and human face region output knot Fruit, since the face probability is used to indicate that each detection zone in the first face detection model of input to include the probability of face； Also, human face region is the detection zone that face probability is greater than predetermined probabilities threshold value；It therefore, can be to being wrapped in the set to be detected At least image to be detected contained is detected, and the face in the image to be detected is detected, so as to realize to be checked At least one face for including in video is surveyed to be detected.

The embodiment of the invention also provides a kind of computer readable storage medium, stored in the computer readable storage medium The step of having computer program, above-mentioned face archiving method is realized when computer program is executed by processor, for example, can be real Existing following steps:

A kind of computer readable storage medium provided in an embodiment of the present invention can be obtained first to be checked in Face datection Video is surveyed, pumping frame is carried out to video to be detected, obtains image to be detected set；It then, will be each in image to be detected set In image to be detected input Face datection model that training obtains in advance, obtain include face probability and human face region position The output of information is as a result, since the face probability is used to indicate that each detection zone in the first face detection model of input to include The probability of face；Also, human face region is the detection zone that face probability is greater than predetermined probabilities threshold value；Therefore, this can be waited for At least image to be detected for including in detection set is detected, and detects the face in the image to be detected, so as to At least one face for including in video to be detected is detected with realizing.

The embodiment of the invention also provides a kind of computer program products comprising instruction, when it runs on computers When, so that the step of computer executes above-mentioned face archiving method, for example, following steps can be executed:

A kind of computer program product comprising instruction provided in an embodiment of the present invention can be obtained first in Face datection Video to be detected is taken, pumping frame is carried out to video to be detected, obtains image to be detected set；It then, will be in image to be detected set Each image to be detected input in advance in the obtained Face datection model of training, obtain including face probability and human face region Location information output as a result, due to the face probability be used for indicates input the first face detection model in each detection zone Domain includes the probability of face；Also, human face region is the detection zone that face probability is greater than predetermined probabilities threshold value；It therefore, can be with At least image to be detected for including in the set to be detected is detected, detects the face in the image to be detected, At least one face for including in video to be detected is detected so as to realize.

The embodiment of the invention also provides a kind of computer programs, when run on a computer, so that computer is held The step of row above-mentioned face archiving method, for example, following steps can be executed:

A kind of computer program provided in an embodiment of the present invention can first obtain video to be detected in Face datection, right Video to be detected carries out pumping frame, obtains image to be detected set；Then, by each mapping to be checked in image to be detected set As in the input in advance obtained Face datection model of training, obtain include the location information of face probability and human face region output As a result, since the face probability is used to indicate that each detection zone in the first face detection model of input to include the general of face Rate；Also, human face region is the detection zone that face probability is greater than predetermined probabilities threshold value；It therefore, can be to the set to be detected In include at least image to be detected detected, the face in the image to be detected is detected, so as to realize pair At least one face for including in video to be detected is detected.

It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.

Each embodiment in this specification is all made of relevant mode and describes, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality For applying example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to embodiment of the method Part explanation.

It above are only presently preferred embodiments of the present invention, be not intended to limit the scope of the present invention.It is all in this hair Any modification, equivalent replacement, improvement and so within bright spirit and principle, are included within the scope of protection of the present invention.

Claims

1. a kind of method for detecting human face, which is characterized in that the described method includes:

Video to be detected is obtained, pumping frame is carried out to the video to be detected, obtains image to be detected set；Wherein, described to be checked It include at least image to be detected in altimetric image set；

Each image to be detected input in image to be detected set is trained in advance in obtained Face datection model, Obtain output result, wherein the output result includes the location information of face probability and human face region, wherein the face Each detection zone that probability is used to indicate to input in the first face detection model includes the probability of face；The face area Domain is the detection zone that the face probability is greater than predetermined probabilities threshold value；

Wherein, the Face datection model is to be obtained using training dataset by machine learning training, the training dataset In each group of training data include: from the facial image extracted in monitor video and for identifying face in the facial image The location information in region.

2. the method according to claim 1, wherein each by image to be detected set to In the detection image input Face datection model that training obtains in advance, output result is obtained, comprising:

Described image to be detected is inputted in the first face detection model that training obtains in advance, obtains the first output result；Institute State the location information that the first output result includes the first face probability and the first human face region；Wherein, the first face probability For indicating to input the probability that the first detection zone of each of the first face detection model includes face；It is described the first Face region is the first detection zone that the first face probability is greater than the first predetermined probabilities threshold value；

First human face region is inputted the second people that training obtains in advance by the location information based on first human face region In face detection model, the second output result is obtained；The second output result includes the second face probability and the second human face region Location information；Wherein, the second face probability inputs each of described second Face datection model second for indicating Detection zone includes the probability of face；The size of second detection zone is greater than the size of first detection zone；It is described Second human face region is the second detection zone that the second face probability is greater than the second predetermined probabilities threshold value.

3. according to the method described in claim 2, it is characterized in that, the method is also after obtaining the second output result Include:

Second human face region is inputted the third party that training obtains in advance by the location information based on second human face region In face detection model, third output result is obtained；The third output result includes third face probability, third human face region The location information of location information and face organ；Wherein, the third face probability inputs the third face inspection for indicating The each third detection zone surveyed in model includes the probability of face；The third human face region is that the third face probability is big In the third detection zone of third predetermined probabilities threshold value.

4. according to the method described in claim 3, it is characterized in that, the first face detection model and the second Face datection mould Type is to be obtained using the first training data by machine learning training；Wherein, each group of training in first training data Data include: to believe from the facial image extracted in monitor video and for identifying the position of human face region in the facial image Breath；

The third face detection model is to be obtained using the second training data by machine learning training；Wherein, described second Each group of training data in training data includes: the facial image extracted from monitor video, for identifying the face figure The location information of human face region and the location information of face organ as in.

5. according to the method described in claim 2, it is characterized in that, the method is also after obtaining the second output result Include:

Second human face region is inputted the 4th people that training obtains in advance by the location information based on second human face region In face detection model, quality scale corresponding with each second human face region is obtained；

Wherein, the 4th Face datection model is to train to obtain by machine learning using third training dataset, described the The each group of training data that three training datas are concentrated includes: the facial image extracted from monitor video and for identifying the people The label of face image quality scale.

6. according to the method described in claim 3, it is characterized in that, the method is also after obtaining the third output result Include:

The third human face region is inputted the 4th people that training obtains in advance by the location information based on the third human face region In face detection model, quality scale corresponding with each third human face region is obtained；

7. method according to claim 5 or 6, which is characterized in that the 4th Face datection model include: convolutional layer, Criticize normalization layer, activation primitive layer, global average pond layer and output layer；

The convolutional layer, for extracting the local feature of each human face region；

By the local feature of each human face region, it is input to described batch of normalization layer and is normalized, obtain each described Local feature after human face region normalization；

By the local feature after each human face region normalization, it is input to the activation primitive layer, obtains each people Local feature after the non-linearization of face region；

By the local feature after each human face region non-linearization, it is input to the global average pond layer and carries out pond, Obtain the feature of each human face region Chi Huahou；

By the feature of each human face region Chi Huahou, it is input to the output layer, obtains the matter of each human face region Magnitude is other.

8. a kind of human face detection device, which is characterized in that described device includes:

Frame module is taken out to carry out pumping frame for obtaining video to be detected to the video to be detected, obtain image to be detected set； It wherein, include at least image to be detected in image to be detected set；

First face detection module, for training each image to be detected input in image to be detected set in advance In obtained Face datection model, output result is obtained, wherein the output result includes the position of face probability and human face region Confidence breath, wherein the face probability is used to indicate that each detection zone inputted in the first face detection model to include The probability of face；The human face region is the detection zone that the face probability is greater than predetermined probabilities threshold value；

9. device according to claim 8, which is characterized in that first face detection module, comprising:

First detection sub-module, for described image to be detected to be inputted in the first face detection model that training obtains in advance, Obtain the first output result；The first output result includes the location information of the first face probability and the first human face region；Its In, the first face probability is for indicating that inputting the first detection zone of each of the first face detection model includes people The probability of face；First human face region is the first detection zone that the first face probability is greater than the first predetermined probabilities threshold value Domain；

Second detection sub-module inputs first human face region in advance for the location information based on first human face region In the second Face datection model that first training obtains, the second output result is obtained；The second output result includes the second face The location information of probability and the second human face region；Wherein, the second face probability inputs the second face inspection for indicating Survey the probability that the second detection zone of each of model includes face；The size of second detection zone is greater than first inspection Survey the size in region；Second human face region is the second detection that the second face probability is greater than the second predetermined probabilities threshold value Region.

10. device according to claim 9, which is characterized in that described device further include:

Second face detection module inputs second human face region for the location information based on second human face region In the third face detection model that training obtains in advance, third output result is obtained；The third output result includes the third party The location information of face probability, the location information of third human face region and face organ；Wherein, the third face probability is used for table Show that each third detection zone inputted in the third face detection model includes the probability of face；The third human face region It is greater than the third detection zone of third predetermined probabilities threshold value for the third face probability.

11. device according to claim 10, which is characterized in that the first face detection model and the second Face datection Model is to be obtained using the first training data by machine learning training；Wherein, each group of instruction in first training data Practicing data includes: to believe from the facial image extracted in monitor video and for identifying the position of human face region in the facial image Breath；

12. device according to claim 9, which is characterized in that described device further include:

First quality scale detection module, for the location information based on second human face region by second human face region Input is in advance in obtained the 4th Face datection model of training, obtains quality-class corresponding with each second human face region Not；

13. device according to claim 10, which is characterized in that described device further include:

Second quality scale detection module, for the location information based on the third human face region by the third human face region Input is in advance in obtained the 4th Face datection model of training, obtains quality-class corresponding with each third human face region Not；

14. device according to claim 12 or 13, which is characterized in that the 4th Face datection model includes: convolution Layer, batch normalization layer, activation primitive layer, global average pond layer and output layer；

15. a kind of electronic equipment, which is characterized in that including processor and machine readable storage medium, the machine readable storage Media storage has the machine-executable instruction that can be executed by the processor, and the processor is by the machine-executable instruction Promote: realizing the described in any item method and steps of claim 1~7.

16. a kind of computer readable storage medium, which is characterized in that be stored with computer in the computer readable storage medium Program when the computer program is executed by processor, realizes the described in any item method and steps of claim 1~7.