CN108509894A

CN108509894A - Method for detecting human face and device

Info

Publication number: CN108509894A
Application number: CN201810265189.5A
Authority: CN
Inventors: 周舒岩; 封红霞; 蒲雪; 钱晨; 王飞; 王权
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2018-03-28
Filing date: 2018-03-28
Publication date: 2018-09-07

Abstract

This disclosure relates to method for detecting human face and device.This method includes：Pass through the face frame at least two different Face datection model inspection image to be detected；According at least one face frame that described at least two different Face datection model inspections arrive, the Face datection result of described image to be detected is determined.The disclosure can improve the recall rate of the face of different facial image accountings by different Face datection models, so as to improve the accuracy of Face datection.

Description

Method for detecting human face and device

Technical field

This disclosure relates to technical field of computer vision more particularly to a kind of method for detecting human face and device.

Background technology

When carrying out Face datection to image, since the accounting of face in the picture is usually different, and people in the related technology Face detection tech can only be directed to certain Given Face accounting within the scope of face be detected, cause Face datection accuracy rate and Recall rate is relatively low.

Invention content

In view of this, the present disclosure proposes a kind of method for detecting human face and devices.

According to the one side of the disclosure, method for detecting human face is provided, including：

Pass through the face frame at least two different Face datection model inspection image to be detected；

According at least one face frame that described at least two different Face datection model inspections arrive, determine described to be checked The Face datection result of altimetric image.

In one possible implementation, described at least two different Face datection models correspond to different faces Image accounting.

In one possible implementation, at least one face frame is specially multiple face frames；

At least one face frame arrived according at least two Face datections model inspection, determines described image to be detected Face datection as a result, including：

Position based on multiple face frames that at least two Face datections model inspection arrives, to the multiple face frame Processing is merged, face frame amalgamation result is obtained；

According to the face frame amalgamation result, the Face datection result of described image to be detected is determined.

In one possible implementation, the multiple face frames arrived based on at least two Face datections model inspection Position, processing is merged to the multiple face frame, obtains face frame amalgamation result, including：

If the registration between the first face frame and the second face frame in the multiple face frame is greater than or equal to default Threshold value then merges the first face frame and the second face frame, obtains merging face frame.

In one possible implementation, the first face frame and the second face frame are merged, is obtained Merge face frame, including：

The merging face frame is determined as a face frame in the first face frame and the second face frame.

In one possible implementation, passing through at least two different Face datection model inspection image to be detected In face frame before, the method further includes：

Equalization processing based on gray scale is carried out to pending image, wherein the equalization processing based on gray scale is for increasing The grey level region that the pixel value of the big pending image is distributed；

According to the pending image after equalization processing, image to be detected is determined.

In one possible implementation, the equalization processing based on gray scale is carried out to the pending image, including：

Histogram equalization processing is carried out to the pending image.

In one possible implementation, after determining the Face datection result of described image to be detected, the side Method further includes：

Human face region in described image to be detected is cut, at least one organic image block is obtained；

Extract the key point information of the organ of at least one organic image block.

In one possible implementation, the key point information of the organ of at least one organic image block is extracted, Including：

At least one organic image block is separately input into a few neural network, wherein different organic images Block corresponds to different neural networks；

Extract the key point letter of the organ of the organic image block respectively inputted respectively by least one neural network Breath.

According to another aspect of the present disclosure, a kind of human face detection device is provided, including：

Detection module, for passing through the face frame at least two different Face datection model inspection image to be detected；

First determining module, at least one people for being arrived according to described at least two different Face datection model inspections Face frame determines the Face datection result of described image to be detected.

First determining module includes：

Merging submodule, the position of multiple face frames for being arrived based on at least two Face datections model inspection, Processing is merged to the multiple face frame, obtains face frame amalgamation result；

Determination sub-module, for according to the face frame amalgamation result, determining the Face datection knot of described image to be detected Fruit.

In one possible implementation, the merging submodule is used for：

In one possible implementation, described device further includes：

Equalization processing module, for carrying out the equalization processing based on gray scale to pending image, wherein be based on gray scale Equalization processing for increasing the grey level region that the pixel value of the pending image is distributed；

Second determining module, for according to the pending image after equalization processing, determining image to be detected.

In one possible implementation, the equalization processing module is used for：

Histogram equalization processing is carried out to the pending image.

In one possible implementation, described device further includes：

Cutting module obtains at least one organ figure for being cut to the human face region in described image to be detected As block；

Extraction module, the key point information of the organ for extracting at least one organic image block.

In one possible implementation, the extraction module includes：

Input submodule, at least one organic image block to be separately input into a few neural network, wherein Different organic image blocks correspond to different neural networks；

Extracting sub-module, for extracting the organic image block respectively inputted respectively by least one neural network The key point information of organ.

According to another aspect of the present disclosure, a kind of electronic equipment is provided, including：Processor；It can for storing processor The memory executed instruction；Wherein, the processor is configured as executing the above method.

According to another aspect of the present disclosure, a kind of computer readable storage medium is provided, computer journey is stored thereon with Sequence instructs, wherein the computer program instructions realize the above method when being executed by processor.

The method for detecting human face and device of all aspects of this disclosure pass through at least two different Face datection model inspections Face frame in image, and at least one face frame arrived according at least two different Face datection model inspection determine The Face datection of image to be detected is as a result, thus, it is possible to improve different facial image accountings by different Face datection models The recall rate of face, so as to improve the accuracy of Face datection.

According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, the other feature and aspect of the disclosure will become It is clear.

Description of the drawings

Including in the description and the attached drawing of a part for constitution instruction and specification together illustrate the disclosure Exemplary embodiment, feature and aspect, and for explaining the principles of this disclosure.

Fig. 1 shows the flow chart of the method for detecting human face according to one embodiment of the disclosure.

Fig. 2 shows an illustrative flow charts of step S12 in the method for detecting human face according to one embodiment of the disclosure.

Fig. 3 shows an illustrative flow chart of the method for detecting human face according to one embodiment of the disclosure.

Fig. 4 shows the flow chart of the another exemplary of the method for detecting human face according to one embodiment of the disclosure.

Fig. 5 shows an illustrative flow chart of step S14 in the method for detecting human face according to one embodiment of the disclosure.

Fig. 6 shows the block diagram of the human face detection device according to one embodiment of the disclosure.

Fig. 7 shows an illustrative block diagram of the human face detection device according to one embodiment of the disclosure.

Fig. 8 is a kind of block diagram of device 800 for Face datection shown according to an exemplary embodiment.

Fig. 9 is a kind of block diagram of device 1900 for Face datection shown according to an exemplary embodiment.

Specific implementation mode

Various exemplary embodiments, feature and the aspect of the disclosure are described in detail below with reference to attached drawing.It is identical in attached drawing Reference numeral indicate functionally the same or similar element.Although the various aspects of embodiment are shown in the accompanying drawings, remove It non-specifically points out, it is not necessary to attached drawing drawn to scale.

Dedicated word " exemplary " means " being used as example, embodiment or illustrative " herein.Here as " exemplary " Illustrated any embodiment should not necessarily be construed as preferred or advantageous over other embodiments.

In addition, in order to better illustrate the disclosure, numerous details is given in specific implementation mode below. It will be appreciated by those skilled in the art that without certain details, the disclosure can equally be implemented.In some instances, for Method, means, element and circuit well known to those skilled in the art are not described in detail, in order to highlight the purport of the disclosure.

Fig. 1 shows the flow chart of the method for detecting human face according to one embodiment of the disclosure.As shown in Figure 1, this method includes Step S11 and step S12.

In step s 11, pass through the face frame at least two different Face datection model inspection image to be detected.

Image to be detected in the present embodiment can be that the images such as static picture or photo are presented, or present Video frame etc. in dynamic video.Face in image to be detected can be positive face, it is also possible to which there are certain angle deflections Side face, the embodiment of the present disclosure do not limit the specific implementation of image to be detected.

In the embodiments of the present disclosure, different Face datection models can have different network parameters.In a kind of possibility Realization method in, at least two different Face datection models correspond to different facial image accountings.For example, different people Face detection model is trained using the image of different facial image accountings, alternatively, different Face datection models Best applications image (such as recall rate is most high) have different face image accountings, etc., wherein facial image accounting can To refer to human face region ratio shared in whole image, but the embodiment of the present disclosure does not limit this.

As an example of the realization method, different Face datection models can be to different sizes in image to be detected Image-region carry out Face datection.For example, can be waited for by the first face detection model and the second Face datection model inspection Face frame in detection image, wherein the first face detection model is to the pixel size of each 24 pixel in image to be detected × 24 Image-region carry out Face datection, the second Face datection model is to the pixel size of each 64 pixel in image to be detected × 64 Image-region carries out Face datection.For another example, the first face detection model, the second Face datection model and third face can be passed through Face frame in detection model detection image, quantity and multiple Face datection mould of the embodiment of the present disclosure to Face datection model The specific implementations such as the corresponding network parameter of type and input image size do not limit.

As another example of the realization method, different Face datection models can correspond to various sizes of input Image.At this point it is possible to carry out size adjusting processing, such as scaling or sampling processing to image to be detected, different faces is obtained The corresponding input picture of detection model.For example, the first face detection model, the second Face datection model and can be passed through Face frame in three Face datection model inspection image to be detected, wherein the input picture of the first face detection model is to pass through Image to be detected diminution is obtained for one times, the input picture of the second Face datection model is image to be detected itself, the third party The input picture of face detection model is twice to obtain by putting image to be detected, but the embodiment of the present disclosure does not limit this It is fixed.

In step s 12, at least one face frame arrived according at least two different Face datection model inspections determines The Face datection result of image to be detected.

For example, detecting face frame B1 by the first face detection model, it is not detected by the second Face datection model Face frame, then the Face datection result of image to be detected includes face frame B1；For another example being detected by the first face detection model To face frame B1 and face frame B2, by the second Face datection model inspection to face frame B3, face frame B4 and face frame B5, then The Face datection of image to be detected can be determined according to face frame B1, face frame B2, face frame B3, face frame B4 and face frame B5 As a result.For example, it may be determined that the Face datection result of image to be detected includes face frame B1, face frame B2, face frame B3, face Frame B4 and face frame B5, but the embodiment of the present disclosure is without being limited thereto.

The present embodiment is by the face frame at least two different Face datection model inspection images, and at least according to this At least one face frame that two different Face datection model inspections arrive, determine the Face datection of image to be detected as a result, by This can improve the recall rate of face in the images of various different facial image accountings by different Face datection models, to The accuracy of Face datection can be improved.

In one possible implementation, at least one face frame is specially multiple face frames.Fig. 2 shows according to this public affairs Open an illustrative flow chart of step S12 in the method for detecting human face of an embodiment.As shown in Fig. 2, step S12 may include Step S121 and step S122.

In step S121, the position based on multiple face frames that at least two Face datection model inspections arrive, to multiple Face frame merges processing, obtains face frame amalgamation result.

In one possible implementation, the position of the multiple face frames arrived based at least two Face datection model inspections It sets, processing is merged to multiple face frames, obtain face frame amalgamation result, including：If the first face in multiple face frames Registration between frame and the second face frame is greater than or equal to predetermined threshold value, then closes the first face frame and the second face frame And it obtains merging face frame.

For example, detecting face frame B1 and face frame B2 by the first face detection model, pass through the second Face datection mould Type detects face frame B3, face frame B4 and face frame B5.If the registration of face frame B2 and face frame B4 are greater than or equal to pre- If threshold value, then face frame B2 and face frame B4 can be merged into face frame B2 ', and face frame B2 ' can be regard as face frame The face frame amalgamation result of B2 and face frame B4, that is, merging people that can be by face frame B2 ' as face frame B2 and face frame B4 Face frame.

As an example of the realization method, the first face frame and the second face frame are merged, obtain merging people Face frame, including：Face frame will be merged and be determined as a face frame in the first face frame and the second face frame.

If for example, the registration of face frame B2 and face frame B4 be greater than or equal to predetermined threshold value, can be by face frame B2 It is determined as face frame B2 or face frame B4 with the merging face frame of face frame B4.

As another example of the realization method, as an example of the realization method, by the first face frame and Two face frames merge, and obtain merging face frame, including：According to the top of the apex coordinate of the first face frame and the second face frame Point coordinates determines the apex coordinate for merging face frame.

For example, the midpoint of the first face frame and the corresponding apex coordinate of the second face frame can be determined as to merge face The apex coordinate of frame.In this example embodiment, if the registration of the first face frame and the second face frame is greater than or equal to predetermined threshold value, It can be using the midpoint of the first face frame and the top left corner apex coordinate of the second face frame as the top left corner apex for merging face frame Coordinate, using the midpoint of the first face frame and the upper right corner apex coordinate of the second face frame as the upper right angular vertex for merging face frame Coordinate, using the midpoint of the first face frame and the lower left corner apex coordinate of the second face frame as the lower-left angular vertex for merging face frame Coordinate, using the midpoint of the first face frame and the lower right corner apex coordinate of the second face frame as the bottom right angular vertex for merging face frame Coordinate, so that it is determined that merging the size and location of face frame.Wherein, the first face frame and the second face can be referred to by merging face frame The face frame that frame merges., can be with for example, if the registration of face frame B2 and face frame B4 are greater than or equal to predetermined threshold value Using the midpoint of the top left corner apex coordinate of face frame B2 and face frame B4 as merge face frame B2 ' top left corner apex coordinate, Using the midpoint of the upper right corner apex coordinate of face frame B2 and face frame B4 as merge face frame B2 ' upper right corner apex coordinate, Using the midpoint of the lower left corner apex coordinate of face frame B2 and face frame B4 as merge face frame B2 ' lower left corner apex coordinate, Using the midpoint of the lower right corner apex coordinate of face frame B2 and face frame B4 as merge face frame B2 ' lower right corner apex coordinate, So that it is determined that merging the size and location of face frame B2 '.

For another example, can include according to the apex coordinate of the first face frame, the apex coordinate of the second face frame, the first face frame The probability of face and the second face frame include the probability of face, determine the apex coordinate for merging face frame.It in this example embodiment, can be with The apex coordinate for merging face frame is determined using following formula,

Wherein, p₁Indicate the probability that the first face frame includes face, p₂Indicate the probability that the second face frame includes face, (x_1LT,y_1LT) indicate the first face frame top left corner apex coordinate, (x_1RT,y_1RT) indicate that the upper right angular vertex of the first face frame is sat Mark, (x_1LB,y_1LB) indicate the first face frame lower left corner apex coordinate, (x_1RB,y_1RB) indicate that the lower right corner of the first face frame is pushed up Point coordinates, (x_2LT,y_2LT) indicate the second face frame top left corner apex coordinate, (x_2RT,y_2RT) indicate the second face frame upper right Angular vertex coordinate, (x_2LB,y_2LB) indicate the second face frame lower left corner apex coordinate, (x_2RB,y_2RB) indicate the second face frame Lower right corner apex coordinate, (x_3LT,y_3LT) indicate the top left corner apex coordinate for merging face frame, (x_3RT,y_3RT) indicate to merge face The upper right corner apex coordinate of frame, (x_3LB,y_3LB) indicate the lower left corner apex coordinate for merging face frame, (x_3RB,y_3RB) indicate to merge The lower right corner apex coordinate of face frame.

As an example of the realization method, first can be less than in the distance between the geometric center of two face frames When threshold value, determine that the registration of two face frames is greater than or equal to predetermined threshold value.

As another example of the realization method, can when the overlapping area of two face frames is more than second threshold, Determine that the registration of two face frames is greater than or equal to predetermined threshold value.

As another example of the realization method, the second face can be covered in the first face frame in two face frames When the ratio that the area coverage of frame accounts for the area of the second face frame is more than third threshold value, determine that the registration of two face frames is more than Or it is equal to predetermined threshold value.

Wherein, the first face frame one of refers in two face frames that face frame, the second face frame refer to two Another face frame in a face frame, " first " and " second " are only herein the convenience stated and referred to, and are not meant to Corresponding " the first face frame " and " the second face frame " are centainly had in the specific implementation of the disclosure.

It should be noted that determining that the registration of two face frames is greater than or equal to although being described in a manner of implementation above The mode of predetermined threshold value is as above, it is understood by one of ordinary skill in the art that the disclosure answer it is without being limited thereto.Those skilled in the art can Determine that the registration of two face frames is greater than or equal in advance to be flexibly arranged according to practical application scene demand and/or personal like If the mode of threshold value.

In step S122, according to face frame amalgamation result, the Face datection result of image to be detected is determined.

For example, if the registration of any two face frame is respectively less than default in face frame B1, face frame B3 and face frame B5 The face frame amalgamation result of threshold value, face frame B2 and face frame B4 are face frame B2 ', then can determine the face of image to be detected Testing result includes face frame B1, face frame B2 ', face frame B3 and face frame B5.

Fig. 3 shows an illustrative flow chart of the method for detecting human face according to one embodiment of the disclosure.As shown in figure 3, This method may include step S31 to step S34.

In step S31, the equalization processing based on gray scale is carried out to pending image, wherein the equilibrium based on gray scale Change the grey level region that processing is distributed for increasing the pixel value of pending image.

Pending image in the present embodiment can be gray level image, or non-gray level image.Non- gray level image can Think RGB (RGB) image etc..Certainly, non-gray level image may be the image of the other forms other than RGB image, Such as YUV (brightness coloration) image.The present embodiment does not limit the concrete form of pending image.

In practical applications, due to half-light, backlight, overexposure or other complex illumination factors often make it is pending The picture quality of image is affected, and therefore, the presence of these factors is generally not favored Face datection.For example, half-light scene (packet Include part half-light scene etc.) under pending image at least partly image-region in the pixel value (picture of such as gray level image Plain value) it is usually concentrated in lower value region range, this makes the line of at least partly image-region in pending image Reason gradient is smaller, and to which the feature of the target object (such as face) in pending image can be made more fuzzy, this often leads Cause, when carrying out image procossing for the pending image under half-light scene, there is the inspection that can not obtain testing result or acquisition Survey the relatively low phenomenon of the accuracy rate of result.For another example since the overall light under overexposure scene is very bright, backlight scene Under background light it is very bright, the factors such as diversity of the light under complex illumination scene often make in pending image Target object profile and the target objects such as detail textures at least partly feature it is more fuzzy, therefore, directly against When pending image under the scenes such as overexposure, backlight or complex illumination carries out image procossing, often presence can not obtain Testing result or testing result the relatively low phenomenon of accuracy rate.The present embodiment is based on gray scale by being carried out to pending image Equalization processing, the indexs such as contrast and/or the brightness of pending image can be made to obtain relatively reasonable adjustment, so as to To avoid picture quality of the factors such as half-light, overexposure, backlight and complex illumination to pending image to a certain extent Influence, and then be conducive to improve Face datection result accuracy.

In the present embodiment, the equalization processing based on gray scale can be carried out to the whole region of pending image, also may be used To carry out the equalization processing based on gray scale to the subregion of pending image, for example, can be directed in pending image Other regions in addition to outer rim carry out the equalization processing based on gray scale.

In step s 32, according to the pending image after equalization processing, image to be detected is determined.

For example, image to be detected can be determined as to the pending image after equalization processing.

In step S33, pass through the face frame at least two different Face datection model inspection image to be detected.

Wherein, the description to step S11 is seen above to step S33.

In step S34, according at least one face frame that at least two different Face datection model inspections arrive, determine The Face datection result of image to be detected.

Wherein, the description to step S12 is seen above to step S34.

In one possible implementation, the equalization processing based on gray scale is carried out to pending image, including：It treats It handles image and carries out histogram equalization processing.

As an example of the realization method, the equalization processing based on gray scale is carried out to pending image, including：If Pending image is gray level image, then carries out histogram equalization processing to pending image.

In this example, if pending image is gray level image, for waiting for for half-light scene or overexposure scene For handling image, after equalization processing, the pixel value of pending image is no longer limited in a small range region (in such as relatively low pixel value range region or in higher pixel value range areas), but expanding on a large scale by equal proportion Multiple grey level regions (such as range areas 0-255), and be uniformly distributed, to keep the pending image after equalization processing logical Often with there is the indexs such as the contrast being closer to normal picture and brightness；For example, being waited for compared to the original of half-light scene For handling image, the pending image after equalization processing usually has the indexs such as higher contrast and brightness；For another example For the original pending image of overexposure scene, the pending image after equalization processing usually has relatively low Contrast and the indexs such as brightness.

In this example, if pending image is gray level image, for backlight scene or complex illumination scene etc. For pending image, excessively dark and excessively bright region the pixel value in pending image is no longer limited to a small range area In domain, but large-scale multiple grey level regions (such as range areas 0-255) are expanded to by equal proportion, and uniformly divided Cloth, to make the excessively dark or excessively bright region in pending image after via equalization processing, have with normal picture compared with For indexs such as close contrast and brightness.

Wherein, above-mentioned normal picture can refer to the preferable image of picture quality.For example, facial contour is clearly apparent, and people The clear apparent image of face grain details.

Seen from the above description, the picture of image to be detected that the present embodiment determines can be apparent, the profile of face and The features such as detail textures can be more obvious, to be conducive to carry out Face datection, and then is conducive to improve Face datection result Accuracy.

The present embodiment carries out pending image to be based on gray scale in addition to the mode of histogram equalization processing may be used Equalization processing, the equalization processing based on gray scale can also be carried out to pending image using other modes.For example, can Pending image is carried out at the equalization based on gray scale in a manner of by directly adjusting the indexs such as contrast and/or brightness Reason.

In one possible implementation, in the feelings that pending image is RGB image (following to be known as pending image) Under condition, it is possible, firstly, to pending image is converted to YUV image, then, to the channels the Y pixel value of YUV image into column hisgram YUV image after equalization processing is converted to RGB image, above-mentioned transformed RGB image is by equalization processing later Pending image after equalization processing in the present embodiment.Since Y channel tables show brightness, i.e. grayscale value, therefore, the realization Mode can make the YUV after equalization processing by carrying out histogram equalization processing to the channels the Y pixel value in YUV image Image has an indexs such as the contrast that is closer to normal picture and brightness in terms of gray scale, and by equalization processing after After YUV image is converted to RGB image, pair of the indexs such as the overall contrast of transformed RGB image and brightness and normal picture It is closer to than indexs such as degree and brightness.It follows that the picture of the RGB image after equalization processing can be more clear, face Profile and the features such as grain details can be more obvious, to be conducive to carry out Face datection, and then be conducive to Face datection As a result accuracy.

In one possible implementation, in the non-gray scale that pending image is the other forms in addition to RGB image In the case of image, it equally may be used and pending image is first converted into YUV image (is such as converted directly into YUV image, for another example RGB image is first converted to, then YUV image etc. is converted to by RGB image), then, the channels the Y pixel value of YUV image is carried out straight Square figure equalization processing, and then the YUV image after equalization processing is converted to the side of the non-gray level image of corresponding form Formula, to obtain the pending image after equalization processing.It is carried out for the non-gray level image of the other forms in addition to RGB image The process of equalization processing based on gray scale, no longer enumerates explanation herein.

In one possible implementation, in the case where pending image is RGB image, it is possible, firstly, to will wait locating Reason image is converted to gray level image, then, histogram equalization processing, histogram equalization is carried out to transformed gray level image Pending image after treated gray level image is in the present embodiment equalization processing.Since pending image is converted Picture for gray level image, and the corresponding gray level image of pending image after equalization processing can be more clear, the wheel of face The features such as wide and detail textures can be more obvious, and therefore, the present embodiment is conducive to carry out Face datection, and then is conducive to improve The accuracy of Face datection result.

In one possible implementation, in the non-gray scale that pending image is the other forms in addition to RGB image In the case of image, equally may be used and pending image be first converted into gray level image, and to transformed gray level image into The mode of column hisgram equalization processing, to obtain the pending image after equalization processing.For its in addition to RGB image The non-gray level image of his form carries out the process of the equalization processing based on gray scale, no longer enumerates explanation herein.

In one possible implementation, the present embodiment is carrying out the equalization processing based on gray scale to pending image Before, it can first judge whether the grey level region that the pixel value of pending image is distributed is less than the 4th threshold value, if small In the 4th threshold value, then execute the operation that the equalization processing based on gray scale is carried out to pending image；And if being not less than the 4th Threshold value then can no longer execute the operation that the equalization processing based on gray scale is carried out to pending image.Above-mentioned grey level area Domain can reflect situations such as pending image is with the presence or absence of half-light, backlight, overexposure and complex illumination.Certainly, this reality Above-mentioned judgement operation can not also be executed by applying example, and directly carry out the equalization processing based on gray scale to pending image.

In one possible implementation, it can determine that the pixel value of pending image is distributed by following manner Grey level region：Judge any pixel value in pending image (for example, the gray value of pixel, for another example the R of pixel is logical The average value etc. of the numerical value of road numerical value, the channels G numerical value and channel B) whether corresponding pixel quantity reach the 5th threshold value, such as Fruit reaches the 5th threshold value, then above-mentioned pixel value is brought into the range in grey level region, and if not reaching the 5th threshold value, Above-mentioned pixel value will not then be brought into the range in grey level region.

Fig. 4 shows the flow chart of the another exemplary of the method for detecting human face according to one embodiment of the disclosure.Such as Fig. 4 institutes Show, this method may include step S11 to step S14.

In step s 13, the human face region in image to be detected is cut, obtains at least one organic image block.

In one possible implementation, one or two can be cut into from the human face region of image to be detected Or greater number of organic image block.Wherein, the organic image block being cut into may include eye image block, eyebrow image block One or more of with face image block.Wherein, eye image block can be left-eye image block, eye image block or eyes Image block；Eyebrow image block can be left eyebrow image block, right eyebrow image block or double eyebrow image blocks；Face image block can be upper Lip image block, lower lip image block or lips image block (including the image block of upper lip and lower lip).

It in one possible implementation, can be according to the Initial Face key point information of image, to image to be detected In human face region cut, obtain at least one organic image block of the human face region in image to be detected.For example, can be with According to the initial eyes key point in the Initial Face key point information of image to be detected, left eye region, right eye are determined Region either eyes region and can be according to left eye region, right eye region or eyes region The human face region of image to be detected is cut, eye image block is obtained；It for another example, can be according to the initial people of image to be detected Initial eyebrow key point in face key point information determines left eyebrow region, right eyebrow region or double eyebrow locations Domain, and can be according to left eyebrow region, right eyebrow region or double eyebrow regions to the human face region of image to be detected It is cut, obtains eyebrow image block；It for another example, can be according to the initial mouth in the Initial Face key point information of image to be detected Bar key point, determines upper lip region, lower lip region or lips region, and can be according to upper lip The human face region of image to be detected is cut in region, lower lip region or lips region, obtains mouth Bar image block.In the realization method, can zoom in or out to the organic image block being cut into adjustment, so that adjustment Organic image block afterwards has predefined size.Wherein, the requirement that predefined size can be according to neural network to the image block of input To determine.

In one possible implementation, can be cut into from the human face region of image to be detected left-eye image block, Eye image block and face image block.

In one possible implementation, the Initial Face key point information of image may include that each Initial Face is crucial Coordinate information etc. of the number information and each Initial Face key point of point in image to be detected.Wherein, Initial Face is crucial The quantity of Initial Face key point included by point information can be less than or equal to a certain setting value.For example, Initial Face closes Key point information includes 21 either 68 or 106 Initial Face key points.

In one possible implementation, the Initial Face key point that neural network extraction image may be used, obtains Initial Face key point information.Wherein, neural network can be neural network in the related technology.Wherein, neural network can be with Include the Face datection deep neural network for detecting face location and the face key point for detecting face key point Deep neural network.Image can be first input in Face datection deep neural network by the realization method, by Face datection depth Degree neural network exports the face location information (the external frame information of such as face) of image to be detected, it is then possible to by mapping to be checked As and face location information input to face key point deep neural network in, by face key point deep neural network according to Face location information determines the region for needing to detect in image to be detected, and can be directed to the image in the region for needing to detect into Pedestrian's face critical point detection, to which face key point deep neural network can be directed to image to be detected output Initial Face key Point information.

In step S14, the key point information of the organ of at least one organic image block is extracted.

In one possible implementation, it can be provided with one or two or greater number of neural network, And each neural network is respectively used to extract corresponding organ key point information in the organic image block inputted from it.

Fig. 5 shows an illustrative flow chart of step S14 in the method for detecting human face according to one embodiment of the disclosure.Such as Shown in Fig. 5, step S14 may include step S141 and step S142.

In step s 141, at least one organic image block is separately input into a few neural network, wherein different Organic image block corresponds to different neural networks.

In one possible implementation, different classes of organic image block corresponds to different neural networks.Namely Say, if the classification of two organic image organs in the block differ (for example, a certain organic image organ in the block be eyes, Another organic image organ in the block is face), then the two organic image blocks are provided to different two nerve nets Network.If the classification of two organic image organs in the block is identical (for example, a certain organic image organ in the block is left eye, separately One organic image organ in the block is right eye), then the two organic image blocks can be supplied to the same neural network.Nerve net Network can be to be instructed in advance using supervision, the semi-supervised or modes such as unsupervised for the key point information location tasks of face organ Practice the neural network completed, trained concrete mode the present embodiment does not limit.For example, that monitor mode can be used is pre- for neural network First training is completed, and such as trains neural network in advance using the labeled data of face organ.The network structure of neural network can be with According to the flexible design that needs of key point information location tasks, the present embodiment is not intended to limit.For example, neural network may include but It is not limited to convolutional layer, elu layers of linear R, pond layer, full articulamentum etc., the network number of plies is more, and network is deeper.For another example, nerve net The network structure of network may be used but be not limited to ALexNet, depth residual error network (Deep Residual Network, ) or the structure of the networks such as VGGnet (Visual Geometry Group Network) ResNet.

In one possible implementation, the setting of eye image block can be directed to for the positioning of eye key point information Neural network.This can be directed to eye key point information location tasks for the neural network that eye key point information positions and use Supervision, the semi-supervised or modes training in advance such as unsupervised are completed.For example, eyes key point information and/or eyelid line may be used Relevant labeled data trains this to be used for the neural network of eye key point information positioning in advance.It can be set for face image block Set the neural network for Lip contour location.The neural network for being used for Lip contour location can be directed to the positioning of lip key point information and appoint Business is completed using supervision, the semi-supervised or modes training in advance such as unsupervised.For example, being instructed in advance using the relevant labeled data of lip line Practice the neural network for being used for Lip contour location.For example, can by the left-eye image block being cut into from the human face region of image and Eye image block is separately input in the neural network positioned for eye key point information, can will be from the human face region of image In the face image block that is cut into be input in the neural network for Lip contour location.

It is appreciated that the present embodiment can also be cut into from the human face region of image left eyebrow image block, right eyebrow image block, The image block of other organs of the faces such as nose image block, and can be to these image blocks through using eyebrow labeled data or nose in advance The neural network that sub- labeled data was trained carries out the extraction of key point information respectively, and details are not described herein.

In step S142, the organ of the organic image block respectively inputted is extracted respectively by least one neural network Key point information.

In one possible implementation, the neural network of eye key point information positioning is used for from eye image block The key point information of the organ extracted may include：One in eyes key point information and eyelid line information or two.Its In, eyes key point information can be the key point informations such as canthus, eye center.Eyelid line information may include by the more of eyelid The trace information that a key point indicates.It should be noted that the trace information in the present embodiment can be to be connected by multiple key points At track, or the fit line being fitted by multiple key points.For example, eyelid line information may include by simple eye The trace information that 10-15 key point at eyelid or at palpebra inferior indicates, eyelid line information may include upper eyelid line information And/or palpebra inferior line information.

In actual application, not keypoint quantity more multisystem performance is more superior.For example, keypoint quantity is more Be conducive to improve the accuracy of the shape of description eyes to a certain extent, but also bring larger computing cost, reduces operation speed Degree.In the accuracy factor of the shape of the efficiency and description eyes of the neural network to being positioned for eye key point information After being considered, the trace information that can indicate 10-15 key point at simple eye upper eyelid or at palpebra inferior is as eye Eyelid line information disclosure satisfy that at present have to eye shape using the accuracy of the shape of the eyes gone out described by the eyelid line information The demand of a variety of applications precisely required also helps detection eye state, such as eyes-open state, closed-eye state detection.

It should be noted that the joint of upper eyelid line and palpebra inferior line at inner eye corner in the present embodiment can be interior Canthus key point, the joint of upper eyelid line and palpebra inferior line at the tail of the eye can be tail of the eye key point, and inner eye corner is crucial Point can be incorporated into as the key point of upper eyelid line, can also be incorporated into as the key point of palpebra inferior line.Tail of the eye key point can To be incorporated into as the key point of upper eyelid line, can also be incorporated into as the key point of palpebra inferior line.Certainly, inner eye corner key point and Tail of the eye key point can also both be not belonging to the key point of upper eyelid line, also be not belonging to the key point of palpebra inferior line, and independently deposit .In addition, the quantity for the organ key point that the neural network for the positioning of eye key point information is extracted from left-eye image block It may be the same or different with the quantity of the organ key point extracted from eye image block.

The eyelid line information that the present embodiment is extracted by the neural network positioned for eye key point information includes Key point quantity can be more than Initial Face key point information be included be located at eye position at key point quantity. It is used to indicate that each key point of eyelid line can be fitted by what the neural network positioned for eye key point information extracted Go out to indicate the curve A of eye shape, can such as fit upper eyelid line curve or palpebra inferior line curve.Pass through Initial Face key The key point being located at eye position in point information (such as including 106 Initial Face key points) can fit expression eye The curve B of eyelid.It is detected by practical calculate, the degree of error of curve A and the curve of practical eyelid wire shaped, is curve B and reality The 1/5-1/10 of the degree of error of the curve of eyelid wire shaped.It follows that the present embodiment is individually carried by being directed to eye image block Eyelid line information is taken, the accuracy for the shape for describing eyes using the key point information extracted is can effectively improve.Usual In the case of, since more apparent variation can occur with the variation (such as expression shape change) of the face of people for eyes.The present embodiment Technical solution be conducive to improve eye key point information extraction precision and accuracy, improve subsequently be based on these key points believe Cease the accuracy of application.Such as in the facial expression for determining people, can using the reference important as one of eyelid line information because Element is conducive to the accuracy for improving the facial expression of determining people；In another example when carrying out image rendering, eyelid line can be based on and adopted Eye with computer plotting mode in image draws such as paster spatial cue, improves the accuracy that spatial cue is drawn；Again For example, can be based on eyelid line carries out the U.S. processing such as face or makeups to image, U.S. face or makeups effect are improved.

In one possible implementation, the device extracted from face image block for the neural network of Lip contour location Official's key point may include the key point of lip line.For example, the neural network for Lip contour location is extracted from face image block The organ key point gone out may include the key point of the key point and lower lip line of upper lip line.Wherein, the key of upper lip line Point may include the key point of upper lip upper lip line, can also include the key point and upper lip lower lip line of upper lip upper lip line Key point.The key point of lower lip line may include the key point of lower lip upper lip line, can also include lower lip upper lip line The key point of key point and lower lip lower lip line.Lip line information in the realization method may include multiple passes by lip outline The trace information that key point indicates.For example, lip line information may include by the respective 16-21 key point table of bottom profiled on single lip The trace information shown.For example, the lip line of upper lip can be the 16-21 key point and upper lip by profile on upper lip The trace information that 16-21 key point at bottom profiled indicates together.In another example the lip line of lower lip can be by lower lip The trace information that 16-21 key point at 16-21 key point and lower lip bottom profiled at profile indicates together.

In actual application, not keypoint quantity more multisystem performance is more superior.For example, keypoint quantity is more Be conducive to improve the accuracy of the shape of description lip to a certain extent, but also bring larger computing cost, reduces operation speed Degree.After the accuracy factor to the efficiency of deep neural network and the shape for describing lip considers, it can incite somebody to action The trace information that respective 16-21 key point indicates at bottom profiled on single lip is expressed as lip line information, uses lip line information institute The accuracy of the shape for the lip being depicted disclosure satisfy that at present to lip shape or mouth states detection have precisely require it is more The demand of kind application, also helps to face opening and closing state, such as state of opening one's mouth, yawn state, state of shutting up detection.

It should be noted that upper lip upper lip line, upper lip lower lip line, lower lip upper lip line in the present embodiment and under Joint of the lip lower lip line at the corners of the mouth of both sides can be two corners of the mouth key points, any corners of the mouth key point can be incorporated into for Upper lip upper lip line, upper lip lower lip line, lower lip upper lip line or lower lip lower lip line.Certainly, two corners of the mouth key points It can be both not belonging to upper lip upper lip line and upper lip lower lip line, be also not belonging to lower lip upper lip line and lower lip lower lip line, and It is individually present.

The present embodiment can be big by the quantity of the key point of the lip line extracted for the neural network of Lip contour location In the quantity for the key point being located at face position that Initial Face key point information is included.The present embodiment is according to for lip line The key point for the lip line that the neural network of positioning extracts can fit the curve C for indicating upper lip shape, can such as intend Upper lip upper lip line curve and upper lip lower lip line curve are closed out, this two curve forming curves C；According to Initial Face key point The key point being located at face position in information (such as including 106 Initial Face key points) can fit the upper mouth of expression The curve D of lip shape.Pass through practical calculating, the degree of error of the curve C of the present embodiment and the actually curve of lip shape, for song The 1/5-1/10 of line D and the actually degree of error of the curve of lip shape.It follows that the present embodiment is by being directed to face image Block individually extracts the key point of lip line, can effectively improve the accuracy for the shape that lip is described using the key point extracted. Under normal conditions, since face can occur more significantly to change with the variation of the facial expression of people.The skill of the present embodiment Art scheme is conducive to improve precision and the accuracy of lip key point information extraction, and raising is subsequently answered based on these key point informations Accuracy.Such as in the facial expression for determining people, the lip line reference factor important as one can be conducive to carry Height determines the accuracy of the facial expression of people；In another example when carrying out image rendering, lip line can be based on and use computer graphics Mode draws such as paster spatial cue in the lip of image, improves the accuracy that spatial cue is drawn；For another example lip can be based on Line carries out U.S. face or the makeups processing of image, improves U.S. face or makeups effect.

Other than above-mentioned eyelid line information, lip line information, the key point information of organ can also include nose key point Information, eyebrow key point, eye center etc..The key point information of the organ for the face that the present embodiment obtains can be used for face Image rendering, Face Changing processing, U.S. face processing, makeups processing, recognition of face, face state-detection, expression detection and face category Property (such as male/female, age or nationality's attribute) detection etc. applications, the present embodiment do not limit the key point information of the organ of acquisition Concrete application range.

In one possible implementation, this method can also include：The Initial Face for integrating image to be detected is crucial The key point information of point information and each corresponding organ, obtains the face key point information of image to be detected.In the realization method, Integration processing can be carried out to the key point information of Initial Face key point information and each corresponding organ by a variety of Integration Modes.

It, can be based on the Integration Mode of union, to Initial Face key point information as an example of the realization method Integration processing is carried out with the key point information of each corresponding organ.

In this example, it is possible, firstly, to obtain the Initial Face key point for the image that neural network extracts, such as gets Initial Face key point can be 21 either 68 or 106 Initial Face key points.

Secondly, the key point of the key point of eyelid line, the key point of lip line and eyebrow line can be numbered respectively And the conversion of location information.Wherein, it is used for the key point of the eyelid line of the neural network output of eye key point information positioning Information may include：The key point of the number and eyelid line set by predetermined arrangement sequence for the key point of eyelid line exists Eye image coordinate information in the block.The present embodiment can be numbered according to preset putting in order for face key point Conversion process.The conversion of location information in the present embodiment can be：By the key point of eyelid line in eye image seat in the block Mark information MAP to eyelid line key point coordinate information in the picture.The key point of key point and eyebrow line to lip line It is numbered and the conversion of location information is referred to the above-mentioned key point to eyelid line and is numbered and location information Conversion described is no longer described in detail herein.In addition, the present embodiment can also be to the volume of some or all of Initial Face key point Number carry out conversion process.

It finally, can be by the key point information of the eyelid line after number and location information conversion process, the pass of lip line Key point information and the key point information of eyebrow line are merged with the Initial Face key point information after number conversion process, shape At the face key point information of image to be detected.

As another example of the realization method, the Integration Mode that can be replaced based on partial replacement or all is right Initial Face key point information and the key point information of each corresponding organ carry out integration processing.

It is possible, firstly, to obtain the Initial Face key point of image to be detected that neural network extracts, what is such as got is first Beginning face key point can be 21 either 68 or 106 Initial Face key points.

Secondly, the key point of the key point of eyelid line, the key point of lip line and eyebrow line can be numbered respectively And the conversion of location information, conversion process carried out to the number of some or all of Initial Face key point, specific as above one Description in a example, is no longer described in detail herein.

Finally, the key point information of the eyelid line after number and location information conversion process can be utilized to replace initial people The Partial key point information being located at eye position in face key point information.Certainly, the present embodiment can also utilize number with And the key point information of the eyelid line after location information conversion process replaces in Initial Face key point information and is located at eyes position Set all key point informations at place.The present embodiment can also utilize the pass of the lip line after number and location information conversion process It is located at mouth in Initial Face key point information after key point information and the key point information of eyebrow line replacement number conversion process All key point informations at lip position and at eyebrow position form the face key point information of image to be detected.

In one possible implementation, the final face key point information extracted from image may be greater than 106 A quantity of key point, such as can need extraction 186,240,220 or 274 key points according to service application.For example, people Face key point information may include key point, the nose region that eye key point information, mouth key point information, brow region include The key point of key point and face mask that domain includes.Wherein, eye key point information may include 48 to 72 eye keys Point.Eye key point information may include for positioning the 4 to 24 of eye position key points and eyelid line information include 44 To 48 key points.4 to 24 key points for positioning eye position may include for position the key point of eye areas and Key point for positioning eyeball position.44 to 48 key points that eyelid line information includes may include corresponding four eyes of eyes Eyelid line.Mouth key point information may include 0 to 20 key point for positioning face position and lip line (such as lips Corresponding two lip lines) 60 to 64 key points including.Brow region may include 26 to 58 key points.Nasal area can To include 15 to 27 key points.Face mask may include 33 to 37 key points.Using the face of above-mentioned quantity and ratio Key point information can preferably be weighed in computing cost, positioning accuracy and accuracy various aspects, meet most service applications Precisely require.

In one possible implementation, neural network may include input layer, multiple convolution for extracting feature Layer, it is at least one for determine organ key point organic image coordinate information in the block full articulamentum and output layer.With May include multiple images sample in training the sample data set of neural network.Can include someone in each image pattern Face image.Each image pattern can be labeled with face key point labeled data, for example, each image pattern can It is labeled with the coordinate information of the number and each key point that are more than 106 key points in image pattern.In training neural network When, it can first be concentrated from sample data and choose an image pattern, and image pattern is inputed into neural network, neural network Image pattern can be cut out eye image block, eyebrow image block or face by input layer according to the labeled data of image pattern The organic images block such as image block, and the size for the organic image block being cut out is adjusted, later, input layer can be to organic image block In each key point number and coordinate information carry out conversion process, make the corresponding organ key point in labeled data entire Coordinate information of the number and corresponding organ key point in all key points in pattern sheet in image pattern is converted to Number in organic image all key points in the block and in organic image coordinate information in the block；After shearing and being sized Organic image block can be supplied to the convolutional layer for extracting feature, by convolutional layer extract organ image block characteristics of image, Later, for determining that organ key point can be according to the image extracted in the full articulamentum of organic image coordinate information in the block The number of feature determiner official's image each key point in the block and the coordinate information of each key point, and neural network can be passed through Output layer export multi-group data, each group of data can include the number of key point and the coordinate letter of the key point Breath.The present embodiment can be using the number of the key point after input layer conversion process and the coordinate information of key point to nerve net The multi-group data of the output layer output of network exercises supervision study；Above-mentioned training process is repeated, in each key of neural network output When the error of the coordinate information of point meets predictive error requirement, neural metwork training success.

In one possible implementation, each organ key point labeled data in the image pattern in the present embodiment can To be marked by following processes：It is possible, firstly, to determine curve control point (the up/down eyelid of such as eyes of the corresponding organ of face The up/down lip up/down lip line traffic control point etc. of line traffic control point, face)；Secondly, one can be formed according to above-mentioned curve control point Curve；Again, may be used interpolation method (such as uniform interpolation mode or non-homogeneous interpolation method) be inserted into the curve it is more It is a, for example, if the curve is simple eye upper eyelid line or palpebra inferior line, 10-15 (such as 11) a point can be inserted；Example again Such as, if the curve is the upper lip up/down lip line of face, 16-21 (such as 17) a point can be inserted；For another example if should Curve is the lower lip up/down lip line of face, then 16-21 (such as 16) a point can be inserted.The point being inserted into curve is in image Coordinate information in sample is the coordinate information in corresponding organ key point labeled data, the sequence for the point being inserted into curve Number can be converted into the number in the corresponding organ key point labeled data in image pattern.

It should be noted that the quantity for the point being inserted into for a curve in the present embodiment can be true according to actual demand It is fixed.For example, the quantity for being directed to the point that a curve is inserted into needs to ensure：The curve counterpart of the be fitted formation of point through insertion The degree of error of the practical organ curve of face, for the error of the practical organ curve of the opposite face of curve formed by curve control point The 1/5-1/10 of degree.It is formed by go out expressed by organ key point labeled data it follows that the present embodiment is image pattern Shape, can be closer to actual organ shape, to be more advantageous to trained neural network.

In one possible implementation, this method can also include：It is carried from image to be detected by neural network Take eyelid line information, wherein eyelid line information includes the trace information indicated by multiple key points of eyelid.

In one possible implementation, the human face region in image to be detected may include left eye region, right eye area Domain or eyes region.The realization method can be according to the Initial Face key point information of image to be detected, to image to be detected Human face region cut, obtain single eye images block or binocular images block.Single eye images block or binocular images block can conducts The input picture of neural network.

In one possible implementation, eyelid line information may include by simple eye upper eyelid or palpebra inferior The trace information that 10-15 key point indicates.

In one possible implementation, neural network can be the nerve net obtained based on sample data set training Network.May include eye key point labeled data for training the sample data set of neural network.Eye key point labeled data Set-up mode may include：First, the curve control point of eyelid line is determined；Secondly, the first song is formed according to curve control point Line；Finally, multiple points are inserted into first curve using interpolation method, the information of the point of insertion is eye key point mark Data.The degree of error for the relatively true eyelid line of the second curve that point fitting through insertion is formed, is that the first curve is relatively true Eyelid line degree of error 1/5-1/10.

Wherein, the specific descriptions of the related contents such as training process of eyelid line information and neural network may refer to Text is no longer described in detail herein.

In one possible implementation, this method can also include：It is carried from image to be detected by neural network Take lip line information, wherein lip line information includes the trace information indicated by multiple key points of lip outline.

In one possible implementation, the human face region in image to be detected may include upper lip region and/or Lower lip region.The realization method can be according to the Initial Face key point information of image to be detected, to the people of image to be detected Face region is cut, and upper lip image block, lower lip image block or lips image block are obtained.Upper lip image block, lower mouth Lip image block or lips image block can be as the input pictures of neural network.

In one possible implementation, lip line information may include by the respective 16-21 of bottom profiled on single lip The trace information that key point indicates.

In one possible implementation, neural network can be the nerve net obtained based on sample data set training Network.May include lip portion key point labeled data for training the sample data set of neural network.Lip portion key point mark The set-up mode of data may include：First, the curve control point of lip line is determined；Secondly, first is formed according to curve control point Curve；Finally, multiple points are inserted into first curve using interpolation method, the information of the point of insertion is lip portion key point Labeled data.The degree of error for the relatively true lip line of the second curve that point fitting through insertion is formed, is that the first curve is relatively true The 1/5-1/10 of the degree of error of real lip line.

Wherein, the specific descriptions of the related contents such as training process of lip line information and neural network may refer to above, It is no longer described in detail herein.

Fig. 6 shows the block diagram of the human face detection device according to one embodiment of the disclosure.As shown in fig. 6, the device includes：Inspection Module 61 is surveyed, for passing through the face frame at least two different Face datection model inspection image to be detected；First determines Module 62, at least one face frame for being arrived according at least two different Face datection model inspections, determines mapping to be checked The Face datection result of picture.

In one possible implementation, at least two different Face datection models correspond to different facial images Accounting.

Fig. 7 shows an illustrative block diagram of the human face detection device according to one embodiment of the disclosure.As shown in Figure 7：

In one possible implementation, at least one face frame is specially multiple face frames；First determining module 62 Including：Merge submodule 621, the position of multiple face frames for being arrived based at least two Face datection model inspections, to more A face frame merges processing, obtains face frame amalgamation result；Determination sub-module 622 is used for according to face frame amalgamation result, Determine the Face datection result of image to be detected.

In one possible implementation, merge submodule 621 to be used for：If the first face frame in multiple face frames and Registration between second face frame is greater than or equal to predetermined threshold value, then merges the first face frame and the second face frame, It obtains merging face frame.

In one possible implementation, merge submodule 621 to be used for：Face frame will be merged and be determined as the first face frame With a face frame in the second face frame.

In one possible implementation, which further includes：Equalization processing module 63, for pending image Carry out the equalization processing based on gray scale, wherein the equalization processing based on gray scale is used to increase the pixel value of pending image The grey level region being distributed；Second determining module 64, for according to the pending image after equalization processing, determining to be checked Altimetric image.

In one possible implementation, equalization processing module 63 is used for：It is equal into column hisgram to pending image Weighing apparatusization processing.

In one possible implementation, which further includes：Cutting module 65, for the people in image to be detected Face region is cut, and at least one organic image block is obtained；Extraction module 66, for extracting at least one organic image block The key point information of organ.

In one possible implementation, extraction module 66 includes：Input submodule 661 is used at least one device Official's image block is separately input into a few neural network, wherein different organic image blocks correspond to different neural networks；Extraction Submodule 662, the key point of the organ for extracting the organic image block respectively inputted respectively by least one neural network Information.

Fig. 8 is a kind of block diagram of device 800 for Face datection shown according to an exemplary embodiment.For example, dress It can be mobile phone, computer, digital broadcast terminal, messaging devices, game console, tablet device, medical treatment to set 800 Equipment, body-building equipment, personal digital assistant etc..

With reference to Fig. 8, device 800 may include following one or more components：Processing component 802, memory 804, power supply Component 806, multimedia component 808, audio component 810, the interface 812 of input/output (I/O), sensor module 814, and Communication component 816.

The integrated operation of 802 usual control device 800 of processing component, such as with display, call, data communication, phase Machine operates and record operates associated operation.Processing component 802 may include that one or more processors 820 refer to execute It enables, to perform all or part of the steps of the methods described above.In addition, processing component 802 may include one or more modules, just Interaction between processing component 802 and other assemblies.For example, processing component 802 may include multi-media module, it is more to facilitate Interaction between media component 808 and processing component 802.

Memory 804 is configured as storing various types of data to support the operation in device 800.These data are shown Example includes instruction for any application program or method that are operated on device 800, contact data, and telephone book data disappears Breath, picture, video etc..Memory 804 can be by any kind of volatibility or non-volatile memory device or their group It closes and realizes, such as static RAM (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable to compile Journey read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash Device, disk or CD.

Power supply module 806 provides electric power for the various assemblies of device 800.Power supply module 806 may include power management system System, one or more power supplys and other generated with for device 800, management and the associated component of distribution electric power.

Multimedia component 808 is included in the screen of one output interface of offer between described device 800 and user.One In a little embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen Curtain may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touch sensings Device is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding action Boundary, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, more matchmakers Body component 808 includes a front camera and/or rear camera.When device 800 is in operation mode, such as screening-mode or When video mode, front camera and/or rear camera can receive external multi-medium data.Each front camera and Rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.

Audio component 810 is configured as output and/or input audio signal.For example, audio component 810 includes a Mike Wind (MIC), when device 800 is in operation mode, when such as call model, logging mode and speech recognition mode, microphone by with It is set to reception external audio signal.The received audio signal can be further stored in memory 804 or via communication set Part 816 is sent.In some embodiments, audio component 810 further includes a loud speaker, is used for exports audio signal.

I/O interfaces 812 provide interface between processing component 802 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include but be not limited to：Home button, volume button, start button and lock Determine button.

Sensor module 814 includes one or more sensors, and the state for providing various aspects for device 800 is commented Estimate.For example, sensor module 814 can detect the state that opens/closes of device 800, and the relative positioning of component, for example, it is described Component is the display and keypad of device 800, and sensor module 814 can be with 800 1 components of detection device 800 or device Position change, the existence or non-existence that user contacts with device 800,800 orientation of device or acceleration/deceleration and device 800 Temperature change.Sensor module 814 may include proximity sensor, be configured to detect without any physical contact Presence of nearby objects.Sensor module 814 can also include optical sensor, such as CMOS or ccd image sensor, at As being used in application.In some embodiments, which can also include acceleration transducer, gyro sensors Device, Magnetic Sensor, pressure sensor or temperature sensor.

Communication component 816 is configured to facilitate the communication of wired or wireless way between device 800 and other equipment.Device 800 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or combination thereof.In an exemplary implementation In example, communication component 816 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel. In one exemplary embodiment, the communication component 816 further includes near-field communication (NFC) module, to promote short range communication.Example Such as, NFC module can be based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology, Bluetooth (BT) technology and other technologies are realized.

In the exemplary embodiment, device 800 can be believed by one or more application application-specific integrated circuit (ASIC), number Number processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.

In the exemplary embodiment, a kind of non-volatile computer readable storage medium storing program for executing is additionally provided, such as including calculating The memory 804 of machine program instruction, above computer program instruction can be executed above-mentioned to complete by the processor 820 of device 800 Method.

Fig. 9 is a kind of block diagram of device 1900 for Face datection shown according to an exemplary embodiment.For example, dress It sets 1900 and may be provided as a server.With reference to Fig. 9, device 1900 includes processing component 1922, further comprises one Or multiple processors and memory resource represented by a memory 1932, it can holding by processing component 1922 for storing Capable instruction, such as application program.The application program stored in memory 1932 may include one or more each A module for corresponding to one group of instruction.In addition, processing component 1922 is configured as executing instruction, to execute the above method.

Device 1900 can also include that a power supply module 1926 be configured as the power management of executive device 1900, one Wired or wireless network interface 1950 is configured as device 1900 being connected to network and input and output (I/O) interface 1958.Device 1900 can be operated based on the operating system for being stored in memory 1932, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or similar.

In the exemplary embodiment, a kind of non-volatile computer readable storage medium storing program for executing is additionally provided, such as including calculating The memory 1932 of machine program instruction, above computer program instruction can be executed by the processing component 1922 of device 1900 to complete The above method.

The disclosure can be system, method and/or computer program product.Computer program product may include computer Readable storage medium storing program for executing, containing for making processor realize the computer-readable program instructions of various aspects of the disclosure.

Computer readable storage medium can be can keep and store the instruction used by instruction execution equipment tangible Equipment.Computer readable storage medium for example can be-- but be not limited to-- storage device electric, magnetic storage apparatus, optical storage Equipment, electromagnetism storage device, semiconductor memory apparatus or above-mentioned any appropriate combination.Computer readable storage medium More specific example (non exhaustive list) includes：Portable computer diskette, random access memory (RAM), read-only is deposited hard disk It is reservoir (ROM), erasable programmable read only memory (EPROM or flash memory), static RAM (SRAM), portable Compact disk read-only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanical coding equipment, for example thereon It is stored with punch card or groove internal projection structure and the above-mentioned any appropriate combination of instruction.Calculating used herein above Machine readable storage medium storing program for executing is not interpreted that instantaneous signal itself, the electromagnetic wave of such as radio wave or other Free propagations lead to It crosses the electromagnetic wave (for example, the light pulse for passing through fiber optic cables) of waveguide or the propagation of other transmission mediums or is transmitted by electric wire Electric signal.

Computer-readable program instructions as described herein can be downloaded to from computer readable storage medium it is each calculate/ Processing equipment, or outer computer or outer is downloaded to by network, such as internet, LAN, wide area network and/or wireless network Portion's storage device.Network may include copper transmission cable, optical fiber transmission, wireless transmission, router, fire wall, interchanger, gateway Computer and/or Edge Server.Adapter or network interface in each calculating/processing equipment are received from network to be counted Calculation machine readable program instructions, and the computer-readable program instructions are forwarded, for the meter being stored in each calculating/processing equipment In calculation machine readable storage medium storing program for executing.

For execute the disclosure operation computer program instructions can be assembly instruction, instruction set architecture (ISA) instruction, Machine instruction, machine-dependent instructions, microcode, firmware instructions, condition setup data or with one or more programming languages Arbitrarily combine the source code or object code write, the programming language include the programming language-of object-oriented such as Smalltalk, C++ etc., and conventional procedural programming languages-such as " C " language or similar programming language.Computer Readable program instructions can be executed fully, partly execute on the user computer, is only as one on the user computer Vertical software package executes, part executes or on the remote computer completely in remote computer on the user computer for part Or it is executed on server.In situations involving remote computers, remote computer can pass through network-packet of any kind It includes LAN (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as profit It is connected by internet with ISP).In some embodiments, by using computer-readable program instructions Status information carry out personalized customization electronic circuit, such as programmable logic circuit, field programmable gate array (FPGA) or can Programmed logic array (PLA) (PLA), the electronic circuit can execute computer-readable program instructions, to realize each side of the disclosure Face.

Referring herein to according to the flow chart of the method, apparatus (system) of the embodiment of the present disclosure and computer program product and/ Or block diagram describes various aspects of the disclosure.It should be appreciated that flowchart and or block diagram each box and flow chart and/ Or in block diagram each box combination, can be realized by computer-readable program instructions.

These computer-readable program instructions can be supplied to all-purpose computer, special purpose computer or other programmable datas The processor of processing unit, to produce a kind of machine so that these instructions are passing through computer or other programmable datas When the processor of processing unit executes, work(specified in one or more of implementation flow chart and/or block diagram box is produced The device of energy/action.These computer-readable program instructions can also be stored in a computer-readable storage medium, these refer to It enables so that computer, programmable data processing unit and/or other equipment work in a specific way, to be stored with instruction Computer-readable medium includes then a manufacture comprising in one or more of implementation flow chart and/or block diagram box The instruction of the various aspects of defined function action.

Computer-readable program instructions can also be loaded into computer, other programmable data processing units or other In equipment so that series of operation steps are executed on computer, other programmable data processing units or miscellaneous equipment, with production Raw computer implemented process, so that executed on computer, other programmable data processing units or miscellaneous equipment Instruct function action specified in one or more of implementation flow chart and/or block diagram box.

Flow chart and block diagram in attached drawing show the system, method and computer journey of multiple embodiments according to the disclosure The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation One module of table, program segment or a part for instruction, the module, program segment or a part for instruction include one or more use The executable instruction of the logic function as defined in realization.In some implementations as replacements, the function of being marked in box It can occur in a different order than that indicated in the drawings.For example, two continuous boxes can essentially be held substantially in parallel Row, they can also be executed in the opposite order sometimes, this is depended on the functions involved.It is also noted that block diagram and/or The combination of each box in flow chart and the box in block diagram and or flow chart can use function or dynamic as defined in executing The dedicated hardware based system made is realized, or can be realized using a combination of dedicated hardware and computer instructions.

The presently disclosed embodiments is described above, above description is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes will be apparent from for the those of ordinary skill in art field.The selection of term used herein, purport In the principle, practical application or technological improvement to the technology in market for best explaining each embodiment, or this technology is made to lead Other those of ordinary skill in domain can understand each embodiment disclosed herein.

Claims

1. a kind of method for detecting human face, which is characterized in that including：

According at least one face frame that described at least two different Face datection model inspections arrive, the mapping to be checked is determined The Face datection result of picture.

2. according to the method described in claim 1, it is characterized in that, described at least two different Face datection models correspond to Different facial image accountings.

3. according to the method described in claim 1, it is characterized in that, at least one face frame is specially multiple face frames；

At least one face frame arrived according at least two Face datections model inspection, determines the people of described image to be detected Face testing result, including：

Position based on multiple face frames that at least two Face datections model inspection arrives carries out the multiple face frame Merging treatment obtains face frame amalgamation result；

4. according to the method described in claim 3, it is characterized in that, being arrived based on at least two Face datections model inspection The position of multiple face frames merges processing to the multiple face frame, obtains face frame amalgamation result, including：

If the registration between the first face frame and the second face frame in the multiple face frame is greater than or equal to predetermined threshold value, Then the first face frame and the second face frame are merged, obtain merging face frame.

5. a kind of human face detection device, which is characterized in that including：

First determining module, at least one face for being arrived according to described at least two different Face datection model inspections Frame determines the Face datection result of described image to be detected.

6. device according to claim 5, which is characterized in that described at least two different Face datection models correspond to Different facial image accountings.

7. device according to claim 5, which is characterized in that at least one face frame is specially multiple face frames；

First determining module includes：

Merge submodule, the position of multiple face frames for being arrived based on at least two Face datections model inspection, to institute It states multiple face frames and merges processing, obtain face frame amalgamation result；

Determination sub-module, for according to the face frame amalgamation result, determining the Face datection result of described image to be detected.

8. device according to claim 7, which is characterized in that the merging submodule is used for：

9. a kind of electronic equipment, which is characterized in that including：

Processor；

Memory for storing processor-executable instruction；

Wherein, the processor is configured as the method described in any one of perform claim requirement 1 to 4.

10. a kind of computer readable storage medium, is stored thereon with computer program instructions, which is characterized in that the computer The method described in any one of Claims 1-4 is realized when program instruction is executed by processor.