CN108280388A

CN108280388A - The method and apparatus and type of face detection method and device of training face detection model

Info

Publication number: CN108280388A
Application number: CN201710010709.3A
Authority: CN
Inventors: 贾晓飞; 刘汝杰
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2017-01-06
Filing date: 2017-01-06
Publication date: 2018-07-13

Abstract

The method and apparatus and type of face detection method and device for disclosing a kind of trained face detection model, wherein the method for training face detection model includes：At least two region recognition models are trained, each region recognition model in at least two region recognitions model is associated with a different parts of face, may belong to the region at position associated with the region recognition model in input picture for identification.In accordance with an embodiment of the present disclosure, face detection can rapidly and accurately be carried out.

Description

The method and apparatus and type of face detection method and device of training face detection model

Technical field

This disclosure relates to image processing field, and in particular to can rapidly and accurately carry out the training face of face detection The method and apparatus and type of face detection method and device of detection model.

Background technology

Face detection technique is one of important research content of computer vision and the skills such as face recognition, face tracking One of committed step of art.It is all achieved in view of convolutional neural networks (CNN) when solving the problems, such as that computer vision field is multiple Good effect, in recent years, people are attempted to carry out face detection using CNN models.Face detection based on CNN is broadly divided into two A step：First, extracting, there may be the candidate regions of face；Second is that judging that the candidate region of face whether there is face using CNN Portion.Fig. 1 shows the Structure and Process example that Face datection is carried out based on convolutional neural networks.As shown in Figure 1, based on CNN into In the algorithm of row Face datection, face candidate region is usually firstly generated；Original image is input to convolutional layer again, generates input The characteristic pattern of image；The corresponding feature graph region in face candidate region is input to full articulamentum, and then judges the candidate region With the presence or absence of face, to generate Face datection result.The generating mode in face candidate region is mainly used in original graph at present As it is upper or on characteristic pattern corresponding with original image sliding window, then judge whether the window is face location candidate regions Domain.Fig. 2 is to show that in the prior art sliding window judges the exemplary figure of face location candidate region on the original image.Such as Shown in Fig. 2, original image is divided into several image blocks, then judges whether each image block is face location candidate regions successively Domain.Fig. 3 is to show that sliding window judges face location candidate regions on the characteristic pattern corresponding with original image in the prior art The exemplary figure in domain.As shown in figure 3, original image is input to convolutional network, on the characteristic pattern generated by convolutional network Sliding window operation is carried out, to judge face candidate region.Both modes are time-consuming all longer, cause to generate face candidate area The time in domain judges that candidate region whether there is face the time it takes much larger than CNN.

Invention content

The brief overview about the disclosure is given below, in order to provide the basic of some aspects about the disclosure Understand.It is understood, however, that this general introduction is not the exhaustive general introduction about the disclosure.It is not intended to for determining The critical component or pith of the disclosure, nor being intended to limit the scope of the present disclosure.Its purpose is only with letter The form of change provides certain concepts about the disclosure, in this, as preamble in greater detail given later.

In view of problem above, purpose of this disclosure is to provide a kind of training that can rapidly and accurately carry out face detection The method and apparatus and type of face detection method and device of face detection model.

According to the one side of the disclosure, a kind of method of trained face detection model is provided, including：Training at least two Region recognition model, a different parts of each region recognition model and face in at least two region recognitions model It is associated, it can be used for identifying the region that may belong to position associated with the region recognition model in input picture.

According to another aspect of the present disclosure, a kind of device of trained face detection model is provided, including：Training unit, It is configured at least two region recognition models of training, each region recognition model in at least two region recognitions model Associated with a different parts of face, can be used for identifying in input picture may belong to related to the region recognition model The region at the position of connection.

According to the another aspect of the disclosure, a kind of type of face detection method is provided, including：To input picture application respectively with The associated at least two presumptive areas identification model of a different parts of face, can so as to be identified from input picture The region at position associated with each presumptive area identification model can be belonged to；According to about face structure priori and portion The region of position identified, can obtain the candidate region of the face to be detected in input picture；And use detection model With the presence or absence of face in candidate region to judge face to be detected, the area present in face to be detected whereby it can be detected that Domain.

According to the other aspects of the disclosure, additionally provide for realizing the above-mentioned computer program according to disclosed method Code and computer program product and thereon record have this for realizing the above-mentioned computer program according to disclosed method The computer readable storage medium of code.

The other aspects of the embodiment of the present disclosure are provided in following specification part, wherein be described in detail for abundant Ground discloses the preferred embodiment of the embodiment of the present disclosure, without applying restriction to it.

Description of the drawings

The disclosure can by reference to being better understood below in association with the detailed description given by attached drawing, wherein Same or analogous reference numeral has been used in all the appended drawings to indicate same or similar component.The attached drawing is together under The detailed description in face includes in the present specification and to form part of specification together, for the disclosure is further illustrated Preferred embodiment and explain the disclosure principle and advantage.Wherein：

Fig. 1 shows the Structure and Process example that Face datection is carried out based on convolutional neural networks；

Fig. 2 is to show that in the prior art sliding window judges the exemplary of face location candidate region on the original image Figure；

Fig. 3 be show in the prior art on characteristic pattern corresponding with original image sliding window come judge face location wait The exemplary figure of favored area；

Fig. 4 is the flow of the flow example for the method for showing trained face detection model according to an embodiment of the present disclosure Figure；

Fig. 5 is to show to be split facial according to the location information according to face key point of the embodiment of the present disclosure The exemplary figure of mark；

Fig. 6 be show according to the embodiment of the present disclosure at least one layer of each region recognition model using it is multiple not With the exemplary figure of the filter of size；

Fig. 7 is the frame of the functional configuration example for the device for showing trained face detection model according to an embodiment of the present disclosure Figure；

Fig. 8 is the flow chart for the flow example for showing type of face detection method according to an embodiment of the present disclosure；

Fig. 9 is schematically shown according to the initial baseline region of the embodiment of the present disclosure and in the benchmark of different directions The exemplary figure in region；

Figure 10 schematically shows the structure of the generation of the candidate region of the face to be detected according to the embodiment of the present disclosure The example of flow；

Whether Figure 11 schematically shows in the candidate region for judging face to be detected according to the embodiment of the present disclosure and deposits In the example of the Structure and Process of face；

Figure 12 is the block diagram for the functional configuration example for showing face detection means according to an embodiment of the present disclosure；And

Figure 13 is the example for being shown as the personal computer of adoptable information processing unit in embodiment of the disclosure The block diagram of structure.

Specific implementation mode

The exemplary embodiment of the disclosure is described hereinafter in connection with attached drawing.For clarity and conciseness, All features of actual implementation mode are not described in the description.It should be understood, however, that developing any this actual implementation Much decisions specific to embodiment must be made during example, to realize the objectives of developer, for example, symbol Restrictive condition those of related to system and business is closed, and these restrictive conditions may have with the difference of embodiment Changed.In addition, it will also be appreciated that although development is likely to be extremely complex and time-consuming, to having benefited from the disclosure For those skilled in the art of content, this development is only routine task.

Herein, it is also necessary to which explanation is a bit, in order to avoid having obscured the disclosure because of unnecessary details, in the accompanying drawings It illustrate only with according to the closely related device structure of the scheme of the disclosure and/or processing step, and be omitted and the disclosure The little other details of relationship.

According to the one side of the disclosure, it is proposed that a kind of method of trained face detection model, the training face detection mould The method of type can train at least two region recognitions model associated with a different parts of face respectively, to utilize Stating at least two region recognition models and identifying fast automaticly may belong to related to each region recognition model in input picture The region at the position of connection.

It is described in detail below in conjunction with the accompanying drawings in accordance with an embodiment of the present disclosure.

First, it will describe to be shown according to the flow of the method 400 of the training face detection model of the embodiment of the present disclosure with reference to Fig. 4 Example.Fig. 4 is the flow chart of the flow example for the method 400 for showing trained face detection model according to an embodiment of the present disclosure. As shown in figure 4, the method 400 of trained face detection model according to an embodiment of the present disclosure includes training step S402.

First, in training step S402, at least two region recognition models of training, at least two region recognitions mould Each region recognition model in type is associated with a different parts of face, and can be used for identifying in input picture may belong to Region in position associated with the region recognition model.

Face can be divided into different positions such as left eye, right eye, left cheek, right cheek, nose, mouth and chin Deng.In training step S402, according to the method 400 of the training face detection model of the embodiment of the present disclosure can train respectively with The associated at least two region recognitions model of said one different parts of face, to utilize at least two region recognition Model quickly identifies the region that may belong to position associated with each region recognition model in input picture.

Preferably, each region recognition model can be respectively trained based on the tab area location information at the position of face. I.e., it is possible to be split to each position of face by facial key point location information, and use the mark at each position of face Each region recognition model is respectively trained in zone position information.

Specifically, selection exist facial key point position markup information (the key point number of mark can be 5,13,21, 68 etc.) face data seeks at least one square for including all specific position key points according to the location information of facial key point Shape frame (including the tab area location information at a facial position in each rectangle frame), at least one rectangle frame can incite somebody to action Face is divided into several parts, to complete the segmentation mark of facial.And it is possible to use the mark at each position of face Each region recognition model is respectively trained in note zone position information.Include the tab area at a facial position with rectangle frame above Location information is only example, and it includes the tab area location information at a facial position that can also use ellipse etc..For convenience, It is described by taking rectangle frame as an example below.

Fig. 5 is to show to be split facial according to the location information according to face key point of the embodiment of the present disclosure The exemplary figure of mark.As shown in figure 5, one co-exists in 68 facial key point position markup informations, it is crucial according to these faces The location information of point, can obtain 7 rectangle frames, this 7 rectangle frames respectively include being located at left eye, right eye, left cheek, right face The key point of cheek, nose, mouth and chin area.It is then possible to use the tab area location information in this 7 rectangle frames point Each region recognition model that Xun Lian be in 7 region recognition models.

Preferably, each region recognition model is used to generate characteristic pattern according to input picture, and this feature figure can indicate defeated Enter the possibility that the pixel in image belongs to position associated with the region recognition model.

Specifically, a region recognition model is respectively trained using the facial different parts of mark.Each region recognition mould The input of type is a secondary position mark figure for including complete original image and corresponding facial.The wherein position of facial Mark figure is a secondary and equal-sized bianry image of original image size, and corresponding face area is labeled as 1, face other Position and non-face zone marker are 0.Each region recognition model will export a characteristic pattern, and the point in characteristic pattern indicates input Pixel in image belongs to the possibility at position associated with the region recognition model.To each region recognition model training When, in order to improve the precision of region recognition model, it is desirable that facial corresponding with the region recognition model is in characteristic pattern Response of the response much larger than other facial positions or non-face.It (can be empirically determined it is possible to further given threshold The threshold value) to carry out binarization operation to response diagram (i.e. characteristic pattern), to obtain the response region at facial each position. That is, if the pixel in input picture belongs to position associated with the region recognition model, in this feature figure with the pixel The point of corresponding position is labeled as 1, and if the pixel in input picture is not belonging to portion associated with the region recognition model Position is then labeled as 0 in this feature figure with the point of the pixel corresponding position, then when to each region recognition model training, It is required that the characteristic pattern and position mark figure that are generated are as similar as possible, the characteristic pattern generated in this way can accurately indicate that institute is right The facial region answered.

Preferably, each region recognition model may include multiple layers, can be adopted at least one of multiple layers layer With different size of multiple filters, and the response of multiple filters can be combined to obtained result as next The input of layer, to improve the possibility that the pixel in characteristic pattern reflection input picture belongs to position associated with region recognition model The accuracy of property.

Specifically, in order to improve the precision of region recognition model, each region recognition model may include multiple layers, more Different size of multiple filters, such as the filtering of 1 × 1,3 × 3,5 × 5 sizes may be used at least one of a layer layer Device.It illustrates and unrestricted, the response of multiple filters can be carried out to Polynomial combination, and the result of Polynomial combination is made For next layer of input, to improve the accuracy that characteristic pattern reflects the possibility.It, can be with other than Polynomial combination Other combinations are carried out to the response of multiple filters, are described again here.

Illustrate and it is unrestricted, each region recognition model can be full convolutional network.In addition, each region recognition model is also Can be other models other than full convolutional network.

Below for the convenience of description, describing to know in each region so that each region recognition model is full convolutional network as an example Using the example of the filter of multiple and different sizes at least one layer of other model.Fig. 6 is shown according to the embodiment of the present disclosure In at least one layer of each region recognition model using multiple and different sizes filter exemplary figure.In figure 6, For the convenience of description, assume that the input of full convolutional network is that right cheek marks image, in the mark image, filled with black Rectangle frame indicates the right cheek region of the face in original image.As shown in fig. 6, full convolutional network includes n convolutional layer, that is, roll up Lamination 1, convolutional layer 2 ..., convolutional layer n, wherein n is integer more than 1.By taking convolutional layer 1 as an example, input is input to 3 The filter of a different size of filter i.e. 1 × 1,3 × 3,5 × 5, and the progress of the response results of this 3 filters is more Item formula combination (α, β and γ in Fig. 6 are respectively weighting coefficient, and α, β and γ can be empirically determined respectively) is to obtain It is exported to new response results, then using the output as the input of convolutional layer 2.As shown in fig. 6, the full convolutional network is defeated It is the characteristic pattern after the corresponding binaryzation of original image to go out (i.e. the output of convolutional layer n), in the characteristic pattern after the binaryzation, with The value of right cheek corresponding region is 1 (being indicated with black), and region corresponding with other positions in face and non-face is 0 (use White indicates).From fig. 6 it can be seen that training the full convolutional network with the right cheek region of face so that this after training is complete Convolutional network can identify the region that may belong to right cheek in input picture.

In order to further speed up the training speed of region recognition model, the part ginseng of different zones identification model can be shared Number.It illustrates and unrestricted, in the case where each region recognition model is full convolutional network, different full convolutional networks can be shared Identical layer parameter, for example, the preceding layer for the convolutional layer that full convolutional network includes or preceding two layers of parameter can be shared.

In conclusion the method 400 of trained face detection model according to an embodiment of the present disclosure need not be in original graph Sliding window can identify and may belong in input picture and region recognition model phase on the characteristic pattern of picture or original image The region at associated position, so as to carry out quick face detection.In addition, training face inspection according to an embodiment of the present disclosure The each region recognition model surveyed in the method 400 of model is associated with a different parts of face, therefore can carry out more Accurate face detection.

With the embodiment of the method for above-mentioned trained face detection model correspondingly, the disclosure additionally provide it is following training face The embodiment of the device of detection model.

Fig. 7 is the functional configuration example for the device 700 for showing trained face detection model according to an embodiment of the present disclosure Block diagram.

As shown in fig. 7, the device 700 of trained face detection model according to an embodiment of the present disclosure may include that training is single Member 702.It is described below the functional configuration example of the unit.

In training unit 702, at least two region recognition models of training, in at least two region recognitions model Each region recognition model is associated with a different parts of face, and can be used for identifying may belong to and be somebody's turn to do in input picture The region at the associated position of region recognition model.

Face can be divided into different positions such as left eye, right eye, left cheek, right cheek, nose, mouth and chin Deng.In training unit 702, according to the device 700 of the training face detection model of the embodiment of the present disclosure can train respectively with The associated at least two region recognitions model of said one different parts of face, to utilize at least two region recognition Model quickly identifies the region that may belong to position associated with each region recognition model in input picture.

Specifically, there is the face data of facial key point position markup information in selection, according to the position of facial key point Information, it (includes one position of face in each rectangle frame to seek at least one rectangle frame comprising all specific position key points Tab area location information), face can be divided into several parts by least one rectangle frame, to complete facial portion The segmentation mark of position.And it is possible to which each region recognition is respectively trained using the tab area location information at each position of face Model.

Based on the tab area location information at the position of face come the example of each region recognition model is respectively trained can be with Referring to the description of corresponding position in above method embodiment, it is not repeated herein.

The example that characteristic pattern is generated according to input picture may refer to the description of corresponding position in above method embodiment, This is not repeated.

The example of filter at least one layer of each region recognition model using multiple and different sizes can join The description for seeing corresponding position in above method embodiment, is not repeated herein.

In conclusion the device 700 of trained face detection model according to an embodiment of the present disclosure need not be in original graph Sliding window can identify and may belong in input picture and region recognition model phase on the characteristic pattern of picture or original image The region at associated position, so as to carry out quick face detection.In addition, training face inspection according to an embodiment of the present disclosure The each region recognition model surveyed in the device 700 of model is associated with a different parts of face, therefore can carry out more Accurate face detection.

It is noted that although the foregoing describe the functions of the device of trained face detection model according to an embodiment of the present disclosure Configuration, but this is only exemplary rather than limitation, and those skilled in the art can be according to the principle of the disclosure to above example It modifies, such as the function module in each embodiment can be added, deleted or be combined, and such modification It each falls in the scope of the present disclosure.

It is furthermore to be noted that device embodiment here is corresponding with above method embodiment, therefore in device reality The description that the content not being described in detail in example can be found in corresponding position in embodiment of the method is applied, is not repeated to describe herein.

It should be understood that the instruction that the machine in storage medium and program product according to an embodiment of the present disclosure can perform may be used also To be configured to the method for executing above-mentioned trained face detection model, the content that therefore not described in detail here can refer to previous phase The description of position is answered, is not repeated to be described herein.

Correspondingly, the storage medium of the program product for carrying the above-mentioned instruction that can perform including machine is also included within this In the disclosure of invention.The storage medium includes but not limited to floppy disk, CD, magneto-optic disk, storage card, memory stick etc..

According to another aspect of the present disclosure, a kind of type of face detection method is provided, which can quickly simultaneously Accurately carry out face detection.

The flow example of type of face detection method 800 according to an embodiment of the present disclosure is described next, with reference to Fig. 8.Fig. 8 is The flow chart of the flow example of type of face detection method 800 according to an embodiment of the present disclosure is shown.As shown in figure 8, according to this public affairs The type of face detection method 800 for the embodiment opened includes identification region step S802, obtains candidate region step S804 and judgement Step S806.

It, can be associated with a different parts of face respectively to input picture application in identification region step S802 At least two presumptive area identification models, so as to identified from input picture may belong to and each presumptive area identify The region at the associated position of model.

It illustrates and unrestricted, at least two presumptive areas identification model can be using according to the embodiment of the present disclosure The region recognition model that the method 400 of training face detection model obtains, more specifically, each presumptive area identification model can be with It is the full convolutional network for utilizing the method 400 according to the training face detection model of the embodiment of the present disclosure to obtain.In addition, it is described extremely Few two presumptive area identification models can also be the region recognition model obtained using other methods, as long as described at least two Each presumptive area identification model in presumptive area identification model is associated with a different parts of face respectively and uses In identifying the region that may belong to position associated with the presumptive area identification model in input picture.

Specifically, in identification region step S802, pass through each of described at least two presumptive areas identification model Presumptive area identification model obtains characteristic pattern of the input picture under the presumptive area identification model, and (threshold value can for given threshold With empirically determined) come to characteristic pattern carry out binarization operation, to obtain in face with the presumptive area identification model phase The response region at associated position.According to the position correspondence of characteristic pattern and input picture, so that it may to obtain and these responses Region in the corresponding input picture in region, i.e., position associated with the presumptive area identification model in the face of input picture The band of position.

It can be seen that since at least two presumptive areas identification model is related to a different parts of face respectively Connection, therefore can be from input picture using each presumptive area identification model in at least two presumptive areas identification model The region at position associated with each presumptive area identification model may be belonged to by automatically identifying.

Preferably, the ginseng in the different zones identification model in at least two presumptive areas identification model can be shared Number, to accelerate the speed of identification.Illustrate and it is unrestricted, in the case where each presumptive area identification model is full convolutional network, The parameter that the identical layer of different full convolutional networks can be shared, for example, the convolutional layer that full convolutional network includes can be shared Preceding layer or preceding two layers of parameter, to accelerate the speed of identification.

In obtain candidate region step S804, can according to about face structure priori and position identified The region gone out obtains the candidate region of the face to be detected in input picture.

It specifically, can be according to related facial and facial overall region in obtaining candidate region step S804 It the priori of relative size and relative position relation and is identified using at least two presumptive areas identification model Position region, estimate in input picture there may be face region face i.e. to be detected candidate region.

Preferably, the candidate region of the face to be detected in input picture can be obtained in the following manner：It can be based on Relative size and the priori of relative position relation about facial and facial overall region and being identified for position The relative position of face to be detected is estimated to obtain the initial baseline region of face to be detected, and can be incited somebody to action in the region gone out Candidate region of the initial baseline region as face to be detected；And initial baseline region can be rotated successively predetermined angular with The reference area in different directions is obtained, and can be by the reference area in different directions also as face to be detected Candidate region.

It specifically, can be with for the region at a position of the face identified using each presumptive area identification model Priori based on relative size and relative position relation about the facial and facial overall region and utilize institute The region at the position identified estimates the relative position of face to be detected to obtain the initial baseline area of face to be detected Domain；Since actual scene septum reset is likely to be present in all angles, which is rotated into special angle to obtain Obtain the reference area in different directions.The initial baseline that the region at the position based on each face identified can be obtained Region and candidate region of the reference area all as face to be detected in different directions, it is possible thereby to more accurately carry out face It detects in portion.Wherein it is possible to rule of thumb determine above-mentioned predetermined angular.

Reference area is rotated into special angle to schematically show by taking a position of the face identified as an example below Example.Fig. 9 is schematically shown according to the initial baseline region of the embodiment of the present disclosure and in the benchmark of different directions The exemplary figure in region.Assuming that with the rectangle frame of filled black being the face identified using presumptive area identification model in a of Fig. 9 The region of the right cheek in portion.Based on about right cheek position and the facial relative size of overall region and the elder generation of relative position relation The region at the right cheek position tested knowledge and identified, it is estimated that the relative position of face to be detected is to obtain The reference area of face to be detected for the clearness of diagram, does not draw the accurate reference area in fig.9, but with The right cheek region identified represents the reference area.Being respectively illustrated with the rectangle frame of filled black in b to the h of Fig. 9 will be right Cheek region rotates the right cheek region in different directions obtained after predetermined angular (since right cheek region represents base successively Quasi- region, therefore the right cheek region in different directions represents the reference area in different directions).

As described above, each presumptive area identification model can be full convolutional network.Figure 10 schematically shows basis The example of the Structure and Process of the generation of the candidate region of the face to be detected of the embodiment of the present disclosure.In Fig. 10, in the first row most The image in left side is the input picture for including face to be detected；Input picture is separately input to full convolutional network 1, full convolution net Network 2 ... and full convolutional network n (wherein, n is the integer more than 1), input picture can be obtained in different full convolutional networks Response results, that is, the characteristic response region that can obtain position associated with each full convolutional network in input picture is Simplification, irregular figure shown in second image of the first row schematically shows six of face in Figure 10 Characteristic response region corresponding to position, that is, left eye, right eye, left cheek, right cheek, nose and chin；According to features described above Position correspondence between response region and input picture can obtain the corresponding input picture region of these response regions i.e. Region where the left eye of input picture septum reset, right eye, left cheek, right cheek, nose and chin, in Fig. 10 first Schematically shown in third image in row the left eye of input picture septum reset, right eye, left cheek, right cheek, nose, And the region where chin；In addition, being schematically shown in Fig. 10 based on obtained above-mentioned in the image of the second row The reference area for the face to be detected that the region at position and priori about face structure obtain respectively, here for letter It is single, the reference area in different directions is not shown；In addition, in order to intuitively show the candidate region of face to be detected, Schematically the reference area of above-mentioned face to be detected is indicated together in input figure in the 4th image of the first row in Figure 10 As upper.

The mode of the candidate region of face to be detected in input picture achieved above is only example, can also pass through ability Other technologies means in domain obtain the candidate region of the face to be detected in input picture, are described again here.

In judgment step S806, it can judge to whether there is in the candidate region of face to be detected using detection model Face, to detect the region present in face to be detected.

Specifically, in judgment step S806, can judge obtaining candidate region step S804 using detection model With the presence or absence of face in the candidate region of the face to be detected of middle acquisition, to detect the region present in face to be detected.

Illustrate and it is unrestricted, detection model can be convolutional neural networks.In addition, detection model can also be except convolution god Through other models except network.

Preferably, in judgment step S806, can judge in the following manner be in the candidate region of face to be detected It is no to there is face：Input picture is input to the convolutional layer of convolutional neural networks, generates the characteristic pattern of input picture；Will with it is to be checked The region for surveying the corresponding characteristic pattern in candidate region of face is input to the full articulamentum of convolutional neural networks；And based on described defeated The region for entering the characteristic pattern and characteristic pattern corresponding with the candidate region of face to be detected of image, judges the candidate of face to be detected With the presence or absence of face in region.

Whether Figure 11 schematically shows in the candidate region for judging face to be detected according to the embodiment of the present disclosure and deposits In the example of the Structure and Process of face.

In fig. 11, the image of the leftmost side is the input picture for including face to be detected in the second row, by the input picture It is input to the convolutional layer of convolutional neural networks, generates the characteristic pattern of input picture.The diagram of the first row is substantially and Figure 10 in Figure 11 The first row diagram it is identical, be described again here.The region surrounded with white box in the third image of the first row in Figure 11 Schematically show the candidate region of face to be detected.By characteristic pattern corresponding with the above-mentioned candidate region of face to be detected Region is input to the full articulamentum of convolutional neural networks；May then based on input picture characteristic pattern and with it is to be detected face The region of the corresponding characteristic pattern in candidate region judges in the candidate region of face to be detected with the presence or absence of face, the third of Figure 11 The region surrounded with white box in capable image indicates the face that detected.

In conclusion type of face detection method 800 according to an embodiment of the present disclosure need not be in original image or original Sliding window can filter out the candidate region of face on the characteristic pattern of image, so as to carry out quick face detection.This Outside, each presumptive area identification model in type of face detection method 800 according to an embodiment of the present disclosure with face one not It is associated with position, therefore more accurate face detection can be carried out.

With above-mentioned type of face detection method embodiment correspondingly, the disclosure additionally provides the implementation of following face detection means Example.

Figure 12 is the block diagram for the functional configuration example for showing face detection means 1200 according to an embodiment of the present disclosure.

As shown in figure 12, face detection means 1200 according to an embodiment of the present disclosure may include identification region unit 1202, candidate region unit 1204 and judging unit 1206 are obtained.It is described below the functional configuration example of each unit.

It, can be associated with a different parts of face respectively to input picture application in identification region unit 1202 At least two presumptive area identification models, so as to identified from input picture may belong to and each presumptive area identify The region at the associated position of model.

Specifically, in identification region unit 1202, pass through each of described at least two presumptive areas identification model Presumptive area identification model obtains characteristic pattern of the input picture under the presumptive area identification model, and (threshold value can for given threshold With empirically determined) come to characteristic pattern carry out binarization operation, to obtain in face with the presumptive area identification model phase The response region at associated position.According to the position correspondence of characteristic pattern and input picture, so that it may to obtain and these responses Region in the corresponding input picture in region, i.e., position associated with the presumptive area identification model in the face of input picture The band of position.

In obtain candidate region unit 1204, can according to about face structure priori and position identified The region gone out obtains the candidate region of the face to be detected in input picture.

It specifically, can be according to related facial and facial overall region in obtaining candidate region unit 1204 It the priori of relative size and relative position relation and is identified using at least two presumptive areas identification model Position region, estimate in input picture there may be face region face i.e. to be detected candidate region.

The example of the generation of the candidate region of face to be detected may refer to retouching for corresponding position in above method embodiment It states, is not repeated herein.

In judging unit 1206, it can judge to whether there is in the candidate region of face to be detected using detection model Face, to detect the region present in face to be detected.

Specifically, in judging unit 1206, can judge obtaining candidate region unit 1204 using detection model With the presence or absence of face in the candidate region of the face to be detected of middle acquisition, to detect the region present in face to be detected.

Preferably, in judging unit 1206, can judge in the following manner be in the candidate region of face to be detected It is no to there is face：Input picture is input to the convolutional layer of convolutional neural networks, generates the characteristic pattern of input picture；Will with it is to be checked The region for surveying the corresponding characteristic pattern in candidate region of face is input to the full articulamentum of convolutional neural networks；And based on described defeated The region for entering the characteristic pattern and characteristic pattern corresponding with the candidate region of face to be detected of image, judges the candidate of face to be detected With the presence or absence of face in region.

Judge to may refer in above method embodiment with the presence or absence of the example of face in the candidate region of face to be detected The description of corresponding position, is not repeated herein.

In conclusion face detection means 1200 according to an embodiment of the present disclosure need not be in original image or original Sliding window can filter out the candidate region of face on the characteristic pattern of image, so as to carry out quick face detection.This Outside, each presumptive area identification model in face detection means 1200 according to an embodiment of the present disclosure with face one not It is associated with position, therefore more accurate face detection can be carried out.

It is noted that although the foregoing describe the functional configuration of face detection means according to an embodiment of the present disclosure, This is only exemplary rather than limitation, and those skilled in the art can modify to above example according to the principle of the disclosure, Such as the function module in each embodiment can be added, deleted or be combined, and such modification each falls within this In scope of disclosure.

It should be understood that the instruction that the machine in storage medium and program product according to an embodiment of the present disclosure can perform may be used also To be configured to execute above-mentioned type of face detection method, the content that therefore not described in detail here can refer to retouching for previous corresponding position It states, is not repeated to be described herein.

In addition, it should also be noted that above-mentioned series of processes and device can also be realized by software and/or firmware. In the case of being realized by software and/or firmware, from storage medium or network to the computer with specialized hardware structure, such as Shown in Figure 13 general purpose personal computer 1300 installation constitute the software program, the computer when being equipped with various programs, It is able to carry out various functions etc..

In fig. 13, central processing unit (CPU) 1301 according to the program stored in read-only memory (ROM) 1302 or from The program that storage section 1308 is loaded into random access memory (RAM) 1303 executes various processing.In RAM 1303, also root The data required when CPU 1301 executes various processing etc. are stored according to needs.

CPU 1301, ROM 1302 and RAM 1303 are connected to each other via bus 1304.Input/output interface 1305 also connects It is connected to bus 1304.

Components described below is connected to input/output interface 1305：Importation 1306, including keyboard, mouse etc.；Output par, c 1307, including display, such as cathode-ray tube (CRT), liquid crystal display (LCD) etc. and loud speaker etc.；Storage section 1308, including hard disk etc.；With communications portion 1309, including network interface card such as LAN card, modem etc..Communications portion 1309 execute communication process via network such as internet.

As needed, driver 1310 is also connected to input/output interface 1305.The such as disk of detachable media 1311, CD, magneto-optic disk, semiconductor memory etc. are installed on driver 1310 as needed so that the computer read out Program is mounted to as needed in storage section 1308.

It is such as removable from network such as internet or storage medium in the case of series of processes above-mentioned by software realization Unload the program that the installation of medium 1311 constitutes software.

It will be understood by those of skill in the art that this storage medium is not limited to wherein be stored with journey shown in Figure 13 Sequence is separately distributed with equipment to provide a user the detachable media 1311 of program.The example packet of detachable media 1311 Containing disk (include floppy disk (registered trademark)), CD (including compact disc read-only memory (CD-ROM) and digital versatile disc (DVD)), Magneto-optic disk (including mini-disk (MD) (registered trademark)) and semiconductor memory.Alternatively, storage medium can be ROM 1302, deposit The hard disk etc. for including in storage part 1308, wherein computer program stored, and user is distributed to together with the equipment comprising them.

Preferred embodiment of the present disclosure is described above by reference to attached drawing, but the disclosure is certainly not limited to above example.This Field technology personnel can obtain various changes and modifications within the scope of the appended claims, and should be understood that these changes and repair Changing nature will fall into scope of the presently disclosed technology.

For example, can be realized in the embodiment above by the device separated including multiple functions in a unit. As an alternative, the multiple functions of being realized in the embodiment above by multiple units can be realized by the device separated respectively.In addition, with One of upper function can be realized by multiple units.Needless to say, such configuration includes in scope of the presently disclosed technology.

In this specification, described in flow chart the step of includes not only the place executed in temporal sequence with the sequence Reason, and include concurrently or individually rather than the processing that must execute in temporal sequence.In addition, even in temporal sequence In the step of processing, needless to say, the sequence can also be suitably changed.

In addition, can also be configured as follows according to the technology of the disclosure.

A kind of 1. methods of trained face detection model are attached, including：

Train at least two region recognition models, each region recognition model in at least two region recognitions model It is associated with a different parts of face, it may belong to associated with the region recognition model in input picture for identification The region at position.

The method of training face detection model of the note 2. according to note 1, wherein the mark at the position based on face Zone position information is respectively trained each region recognition model.

The method of training face detection model of the note 3. according to note 1, wherein：

Each region recognition model is used to generate characteristic pattern according to input picture, and the characteristic pattern indicates in input picture Pixel belongs to the possibility at position associated with the region recognition model.

It is attached 4. methods according to trained face detection model described in note 3, wherein each region recognition model includes Multiple layers use different size of multiple filters at least one of the multiple layer layer, and by the multiple filter The response of wave device is combined obtained result as next layer of input, reflects the possibility to improve the characteristic pattern Accuracy.

The method of training face detection model of the note 5. according to note 1, wherein each region recognition model is complete Convolutional network.

A kind of 6. devices of trained face detection model are attached, including：

Training unit is configured at least two region recognition models of training, in at least two region recognitions model Each region recognition model with face a different parts it is associated, may belong in input picture for identification and the area The region at the associated position of domain identification model.

The device of training face detection model of the note 7. according to note 6, wherein the mark at the position based on face Zone position information is respectively trained each region recognition model.

The device of training face detection model of the note 8. according to note 6, wherein：

It is attached 9. devices according to trained face detection model described in note 8, wherein each region recognition model includes Multiple layers use different size of multiple filters at least one of the multiple layer layer, and by the multiple filter The response of wave device is combined obtained result as next layer of input, reflects the possibility to improve the characteristic pattern Accuracy.

The device of training face detection model of the note 10. according to note 6, wherein each region recognition model is Full convolutional network.

A kind of 11. type of face detection method are attached, including：

To input picture application, at least two presumptive area associated with a different parts of face identifies mould respectively Type, to identify the region that may belong to position associated with each presumptive area identification model from the input picture；

According to the region of priori and position about face structure identified, obtain in the input picture The candidate region of face to be detected；And

Judged using detection model with the presence or absence of face in the candidate region of the face to be detected, to detect State the region present in face to be detected.

Type of face detection method of the note 12. according to note 11, wherein obtain the input picture in the following manner In face to be detected candidate region：

Priori based on relative size and relative position relation about facial and facial overall region and The relative position of the face to be detected is estimated to obtain the initial of the face to be detected in the region identified at position Reference area, and using the initial baseline region as the candidate region of the face to be detected；And

The initial baseline region is rotated into predetermined angular successively to obtain the reference area for being in different directions, and will The reference area in the different directions also candidate region as the face to be detected.

Type of face detection method of the note 13. according to note 11, wherein the shared at least two presumptive areas identification The parameter in different zones identification model in model, to accelerate the speed of identification.

Type of face detection method of the note 14. according to note 11, wherein the detection model is convolutional neural networks.

Type of face detection method of the note 15. according to note 14, wherein judge the face to be detected in the following manner With the presence or absence of face in the candidate region in portion：

The input picture is input to the convolutional layer of the convolutional neural networks, generates the feature of the input picture Figure；

The region of characteristic pattern corresponding with the candidate region of face to be detected is input to the convolutional neural networks Full articulamentum；And

The area of characteristic pattern based on the input picture and characteristic pattern corresponding with the candidate region of face to be detected Domain judges in the candidate region of the face to be detected with the presence or absence of face.

Claims

1. a kind of method of trained face detection model, including：

Train at least two region recognition models, each region recognition model in at least two region recognitions model and face One different parts in portion are associated, may belong to position associated with the region recognition model in input picture for identification Region.

2. the method for trained face detection model according to claim 1, wherein the tab area at the position based on face Location information is respectively trained each region recognition model.

3. the method for trained face detection model according to claim 1, wherein：

Each region recognition model is used to generate characteristic pattern according to input picture, and the characteristic pattern indicates the pixel in input picture Belong to the possibility at position associated with the region recognition model.

4. the method for trained face detection model according to claim 3, wherein each region recognition model includes multiple Layer uses different size of multiple filters at least one of the multiple layer layer, and by the multiple filter Response be combined obtained result as next layer of input, to improve the standard that the characteristic pattern reflects the possibility Exactness.

5. the method for trained face detection model according to claim 1, wherein each region recognition model is full convolution Network.

6. a kind of device of trained face detection model, including：

Training unit is configured at least two region recognition models of training, every in at least two region recognitions model A region recognition model is associated with a different parts of face, may belong in input picture know with the region for identification The region at the other associated position of model.

7. a kind of type of face detection method, including：

At least two presumptive areas identification model associated with a different parts of face respectively is applied to input picture, from And the region that may belong to position associated with each presumptive area identification model is identified from the input picture；

According to the region of priori and position about face structure identified, obtain to be checked in the input picture Survey the candidate region of face；And

Judged using detection model with the presence or absence of face in the candidate region of the face to be detected, to detect described wait for Region present in detection face.

8. type of face detection method according to claim 7, wherein obtain waited in the input picture in the following manner Detect the candidate region of face：

Based on about facial and the facial relative size of overall region and priori and the position of relative position relation The region identified, estimate the relative position of the face to be detected to obtain the initial baseline of the face to be detected Region, and using the initial baseline region as the candidate region of the face to be detected；And

The initial baseline region is rotated into predetermined angular successively to obtain the reference area for being in different directions, and will be described Reference area in the different directions also candidate region as the face to be detected.

9. type of face detection method according to claim 7, wherein share in at least two presumptive areas identification model Different zones identification model in parameter, with accelerate identification speed.

10. type of face detection method according to claim 7, wherein the detection model is convolutional neural networks.