CN106599860A - Human face detection method and device - Google Patents

Human face detection method and device Download PDF

Info

Publication number
CN106599860A
CN106599860A CN201611185435.3A CN201611185435A CN106599860A CN 106599860 A CN106599860 A CN 106599860A CN 201611185435 A CN201611185435 A CN 201611185435A CN 106599860 A CN106599860 A CN 106599860A
Authority
CN
China
Prior art keywords
image
characteristic information
region
target
area image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611185435.3A
Other languages
Chinese (zh)
Other versions
CN106599860B (en
Inventor
万韶华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN201611185435.3A priority Critical patent/CN106599860B/en
Publication of CN106599860A publication Critical patent/CN106599860A/en
Application granted granted Critical
Publication of CN106599860B publication Critical patent/CN106599860B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Abstract

The invention provides a human face detection method and device, and belongs to the field of image processing. The method comprises the steps: carrying out the convolution of a to-be-recognized target image for a first preset number of times and the pooling processing of the to-be-recognized target image for a second preset number of times, and obtaining an intermediate image; dividing the intermediate image into a preset number of image blocks with the same size; carrying out the regional image feature extraction for each image block based on at least one candidate frame with a preset size, obtaining at least one piece of regional image feature information of each image block, wherein the height of the candidate frame at each preset size is greater than or equal to the width; and determining a human face region comprising the target image according to at least one piece of regional image feature information of each image block. The method provided by the invention can improve the processing speed of human face detection.

Description

A kind of method and apparatus of Face datection
Technical field
The disclosure is directed to image processing field, especially with respect to a kind of method and apparatus of Face datection.
Background technology
Human face detection tech refers to the characteristic information according to face, the technology positioned to the face in image.Typically The algorithm model that human face detection tech is used is Faster-RCNN (Faster Region-based Convolutional Neural Network, fast area depth convolutional network), concrete process is:
Images to be recognized is input in Faster-RCNN, after process of convolution and pondization are processed pixel is obtained Number is the intermediate image of N*N, and then intermediate image is divided into etc. the preset number image block of size.For each image Block, is 1 according to depth-width ratio example:2、1:1 and 2:1 and area be 1282、2562With 51229 kinds of candidate frames carry out in image block Area image feature extraction, obtains the area image characteristic information included in each image block, then the administrative division map to getting As characteristic information carries out full connection process, the human face region that images to be recognized is included is determined.
So, when area image feature information extraction is carried out, the number of candidate frame is relatively more, the area image for extracting Characteristic information also compares many, for each area image characteristic information needs to carry out full connection process, amount of calculation than larger, so as to The processing speed of Face datection is slow.
The content of the invention
In order to overcome problem present in correlation technique, present disclose provides a kind of method and apparatus of Face datection.Institute State technical scheme as follows:
According to the first aspect of the embodiment of the present disclosure, there is provided a kind of method of Face datection, methods described includes:
The pondization of process of convolution and the second preset times that the first preset times are carried out to target image to be identified is processed, Obtain intermediate image;
The intermediate image such as is divided into at the preset number image block of size;
In each image block, based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtained At least one area image characteristic information of each image block, wherein, the height of the candidate frame of each pre-set dimension is more than or waits In width;
According at least one area image characteristic information of each image block, determine what is included in the target image Human face region.
Optionally, the ratio of the height and the width of the candidate frame of the pre-set dimension is 1:1 and 2:1.
Optionally, at least one area image characteristic information of each image block described in the basis, determines the target The human face region included in image, including:
Based on full Connecting quantity W1=U1m1VT1, full Connecting quantity W2=U2m2VT2, to described each image block extremely A few area image characteristic information carries out full connection process, determines the human face region included in the target image.
As such, it is possible to make that the processing speed of Face datection improves is more.
Optionally, U1And ∑m1VT1It is to full Connecting quantity W1' carry out singular value SVD decompose obtain;
U2And ∑m2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
Optionally, it is described based on full Connecting quantity W1=U1m1VT1, full Connecting quantity W2=U2m2VT2, to it is described each At least one area image characteristic information of image block carries out full connection process, determines the face area included in the target image Domain, including:
Based on full Connecting quantity W1=U1m1VT1, at least one area image characteristic information to each image block, Full connection process is carried out, obtains being confirmed as the target area image characteristic information of facial image characteristic information;
Based on full Connecting quantity W2=U2m2VT2, to the target area image characteristic information, full connection process is carried out, Obtain the corresponding image-region of each described target area image characteristic information, and the position adjustment information of each image-region With scaling adjustment information;
Position adjustment information and scaling adjustment information based on each image-region, adjusts respectively to each image-region It is whole, by the image-region after adjustment, it is defined as the human face region included in the target image.
As such, it is possible to make that the processing speed of Face datection improves is more.
According to the second aspect of the embodiment of the present disclosure, there is provided a kind of device of Face datection, described device includes:
Processing module, the process of convolution and second for carrying out the first preset times to target image to be identified is default secondary Several pondizations is processed, and obtains intermediate image;
Segmentation module, for the intermediate image preset number image block of size such as to be divided into;
Extraction module, in each image block, based on the candidate frame of at least one pre-set dimension, carries out area image Feature extraction, obtains at least one area image characteristic information of each image block, wherein, the candidate frame of each pre-set dimension Highly it is more than or equal to width;
Determining module, for according at least one area image characteristic information of each image block, determining the mesh The human face region included in logo image.
Optionally, the ratio of the height and the width of the candidate frame of the pre-set dimension is 1:1 and 2:1.
Optionally, the determining module, is used for:
Based on full Connecting quantity W1=U1m1VT1, full Connecting quantity W2=U2m2VT2, to described each image block extremely A few area image characteristic information carries out full connection process, determines the human face region included in the target image.
Optionally, U1And ∑m1VT1It is to full Connecting quantity W1' carry out singular value SVD decompose obtain;
U2And ∑m2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
Optionally, the determining module includes that first processes submodule, second processing submodule and determination sub-module, its In:
Described first processes submodule, for based on full Connecting quantity W1=U1Σm1VT1, to described each image block extremely A few area image characteristic information, carries out full connection process, obtains being confirmed as the target area of facial image characteristic information Image feature information;
The second processing submodule, for based on full Connecting quantity W2=U2m2VT2, it is special to the target area image Reference ceases, and carries out full connection process, obtains the corresponding image-region of each described target area image characteristic information, and each The position adjustment information and scaling adjustment information of image-region;
The determination sub-module, for the position adjustment information based on each image-region and scaling adjustment information, difference Each image-region is adjusted, by the image-region after adjustment, is defined as the human face region included in the target image.
According to the second aspect of the embodiment of the present disclosure, there is provided a kind of device of Face datection, described device includes:
Processor;
For storing the memorizer of processor executable;
Wherein, the processor is configured to:
The pondization of process of convolution and the second preset times that the first preset times are carried out to target image to be identified is processed, Obtain intermediate image;
The intermediate image such as is divided into at the preset number image block of size;
In each image block, based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtained At least one area image characteristic information of each image block, wherein, the height of the candidate frame of each pre-set dimension is more than or waits In width;
According at least one area image characteristic information of each image block, determine what is included in the target image Human face region.
The technical scheme that embodiment of the disclosure is provided can include following beneficial effect:
In the embodiment of the present disclosure, using target image to be identified as fast area depth convolutional network, fast area depth Degree convolutional network can carry out the process of convolution of the first preset times and the pond of the second preset times to target image to be identified Change is processed, and obtains intermediate image, and intermediate image is divided into etc. into the preset number image block of size, in each image block, Based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtain at least oneth area of each image block Area image characteristic information, wherein, the height of the candidate frame of each pre-set dimension is more than or equal to width, according to each image block At least one area image characteristic information, determines the human face region included in target image.So, area image feature is being carried out During extraction, candidate frame of the height more than or equal to width is only remained, the number of candidate frame is reduced, and the area image for extracting is special Reference breath can be reduced, and amount of calculation when the full connection for carrying out is processed also can be reduced, so as to improve the processing speed of Face datection.
It should be appreciated that the general description of the above and detailed description hereinafter are only exemplary and explanatory, not The disclosure can be limited.
Description of the drawings
Accompanying drawing herein is merged in description and constitutes the part of this specification, shows the enforcement for meeting the disclosure Example, and be used to explain the principle of the disclosure together with description.In the accompanying drawings:
Fig. 1 is a kind of flow chart of the method for the Face datection according to an exemplary embodiment;
Fig. 2 is a kind of system diagram of the Face datection according to an exemplary embodiment;
Fig. 3 is a kind of image characteristics extraction schematic diagram according to an exemplary embodiment;
Fig. 4 is a kind of schematic diagram of the device of the Face datection according to an exemplary embodiment;
Fig. 5 is a kind of schematic diagram of the device of the Face datection according to an exemplary embodiment;
Fig. 6 is a kind of structural representation of the server according to an exemplary embodiment.
By above-mentioned accompanying drawing, it has been shown that the clear and definite embodiment of the disclosure, hereinafter will be described in more detail.These accompanying drawings It is not intended to limit the scope of disclosure design by any mode with word description, but is by reference to specific embodiment Those skilled in the art illustrate the concept of the disclosure.
Specific embodiment
Here exemplary embodiment will be illustrated in detail, its example is illustrated in the accompanying drawings.Explained below is related to During accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represent same or analogous key element.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the disclosure.Conversely, they be only with it is such as appended The example of the consistent apparatus and method of some aspects described in detail in claims, the disclosure.
The embodiment of the present disclosure provides a kind of method of Face datection, and the executive agent of the method can be server, its In, server can be the background server of Face datection application program.Processor, memorizer can be provided with the server Can be used for the process of the process of Face datection Deng, processor, memorizer can be used for being needed during storing Face datection Data and generation data.
As shown in figure 1, the handling process of the method can include the steps:
In a step 101, the process of convolution and second that the first preset times are carried out to target image to be identified is preset secondary Several pondizations is processed, and obtains intermediate image.
Wherein, the first preset times are preset with the second preset times by technical staff, and such as the first preset times are 4, second Preset times are 3 etc., and the first preset times can be with equal with the second preset times, it is also possible to unequal.
In force, user wants to carry out Face datection to certain image, can in the terminal install Face datection application Face datection application program is opened in program, then triggering, Face datection option is provided with Face datection application program, is checked more Other options, user are newly waited to click on Face datection option, terminal can then receive the click commands of Face datection option, eventually End display image input frame, user can select image to be identified (being subsequently referred to as target image), then click on determination by Key, terminal can then receive the click commands for determining button, then target image be sent to server.As shown in Fig. 2 service Device is received after target image, can obtain the convolution kernel of the first convolutional layer, and to target image process of convolution is carried out, and obtains first The input of convolutional layer, using the output of the first convolutional layer as the input of the first pond layer, carries out pond process, and the rest may be inferred is carried out The pondization of the process of convolution of the first preset times and the second preset times is processed, and exports intermediate image.
In a step 102, intermediate image is divided into etc. the preset number image block of size.
In force, server is obtained after intermediate image, can obtain preset number, and such as 10, then by intermediate image The preset number image block of size such as it is divided into.For example, the pixel of intermediate image is 1000*1000, and preset number is 1000,1000*1000 can be divided into 1000 image blocks.
In step 103, in each image block, based on the candidate frame of at least one pre-set dimension, area image is carried out Feature extraction, obtains at least one area image characteristic information of each image block, wherein, the candidate frame of each pre-set dimension Highly it is more than or equal to width.
In force, as shown in figure 3, server can obtain the candidate frame of default at least one pre-set dimension, preset The height of the candidate frame of size is more than or equal to width, and such as depth-width ratio example is 1:1、2:1, area is 1282、2562With 51226 Candidate frame etc. is planted, it is then determined that the central point of each image block, by the central point of each image block a datum mark is treated as, is surrounded This datum mark adds at least one default candidate frame in each image block, then to the area image feature in candidate frame Extracted, obtained at least one area image characteristic information of each image block, i.e., area image is carried out in image block special When levying extraction, the center of the image-region belonging to area image characteristic information overlaps with the datum mark of the image block, region Image feature information can use vector representation.
Herein it should be noted that:Because the size of face is usually that height is more than width, therefore carrying out area image During feature extraction, the face included in image will not be missed more than the candidate frame of width using height.
Optionally, the ratio of the height and the width of the candidate frame of the pre-set dimension mentioned in step 103 can be 1:1 and 2: 1。
In force, more than or equal to width, the ratio of height and the width can be the rear height for selecting frame of pre-set dimension 1:1 and 2:1 etc..
At step 104, according at least one area image characteristic information of each image block, determine in target image and wrap The human face region for containing.
In force, server is extracted after at least one area image characteristic information of each image block, will can be carried The area image characteristic information got is input to full connection process layer, carries out full connection process, obtains the people that target image is included Face region.
Optionally, server can carry out full connection process to area image characteristic information, determine in target image and include Human face region, the process of corresponding step 104 can be as follows:
Based on full Connecting quantity W1=U1Σm1VT1, full Connecting quantity W2=U2Σm2VT2, at least to each image block Individual area image characteristic information carries out full connection process, determines the human face region included in target image.
In force, as shown in Fig. 2 the area image characteristic information that server is extracted can use vector representation, clothes The area image characteristic information for extracting input can entirely be connected process layer, and the full Connecting quantity based on full articulamentum by business device W1=U1m1VT1And W2=U2m2VT2Full connection process is carried out, the human face region that target image is included is obtained.
Optionally, U1And ∑m1VT1It is to full Connecting quantity W1' carry out singular value SVD decompose obtain;
U2And ∑m2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
In force, full Connecting quantity W1' and W2' it is full Connecting quantity in Faster-RCNN for object detection, Can be represented by matrix, it is assumed that matrix W1' size be expressed as u1×v1, representing matrix W1' in have u1v1Individual element, to matrix W1' carry out after SVD decomposition, obtaining U and ΣmVT, U is u1×u1Matrix, ∑mRepresent u1×v1Positive semidefinite diagonal matrix, VTTable Show v1×v1Matrix, ∑mDiagonal element be matrix W1' singular value, diagonal element is by putting in order by row from big to small Arranged, reduction speed is very fast, because front 10% or even 1% singular value sum is more than whole singular value sums 99%, so, ∑mU in matrix1And v1M can be reduced to1, W1' can be approximated to be W1=U1m1VT1, U1=u1×m1, ∑m1= m1×m1, VT1=m1×v1, such matrix W1' in element number from u1v1Become for m1(u1+v1), due to m1Much smaller than u1、v1 In any one, it is exactly m to be formulated1< < (u1,v1), therefore m1(u1+v1) it is much smaller than u1v1, so as to connected entirely During process, need the element number for carrying out multiplication calculating to diminish, the speed that full connection is processed can be improved.
In the same manner, U2And ∑m2VT2It is to full Connecting quantity W2' carry out SVD and decompose to obtain, concrete processing procedure and W1' SVD Decomposition method is identical, and here is omitted.
Optionally, four full connections can be carried out to process, the facial image that target image is included, corresponding step is obtained 104 process can be as follows:
Based on full Connecting quantity W1=U1Σm1VT1, at least one area image characteristic information to each image block carries out Full connection is processed, and obtains being confirmed as the target area image characteristic information of facial image characteristic information;Based on full Connecting quantity W2=U2Σm2VT2, to target area image characteristic information, full connection process is carried out, obtain each target area image feature letter Cease corresponding image-region, and the position adjustment information of each image-region and scaling adjustment information;Based on each image district The position adjustment information and scaling adjustment information in domain, is adjusted respectively to each image-region, by the image-region after adjustment, It is defined as the human face region included in target image.
In force, server is extracted after the area image characteristic information in each image block, can be by area image Characteristic information is input to the full articulamentum of classification, and full articulamentum of classifying includes two full Connecting quantity U1And Σm1VT1, can be by each Area image characteristic information (area image characteristic information uses vector representation) is multiplied by parameter Σ of classification articulamentumm1VT1, then The product for obtaining is multiplied by into U1, obtain the probability that each area image characteristic information belongs to background, and each area image spy Reference breath belongs to the probability of certain objects.In at least one area image characteristic information of each image block, by certain objects Probability be defined as target area image characteristic information more than the area image characteristic information of default value, default value is image Minimum probability when characteristic information is facial image characteristic information, such as 0.7.
Server determined after target area image characteristic information, target area image characteristic information can be input to into recurrence Full articulamentum, returning full articulamentum includes two full Connecting quantity U2And ∑m2VT2, can be by target area image characteristic information (target area image characteristic information uses vector representation), is multiplied by parameter Σ for returning full articulamentumm2VT2, then will obtain Product is multiplied by U2, obtain each corresponding image-region of target area image characteristic information, and the position of each image-region Adjustment information and scaling adjustment information.The position of general each image-region by the image-region upper left corner end points coordinates table Show, then server can be adjusted according to the position adjustment information of each image-region to the position of each image-region, And according to scaling adjustment information, the size of image-region is adjusted, the image-region after being adjusted, after adjustment Image-region is defined as in target image the region comprising face.
In addition, server is extracted after the area image characteristic information in each image block, can respectively by area image Characteristic information is input to the full articulamentum of classification and returns full articulamentum, in full articulamentum of classifying, by each area image feature letter Breath (area image characteristic information uses vector representation) is multiplied by parameter Σ of classification articulamentumm1VT1, then the product for obtaining is taken advantage of With U1, the probability that each area image characteristic information belongs to background is obtained, and each area image characteristic information belongs to specific The probability of object.Full articulamentum is being returned, (each area image characteristic information uses vector by each area image characteristic information Represent), it is multiplied by parameter Σ for returning full articulamentumm2VT2, then the product for obtaining is multiplied by into U2, obtain each area image feature The corresponding image-region of information, and the position adjustment information of each image-region and scaling adjustment information.Then in conjunction with classification The output of full articulamentum and return the output of full articulamentum, server can according to the position adjustment information of each image-region, The position of each image-region is adjusted, and according to scaling adjustment information, the size of image-region is adjusted, obtained Image-region to after adjustment.The probability for determining certain objects is more than the image district belonging to the image feature information of default value Domain, by the image-region after the corresponding adjustment of above-mentioned affiliated image-region the human face region of target image, present count are defined as Be worth for image feature information be facial image characteristic information when minimum probability, such as 0.7.
In the embodiment of the present disclosure, using target image to be identified as fast area depth convolutional network, fast area depth Degree convolutional network can carry out the process of convolution of the first preset times and the pond of the second preset times to target image to be identified Change is processed, and obtains intermediate image, and intermediate image is divided into etc. into the preset number image block of size, in each image block, Based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtain at least oneth area of each image block Area image characteristic information, wherein, the height of the candidate frame of each pre-set dimension is more than or equal to width, according to each image block At least one area image characteristic information, determines the human face region included in target image.So, area image feature is being carried out During extraction, candidate frame of the height more than or equal to width is only remained, the number of candidate frame is reduced, and the area image for extracting is special Reference breath can be reduced, and amount of calculation when the full connection for carrying out is processed also can be reduced, so as to improve the processing speed of Face datection.
Based on identical technology design, the exemplary embodiment of the disclosure one provides a kind of device of Face datection, such as Fig. 4 Shown, the device includes:
Processing module 410, the process of convolution and second for carrying out the first preset times to target image to be identified is pre- If the pondization of number of times is processed, intermediate image is obtained;
Segmentation module 420, for the intermediate image preset number image block of size such as to be divided into;
Extraction module 430, in each image block, based on the candidate frame of at least one pre-set dimension, carries out region Image characteristics extraction, obtains at least one area image characteristic information of each image block, wherein, the candidate of each pre-set dimension The height of frame is more than or equal to width;
Determining module 440, for according at least one area image characteristic information of each image block, it is determined that described The human face region included in target image.
Optionally, the ratio of the height and the width of the candidate frame of the pre-set dimension is 1:1 and 2:1.
Optionally, the determining module 440, is used for:
Based on full Connecting quantity W1=U1m1VT1, full Connecting quantity W2=U2m2VT2, to described each image block extremely A few area image characteristic information carries out full connection process, determines the human face region included in the target image.
Optionally, U1And ∑m1VT1It is to full Connecting quantity W1' carry out singular value SVD decompose obtain;
U2And ∑m2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
Optionally, as shown in figure 5, the determining module 440 includes that first processes submodule 441, second processing submodule 442 and determination sub-module 443, wherein:
Described first processes submodule 441, for based on full Connecting quantity W1=U1m1VT1, to described each image block At least one area image characteristic information, carry out full connection process, obtain being confirmed as the target of facial image characteristic information Area image characteristic information;
The second processing submodule 442, for based on full Connecting quantity W2=U2m2VT2, to the target area figure As characteristic information, full connection process is carried out, obtain the corresponding image-region of each described target area image characteristic information, and The position adjustment information and scaling adjustment information of each image-region;
The determination sub-module 443, for the position adjustment information based on each image-region and scaling adjustment information, point It is other that each image-region is adjusted, by the image-region after adjustment, it is defined as the face area included in the target image Domain.
With regard to the device in above-described embodiment, wherein modules perform the concrete mode of operation in relevant the method Embodiment in be described in detail, explanation will be not set forth in detail herein.
In the embodiment of the present disclosure, using target image to be identified as fast area depth convolutional network, fast area depth Degree convolutional network can carry out the process of convolution of the first preset times and the pond of the second preset times to target image to be identified Change is processed, and obtains intermediate image, and intermediate image is divided into etc. into the preset number image block of size, in each image block, Based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtain at least oneth area of each image block Area image characteristic information, wherein, the height of the candidate frame of each pre-set dimension is more than or equal to width, according to each image block At least one area image characteristic information, determines the human face region included in target image.So, area image feature is being carried out During extraction, candidate frame of the height more than or equal to width is only remained, the number of candidate frame is reduced, and the area image for extracting is special Reference breath can be reduced, and amount of calculation when the full connection for carrying out is processed also can be reduced, so as to improve the processing speed of Face datection.
It should be noted that:Above-described embodiment provide Face datection device when Face datection is carried out, only with above-mentioned The division of each functional module is illustrated, and in practical application, as desired can distribute above-mentioned functions by different Functional module is completed, will the internal structure of device be divided into different functional modules, with complete it is described above whole or Partial function.In addition, the device of Face datection that above-described embodiment is provided belongs to same structure with the embodiment of the method for Face datection Think, it implements process and refer to embodiment of the method, repeats no more here.
The another exemplary embodiment of the disclosure provides a kind of structural representation of server.With reference to Fig. 6, server 600 Including process assembly 622, it further includes one or more processors, and the memorizer money by representated by memorizer 632 Source, can be by the instruction of the execution of processing component 622, such as application program for storage.The application program stored in memorizer 632 Can include it is one or more each corresponding to one group of instruction module.Additionally, process assembly 622 is configured to hold Row instruction, the method to perform above-mentioned display usage record.
Server 600 can also include that power supply module 626 be configured to the power management of execute server 600, one Individual wired or wireless network interface 660 is configured to for server 600 to be connected to network, and input and output (I/O) interface 668.Server 600 can be operated based on the operating system for being stored in memorizer 632, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or similar.
Server 600 can include memorizer, and one or more than one program, one of them or one Procedure above is stored in memorizer, and be configured to by one either more than one computing device it is one or one Procedure above includes the instruction for carrying out following operation:
The pondization of process of convolution and the second preset times that the first preset times are carried out to target image to be identified is processed, Obtain intermediate image;
The intermediate image such as is divided into at the preset number image block of size;
In each image block, based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtained At least one area image characteristic information of each image block, wherein, the height of the candidate frame of each pre-set dimension is more than or waits In width;
According at least one area image characteristic information of each image block, determine what is included in the target image Human face region.
Optionally, the ratio of the height and the width of the candidate frame of the pre-set dimension is 1:1 and 2:1.
Optionally, at least one area image characteristic information of each image block described in the basis, determines the target The human face region included in image, including:
Based on full Connecting quantity W1=U1m1VT1, full Connecting quantity W2=U2m2VT2, to described each image block extremely A few area image characteristic information carries out full connection process, determines the human face region included in the target image.
Optionally, U1And Σm1VT1It is to full Connecting quantity W1' carry out singular value SVD decompose obtain;
U2And Σm2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
Optionally, it is described based on full Connecting quantity W1=U1m1VT1, full Connecting quantity W2=U2Σm2VT2, to it is described each At least one area image characteristic information of image block carries out full connection process, determines the face area included in the target image Domain, including:
Based on full Connecting quantity W1=U1m1VT1, at least one area image characteristic information to each image block, Full connection process is carried out, obtains being confirmed as the target area image characteristic information of facial image characteristic information;
Based on full Connecting quantity W2=U2m2VT2, to the target area image characteristic information, full connection process is carried out, Obtain the corresponding image-region of each described target area image characteristic information, and the position adjustment information of each image-region With scaling adjustment information;
Position adjustment information and scaling adjustment information based on each image-region, adjusts respectively to each image-region It is whole, by the image-region after adjustment, it is defined as the human face region included in the target image.
In the embodiment of the present disclosure, using target image to be identified as fast area depth convolutional network, fast area depth Degree convolutional network can carry out the process of convolution of the first preset times and the pond of the second preset times to target image to be identified Change is processed, and obtains intermediate image, and intermediate image is divided into etc. into the preset number image block of size, in each image block, Based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtain at least oneth area of each image block Area image characteristic information, wherein, the height of the candidate frame of each pre-set dimension is more than or equal to width, according to each image block At least one area image characteristic information, determines the human face region included in target image.So, area image feature is being carried out During extraction, candidate frame of the height more than or equal to width is only remained, the number of candidate frame is reduced, and the area image for extracting is special Reference breath can be reduced, and amount of calculation when the full connection for carrying out is processed also can be reduced, so as to improve the processing speed of Face datection.
Those skilled in the art will readily occur to its of the disclosure after considering description and putting into practice disclosure disclosed herein Its embodiment.The application is intended to any modification, purposes or the adaptations of the disclosure, these modifications, purposes or Person's adaptations follow the general principle of the disclosure and including the undocumented common knowledge in the art of the disclosure Or conventional techniques.Description and embodiments are considered only as exemplary, and the true scope of the disclosure and spirit are by following Claim is pointed out.
It should be appreciated that the disclosure is not limited to the precision architecture for being described above and being shown in the drawings, and And can without departing from the scope carry out various modifications and changes.The scope of the present disclosure is only limited by appended claim.

Claims (11)

1. a kind of method of Face datection, it is characterised in that methods described includes:
The pondization of process of convolution and the second preset times that the first preset times are carried out to target image to be identified is processed, and is obtained Intermediate image;
The intermediate image such as is divided into at the preset number image block of size;
In each image block, based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtain each At least one area image characteristic information of image block, wherein, the height of the candidate frame of each pre-set dimension is more than or equal to width Degree;
According at least one area image characteristic information of each image block, the face included in the target image is determined Region.
2. method according to claim 1, it is characterised in that the ratio of the height and the width of the candidate frame of the pre-set dimension Example is 1:1 and 2:1.
3. method according to claim 1, it is characterised in that at least one region of each image block described in the basis Image feature information, determines the human face region included in the target image, including:
Based on full Connecting quantity W1=U1Σm1VT1, full Connecting quantity W2=U2Σm2VT2, at least to each image block Individual area image characteristic information carries out full connection process, determines the human face region included in the target image.
4. method according to claim 3, it is characterised in that U1And Σm1VT1It is to full Connecting quantity W1' carry out singular value SVD decomposition is obtained;
U2And Σm2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
5. method according to claim 3, it is characterised in that described based on full Connecting quantity W1=U1Σm1VT1, full connection Parameter W2=U2Σm2VT2, full connection process is carried out at least one area image characteristic information of each image block, it is determined that The human face region included in the target image, including:
Based on full Connecting quantity W1=U1Σm1VT1, at least one area image characteristic information to each image block carries out Full connection is processed, and obtains being confirmed as the target area image characteristic information of facial image characteristic information;
Based on full Connecting quantity W2=U2m2VT2, to the target area image characteristic information, full connection process is carried out, obtain The corresponding image-region of each described target area image characteristic information, and position adjustment information and the contracting of each image-region Put adjustment information;
Position adjustment information and scaling adjustment information based on each image-region, is adjusted respectively to each image-region, By the image-region after adjustment, it is defined as the human face region included in the target image.
6. a kind of device of Face datection, it is characterised in that described device includes:
Processing module, for carrying out the process of convolution and second preset times of the first preset times to target image to be identified Pondization process, obtains intermediate image;
Segmentation module, for the intermediate image preset number image block of size such as to be divided into;
Extraction module, in each image block, based on the candidate frame of at least one pre-set dimension, carries out area image feature Extract, obtain at least one area image characteristic information of each image block, wherein, the height of the candidate frame of each pre-set dimension More than or equal to width;
Determining module, for according at least one area image characteristic information of each image block, determining the target figure The human face region included as in.
7. device according to claim 6, it is characterised in that the ratio of the height and the width of the candidate frame of the pre-set dimension Example is 1:1 and 2:1.
8. device according to claim 6, it is characterised in that the determining module, is used for:
Based on full Connecting quantity W1=U1Σm1VT1, full Connecting quantity W2=U2m2VT2, at least to each image block Individual area image characteristic information carries out full connection process, determines the human face region included in the target image.
9. device according to claim 8, it is characterised in that U1And ∑m1VT1It is to full Connecting quantity W1' carry out singular value SVD decomposition is obtained;
U2And ∑m2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
10. device according to claim 8, it is characterised in that the determining module include first process submodule, second Submodule and determination sub-module are processed, wherein:
Described first processes submodule, for based on full Connecting quantity W1=U1Σm1VT1, at least to each image block Individual area image characteristic information, carries out full connection process, obtains being confirmed as the target area image of facial image characteristic information Characteristic information;
The second processing submodule, for based on full Connecting quantity W2=U2Σm2VT2, the target area image feature is believed Breath, carries out full connection process, obtains the corresponding image-region of each described target area image characteristic information, and each image The position adjustment information and scaling adjustment information in region;
The determination sub-module, for the position adjustment information based on each image-region and scaling adjustment information, respectively to every Individual image-region is adjusted, and by the image-region after adjustment, is defined as the human face region included in the target image.
11. a kind of devices of Face datection, it is characterised in that described device includes:
Processor;
For storing the memorizer of processor executable;
Wherein, the processor is configured to:
The pondization of process of convolution and the second preset times that the first preset times are carried out to target image to be identified is processed, and is obtained Intermediate image;
The intermediate image such as is divided into at the preset number image block of size;
In each image block, based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtain each At least one area image characteristic information of image block, wherein, the height of the candidate frame of each pre-set dimension is more than or equal to width Degree;
According at least one area image characteristic information of each image block, the face included in the target image is determined Region.
CN201611185435.3A 2016-12-20 2016-12-20 A kind of method and apparatus of Face datection Active CN106599860B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611185435.3A CN106599860B (en) 2016-12-20 2016-12-20 A kind of method and apparatus of Face datection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611185435.3A CN106599860B (en) 2016-12-20 2016-12-20 A kind of method and apparatus of Face datection

Publications (2)

Publication Number Publication Date
CN106599860A true CN106599860A (en) 2017-04-26
CN106599860B CN106599860B (en) 2019-11-26

Family

ID=58600344

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611185435.3A Active CN106599860B (en) 2016-12-20 2016-12-20 A kind of method and apparatus of Face datection

Country Status (1)

Country Link
CN (1) CN106599860B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105243395A (en) * 2015-11-04 2016-01-13 东方网力科技股份有限公司 Human body image comparison method and device
CN105488468A (en) * 2015-11-26 2016-04-13 浙江宇视科技有限公司 Method and device for positioning target area

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105243395A (en) * 2015-11-04 2016-01-13 东方网力科技股份有限公司 Human body image comparison method and device
CN105488468A (en) * 2015-11-26 2016-04-13 浙江宇视科技有限公司 Method and device for positioning target area

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LILIANG ZHANG等: "Is Faster R-CNN Doing Well for Pedestrian Detection?", 《EUROPEAN CONFERENCE ON COMPUTER VISION》 *
ROSS GIRSHICK: "Fast R-CNN", 《THE IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION(ICCV),2015》 *
SHAOQING REN等: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *

Also Published As

Publication number Publication date
CN106599860B (en) 2019-11-26

Similar Documents

Publication Publication Date Title
CN108205655B (en) Key point prediction method and device, electronic equipment and storage medium
US10713532B2 (en) Image recognition method and apparatus
US9396560B2 (en) Image-based color palette generation
US10283162B2 (en) Method for triggering events in a video
US10318797B2 (en) Image processing apparatus and image processing method
WO2018153294A1 (en) Face tracking method, storage medium, and terminal device
JP2020515983A (en) Target person search method and device, device, program product and medium
CN110110118A (en) Dressing recommended method, device, storage medium and mobile terminal
CN111383232B (en) Matting method, matting device, terminal equipment and computer readable storage medium
CN110232318A (en) Acupuncture point recognition methods, device, electronic equipment and storage medium
Zhang et al. High-quality face image generation based on generative adversarial networks
CN111428671A (en) Face structured information identification method, system, device and storage medium
CN111860484B (en) Region labeling method, device, equipment and storage medium
CN111127309A (en) Portrait style transfer model training method, portrait style transfer method and device
CN111199169A (en) Image processing method and device
US20210166058A1 (en) Image generation method and computing device
CN113610864B (en) Image processing method, device, electronic equipment and computer readable storage medium
Sun et al. Location dependent Dirichlet processes
CN106599860A (en) Human face detection method and device
CN106469437B (en) Image processing method and image processing apparatus
CN112150486A (en) Image processing method and device
CN116030466B (en) Image text information identification and processing method and device and computer equipment
CN116485833A (en) Image segmentation method and device
CN117874272A (en) Image classification method, apparatus, electronic device, and readable storage medium
Tang et al. MRP-Net: A Light Multiple Region Perception Neural Network for Multi-label AU Detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant