CN106599860A - Human face detection method and device - Google Patents
Human face detection method and device Download PDFInfo
- Publication number
- CN106599860A CN106599860A CN201611185435.3A CN201611185435A CN106599860A CN 106599860 A CN106599860 A CN 106599860A CN 201611185435 A CN201611185435 A CN 201611185435A CN 106599860 A CN106599860 A CN 106599860A
- Authority
- CN
- China
- Prior art keywords
- image
- characteristic information
- region
- target
- area image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
Abstract
The invention provides a human face detection method and device, and belongs to the field of image processing. The method comprises the steps: carrying out the convolution of a to-be-recognized target image for a first preset number of times and the pooling processing of the to-be-recognized target image for a second preset number of times, and obtaining an intermediate image; dividing the intermediate image into a preset number of image blocks with the same size; carrying out the regional image feature extraction for each image block based on at least one candidate frame with a preset size, obtaining at least one piece of regional image feature information of each image block, wherein the height of the candidate frame at each preset size is greater than or equal to the width; and determining a human face region comprising the target image according to at least one piece of regional image feature information of each image block. The method provided by the invention can improve the processing speed of human face detection.
Description
Technical field
The disclosure is directed to image processing field, especially with respect to a kind of method and apparatus of Face datection.
Background technology
Human face detection tech refers to the characteristic information according to face, the technology positioned to the face in image.Typically
The algorithm model that human face detection tech is used is Faster-RCNN (Faster Region-based Convolutional
Neural Network, fast area depth convolutional network), concrete process is:
Images to be recognized is input in Faster-RCNN, after process of convolution and pondization are processed pixel is obtained
Number is the intermediate image of N*N, and then intermediate image is divided into etc. the preset number image block of size.For each image
Block, is 1 according to depth-width ratio example:2、1:1 and 2:1 and area be 1282、2562With 51229 kinds of candidate frames carry out in image block
Area image feature extraction, obtains the area image characteristic information included in each image block, then the administrative division map to getting
As characteristic information carries out full connection process, the human face region that images to be recognized is included is determined.
So, when area image feature information extraction is carried out, the number of candidate frame is relatively more, the area image for extracting
Characteristic information also compares many, for each area image characteristic information needs to carry out full connection process, amount of calculation than larger, so as to
The processing speed of Face datection is slow.
The content of the invention
In order to overcome problem present in correlation technique, present disclose provides a kind of method and apparatus of Face datection.Institute
State technical scheme as follows:
According to the first aspect of the embodiment of the present disclosure, there is provided a kind of method of Face datection, methods described includes:
The pondization of process of convolution and the second preset times that the first preset times are carried out to target image to be identified is processed,
Obtain intermediate image;
The intermediate image such as is divided into at the preset number image block of size;
In each image block, based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtained
At least one area image characteristic information of each image block, wherein, the height of the candidate frame of each pre-set dimension is more than or waits
In width;
According at least one area image characteristic information of each image block, determine what is included in the target image
Human face region.
Optionally, the ratio of the height and the width of the candidate frame of the pre-set dimension is 1:1 and 2:1.
Optionally, at least one area image characteristic information of each image block described in the basis, determines the target
The human face region included in image, including:
Based on full Connecting quantity W1=U1∑m1VT1, full Connecting quantity W2=U2∑m2VT2, to described each image block extremely
A few area image characteristic information carries out full connection process, determines the human face region included in the target image.
As such, it is possible to make that the processing speed of Face datection improves is more.
Optionally, U1And ∑m1VT1It is to full Connecting quantity W1' carry out singular value SVD decompose obtain;
U2And ∑m2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
Optionally, it is described based on full Connecting quantity W1=U1∑m1VT1, full Connecting quantity W2=U2∑m2VT2, to it is described each
At least one area image characteristic information of image block carries out full connection process, determines the face area included in the target image
Domain, including:
Based on full Connecting quantity W1=U1∑m1VT1, at least one area image characteristic information to each image block,
Full connection process is carried out, obtains being confirmed as the target area image characteristic information of facial image characteristic information;
Based on full Connecting quantity W2=U2∑m2VT2, to the target area image characteristic information, full connection process is carried out,
Obtain the corresponding image-region of each described target area image characteristic information, and the position adjustment information of each image-region
With scaling adjustment information;
Position adjustment information and scaling adjustment information based on each image-region, adjusts respectively to each image-region
It is whole, by the image-region after adjustment, it is defined as the human face region included in the target image.
As such, it is possible to make that the processing speed of Face datection improves is more.
According to the second aspect of the embodiment of the present disclosure, there is provided a kind of device of Face datection, described device includes:
Processing module, the process of convolution and second for carrying out the first preset times to target image to be identified is default secondary
Several pondizations is processed, and obtains intermediate image;
Segmentation module, for the intermediate image preset number image block of size such as to be divided into;
Extraction module, in each image block, based on the candidate frame of at least one pre-set dimension, carries out area image
Feature extraction, obtains at least one area image characteristic information of each image block, wherein, the candidate frame of each pre-set dimension
Highly it is more than or equal to width;
Determining module, for according at least one area image characteristic information of each image block, determining the mesh
The human face region included in logo image.
Optionally, the ratio of the height and the width of the candidate frame of the pre-set dimension is 1:1 and 2:1.
Optionally, the determining module, is used for:
Based on full Connecting quantity W1=U1∑m1VT1, full Connecting quantity W2=U2∑m2VT2, to described each image block extremely
A few area image characteristic information carries out full connection process, determines the human face region included in the target image.
Optionally, U1And ∑m1VT1It is to full Connecting quantity W1' carry out singular value SVD decompose obtain;
U2And ∑m2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
Optionally, the determining module includes that first processes submodule, second processing submodule and determination sub-module, its
In:
Described first processes submodule, for based on full Connecting quantity W1=U1Σm1VT1, to described each image block extremely
A few area image characteristic information, carries out full connection process, obtains being confirmed as the target area of facial image characteristic information
Image feature information;
The second processing submodule, for based on full Connecting quantity W2=U2∑m2VT2, it is special to the target area image
Reference ceases, and carries out full connection process, obtains the corresponding image-region of each described target area image characteristic information, and each
The position adjustment information and scaling adjustment information of image-region;
The determination sub-module, for the position adjustment information based on each image-region and scaling adjustment information, difference
Each image-region is adjusted, by the image-region after adjustment, is defined as the human face region included in the target image.
According to the second aspect of the embodiment of the present disclosure, there is provided a kind of device of Face datection, described device includes:
Processor;
For storing the memorizer of processor executable;
Wherein, the processor is configured to:
The pondization of process of convolution and the second preset times that the first preset times are carried out to target image to be identified is processed,
Obtain intermediate image;
The intermediate image such as is divided into at the preset number image block of size;
In each image block, based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtained
At least one area image characteristic information of each image block, wherein, the height of the candidate frame of each pre-set dimension is more than or waits
In width;
According at least one area image characteristic information of each image block, determine what is included in the target image
Human face region.
The technical scheme that embodiment of the disclosure is provided can include following beneficial effect:
In the embodiment of the present disclosure, using target image to be identified as fast area depth convolutional network, fast area depth
Degree convolutional network can carry out the process of convolution of the first preset times and the pond of the second preset times to target image to be identified
Change is processed, and obtains intermediate image, and intermediate image is divided into etc. into the preset number image block of size, in each image block,
Based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtain at least oneth area of each image block
Area image characteristic information, wherein, the height of the candidate frame of each pre-set dimension is more than or equal to width, according to each image block
At least one area image characteristic information, determines the human face region included in target image.So, area image feature is being carried out
During extraction, candidate frame of the height more than or equal to width is only remained, the number of candidate frame is reduced, and the area image for extracting is special
Reference breath can be reduced, and amount of calculation when the full connection for carrying out is processed also can be reduced, so as to improve the processing speed of Face datection.
It should be appreciated that the general description of the above and detailed description hereinafter are only exemplary and explanatory, not
The disclosure can be limited.
Description of the drawings
Accompanying drawing herein is merged in description and constitutes the part of this specification, shows the enforcement for meeting the disclosure
Example, and be used to explain the principle of the disclosure together with description.In the accompanying drawings:
Fig. 1 is a kind of flow chart of the method for the Face datection according to an exemplary embodiment;
Fig. 2 is a kind of system diagram of the Face datection according to an exemplary embodiment;
Fig. 3 is a kind of image characteristics extraction schematic diagram according to an exemplary embodiment;
Fig. 4 is a kind of schematic diagram of the device of the Face datection according to an exemplary embodiment;
Fig. 5 is a kind of schematic diagram of the device of the Face datection according to an exemplary embodiment;
Fig. 6 is a kind of structural representation of the server according to an exemplary embodiment.
By above-mentioned accompanying drawing, it has been shown that the clear and definite embodiment of the disclosure, hereinafter will be described in more detail.These accompanying drawings
It is not intended to limit the scope of disclosure design by any mode with word description, but is by reference to specific embodiment
Those skilled in the art illustrate the concept of the disclosure.
Specific embodiment
Here exemplary embodiment will be illustrated in detail, its example is illustrated in the accompanying drawings.Explained below is related to
During accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represent same or analogous key element.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistent with the disclosure.Conversely, they be only with it is such as appended
The example of the consistent apparatus and method of some aspects described in detail in claims, the disclosure.
The embodiment of the present disclosure provides a kind of method of Face datection, and the executive agent of the method can be server, its
In, server can be the background server of Face datection application program.Processor, memorizer can be provided with the server
Can be used for the process of the process of Face datection Deng, processor, memorizer can be used for being needed during storing Face datection
Data and generation data.
As shown in figure 1, the handling process of the method can include the steps:
In a step 101, the process of convolution and second that the first preset times are carried out to target image to be identified is preset secondary
Several pondizations is processed, and obtains intermediate image.
Wherein, the first preset times are preset with the second preset times by technical staff, and such as the first preset times are 4, second
Preset times are 3 etc., and the first preset times can be with equal with the second preset times, it is also possible to unequal.
In force, user wants to carry out Face datection to certain image, can in the terminal install Face datection application
Face datection application program is opened in program, then triggering, Face datection option is provided with Face datection application program, is checked more
Other options, user are newly waited to click on Face datection option, terminal can then receive the click commands of Face datection option, eventually
End display image input frame, user can select image to be identified (being subsequently referred to as target image), then click on determination by
Key, terminal can then receive the click commands for determining button, then target image be sent to server.As shown in Fig. 2 service
Device is received after target image, can obtain the convolution kernel of the first convolutional layer, and to target image process of convolution is carried out, and obtains first
The input of convolutional layer, using the output of the first convolutional layer as the input of the first pond layer, carries out pond process, and the rest may be inferred is carried out
The pondization of the process of convolution of the first preset times and the second preset times is processed, and exports intermediate image.
In a step 102, intermediate image is divided into etc. the preset number image block of size.
In force, server is obtained after intermediate image, can obtain preset number, and such as 10, then by intermediate image
The preset number image block of size such as it is divided into.For example, the pixel of intermediate image is 1000*1000, and preset number is
1000,1000*1000 can be divided into 1000 image blocks.
In step 103, in each image block, based on the candidate frame of at least one pre-set dimension, area image is carried out
Feature extraction, obtains at least one area image characteristic information of each image block, wherein, the candidate frame of each pre-set dimension
Highly it is more than or equal to width.
In force, as shown in figure 3, server can obtain the candidate frame of default at least one pre-set dimension, preset
The height of the candidate frame of size is more than or equal to width, and such as depth-width ratio example is 1:1、2:1, area is 1282、2562With 51226
Candidate frame etc. is planted, it is then determined that the central point of each image block, by the central point of each image block a datum mark is treated as, is surrounded
This datum mark adds at least one default candidate frame in each image block, then to the area image feature in candidate frame
Extracted, obtained at least one area image characteristic information of each image block, i.e., area image is carried out in image block special
When levying extraction, the center of the image-region belonging to area image characteristic information overlaps with the datum mark of the image block, region
Image feature information can use vector representation.
Herein it should be noted that:Because the size of face is usually that height is more than width, therefore carrying out area image
During feature extraction, the face included in image will not be missed more than the candidate frame of width using height.
Optionally, the ratio of the height and the width of the candidate frame of the pre-set dimension mentioned in step 103 can be 1:1 and 2:
1。
In force, more than or equal to width, the ratio of height and the width can be the rear height for selecting frame of pre-set dimension
1:1 and 2:1 etc..
At step 104, according at least one area image characteristic information of each image block, determine in target image and wrap
The human face region for containing.
In force, server is extracted after at least one area image characteristic information of each image block, will can be carried
The area image characteristic information got is input to full connection process layer, carries out full connection process, obtains the people that target image is included
Face region.
Optionally, server can carry out full connection process to area image characteristic information, determine in target image and include
Human face region, the process of corresponding step 104 can be as follows:
Based on full Connecting quantity W1=U1Σm1VT1, full Connecting quantity W2=U2Σm2VT2, at least to each image block
Individual area image characteristic information carries out full connection process, determines the human face region included in target image.
In force, as shown in Fig. 2 the area image characteristic information that server is extracted can use vector representation, clothes
The area image characteristic information for extracting input can entirely be connected process layer, and the full Connecting quantity based on full articulamentum by business device
W1=U1∑m1VT1And W2=U2∑m2VT2Full connection process is carried out, the human face region that target image is included is obtained.
Optionally, U1And ∑m1VT1It is to full Connecting quantity W1' carry out singular value SVD decompose obtain;
U2And ∑m2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
In force, full Connecting quantity W1' and W2' it is full Connecting quantity in Faster-RCNN for object detection,
Can be represented by matrix, it is assumed that matrix W1' size be expressed as u1×v1, representing matrix W1' in have u1v1Individual element, to matrix
W1' carry out after SVD decomposition, obtaining U and ΣmVT, U is u1×u1Matrix, ∑mRepresent u1×v1Positive semidefinite diagonal matrix, VTTable
Show v1×v1Matrix, ∑mDiagonal element be matrix W1' singular value, diagonal element is by putting in order by row from big to small
Arranged, reduction speed is very fast, because front 10% or even 1% singular value sum is more than whole singular value sums
99%, so, ∑mU in matrix1And v1M can be reduced to1, W1' can be approximated to be W1=U1∑m1VT1, U1=u1×m1, ∑m1=
m1×m1, VT1=m1×v1, such matrix W1' in element number from u1v1Become for m1(u1+v1), due to m1Much smaller than u1、v1
In any one, it is exactly m to be formulated1< < (u1,v1), therefore m1(u1+v1) it is much smaller than u1v1, so as to connected entirely
During process, need the element number for carrying out multiplication calculating to diminish, the speed that full connection is processed can be improved.
In the same manner, U2And ∑m2VT2It is to full Connecting quantity W2' carry out SVD and decompose to obtain, concrete processing procedure and W1' SVD
Decomposition method is identical, and here is omitted.
Optionally, four full connections can be carried out to process, the facial image that target image is included, corresponding step is obtained
104 process can be as follows:
Based on full Connecting quantity W1=U1Σm1VT1, at least one area image characteristic information to each image block carries out
Full connection is processed, and obtains being confirmed as the target area image characteristic information of facial image characteristic information;Based on full Connecting quantity
W2=U2Σm2VT2, to target area image characteristic information, full connection process is carried out, obtain each target area image feature letter
Cease corresponding image-region, and the position adjustment information of each image-region and scaling adjustment information;Based on each image district
The position adjustment information and scaling adjustment information in domain, is adjusted respectively to each image-region, by the image-region after adjustment,
It is defined as the human face region included in target image.
In force, server is extracted after the area image characteristic information in each image block, can be by area image
Characteristic information is input to the full articulamentum of classification, and full articulamentum of classifying includes two full Connecting quantity U1And Σm1VT1, can be by each
Area image characteristic information (area image characteristic information uses vector representation) is multiplied by parameter Σ of classification articulamentumm1VT1, then
The product for obtaining is multiplied by into U1, obtain the probability that each area image characteristic information belongs to background, and each area image spy
Reference breath belongs to the probability of certain objects.In at least one area image characteristic information of each image block, by certain objects
Probability be defined as target area image characteristic information more than the area image characteristic information of default value, default value is image
Minimum probability when characteristic information is facial image characteristic information, such as 0.7.
Server determined after target area image characteristic information, target area image characteristic information can be input to into recurrence
Full articulamentum, returning full articulamentum includes two full Connecting quantity U2And ∑m2VT2, can be by target area image characteristic information
(target area image characteristic information uses vector representation), is multiplied by parameter Σ for returning full articulamentumm2VT2, then will obtain
Product is multiplied by U2, obtain each corresponding image-region of target area image characteristic information, and the position of each image-region
Adjustment information and scaling adjustment information.The position of general each image-region by the image-region upper left corner end points coordinates table
Show, then server can be adjusted according to the position adjustment information of each image-region to the position of each image-region,
And according to scaling adjustment information, the size of image-region is adjusted, the image-region after being adjusted, after adjustment
Image-region is defined as in target image the region comprising face.
In addition, server is extracted after the area image characteristic information in each image block, can respectively by area image
Characteristic information is input to the full articulamentum of classification and returns full articulamentum, in full articulamentum of classifying, by each area image feature letter
Breath (area image characteristic information uses vector representation) is multiplied by parameter Σ of classification articulamentumm1VT1, then the product for obtaining is taken advantage of
With U1, the probability that each area image characteristic information belongs to background is obtained, and each area image characteristic information belongs to specific
The probability of object.Full articulamentum is being returned, (each area image characteristic information uses vector by each area image characteristic information
Represent), it is multiplied by parameter Σ for returning full articulamentumm2VT2, then the product for obtaining is multiplied by into U2, obtain each area image feature
The corresponding image-region of information, and the position adjustment information of each image-region and scaling adjustment information.Then in conjunction with classification
The output of full articulamentum and return the output of full articulamentum, server can according to the position adjustment information of each image-region,
The position of each image-region is adjusted, and according to scaling adjustment information, the size of image-region is adjusted, obtained
Image-region to after adjustment.The probability for determining certain objects is more than the image district belonging to the image feature information of default value
Domain, by the image-region after the corresponding adjustment of above-mentioned affiliated image-region the human face region of target image, present count are defined as
Be worth for image feature information be facial image characteristic information when minimum probability, such as 0.7.
In the embodiment of the present disclosure, using target image to be identified as fast area depth convolutional network, fast area depth
Degree convolutional network can carry out the process of convolution of the first preset times and the pond of the second preset times to target image to be identified
Change is processed, and obtains intermediate image, and intermediate image is divided into etc. into the preset number image block of size, in each image block,
Based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtain at least oneth area of each image block
Area image characteristic information, wherein, the height of the candidate frame of each pre-set dimension is more than or equal to width, according to each image block
At least one area image characteristic information, determines the human face region included in target image.So, area image feature is being carried out
During extraction, candidate frame of the height more than or equal to width is only remained, the number of candidate frame is reduced, and the area image for extracting is special
Reference breath can be reduced, and amount of calculation when the full connection for carrying out is processed also can be reduced, so as to improve the processing speed of Face datection.
Based on identical technology design, the exemplary embodiment of the disclosure one provides a kind of device of Face datection, such as Fig. 4
Shown, the device includes:
Processing module 410, the process of convolution and second for carrying out the first preset times to target image to be identified is pre-
If the pondization of number of times is processed, intermediate image is obtained;
Segmentation module 420, for the intermediate image preset number image block of size such as to be divided into;
Extraction module 430, in each image block, based on the candidate frame of at least one pre-set dimension, carries out region
Image characteristics extraction, obtains at least one area image characteristic information of each image block, wherein, the candidate of each pre-set dimension
The height of frame is more than or equal to width;
Determining module 440, for according at least one area image characteristic information of each image block, it is determined that described
The human face region included in target image.
Optionally, the ratio of the height and the width of the candidate frame of the pre-set dimension is 1:1 and 2:1.
Optionally, the determining module 440, is used for:
Based on full Connecting quantity W1=U1∑m1VT1, full Connecting quantity W2=U2∑m2VT2, to described each image block extremely
A few area image characteristic information carries out full connection process, determines the human face region included in the target image.
Optionally, U1And ∑m1VT1It is to full Connecting quantity W1' carry out singular value SVD decompose obtain;
U2And ∑m2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
Optionally, as shown in figure 5, the determining module 440 includes that first processes submodule 441, second processing submodule
442 and determination sub-module 443, wherein:
Described first processes submodule 441, for based on full Connecting quantity W1=U1∑m1VT1, to described each image block
At least one area image characteristic information, carry out full connection process, obtain being confirmed as the target of facial image characteristic information
Area image characteristic information;
The second processing submodule 442, for based on full Connecting quantity W2=U2∑m2VT2, to the target area figure
As characteristic information, full connection process is carried out, obtain the corresponding image-region of each described target area image characteristic information, and
The position adjustment information and scaling adjustment information of each image-region;
The determination sub-module 443, for the position adjustment information based on each image-region and scaling adjustment information, point
It is other that each image-region is adjusted, by the image-region after adjustment, it is defined as the face area included in the target image
Domain.
With regard to the device in above-described embodiment, wherein modules perform the concrete mode of operation in relevant the method
Embodiment in be described in detail, explanation will be not set forth in detail herein.
In the embodiment of the present disclosure, using target image to be identified as fast area depth convolutional network, fast area depth
Degree convolutional network can carry out the process of convolution of the first preset times and the pond of the second preset times to target image to be identified
Change is processed, and obtains intermediate image, and intermediate image is divided into etc. into the preset number image block of size, in each image block,
Based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtain at least oneth area of each image block
Area image characteristic information, wherein, the height of the candidate frame of each pre-set dimension is more than or equal to width, according to each image block
At least one area image characteristic information, determines the human face region included in target image.So, area image feature is being carried out
During extraction, candidate frame of the height more than or equal to width is only remained, the number of candidate frame is reduced, and the area image for extracting is special
Reference breath can be reduced, and amount of calculation when the full connection for carrying out is processed also can be reduced, so as to improve the processing speed of Face datection.
It should be noted that:Above-described embodiment provide Face datection device when Face datection is carried out, only with above-mentioned
The division of each functional module is illustrated, and in practical application, as desired can distribute above-mentioned functions by different
Functional module is completed, will the internal structure of device be divided into different functional modules, with complete it is described above whole or
Partial function.In addition, the device of Face datection that above-described embodiment is provided belongs to same structure with the embodiment of the method for Face datection
Think, it implements process and refer to embodiment of the method, repeats no more here.
The another exemplary embodiment of the disclosure provides a kind of structural representation of server.With reference to Fig. 6, server 600
Including process assembly 622, it further includes one or more processors, and the memorizer money by representated by memorizer 632
Source, can be by the instruction of the execution of processing component 622, such as application program for storage.The application program stored in memorizer 632
Can include it is one or more each corresponding to one group of instruction module.Additionally, process assembly 622 is configured to hold
Row instruction, the method to perform above-mentioned display usage record.
Server 600 can also include that power supply module 626 be configured to the power management of execute server 600, one
Individual wired or wireless network interface 660 is configured to for server 600 to be connected to network, and input and output (I/O) interface
668.Server 600 can be operated based on the operating system for being stored in memorizer 632, such as Windows ServerTM, Mac
OS XTM, UnixTM, LinuxTM, FreeBSDTM or similar.
Server 600 can include memorizer, and one or more than one program, one of them or one
Procedure above is stored in memorizer, and be configured to by one either more than one computing device it is one or one
Procedure above includes the instruction for carrying out following operation:
The pondization of process of convolution and the second preset times that the first preset times are carried out to target image to be identified is processed,
Obtain intermediate image;
The intermediate image such as is divided into at the preset number image block of size;
In each image block, based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtained
At least one area image characteristic information of each image block, wherein, the height of the candidate frame of each pre-set dimension is more than or waits
In width;
According at least one area image characteristic information of each image block, determine what is included in the target image
Human face region.
Optionally, the ratio of the height and the width of the candidate frame of the pre-set dimension is 1:1 and 2:1.
Optionally, at least one area image characteristic information of each image block described in the basis, determines the target
The human face region included in image, including:
Based on full Connecting quantity W1=U1∑m1VT1, full Connecting quantity W2=U2∑m2VT2, to described each image block extremely
A few area image characteristic information carries out full connection process, determines the human face region included in the target image.
Optionally, U1And Σm1VT1It is to full Connecting quantity W1' carry out singular value SVD decompose obtain;
U2And Σm2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
Optionally, it is described based on full Connecting quantity W1=U1∑m1VT1, full Connecting quantity W2=U2Σm2VT2, to it is described each
At least one area image characteristic information of image block carries out full connection process, determines the face area included in the target image
Domain, including:
Based on full Connecting quantity W1=U1∑m1VT1, at least one area image characteristic information to each image block,
Full connection process is carried out, obtains being confirmed as the target area image characteristic information of facial image characteristic information;
Based on full Connecting quantity W2=U2∑m2VT2, to the target area image characteristic information, full connection process is carried out,
Obtain the corresponding image-region of each described target area image characteristic information, and the position adjustment information of each image-region
With scaling adjustment information;
Position adjustment information and scaling adjustment information based on each image-region, adjusts respectively to each image-region
It is whole, by the image-region after adjustment, it is defined as the human face region included in the target image.
In the embodiment of the present disclosure, using target image to be identified as fast area depth convolutional network, fast area depth
Degree convolutional network can carry out the process of convolution of the first preset times and the pond of the second preset times to target image to be identified
Change is processed, and obtains intermediate image, and intermediate image is divided into etc. into the preset number image block of size, in each image block,
Based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtain at least oneth area of each image block
Area image characteristic information, wherein, the height of the candidate frame of each pre-set dimension is more than or equal to width, according to each image block
At least one area image characteristic information, determines the human face region included in target image.So, area image feature is being carried out
During extraction, candidate frame of the height more than or equal to width is only remained, the number of candidate frame is reduced, and the area image for extracting is special
Reference breath can be reduced, and amount of calculation when the full connection for carrying out is processed also can be reduced, so as to improve the processing speed of Face datection.
Those skilled in the art will readily occur to its of the disclosure after considering description and putting into practice disclosure disclosed herein
Its embodiment.The application is intended to any modification, purposes or the adaptations of the disclosure, these modifications, purposes or
Person's adaptations follow the general principle of the disclosure and including the undocumented common knowledge in the art of the disclosure
Or conventional techniques.Description and embodiments are considered only as exemplary, and the true scope of the disclosure and spirit are by following
Claim is pointed out.
It should be appreciated that the disclosure is not limited to the precision architecture for being described above and being shown in the drawings, and
And can without departing from the scope carry out various modifications and changes.The scope of the present disclosure is only limited by appended claim.
Claims (11)
1. a kind of method of Face datection, it is characterised in that methods described includes:
The pondization of process of convolution and the second preset times that the first preset times are carried out to target image to be identified is processed, and is obtained
Intermediate image;
The intermediate image such as is divided into at the preset number image block of size;
In each image block, based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtain each
At least one area image characteristic information of image block, wherein, the height of the candidate frame of each pre-set dimension is more than or equal to width
Degree;
According at least one area image characteristic information of each image block, the face included in the target image is determined
Region.
2. method according to claim 1, it is characterised in that the ratio of the height and the width of the candidate frame of the pre-set dimension
Example is 1:1 and 2:1.
3. method according to claim 1, it is characterised in that at least one region of each image block described in the basis
Image feature information, determines the human face region included in the target image, including:
Based on full Connecting quantity W1=U1Σm1VT1, full Connecting quantity W2=U2Σm2VT2, at least to each image block
Individual area image characteristic information carries out full connection process, determines the human face region included in the target image.
4. method according to claim 3, it is characterised in that U1And Σm1VT1It is to full Connecting quantity W1' carry out singular value
SVD decomposition is obtained;
U2And Σm2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
5. method according to claim 3, it is characterised in that described based on full Connecting quantity W1=U1Σm1VT1, full connection
Parameter W2=U2Σm2VT2, full connection process is carried out at least one area image characteristic information of each image block, it is determined that
The human face region included in the target image, including:
Based on full Connecting quantity W1=U1Σm1VT1, at least one area image characteristic information to each image block carries out
Full connection is processed, and obtains being confirmed as the target area image characteristic information of facial image characteristic information;
Based on full Connecting quantity W2=U2∑m2VT2, to the target area image characteristic information, full connection process is carried out, obtain
The corresponding image-region of each described target area image characteristic information, and position adjustment information and the contracting of each image-region
Put adjustment information;
Position adjustment information and scaling adjustment information based on each image-region, is adjusted respectively to each image-region,
By the image-region after adjustment, it is defined as the human face region included in the target image.
6. a kind of device of Face datection, it is characterised in that described device includes:
Processing module, for carrying out the process of convolution and second preset times of the first preset times to target image to be identified
Pondization process, obtains intermediate image;
Segmentation module, for the intermediate image preset number image block of size such as to be divided into;
Extraction module, in each image block, based on the candidate frame of at least one pre-set dimension, carries out area image feature
Extract, obtain at least one area image characteristic information of each image block, wherein, the height of the candidate frame of each pre-set dimension
More than or equal to width;
Determining module, for according at least one area image characteristic information of each image block, determining the target figure
The human face region included as in.
7. device according to claim 6, it is characterised in that the ratio of the height and the width of the candidate frame of the pre-set dimension
Example is 1:1 and 2:1.
8. device according to claim 6, it is characterised in that the determining module, is used for:
Based on full Connecting quantity W1=U1Σm1VT1, full Connecting quantity W2=U2∑m2VT2, at least to each image block
Individual area image characteristic information carries out full connection process, determines the human face region included in the target image.
9. device according to claim 8, it is characterised in that U1And ∑m1VT1It is to full Connecting quantity W1' carry out singular value
SVD decomposition is obtained;
U2And ∑m2VT2It is to full Connecting quantity W2' carry out SVD decompose obtain.
10. device according to claim 8, it is characterised in that the determining module include first process submodule, second
Submodule and determination sub-module are processed, wherein:
Described first processes submodule, for based on full Connecting quantity W1=U1Σm1VT1, at least to each image block
Individual area image characteristic information, carries out full connection process, obtains being confirmed as the target area image of facial image characteristic information
Characteristic information;
The second processing submodule, for based on full Connecting quantity W2=U2Σm2VT2, the target area image feature is believed
Breath, carries out full connection process, obtains the corresponding image-region of each described target area image characteristic information, and each image
The position adjustment information and scaling adjustment information in region;
The determination sub-module, for the position adjustment information based on each image-region and scaling adjustment information, respectively to every
Individual image-region is adjusted, and by the image-region after adjustment, is defined as the human face region included in the target image.
11. a kind of devices of Face datection, it is characterised in that described device includes:
Processor;
For storing the memorizer of processor executable;
Wherein, the processor is configured to:
The pondization of process of convolution and the second preset times that the first preset times are carried out to target image to be identified is processed, and is obtained
Intermediate image;
The intermediate image such as is divided into at the preset number image block of size;
In each image block, based on the candidate frame of at least one pre-set dimension, area image feature extraction is carried out, obtain each
At least one area image characteristic information of image block, wherein, the height of the candidate frame of each pre-set dimension is more than or equal to width
Degree;
According at least one area image characteristic information of each image block, the face included in the target image is determined
Region.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611185435.3A CN106599860B (en) | 2016-12-20 | 2016-12-20 | A kind of method and apparatus of Face datection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611185435.3A CN106599860B (en) | 2016-12-20 | 2016-12-20 | A kind of method and apparatus of Face datection |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106599860A true CN106599860A (en) | 2017-04-26 |
CN106599860B CN106599860B (en) | 2019-11-26 |
Family
ID=58600344
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611185435.3A Active CN106599860B (en) | 2016-12-20 | 2016-12-20 | A kind of method and apparatus of Face datection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106599860B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105243395A (en) * | 2015-11-04 | 2016-01-13 | 东方网力科技股份有限公司 | Human body image comparison method and device |
CN105488468A (en) * | 2015-11-26 | 2016-04-13 | 浙江宇视科技有限公司 | Method and device for positioning target area |
-
2016
- 2016-12-20 CN CN201611185435.3A patent/CN106599860B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105243395A (en) * | 2015-11-04 | 2016-01-13 | 东方网力科技股份有限公司 | Human body image comparison method and device |
CN105488468A (en) * | 2015-11-26 | 2016-04-13 | 浙江宇视科技有限公司 | Method and device for positioning target area |
Non-Patent Citations (3)
Title |
---|
LILIANG ZHANG等: "Is Faster R-CNN Doing Well for Pedestrian Detection?", 《EUROPEAN CONFERENCE ON COMPUTER VISION》 * |
ROSS GIRSHICK: "Fast R-CNN", 《THE IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION(ICCV),2015》 * |
SHAOQING REN等: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
Also Published As
Publication number | Publication date |
---|---|
CN106599860B (en) | 2019-11-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108205655B (en) | Key point prediction method and device, electronic equipment and storage medium | |
US10713532B2 (en) | Image recognition method and apparatus | |
US9396560B2 (en) | Image-based color palette generation | |
US10283162B2 (en) | Method for triggering events in a video | |
US10318797B2 (en) | Image processing apparatus and image processing method | |
WO2018153294A1 (en) | Face tracking method, storage medium, and terminal device | |
JP2020515983A (en) | Target person search method and device, device, program product and medium | |
CN110110118A (en) | Dressing recommended method, device, storage medium and mobile terminal | |
CN111383232B (en) | Matting method, matting device, terminal equipment and computer readable storage medium | |
CN110232318A (en) | Acupuncture point recognition methods, device, electronic equipment and storage medium | |
Zhang et al. | High-quality face image generation based on generative adversarial networks | |
CN111428671A (en) | Face structured information identification method, system, device and storage medium | |
CN111860484B (en) | Region labeling method, device, equipment and storage medium | |
CN111127309A (en) | Portrait style transfer model training method, portrait style transfer method and device | |
CN111199169A (en) | Image processing method and device | |
US20210166058A1 (en) | Image generation method and computing device | |
CN113610864B (en) | Image processing method, device, electronic equipment and computer readable storage medium | |
Sun et al. | Location dependent Dirichlet processes | |
CN106599860A (en) | Human face detection method and device | |
CN106469437B (en) | Image processing method and image processing apparatus | |
CN112150486A (en) | Image processing method and device | |
CN116030466B (en) | Image text information identification and processing method and device and computer equipment | |
CN116485833A (en) | Image segmentation method and device | |
CN117874272A (en) | Image classification method, apparatus, electronic device, and readable storage medium | |
Tang et al. | MRP-Net: A Light Multiple Region Perception Neural Network for Multi-label AU Detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |