CN106778550A - A kind of method and apparatus of Face datection - Google Patents

A kind of method and apparatus of Face datection Download PDF

Info

Publication number
CN106778550A
CN106778550A CN201611082414.9A CN201611082414A CN106778550A CN 106778550 A CN106778550 A CN 106778550A CN 201611082414 A CN201611082414 A CN 201611082414A CN 106778550 A CN106778550 A CN 106778550A
Authority
CN
China
Prior art keywords
convolution
network model
target
layer
rank
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611082414.9A
Other languages
Chinese (zh)
Other versions
CN106778550B (en
Inventor
万韶华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN201611082414.9A priority Critical patent/CN106778550B/en
Publication of CN106778550A publication Critical patent/CN106778550A/en
Application granted granted Critical
Publication of CN106778550B publication Critical patent/CN106778550B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The disclosure is directed to a kind of method and apparatus of Face datection, belong to field of computer technology.Methods described includes:In depth convolutional network model to be used, the convolution kernel of target convolutional layer is obtained;Convolution kernel to the target convolutional layer carries out CP decomposition, obtains the low-rank convolution kernel of the target convolutional layer;In the depth convolutional network model to be used, the convolution kernel of the target convolutional layer is replaced with into corresponding low-rank convolution kernel, the depth convolutional network model after being adjusted;Face datection is carried out to image based on the depth convolutional network model after the adjustment.Using the disclosure, the processing speed of Face datection can be improved.

Description

Face detection method and device
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for face detection.
Background
The face detection technology is a technology for positioning a face in an image according to feature information of the face. The algorithm model used by the general face detection technology is a deep convolution network model, and the specific processing is as follows:
and taking the image to be detected as the input of a preset depth convolution network model, and obtaining the face position information in the image to be detected through multilayer processing and full-connection processing. The multi-layer processing generally includes at least one layer of convolution processing and at least one layer of pooling processing, and the layer subjected to convolution processing may be referred to as a convolution layer and the layer subjected to pooling processing may be referred to as a pooling layer. In a multi-layer process, the output of the previous layer process is used as the input of the next layer. When a convolutional layer is convolved, the output data of the convolutional layer processing is generally obtained by multiplying the output data (which may be a matrix or a vector) of the previous layer processing by the convolution kernel (which is a matrix composed of a plurality of different parameters) of the convolutional layer.
In general, a deep convolutional network model includes a plurality of convolutional layers, each convolutional layer corresponds to a plurality of convolutional kernels, and thus, because the number of convolutional kernels of the convolutional layers is large, a large amount of complex calculation is required during convolutional processing, and the processing speed of face detection is slow.
Disclosure of Invention
In order to overcome the problems in the related art, the present disclosure provides a method and an apparatus for face detection. The technical scheme is as follows:
according to a first aspect of the embodiments of the present disclosure, there is provided a method for face detection, the method including:
acquiring a convolution kernel of a target convolution layer in a deep convolution network model to be used;
performing canonical CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer;
in the deep convolution network model to be used, replacing the convolution kernel of the target convolution layer with a corresponding low-rank convolution kernel to obtain an adjusted deep convolution network model;
and carrying out face detection on the image based on the adjusted depth convolution network model.
Optionally, the method further includes:
setting the value of the model parameter in the adjusted deep convolutional network model as the training initial value of the model parameter of the adjusted deep convolutional network model, and retraining the adjusted deep convolutional network model;
the detecting the human face of the image based on the adjusted depth convolution network model comprises the following steps:
and carrying out face detection on the image based on the retrained deep convolutional network model.
Optionally, the retraining the adjusted deep convolutional network model includes:
determining a training value of the model parameter corresponding to each preset sample image based on an error feedback algorithm, wherein when the value of the model parameter in the adjusted depth convolution network model is the training value and the input image of the adjusted depth convolution network model is the sample image, the output value of the adjusted depth convolution network model and a preset reference output value corresponding to the sample image meet a preset matching condition;
determining the average value of the training values of the model parameters corresponding to each sample image;
and adjusting the values of the model parameters in the adjusted deep convolutional network model into corresponding average values to obtain the retrained deep convolutional network model.
Optionally, the performing CP decomposition on the convolution kernel of the target convolutional layer to obtain a low-rank convolution kernel of the target convolutional layer includes:
and performing CP decomposition on the convolution kernels d multiplied by S multiplied by T of the target convolution layer to obtain four low-rank convolution kernels d multiplied by R, d multiplied by R, S multiplied by R and T multiplied by R of the target convolution layer, wherein d is the number of rows and columns of the convolution kernels, S is the number of color channels, T is the number of convolution kernels of the target convolution layer, and R is the rank of the convolution kernels of the target convolution layer.
Optionally, the performing CP decomposition on the convolution kernel of the target convolutional layer to obtain a low-rank convolution kernel of the target convolutional layer includes:
and if the convolution kernel of the target convolution layer is a full-rank matrix, performing CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer.
According to a second aspect of the embodiments of the present disclosure, there is provided an apparatus for face detection, the apparatus comprising:
the acquisition module is used for acquiring a convolution kernel of the target convolution layer in the deep convolution network model to be used;
the decomposition module is used for performing canonical CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer;
the replacing module is used for replacing the convolution kernel of the target convolution layer with a corresponding low-rank convolution kernel in the deep convolution network model to be used to obtain an adjusted deep convolution network model;
and the detection module is used for carrying out face detection on the image based on the adjusted depth convolution network model.
Optionally, the apparatus further comprises:
the training module is used for setting the value of the model parameter in the adjusted deep convolutional network model as the training initial value of the model parameter of the adjusted deep convolutional network model and retraining the adjusted deep convolutional network model;
the detection module is configured to:
and carrying out face detection on the image based on the retrained deep convolutional network model.
Optionally, the training module includes a first determining sub-module, a second determining sub-module, and an adjusting sub-module, wherein:
the first determining submodule is configured to determine, for each preset sample image, a training value of the model parameter corresponding to the sample image based on an error back-transfer algorithm, where when a value of the model parameter in the adjusted deep convolutional network model is the training value and an input image of the adjusted deep convolutional network model is the sample image, an output value of the adjusted deep convolutional network model and a preset reference output value corresponding to the sample image satisfy a preset matching condition;
the second determining submodule is used for determining the average value of the training values of the model parameters corresponding to each sample image;
and the adjusting module is used for adjusting the values of the model parameters in the adjusted deep convolutional network model into corresponding average values to obtain the retrained deep convolutional network model.
Optionally, the decomposition module is configured to:
and performing CP decomposition on the convolution kernels d multiplied by S multiplied by T of the target convolution layer to obtain four low-rank convolution kernels d multiplied by R, d multiplied by R, S multiplied by R and T multiplied by R of the target convolution layer, wherein d is the number of rows and columns of the convolution kernels, S is the number of color channels, T is the number of convolution kernels of the target convolution layer, and R is the rank of the convolution kernels of the target convolution layer.
Optionally, the decomposition module is configured to:
and if the convolution kernel of the target convolution layer is a full-rank matrix, performing CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer.
According to a third aspect of the embodiments of the present disclosure, there is provided an apparatus for face detection, the apparatus comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
acquiring a convolution kernel of a target convolution layer in a deep convolution network model to be used;
performing CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer;
in the deep convolution network model to be used, replacing the convolution kernel of the target convolution layer with a corresponding low-rank convolution kernel to obtain an adjusted deep convolution network model;
and carrying out face detection on the image based on the adjusted depth convolution network model.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
in the embodiment of the disclosure, in the deep convolutional network model to be used, the server may obtain a convolutional kernel of the target convolutional layer, perform CP decomposition on the convolutional kernel of the target convolutional layer to obtain a low-rank convolutional kernel of the target convolutional layer, replace the convolutional kernel of the target convolutional layer with a corresponding low-rank convolutional kernel in the deep convolutional network model to be used to obtain an adjusted deep convolutional network model, and perform face detection on an image based on the adjusted deep convolutional network model in a subsequent image face detection process. Therefore, when the deep convolutional network model is used for detecting the image face, the convolutional kernel of the convolutional layer is a low-rank convolutional kernel, the parameters in the low-rank convolutional kernel are less, and the data processing amount is smaller, so that the processing speed of the face detection can be improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. In the drawings:
fig. 1 is a flowchart of a method for detecting a human face according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a target convolutional layer convolution kernel provided by an embodiment of the present disclosure;
fig. 3 is a schematic diagram of a face detection process provided by an embodiment of the present disclosure;
FIG. 4 is a flowchart of a training method of a deep convolutional network model provided by an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an apparatus for face detection according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of an apparatus for face detection according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an apparatus for face detection according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of a server according to an embodiment of the present disclosure.
With the foregoing drawings in mind, certain embodiments of the disclosure have been shown and described in more detail below. These drawings and written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the concepts of the disclosure to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
The embodiment of the disclosure provides a face detection method, and an execution subject of the method can be a server, wherein the server can be a background server of a face detection application program. The server can be provided with a processor, a memory and the like, the processor can be used for processing in the process of face detection, and the memory can be used for storing data required in the process of face detection and generated data.
As shown in fig. 1, the processing flow of the method may include the following steps:
in step 101, the convolution kernel of the target convolution layer is obtained in the deep convolutional network model to be used.
The target convolutional layer may be one convolutional layer or a plurality of convolutional layers.
In implementation, the deep convolutional network model to be used includes at least one convolutional layer, each convolutional layer includes a preset number of convolutional kernels, the number of parameters of each convolutional kernel is the same, the parameter values of the parameters are different, and the parameter value of each convolutional kernel is determined. Before the server performs face detection on the image, the server may obtain a convolution kernel of the target convolution layer in at least one convolution layer.
In step 102, canonical CP decomposition is performed on the convolution kernel of the target convolutional layer to obtain a low-rank convolution kernel of the target convolutional layer.
In an implementation, after the server obtains the convolution kernel of the target convolution layer, CP (systematic convolutional decomposition) decomposition may be performed on each convolution kernel of the target convolution layer to obtain a low-rank convolution kernel corresponding to each convolution kernel, so as to obtain a low-rank convolution kernel of the target convolution layer.
Optionally, the convolution kernel of the target convolutional layer may be decomposed into four low-rank convolution kernels, and the corresponding processing of step 102 may be as follows:
and performing CP decomposition on the convolution kernels d multiplied by S multiplied by T of the target convolution layer to obtain four low-rank convolution kernels d multiplied by R, d multiplied by R, S multiplied by R and T multiplied by R of the target convolution layer, wherein d is the number of rows and columns of the convolution kernels, S is the number of color channels, T is the number of convolution kernels of the target convolution layer, and R is the rank of the convolution kernels of the target convolution layer.
Where d × d × S × T represents the number of rows and columns of the convolution kernel, S represents the number of color channels, i.e., RGB (Red Green Blue), and S is generally 3, T represents the number of convolution kernels of the target convolution layer, and R represents the rank of convolution kernels of the target convolution layer.
In an implementation, as shown in fig. 2, the server acquires the target convolutional layer with a convolutional kernel size of d × d, T convolutional kernels with a size of d × d in total, and S color channels, so that the server acquires the target convolutional layer with a convolutional kernel size of d × d × S × T, and can decompose the convolutional kernel size of d × d × S × T into four low-rank convolutional kernels with sizes of d × R, d × R, S × R and T × R according to the CP decomposition method.
For example, the input of the target convolutional layer is represented by a matrix U of size X × Y × S, U ═ X × Y × S, the convolution kernel of the target convolutional layer is represented by a matrix k of size d × d × S × T, k ═ d × d × S × T, the output of the target convolutional layer is represented by a matrix V of size (X-d +1) × (Y-d +1), and the output of the target convolutional layer is represented by formula (1)
Wherein,the convolution kernel k is decomposed by using CP to obtain an expression (2),
in the formula (2), kx(i-x+,r)、ky(j-y+,r)、ks(s,r)、kt(T, R) represents four component matrices of d × R, d × R, S × R and T × R, respectively, and formula (2) is substituted into formula (1) to obtain formula (3)
Thus, the output V (x, y, t) of the convolutional layer can be calculated by the following low-rank convolution kernel:
as described above, the output of the target convolution layer can be expressed by expression (7), and when the target convolution layer is subjected to convolution processing, the complexity of the multiplication calculation to be performed becomes X × Y × d2S T is changed into X Y R (2d + S + T), and the complexity of calculation is reduced because R is far less than d,the efficiency of face detection can be improved.
Optionally, if the convolution kernel of the target convolutional layer is a full-rank matrix, performing CP decomposition on the convolution kernel of the target convolutional layer to obtain a low-rank convolution kernel of the target convolutional layer.
In implementation, after the server obtains the convolution kernel of the target convolution layer, the server may determine the rank of the convolution kernel, and if the rank of the convolution kernel is equal to the number of rows of the convolution kernel or equal to the number of columns of the convolution kernel, the server determines that the convolution kernel is a full-rank matrix, and then performs CP decomposition on the target convolution kernel to obtain a low-rank convolution kernel of the target convolution layer.
In step 103, in the deep convolutional network model to be used, the convolutional kernel of the target convolutional layer is replaced by a corresponding low-rank convolutional kernel, so as to obtain an adjusted deep convolutional network model.
In implementation, after the server determines the low-rank convolution kernel of the target convolution kernel, each convolution kernel of the target convolution layer may be replaced by a corresponding low-rank convolution kernel in the deep convolution network model to be used, and the low-rank convolution kernels are stored to obtain the adjusted deep convolution network model.
In step 104, face detection is performed on the image based on the adjusted deep convolutional network model.
In implementation, as shown in fig. 3, after the server determines the adjusted deep convolutional network model, the adjusted deep convolutional network model may be used in face detection of an image, and the processing may be: inputting an image to be detected as an adjusted depth convolution network model, obtaining an N x N image after convolution processing and pooling processing, then dividing the N x N image into a preset number of image blocks with equal size, regarding each image block, taking a central position point of the image block as a central position point of a candidate frame according to the width-to-height ratio of 1:2, 1:1 and 2:1 and the area of 1282、2562And 5122Adding candidate frames in the image block, determining the position information of each candidate frame, and then obtaining the image in the candidate frameImage feature vectors of the image. And then carrying out full-connection processing, wherein the process is to multiply the acquired image feature vector by a preset matrix W to obtain the category of the image feature contained in each candidate frame and the position of each candidate frame needing to be adjusted, so that the position information of the face in the image can be determined.
The embodiment of the present disclosure further provides a process of retraining the adjusted deep convolutional network model, and the corresponding processing may be as follows:
and setting the value of the model parameter in the adjusted deep convolutional network model as the training initial value of the model parameter of the adjusted deep convolutional network model, retraining the adjusted deep convolutional network model, and performing face detection on the image based on the retrained deep convolutional network model.
The model parameters comprise parameters in a convolution kernel, other parameters in a deep convolution network model such as parameters in a pooling kernel and the like.
In implementation, after the adjusted deep convolutional network model is determined, values of model parameters in the adjusted deep convolutional network model can be obtained, then the obtained values of the model parameters are used as training initial values of the model parameters of the convolutional network model, the adjusted deep convolutional network model is retrained, and after the retrained deep convolutional network model is obtained, face detection can be performed on the image based on the retrained deep convolutional network model.
Optionally, the process of retraining the adjusted deep convolutional network model is the same as the training process of a general convolutional network model, as shown in fig. 4, the specific processing steps may be as follows:
in step 401, for each preset sample image, a training value of a model parameter corresponding to the sample image is determined based on an error back-transmission algorithm, where when a value of the model parameter in the adjusted deep convolutional network model is the training value and an input image of the adjusted deep convolutional network model is the sample image, an output value of the adjusted deep convolutional network model and a preset reference output value corresponding to the sample image satisfy a preset matching condition.
The preset matching condition may be that a difference between an output value of the adjusted depth convolution network model and a preset reference output value corresponding to the sample image is smaller than a preset threshold value, and the like.
In implementation, after the adjusted depth convolutional network model is determined, values of model parameters in the adjusted depth convolutional network model can be obtained, and a preset sample image is obtained, wherein the sample image also corresponds to a preset reference output value. In the training process, determining an objective function corresponding to the adjusted deep convolutional network model, wherein the independent variable of the objective function is the input x of the adjusted deep convolutional network model, the dependent variable is the output y of the adjusted deep convolutional network model, the parameters are model parameters w, w represents a plurality of parameters, and the objective function is represented as:taking a certain sample image (which can be called as a first sample image) as the input of the adjusted deep convolutional network model, performing forward propagation, determining the value of y of the target function, if the value of y of the target function and the preset reference output value corresponding to the sample image do not meet the preset matching condition, taking the difference value between the target function and the preset reference output value, then squaring to obtain a loss function L, then using an error return method, executing a backward propagation process through the adjusted deep convolutional network model, when executing the backward propagation process, firstly obtaining a preset learning rate α, then taking a partial derivative of each parameter based on the loss function, and calculating the parameter value used next time by the parameterUntil the next used parameter value of each parameter in the model parameters in the adjusted deep convolutional network model is calculated. Then, the parameter values of the determined model parameters are updated to the adjusted deep convolution network modelAnd taking the first sample image as the input of the adjusted deep convolutional network model, then executing forward propagation and backward propagation until the output value obtained when the first sample image is taken as the input of the adjusted deep convolutional network model and the preset reference output value corresponding to the sample image meet the preset matching condition, and determining the parameter value of the model parameter as the training value. The above-mentioned process is a training process based on a sample image, and the above-mentioned process is executed for each sample image until determining the training value of the model parameter corresponding to the adjusted deep convolutional network model when training based on each sample image.
In step 402, an average of the training values of the model parameters corresponding to each sample image is determined.
In implementation, the training values of the model parameters determined by using each sample image are respectively averaged to obtain the values of each model parameter in the adjusted deep convolutional network model.
In step 403, the values of the model parameters in the adjusted deep convolutional network model are adjusted to corresponding average values, so as to obtain a retrained deep convolutional network model.
In implementation, the values of the model parameters in the adjusted deep convolutional network model are respectively adjusted to corresponding average values and stored, so that the retrained deep convolutional network model is obtained.
For the learning rate mentioned in the training process, the parameter value of the model parameter is finely adjusted on the basis of determining the parameter value of the model parameter of the adjusted deep convolutional network model, so that the learning rate can be set to be smaller.
In the embodiment of the disclosure, in the deep convolutional network model to be used, the server may obtain a convolutional kernel of the target convolutional layer, perform CP decomposition on the convolutional kernel of the target convolutional layer to obtain a low-rank convolutional kernel of the target convolutional layer, replace the convolutional kernel of the target convolutional layer with a corresponding low-rank convolutional kernel in the deep convolutional network model to be used to obtain an adjusted deep convolutional network model, and perform face detection on an image based on the adjusted deep convolutional network model in a subsequent image face detection process. Therefore, when the deep convolutional network model is used for detecting the image face, the convolutional kernel of the convolutional layer is a low-rank convolutional kernel, the parameters of the low-rank convolutional kernel are less, and the data processing amount is smaller, so that the processing speed of the face detection can be improved.
Another embodiment of the present disclosure provides an apparatus for detecting a human face, as shown in fig. 5, the apparatus including:
an obtaining module 510, configured to obtain a convolution kernel of the target convolution layer in a deep convolution network model to be used;
a decomposition module 520, configured to perform canonical CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer;
a replacing module 530, configured to replace, in the deep convolutional network model to be used, the convolutional kernel of the target convolutional layer with a corresponding low-rank convolutional kernel, so as to obtain an adjusted deep convolutional network model;
and a detection module 540, configured to perform face detection on the image based on the adjusted deep convolutional network model.
Optionally, as shown in fig. 6, the apparatus further includes:
a training module 550, configured to set a value of a model parameter in the adjusted deep convolutional network model as a training initial value of the model parameter of the adjusted deep convolutional network model, and retrain the adjusted deep convolutional network model;
the detecting module 540 is configured to:
and carrying out face detection on the image based on the retrained deep convolutional network model.
Optionally, as shown in fig. 7, the training module 550 includes a first determining submodule 551, a second determining submodule 552 and an adjusting submodule 553, wherein:
the first determining submodule 551 is configured to determine, for each preset sample image, a training value of the model parameter corresponding to the sample image based on an error back-propagation algorithm, where when a value of the model parameter in the adjusted deep convolutional network model is the training value and an input image of the adjusted deep convolutional network model is the sample image, an output value of the adjusted deep convolutional network model and a preset reference output value corresponding to the sample image satisfy a preset matching condition;
the second determining submodule 552 is configured to determine an average value of the training values of the model parameter corresponding to each sample image;
the adjusting submodule 553 is configured to adjust the values of the model parameters in the adjusted deep convolutional network model to corresponding average values, so as to obtain a retrained deep convolutional network model.
Optionally, the decomposition module 520 is configured to:
and performing CP decomposition on the convolution kernels d multiplied by S multiplied by T of the target convolution layer to obtain four low-rank convolution kernels d multiplied by R, d multiplied by R, S multiplied by R and T multiplied by R of the target convolution layer, wherein d is the number of rows and columns of the convolution kernels, S is the number of color channels, T is the number of convolution kernels of the target convolution layer, and R is the rank of the convolution kernels of the target convolution layer.
Optionally, the decomposition module 520 is configured to:
and if the convolution kernel of the target convolution layer is a full-rank matrix, performing CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
In the embodiment of the disclosure, in the deep convolutional network model to be used, the server may obtain a convolutional kernel of the target convolutional layer, perform CP decomposition on the convolutional kernel of the target convolutional layer to obtain a low-rank convolutional kernel of the target convolutional layer, replace the convolutional kernel of the target convolutional layer with a corresponding low-rank convolutional kernel in the deep convolutional network model to be used to obtain an adjusted deep convolutional network model, and perform face detection on an image based on the adjusted deep convolutional network model in a subsequent image face detection process. Therefore, when the deep convolutional network model is used for detecting the image face, the convolutional kernel of the convolutional layer is a low-rank convolutional kernel, the parameters of the low-rank convolutional kernel are less, and the data processing amount is smaller, so that the processing speed of the face detection can be improved.
It should be noted that: in the face detection device provided in the above embodiment, when performing face detection, only the division of the functional modules is illustrated, and in practical application, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the face detection apparatus provided in the above embodiment and the face detection method embodiment belong to the same concept, and specific implementation processes thereof are described in detail in the method embodiment and are not described herein again.
Yet another exemplary embodiment of the present disclosure provides a structural diagram of a server. Referring to fig. 8, server 800 includes a processing component 1922 that further includes one or more processors and memory resources, represented by memory 1932, for storing instructions, such as applications, that are executable by processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the method of displaying usage records described above.
The server 800 may also include a power component 1926 configured to perform power management for the server 800, a wired or wireless network interface 1950 configured to connect the server 800 to a network, and an input/output (I/O) interface 1958. The server 800 may operate based on an operating system stored in memory 1932, such as Windows Server, MacOS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
The server 800 may include memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for:
acquiring a convolution kernel of a target convolution layer in a deep convolution network model to be used;
performing canonical CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer;
in the deep convolution network model to be used, replacing the convolution kernel of the target convolution layer with a corresponding low-rank convolution kernel to obtain an adjusted deep convolution network model;
and carrying out face detection on the image based on the adjusted depth convolution network model.
Optionally, the method further includes:
setting the value of the model parameter in the adjusted deep convolutional network model as the training initial value of the model parameter of the adjusted deep convolutional network model, and retraining the adjusted deep convolutional network model;
the detecting the human face of the image based on the adjusted depth convolution network model comprises the following steps:
and carrying out face detection on the image based on the retrained deep convolutional network model.
Optionally, the retraining the adjusted deep convolutional network model includes:
determining a training value of the model parameter corresponding to each preset sample image based on an error feedback algorithm, wherein when the value of the model parameter in the adjusted depth convolution network model is the training value and the input image of the adjusted depth convolution network model is the sample image, the output value of the adjusted depth convolution network model and a preset reference output value corresponding to the sample image meet a preset matching condition;
determining the average value of the training values of the model parameters corresponding to each sample image;
and adjusting the values of the model parameters in the adjusted deep convolutional network model into corresponding average values to obtain the retrained deep convolutional network model.
Optionally, the performing CP decomposition on the convolution kernel of the target convolutional layer to obtain a low-rank convolution kernel of the target convolutional layer includes:
and performing CP decomposition on the convolution kernels d multiplied by S multiplied by T of the target convolution layer to obtain four low-rank convolution kernels d multiplied by R, d multiplied by R, S multiplied by R and T multiplied by R of the target convolution layer, wherein d is the number of rows and columns of the convolution kernels, S is the number of color channels, T is the number of convolution kernels of the target convolution layer, and R is the rank of the convolution kernels of the target convolution layer.
Optionally, the performing CP decomposition on the convolution kernel of the target convolutional layer to obtain a low-rank convolution kernel of the target convolutional layer includes:
and if the convolution kernel of the target convolution layer is a full-rank matrix, performing CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer.
In the embodiment of the disclosure, in the deep convolutional network model to be used, the server may obtain a convolutional kernel of the target convolutional layer, perform CP decomposition on the convolutional kernel of the target convolutional layer to obtain a low-rank convolutional kernel of the target convolutional layer, replace the convolutional kernel of the target convolutional layer with a corresponding low-rank convolutional kernel in the deep convolutional network model to be used to obtain an adjusted deep convolutional network model, and perform face detection on an image based on the adjusted deep convolutional network model in a subsequent image face detection process. Therefore, when the deep convolutional network model is used for detecting the image face, the convolutional kernel of the convolutional layer is a low-rank convolutional kernel, the parameters of the low-rank convolutional kernel are less, and the data processing amount is smaller, so that the processing speed of the face detection can be improved.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (11)

1. A method of face detection, the method comprising:
acquiring a convolution kernel of a target convolution layer in a deep convolution network model to be used;
performing canonical CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer;
in the deep convolution network model to be used, replacing the convolution kernel of the target convolution layer with a corresponding low-rank convolution kernel to obtain an adjusted deep convolution network model;
and carrying out face detection on the image based on the adjusted depth convolution network model.
2. The method of claim 1, further comprising:
setting the value of the model parameter in the adjusted deep convolutional network model as the training initial value of the model parameter of the adjusted deep convolutional network model, and retraining the adjusted deep convolutional network model;
the detecting the human face of the image based on the adjusted depth convolution network model comprises the following steps:
and carrying out face detection on the image based on the retrained deep convolutional network model.
3. The method of claim 2, wherein the retraining the adjusted deep convolutional network model comprises:
determining a training value of the model parameter corresponding to each preset sample image based on an error feedback algorithm, wherein when the value of the model parameter in the adjusted depth convolution network model is the training value and the input image of the adjusted depth convolution network model is the sample image, the output value of the adjusted depth convolution network model and a preset reference output value corresponding to the sample image meet a preset matching condition;
determining the average value of the training values of the model parameters corresponding to each sample image;
and adjusting the values of the model parameters in the adjusted deep convolutional network model into corresponding average values to obtain the retrained deep convolutional network model.
4. The method of claim 1, wherein the performing CP decomposition on the convolution kernels of the target convolutional layer to obtain low-rank convolution kernels of the target convolutional layer comprises:
and performing CP decomposition on the convolution kernels d multiplied by S multiplied by T of the target convolution layer to obtain four low-rank convolution kernels d multiplied by R, d multiplied by R, S multiplied by R and T multiplied by R of the target convolution layer, wherein d is the number of rows and columns of the convolution kernels, S is the number of color channels, T is the number of convolution kernels of the target convolution layer, and R is the rank of the convolution kernels of the target convolution layer.
5. The method of claim 1, wherein the performing CP decomposition on the convolution kernels of the target convolutional layer to obtain low-rank convolution kernels of the target convolutional layer comprises:
and if the convolution kernel of the target convolution layer is a full-rank matrix, performing CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer.
6. An apparatus for face detection, the apparatus comprising:
the acquisition module is used for acquiring a convolution kernel of the target convolution layer in the deep convolution network model to be used;
the decomposition module is used for performing canonical CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer;
the replacing module is used for replacing the convolution kernel of the target convolution layer with a corresponding low-rank convolution kernel in the deep convolution network model to be used to obtain an adjusted deep convolution network model;
and the detection module is used for carrying out face detection on the image based on the adjusted depth convolution network model.
7. The apparatus of claim 6, further comprising:
the training module is used for setting the value of the model parameter in the adjusted deep convolutional network model as the training initial value of the model parameter of the adjusted deep convolutional network model and retraining the adjusted deep convolutional network model;
the detection module is configured to:
and carrying out face detection on the image based on the retrained deep convolutional network model.
8. The apparatus of claim 7, wherein the training module comprises a first determination sub-module, a second determination sub-module, and an adjustment sub-module, wherein:
the first determining submodule is configured to determine, for each preset sample image, a training value of the model parameter corresponding to the sample image based on an error back-transfer algorithm, where when a value of the model parameter in the adjusted deep convolutional network model is the training value and an input image of the adjusted deep convolutional network model is the sample image, an output value of the adjusted deep convolutional network model and a preset reference output value corresponding to the sample image satisfy a preset matching condition;
the second determining submodule is used for determining the average value of the training values of the model parameters corresponding to each sample image;
and the adjusting module is used for adjusting the values of the model parameters in the adjusted deep convolutional network model into corresponding average values to obtain the retrained deep convolutional network model.
9. The apparatus of claim 6, wherein the decomposition module is configured to:
and performing CP decomposition on the convolution kernels d multiplied by S multiplied by T of the target convolution layer to obtain four low-rank convolution kernels d multiplied by R, d multiplied by R, S multiplied by R and T multiplied by R of the target convolution layer, wherein d is the number of rows and columns of the convolution kernels, S is the number of color channels, T is the number of convolution kernels of the target convolution layer, and R is the rank of the convolution kernels of the target convolution layer.
10. The apparatus of claim 6, wherein the decomposition module is configured to:
and if the convolution kernel of the target convolution layer is a full-rank matrix, performing CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer.
11. An apparatus for face detection, the apparatus comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
acquiring a convolution kernel of a target convolution layer in a deep convolution network model to be used;
performing CP decomposition on the convolution kernel of the target convolution layer to obtain a low-rank convolution kernel of the target convolution layer;
in the deep convolution network model to be used, replacing the convolution kernel of the target convolution layer with a corresponding low-rank convolution kernel to obtain an adjusted deep convolution network model;
and carrying out face detection on the image based on the adjusted depth convolution network model.
CN201611082414.9A 2016-11-30 2016-11-30 Face detection method and device Active CN106778550B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611082414.9A CN106778550B (en) 2016-11-30 2016-11-30 Face detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611082414.9A CN106778550B (en) 2016-11-30 2016-11-30 Face detection method and device

Publications (2)

Publication Number Publication Date
CN106778550A true CN106778550A (en) 2017-05-31
CN106778550B CN106778550B (en) 2020-02-07

Family

ID=58898294

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611082414.9A Active CN106778550B (en) 2016-11-30 2016-11-30 Face detection method and device

Country Status (1)

Country Link
CN (1) CN106778550B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107943750A (en) * 2017-11-14 2018-04-20 华南理工大学 A kind of decomposition convolution method based on WGAN models
CN110287857A (en) * 2019-06-20 2019-09-27 厦门美图之家科技有限公司 A kind of training method of characteristic point detection model
CN110858323A (en) * 2018-08-23 2020-03-03 北京京东金融科技控股有限公司 Convolution-based image processing method, convolution-based image processing device, convolution-based image processing medium and electronic equipment
CN115719430A (en) * 2022-10-28 2023-02-28 河北舒隽科技有限公司 Method for identifying male and female of Taixing chick

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103425999A (en) * 2013-08-27 2013-12-04 西安电子科技大学 Brain cognitive state judgment method based on non-negative tensor projection operator decomposition algorithm
CN104318064A (en) * 2014-09-26 2015-01-28 大连理工大学 Three-dimensional head-related impulse response data compressing method based on canonical multi-decomposition
CN105844653A (en) * 2016-04-18 2016-08-10 深圳先进技术研究院 Multilayer convolution neural network optimization system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103425999A (en) * 2013-08-27 2013-12-04 西安电子科技大学 Brain cognitive state judgment method based on non-negative tensor projection operator decomposition algorithm
CN104318064A (en) * 2014-09-26 2015-01-28 大连理工大学 Three-dimensional head-related impulse response data compressing method based on canonical multi-decomposition
CN105844653A (en) * 2016-04-18 2016-08-10 深圳先进技术研究院 Multilayer convolution neural network optimization system and method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
VADIM LEBEDEV ET AL.: "SPEEDING-UP CONVOLUTIONAL NEURAL NETWORKS USING FINE-TUNED CP-DECOMPOSITION", 《ARXIV:1412.6553V3 》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107943750A (en) * 2017-11-14 2018-04-20 华南理工大学 A kind of decomposition convolution method based on WGAN models
CN110858323A (en) * 2018-08-23 2020-03-03 北京京东金融科技控股有限公司 Convolution-based image processing method, convolution-based image processing device, convolution-based image processing medium and electronic equipment
CN110858323B (en) * 2018-08-23 2024-07-19 京东科技控股股份有限公司 Convolution-based image processing method, convolution-based image processing device, convolution-based image processing medium and electronic equipment
CN110287857A (en) * 2019-06-20 2019-09-27 厦门美图之家科技有限公司 A kind of training method of characteristic point detection model
CN115719430A (en) * 2022-10-28 2023-02-28 河北舒隽科技有限公司 Method for identifying male and female of Taixing chick

Also Published As

Publication number Publication date
CN106778550B (en) 2020-02-07

Similar Documents

Publication Publication Date Title
TWI721510B (en) Method, apparatus and storage medium for binocular image depth estimation
CN106778550B (en) Face detection method and device
Zhang et al. Hierarchical feature fusion with mixed convolution attention for single image dehazing
CN107292352B (en) Image classification method and device based on convolutional neural network
CN104463209B (en) Method for recognizing digital code on PCB based on BP neural network
WO2019119301A1 (en) Method and device for determining feature image in convolutional neural network model
WO2016197026A1 (en) Full reference image quality assessment based on convolutional neural network
JP2021509747A (en) Hardware-based pooling system and method
CN114004754B (en) Scene depth completion system and method based on deep learning
US20220083857A1 (en) Convolutional neural network operation method and device
CN110874636A (en) Neural network model compression method and device and computer equipment
US11275966B2 (en) Calculation method using pixel-channel shuffle convolutional neural network and operating system using the same
CN111695624B (en) Updating method, device, equipment and storage medium of data enhancement strategy
Jiang et al. Learning a referenceless stereopair quality engine with deep nonnegativity constrained sparse autoencoder
US9058541B2 (en) Object detection method, object detector and object detection computer program
Sun et al. Learning local quality-aware structures of salient regions for stereoscopic images via deep neural networks
Gastaldo et al. Machine learning solutions for objective visual quality assessment
CN111814884A (en) Target detection network model upgrading method based on deformable convolution
CN116543433A (en) Mask wearing detection method and device based on improved YOLOv7 model
CN111814820A (en) Image processing method and device
CN109978928B (en) Binocular vision stereo matching method and system based on weighted voting
CN109934775B (en) Image processing, model training, method, device and storage medium
CN113888597A (en) Target tracking identification method and system based on lightweight target tracking network
CN111027670B (en) Feature map processing method and device, electronic equipment and storage medium
EP4348510A1 (en) Convolution with kernel expansion and tensor accumulation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant