CN110210329A - A kind of method for detecting human face, device and equipment - Google Patents
A kind of method for detecting human face, device and equipment Download PDFInfo
- Publication number
- CN110210329A CN110210329A CN201910393574.2A CN201910393574A CN110210329A CN 110210329 A CN110210329 A CN 110210329A CN 201910393574 A CN201910393574 A CN 201910393574A CN 110210329 A CN110210329 A CN 110210329A
- Authority
- CN
- China
- Prior art keywords
- depth
- convolution
- data
- human face
- separates
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of method for detecting human face, comprising: human face data is input to preparatory trained depth and is separated in convolution;Wherein, the depth separates convolution and uses the dense structure of two-way;The depth is separated the doubleway output data in convolution to integrate with the human face data, to export characteristic;The characteristic and default anchor point frame are mapped, the result of Face datection is exported.The invention also discloses a kind of human face detection device and equipment.Using the embodiment of the present invention, it can effectively reduce moulded dimension and reduce the operand during Face datection.
Description
Technical field
The present invention relates to human face detection tech more particularly to a kind of method for detecting human face, device and equipment.
Background technique
Face datection algorithm is a kind of human face data transmitted based on image/video monitoring, image information is accurately positioned
In face location.This technology is widely deployed in customs barrier monitoring, fugitive convict tracking and personal management.Its
Technology path can summarize are as follows: extract the essential characteristic of face, then detect whether have in figure according to certain confidence level
Face exists.In some schemes of relatively early stage, the foundation that many manual features are extracted as detection is such as used
ADABOOST and SVM tagsort.Its advantage is that the easy to operate and speed of service is very fast, but these methods are to the extensive of scene
Ability is poor, so that its Detection accuracy and recall rate are all lower.Later as the promotion of depth learning technology and GPU calculate energy
The continuous enhancing of power has more and more technical solutions based on convolutional neural networks, usually used in the prior art to be based on
The Face datection algorithm of Single Shot Detector (SSD), the feature extraction network of this algorithm are based on deep learning
Network VGG, although having obtained effectively being promoted in accuracy rate, recall rate, network model is larger, detection real-time compared with
It is low, and occupancy video memory is higher, the encapsulation for being unfavorable for mobile end equipment uses, while biggish meter is had during Face datection
Calculation amount.
Summary of the invention
The purpose of the embodiment of the present invention is that providing a kind of method for detecting human face, device and equipment, model scale can be effectively reduced
Operand during very little and reduction Face datection.
To achieve the above object, the embodiment of the invention provides a kind of method for detecting human face, comprising:
Human face data is input to preparatory trained depth to separate in convolution;Wherein, the depth separates convolution
Using the dense structure of two-way;
The depth is separated the doubleway output data in convolution to integrate with the human face data, to export feature
Data;
The characteristic and default anchor point frame are mapped, the result of Face datection is exported.
Compared with prior art, in method for detecting human face disclosed by the invention, human face data is input to preparatory instruction first
The depth perfected separates in convolution, separates convolution model using the depth of the dense structure of two-way, can greatly reduce
The operand during Face datection is reduced while model size;Then the depth is separated to the doubleway output in convolution
Data are integrated with original input data, to extract characteristic, the information and former input integration after convolution are not only
In order to shallow-layer human face data is more directly transmitted to deep layer, while can also slow down while extracting feature using convolution
The gradient disappearance problem of deep learning network;Finally characteristic and default anchor point frame are mapped, export Face datection
As a result, can make network to the prediction of face information within the scope of certain, for network provide reliable facial size according to
According to so that it preferably learns and predicts face information, to keep higher detection in limited network parameter
Energy.
As an improvement of the above scheme, doubleway output data and the face depth separated in convolution
Data are integrated, after exporting characteristic, further includes:
The characteristic is input to next depth to separate in convolution;
Judge that next depth separates whether convolution is that preset target depth separates convolution;
If so, by next depth separate convolution in doubleway output data and the characteristic progress it is whole
It closes, to export target signature data;If it is not, then by next depth separate convolution in doubleway output data with it is described
After characteristic is integrated, the data after integration are continued to be input in the separable convolution of next depth, until under
It is that the target depth separates convolution that one depth, which separates convolution,;
Then, described to map the characteristic and default anchor point frame, obtain the result of Face datection, comprising:
The target signature data and default anchor point frame are mapped, the result of Face datection is obtained.
As an improvement of the above scheme, it includes the first output channel and the second output channel that the depth, which separates convolution,;
Wherein,
First output channel includes 1*1 convolutional layer, the first convergence layer, the separable convolutional layer of 3*3 depth, the second convergence
Layer and non-linear layer;
Second output channel includes 1*1 convolutional layer, the first convergence layer, the separable convolutional layer of 5*5 depth, the second convergence
Layer and non-linear layer.
As an improvement of the above scheme, the non-linear layer is ReLU layers or C.ReLU layers.
As an improvement of the above scheme, convolution is separated to the depth using loss function to be trained, then, and the depth
The training method for spending separable convolution includes:
The entrance loss value in the loss function;Wherein, the penalty values are predicted characteristics data and real features number
According to difference;
Judge whether the loss value of the loss function converges to preset value;
If so, the trained depth of output separates convolution;If it is not, then adjusting the depth separates convolution
Parameter is until the loss value converges to preset value.
It is as an improvement of the above scheme, described that human face data is input to before preparatory trained depth separates convolution,
Further include:
The human face data is pre-processed.
The embodiment of the invention also provides a kind of human face detection devices, comprising:
Data input module separates in convolution for human face data to be input to preparatory trained depth;Wherein, institute
It states depth and separates convolution using the dense structure of two-way;
Data Integration module, for by the depth separate convolution in doubleway output data and the human face data into
Row integration, to export characteristic;
Mapping block exports the result of Face datection for mapping the characteristic and default anchor point frame.
As an improvement of the above scheme, the Data Integration module is also used to:
The characteristic is input to next depth to separate in convolution;
Judge that next depth separates whether convolution is that preset target depth separates convolution;
If so, by next depth separate convolution in doubleway output data and the characteristic progress it is whole
It closes, to export target signature data;If it is not, then by next depth separate convolution in doubleway output data with it is described
After characteristic is integrated, the data after integration are continued to be input in the separable convolution of next depth, until under
It is that the target depth separates convolution that one depth, which separates convolution,;
Then, the mapping block is specifically used for:
The target signature data and default anchor point frame are mapped, the result of Face datection is obtained.
As an improvement of the above scheme, it includes the first output channel and the second output channel that the depth, which separates convolution,;
Wherein,
First output channel includes 1*1 convolutional layer, the first convergence layer, the separable convolutional layer of 3*3 depth, the second convergence
Layer and non-linear layer;
Second output channel includes 1*1 convolutional layer, the first convergence layer, the separable convolutional layer of 5*5 depth, the second convergence
Layer and non-linear layer.
To achieve the above object, the embodiment of the present invention also provide a kind of human-face detection equipment, including processor, memory with
And the computer program executed by the processor is stored in the memory and is configured as, described in the processor execution
The method for detecting human face as described in any one of above-described embodiment is realized when computer program.
Detailed description of the invention
Fig. 1 is a kind of flow chart of method for detecting human face provided in an embodiment of the present invention;
Fig. 2 is the structural schematic diagram that depth separates convolution in a kind of method for detecting human face provided in an embodiment of the present invention;
Fig. 3 is the doubleway output data that depth separates convolution in a kind of method for detecting human face provided in an embodiment of the present invention
With the integration schematic diagram of original input data;
Fig. 4 is the flow chart of another method for detecting human face provided in an embodiment of the present invention;
Fig. 5 is the process that depth separates convolution training method in a kind of method for detecting human face provided in an embodiment of the present invention
Figure;
Fig. 6 is a kind of structural schematic diagram of human face detection device provided in an embodiment of the present invention;
Fig. 7 is a kind of structural schematic diagram of human-face detection equipment provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Embodiment one
It is a kind of flow chart of method for detecting human face provided in an embodiment of the present invention referring to Fig. 1, Fig. 1;Include:
S11, human face data is input in the separable convolution of preparatory trained depth;Wherein, the depth is separable
Convolution uses the dense structure of two-way;
S12, the doubleway output data in the separable convolution of the depth are integrated with the human face data, with output
Characteristic;
S13, the characteristic and default anchor point frame are mapped, exports the result of Face datection.
It is worth noting that basic model used in the embodiment of the present invention is to separate the structure of convolution according to depth to set
Meter, it is depth convolution sum point convolution two parts that the depth, which separates convolution for conventional roll Integral Solution, thus keeping tradition
The calculation amount of convolution is considerably reduced while convolution performance.But mean that performance can be reduced while compact model, because
This embodiment of the present invention is with the dense structure of the two-way for separating convolution based on depth, also referred to as two-way Depthwise
Dense Block (DDB), while depth is separated convolution stacking, also simultaneously by the shallow-layer of entire depth learning network
Information is constantly transferred to deep layer, brings more accurate face information (constantly to pass shallow-layer feature for deep layer convolution module
It is handed to further feature, so that graph text information constantly merges above and below allowing, brings more accurate face information for deep layer convolution).
Preferably, before executing step S11, the method for detecting human face further include:
S10, the human face data is pre-processed.Preferably, the human face data includes image data and video counts
According to.This pretreatment includes the distortion of the random cropping to image, overturning, scaling and pixel, these steps to train
Data it is more random, so as to further increase deep learning network in the case where not increasing network model size
Generalization ability, to improve its detection performance.
Specifically, in step s 11, two-way DDB structure is equipped with a growth rate, the i.e. port number of convolution, in the present invention
32 are set in embodiment.Referring to fig. 2, Fig. 2 is that depth can in a kind of method for detecting human face provided in an embodiment of the present invention
Separate the structural schematic diagram of convolution;It includes the first output channel and the second output channel that the depth, which separates convolution,;Wherein, institute
Stating the first output channel includes that 1*1 convolutional layer, the first convergence layer, 3*3 depth separate convolutional layer, the second convergence layer and non-thread
Property layer;Second output channel includes 1*1 convolutional layer, the first convergence layer, 5*5 depth separable convolutional layer, the second convergence layer
And non-linear layer.
Preferably, first convergence layer is Batch Normalization, and second convergence layer is Batch
Normalization, the non-linear layer are ReLU layers or C.ReLU layers.Specifically, input channel is compressed by 1*1 convolution
For growth rate, the convergence capabilities of network are improved followed by a Batch Normalization, then use 3*3,5*5 respectively
Depth separate convolution do feature extraction, followed by Batch Normalization and nonlinear function ReLU/
C.ReLU, for improving the nonlinear characteristic of network.
In this structure, the characteristic information of shallow-layer can be more effectively transferred into deep layer convolution, and be used
The dense structure of two-way changes also for the receptive field for capableing of controlling depth study, separates convolution with a 3*3 depth to bear
The receptive field of Small object is blamed, then separates convolution to be responsible for the receptive field of big target, so that this technology with a 5*5 depth
Scheme can more effectively cope with the Face datection task of different sizes.
Depth based on the dense structure design of the two-way, which separates convolution, can greatly save Internet resources, it is assumed that its
Structure input is 64, and exporting is 96, then the calculation amount of a two-way DDB structure is 64 × 32 × 2+32 × 3 × 3+32 × 5 × 5
=5184B, and VGG in the prior art is 64 × 3 × 3 × 96=55296B using the calculation amount of a tradition 3*3 convolution.Its
Overall network size is about 1M, compares the Face datection algorithm based on VGG, and model reduces about 11 times or so, but its
Based on dense stacked structure, very high detection performance still can be kept.This design scheme is since its model is small and counts
The few feature of calculation amount, can be not only used for based on PC rear end fast face detection, can also be used in the future it is embedded before
In the fast face Detection task at end.
Specifically, the depth to be separated to the doubleway output data and the human face data in convolution in step s 12
It is integrated, to export characteristic.Preferably, the characteristic is face texture information and location information.Referring to Fig. 3,
Fig. 3 be in a kind of method for detecting human face provided in an embodiment of the present invention depth separate convolution doubleway output data with it is original defeated
Enter the integration schematic diagram of data;The output of this two-way is added on the channel of former input, to obtain final output.Here
By after convolution information and former input integrate, not only for while extracting feature using convolution, by shallow-layer face information
It is more directly transmitted to deep layer, while the gradient disappearance problem of deep learning network can also be slowed down.
Specifically, in step s 13, the characteristic and default anchor point frame being mapped, Face datection is exported
As a result.Be in the embodiment of the present invention based on training deep neural network come so that its study to face texture and location information to
Detect face location.By the analysis to human face data, face is all largely square, and shape size is all not fixed.
Therefore, in order to allow algorithm preferably to learn to face information, the embodiment of the present invention is according to the mode for presetting anchor point frame
To predict the location information of face.First according to the face size in training data, the suitable anchor point frame of engineer, then will be true
A bias is obtained on real Face datection information MAP to anchor point frame, allows and e-learning and predicts this bias, finally
The result of this bias is mapped back on anchor point frame, to calculate last true predictive value and final face position
Confidence breath.This method can make network to the prediction of face information within the scope of certain, and can provide for network can
The facial size foundation leaned on, so that it preferably learns and predicts face information, to protect in limited network parameter
Hold higher detection performance.
Further, during above-mentioned steps S11~S13, convolution is separated only with a depth and is carried out
Face datection, in order to improve the precision of Face datection, the present invention can also separate convolution using multiple depth and carry out group
It closes, to realize Face datection.At this point, referring to fig. 4, Fig. 4 is another method for detecting human face provided in an embodiment of the present invention
Flow chart;Include:
S21, human face data is input in the separable convolution of preparatory trained depth;Wherein, the depth is separable
Convolution uses the dense structure of two-way;
S22, the doubleway output data in the separable convolution of the depth are integrated with the human face data, with output
Characteristic;
S23, the characteristic is input in the separable convolution of next depth;
S24, judge that next depth separates whether convolution is that preset target depth separates convolution;
S25, if so, by next depth separate convolution in doubleway output data and the characteristic into
Row integration, to export target signature data;If it is not, then by next depth separate convolution in doubleway output data with
After the characteristic is integrated, the data after integration are continued to be input in the separable convolution of next depth, directly
Separating convolution to next depth is that the target depth separates convolution;
S26, the target signature data and default anchor point frame are mapped, obtains the result of Face datection.
Specifically, the embodiment of the present invention has the two-way for having used 8 growth rates to be 32 altogether in entire depth learning network
DDB structure, it is that the 1st depth separates convolution, institute that the depth that the human face data inputs, which is separated convolution, at this time
Stating target depth and separating convolution is that the 8th depth separates convolution.
Preferably, it is contemplated that the speed of service that the depth separates convolution is further speeded up, it is double in the first two of network
The non-linear layer in road DDB convolutional coding structure (shallow structure) uses Concatenate ReLU (C.ReLU) non-linear behaviour
Make, and the non-linear layer in remaining two-way DDB convolutional coding structure uses ReLU, can further speed up the detection speed of network
Degree.Why C.ReLU can allow network detection speed to become faster, and be the output because of its another hemichannel that can replicate convolution,
Convolution is cooperated to use, so that convolution only needs the parameter amount of half, so that it may reach the performance of original convolution, and only in network
Shallow-layer uses C.ReLU, is because can damage the performance of network in deep layer Web vector graphic C.ReLU.
Further, the embodiment of the present invention separates convolution to the depth using loss function and is trained, referring to
Fig. 5, Fig. 5 are the flow charts that depth separates convolution training method in a kind of method for detecting human face provided in an embodiment of the present invention;
Include:
S31, the entrance loss value in the loss function;Wherein, the penalty values are predicted characteristics data and true spy
Levy the difference of data;
S32, judge whether the loss value of the loss function converges to preset value;
S33, if so, the trained depth of output separates convolution;If it is not, then adjusting the separable volume of the depth
Long-pending parameter is until the loss value converges to preset value.Preferably, the preset value is 0.
Compared with prior art, the method for detecting human face disclosed by the embodiments of the present invention, has the following beneficial effects:
Human face data is input to preparatory trained depth to separate in convolution, it can using the depth of the dense structure of two-way
Convolution model is separated, model size can greatly reduced, the encapsulation conducive to mobile end equipment uses, while reducing face inspection
Operand during survey;The depth is separated the doubleway output data in convolution to integrate with original input data,
To extracting characteristic, by after convolution information and former input integrate, not only for extracting the same of feature using convolution
When, shallow-layer human face data is more directly transmitted to deep layer, while the gradient disappearance problem of deep learning network can also be slowed down;
Characteristic and default anchor point frame are mapped, export Face datection as a result, network can be made to the pre- of face information
It surveys within the scope of certain, reliable facial size foundation can be provided for network, so that it preferably learns and predict people
Face information, to keep higher detection performance in limited network parameter.
Embodiment two
It is a kind of structural schematic diagram of human face detection device 10 provided in an embodiment of the present invention referring to Fig. 6, Fig. 6;Include:
Data input module 11 separates in convolution for human face data to be input to preparatory trained depth;Wherein,
The depth separates convolution and uses the dense structure of two-way;
Data Integration module 12, for the depth to be separated to doubleway output data and the human face data in convolution
It is integrated, to export characteristic;
Mapping block 13 exports the result of Face datection for mapping the characteristic and default anchor point frame.
Preferably, the Data Integration module 12 is also used to:
The characteristic is input to next depth to separate in convolution;
Judge that next depth separates whether convolution is that preset target depth separates convolution;
If so, by next depth separate convolution in doubleway output data and the characteristic progress it is whole
It closes, to export target signature data;If it is not, then by next depth separate convolution in doubleway output data with it is described
After characteristic is integrated, the data after integration are continued to be input in the separable convolution of next depth, until under
It is that the target depth separates convolution that one depth, which separates convolution,;
Then, the mapping block 13 is specifically used for: the target signature data and default anchor point frame being mapped, are obtained
The result of Face datection.
Preferably, it includes the first output channel and the second output channel that the depth, which separates convolution,;Wherein, described first
Output channel includes 1*1 convolutional layer, the first convergence layer, the separable convolutional layer of 3*3 depth, the second convergence layer and non-linear layer;
Second output channel includes that 1*1 convolutional layer, the first convergence layer, 5*5 depth separate convolutional layer, the second convergence layer and non-
Linear layer.
The course of work of specific 10 modules of human face detection device please refers to people described in above-described embodiment one
The flow chart of face detecting method, details are not described herein.
Compared with prior art, the human face detection device 10 disclosed by the embodiments of the present invention, has the following beneficial effects:
Human face data is input to preparatory trained depth and separated in convolution by the data input module 11, using double
The depth of the dense structure in road separates convolution model, can greatly reduce model size, conducive to the encapsulation of mobile end equipment
It uses, while reducing the operand during Face datection;The Data Integration module 12 separates the depth in convolution
Doubleway output data integrated with original input data, to extract characteristic, by the information after convolution and former input
Shallow-layer human face data is more directly transmitted to deep layer, simultaneously not only for while extracting feature using convolution by integration
The gradient disappearance problem of deep learning network can also be slowed down;The mapping block 13 carries out characteristic and default anchor point frame
Mapping, export Face datection as a result, can make network that can be net within the scope of certain to the prediction of face information
Network provides reliable facial size foundation, so that it preferably learns and predict face information, thus in limited network parameter
In the case of, keep higher detection performance.
Embodiment three
It is a kind of structural schematic diagram of human-face detection equipment 20 provided in an embodiment of the present invention referring to Fig. 7, Fig. 7;The implementation
The human-face detection equipment 20 of example includes: processor 21, memory 22 and is stored in the memory 22 and can be at the place
The computer program run on reason device 21.The processor 21 realizes that above-mentioned each 3D scene is aobvious when executing the computer program
Show the step in control method embodiment, such as step S11~S13 shown in FIG. 1.Alternatively, the processor 21 execute it is described
The function of each module/unit in above-mentioned each Installation practice, such as data input module 11 are realized when computer program.
Illustratively, the computer program can be divided into one or more module/units, one or more
A module/unit is stored in the memory 22, and is executed by the processor 21, to complete the present invention.It is one
Or multiple module/units can be the series of computation machine program instruction section that can complete specific function, the instruction segment is for retouching
State implementation procedure of the computer program in the human-face detection equipment 20.For example, the computer program can be divided
It is cut into data input module 11, Data Integration module 12 and mapping block 13, each module concrete function is as follows:
Data input module 11 separates in convolution for human face data to be input to preparatory trained depth;Wherein,
The depth separates convolution and uses the dense structure of two-way;
Data Integration module 12, for the depth to be separated to doubleway output data and the human face data in convolution
It is integrated, to export characteristic;
Mapping block 13 exports the result of Face datection for mapping the characteristic and default anchor point frame.
The human-face detection equipment 20 can be the meter such as desktop PC, notebook, palm PC and cloud server
Calculate equipment.The human-face detection equipment 20 may include, but be not limited only to, processor 21, memory 22.Those skilled in the art can
To understand, the schematic diagram is only the example of human-face detection equipment 20, does not constitute the restriction to human-face detection equipment 20, can
To include perhaps combining certain components or different components, such as the Face datection than illustrating more or fewer components
Equipment 20 can also include input-output equipment, network access equipment, bus etc..
The processor 21 can be central processing unit (Central Processing Unit, CPU), can also be
Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit
(Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor
Deng the processor 21 is the control centre of the human-face detection equipment 20, utilizes various interfaces and the entire face of connection
The various pieces of detection device 20.
The memory 22 can be used for storing the computer program and/or module, the processor 21 by operation or
The computer program and/or module being stored in the memory 22 are executed, and calls the data being stored in memory 22,
Realize the various functions of the human-face detection equipment 20.The memory 22 can mainly include storing program area and storing data
Area, wherein storing program area can application program needed for storage program area, at least one function (such as sound-playing function,
Image player function etc.) etc.;Storage data area, which can be stored, uses created data (such as audio data, electricity according to mobile phone
Script for story-telling etc.) etc..In addition, the memory 22 may include high-speed random access memory, it can also include non-volatile memories
Device, such as hard disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure
Digital, SD) card, flash card (Flash Card), at least one disk memory, flush memory device or other volatibility are solid
State memory device.
Wherein, if module/unit that the human-face detection equipment 20 integrates is realized in the form of SFU software functional unit simultaneously
When sold or used as an independent product, it can store in a computer readable storage medium.Based on such reason
Solution, the present invention realize all or part of the process in above-described embodiment method, can also instruct correlation by computer program
Hardware complete, the computer program can be stored in a computer readable storage medium, the computer program is in quilt
When processor 21 executes, it can be achieved that the step of above-mentioned each embodiment of the method.Wherein, the computer program includes computer journey
Sequence code, the computer program code can be source code form, object identification code form, executable file or certain intermediate shapes
Formula etc..The computer-readable medium may include: any entity or device, note that can carry the computer program code
Recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory),
Random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium
Deng.It should be noted that the content that the computer-readable medium includes can be real according to legislation in jurisdiction and patent
The requirement trampled carries out increase and decrease appropriate, such as in certain jurisdictions, according to legislation and patent practice, computer-readable medium
It does not include electric carrier signal and telecommunication signal.
It should be noted that the apparatus embodiments described above are merely exemplary, wherein described be used as separation unit
The unit of explanation may or may not be physically separated, and component shown as a unit can be or can also be with
It is not physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual
It needs that some or all of the modules therein is selected to achieve the purpose of the solution of this embodiment.In addition, device provided by the invention
In embodiment attached drawing, the connection relationship between module indicate between them have communication connection, specifically can be implemented as one or
A plurality of communication bus or signal wire.Those of ordinary skill in the art are without creative efforts, it can understand
And implement.
The above is a preferred embodiment of the present invention, it is noted that for those skilled in the art
For, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also considered as
Protection scope of the present invention.
Claims (10)
1. a kind of method for detecting human face characterized by comprising
Human face data is input to preparatory trained depth to separate in convolution;Wherein, the depth separates convolution and uses
The dense structure of two-way;
The depth is separated the doubleway output data in convolution to integrate with the human face data, to export characteristic
According to;
The characteristic and default anchor point frame are mapped, the result of Face datection is exported.
2. method for detecting human face as described in claim 1, which is characterized in that pair separated the depth in convolution
Road output data is integrated with the human face data, after exporting characteristic, further includes:
The characteristic is input to next depth to separate in convolution;
Judge that next depth separates whether convolution is that preset target depth separates convolution;
It is integrated if so, next depth is separated the doubleway output data in convolution with the characteristic,
To export target signature data;If it is not, next depth then to be separated to doubleway output data and the spy in convolution
After sign data are integrated, the data after integration are continued to be input in the separable convolution of next depth, until next
It is that the target depth separates convolution that a depth, which separates convolution,;
Then, described to map the characteristic and default anchor point frame, obtain the result of Face datection, comprising:
The target signature data and default anchor point frame are mapped, the result of Face datection is obtained.
3. method for detecting human face as described in claim 1, which is characterized in that it includes the first output that the depth, which separates convolution,
Channel and the second output channel;Wherein,
First output channel include 1*1 convolutional layer, the first convergence layer, 3*3 depth separate convolutional layer, the second convergence layer with
And non-linear layer;
Second output channel include 1*1 convolutional layer, the first convergence layer, 5*5 depth separate convolutional layer, the second convergence layer with
And non-linear layer.
4. method for detecting human face as claimed in claim 3, which is characterized in that the non-linear layer is ReLU layers or C.ReLU
Layer.
5. method for detecting human face as described in claim 1, which is characterized in that using loss function to the separable volume of the depth
Product is trained, then, the training method that the depth separates convolution includes:
The entrance loss value in the loss function;Wherein, the penalty values are predicted characteristics data and real features data
Difference;
Judge whether the loss value of the loss function converges to preset value;
If so, the trained depth of output separates convolution;If it is not, then adjusting the parameter that the depth separates convolution
Until the loss value converges to preset value.
6. method for detecting human face as described in claim 1, which is characterized in that described be input to human face data trains in advance
Depth separate convolution before, further includes:
The human face data is pre-processed.
7. a kind of human face detection device characterized by comprising
Data input module separates in convolution for human face data to be input to preparatory trained depth;Wherein, the depth
It spends separable convolution and uses the dense structure of two-way;
Data Integration module, for by the depth separate convolution in doubleway output data and the human face data progress it is whole
It closes, to export characteristic;
Mapping block exports the result of Face datection for mapping the characteristic and default anchor point frame.
8. human face detection device as described in claim 1, which is characterized in that the Data Integration module is also used to:
The characteristic is input to next depth to separate in convolution;
Judge that next depth separates whether convolution is that preset target depth separates convolution;
It is integrated if so, next depth is separated the doubleway output data in convolution with the characteristic,
To export target signature data;If it is not, next depth then to be separated to doubleway output data and the spy in convolution
After sign data are integrated, the data after integration are continued to be input in the separable convolution of next depth, until next
It is that the target depth separates convolution that a depth, which separates convolution,;
Then, the mapping block is specifically used for:
The target signature data and default anchor point frame are mapped, the result of Face datection is obtained.
9. human face detection device as claimed in claim 8, which is characterized in that it includes the first output that the depth, which separates convolution,
Channel and the second output channel;Wherein,
First output channel include 1*1 convolutional layer, the first convergence layer, 3*3 depth separate convolutional layer, the second convergence layer with
And non-linear layer;
Second output channel include 1*1 convolutional layer, the first convergence layer, 5*5 depth separate convolutional layer, the second convergence layer with
And non-linear layer.
10. a kind of human-face detection equipment, which is characterized in that including processor, memory and storage in the memory and
It is configured as the computer program executed by the processor, the processor realizes such as right when executing the computer program
It is required that method for detecting human face described in any one of 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910393574.2A CN110210329A (en) | 2019-05-13 | 2019-05-13 | A kind of method for detecting human face, device and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910393574.2A CN110210329A (en) | 2019-05-13 | 2019-05-13 | A kind of method for detecting human face, device and equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110210329A true CN110210329A (en) | 2019-09-06 |
Family
ID=67787076
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910393574.2A Pending CN110210329A (en) | 2019-05-13 | 2019-05-13 | A kind of method for detecting human face, device and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110210329A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112733665A (en) * | 2020-12-31 | 2021-04-30 | 中科院微电子研究所南京智能技术研究院 | Face recognition method and system based on lightweight network structure design |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108764164A (en) * | 2018-05-30 | 2018-11-06 | 华中科技大学 | A kind of method for detecting human face and system based on deformable convolutional network |
CN108805216A (en) * | 2018-06-19 | 2018-11-13 | 合肥工业大学 | Face image processing process based on depth Fusion Features |
CN108830211A (en) * | 2018-06-11 | 2018-11-16 | 厦门中控智慧信息技术有限公司 | Face identification method and Related product based on deep learning |
CN109376693A (en) * | 2018-11-22 | 2019-02-22 | 四川长虹电器股份有限公司 | Method for detecting human face and system |
-
2019
- 2019-05-13 CN CN201910393574.2A patent/CN110210329A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108764164A (en) * | 2018-05-30 | 2018-11-06 | 华中科技大学 | A kind of method for detecting human face and system based on deformable convolutional network |
CN108830211A (en) * | 2018-06-11 | 2018-11-16 | 厦门中控智慧信息技术有限公司 | Face identification method and Related product based on deep learning |
CN108805216A (en) * | 2018-06-19 | 2018-11-13 | 合肥工业大学 | Face image processing process based on depth Fusion Features |
CN109376693A (en) * | 2018-11-22 | 2019-02-22 | 四川长虹电器股份有限公司 | Method for detecting human face and system |
Non-Patent Citations (4)
Title |
---|
KYE-HYEON KIM ET AL: "PVANET: Deep but Light weight Neural Networks for Real-time Object Detection", 《ARXIV:1608.08021V3》 * |
SONGTAO LIU ET AL: "Receptive Field Block Net for Accurate and Fast Object Detection", 《ARXIV:1711.07767V3》 * |
YUXI LI ET AL: "Tiny-DSOD: Lightweight Object Detection for Resource-Restricted Usages", 《ARXIV:1807.11013V1》 * |
郝志峰等: "《华中科技大学出版社》", 31 January 2019 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112733665A (en) * | 2020-12-31 | 2021-04-30 | 中科院微电子研究所南京智能技术研究院 | Face recognition method and system based on lightweight network structure design |
CN112733665B (en) * | 2020-12-31 | 2024-05-28 | 中科南京智能技术研究院 | Face recognition method and system based on lightweight network structure design |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | An edge traffic flow detection scheme based on deep learning in an intelligent transportation system | |
Li et al. | Learning IoT in edge: Deep learning for the Internet of Things with edge computing | |
WO2021017606A1 (en) | Video processing method and apparatus, and electronic device and storage medium | |
CN108805070A (en) | A kind of deep learning pedestrian detection method based on built-in terminal | |
CN110909630B (en) | Abnormal game video detection method and device | |
CN110956111A (en) | Artificial intelligence CNN, LSTM neural network gait recognition system | |
CN110781784A (en) | Face recognition method, device and equipment based on double-path attention mechanism | |
US20220222796A1 (en) | Image processing method and apparatus, server, and storage medium | |
CN110222760A (en) | A kind of fast image processing method based on winograd algorithm | |
CN103177262A (en) | FPGA (field programmable gate array) architecture of HOG (histogram of oriented gradient) and SVM (support vector machine) based pedestrian detection system and implementing method of FPGA architecture | |
CN112085088A (en) | Image processing method, device, equipment and storage medium | |
CN108460464A (en) | Deep learning training method and device | |
CN107316029A (en) | A kind of live body verification method and equipment | |
CN105184278A (en) | Human face detection method and device | |
CN114387512B (en) | Remote sensing image building extraction method based on multi-scale feature fusion and enhancement | |
CN110222607A (en) | The method, apparatus and system of face critical point detection | |
CN113850243A (en) | Model training method, face recognition method, electronic device and storage medium | |
CN112364803A (en) | Living body recognition auxiliary network and training method, terminal, equipment and storage medium | |
CN111461105A (en) | Text recognition method and device | |
CN106598738A (en) | Computer cluster system and parallel computing method thereof | |
Yang et al. | Research on subway pedestrian detection algorithms based on SSD model | |
CN110210329A (en) | A kind of method for detecting human face, device and equipment | |
CN105205476A (en) | Face recognition hardware framework based on LBP characteristics | |
CN105184809A (en) | Moving object detection method and moving object detection device | |
CN115424335B (en) | Living body recognition model training method, living body recognition method and related equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190906 |
|
RJ01 | Rejection of invention patent application after publication |