CN111652051B - Face detection model generation method, device, equipment and storage medium - Google Patents

Face detection model generation method, device, equipment and storage medium Download PDF

Info

Publication number
CN111652051B
CN111652051B CN202010315193.5A CN202010315193A CN111652051B CN 111652051 B CN111652051 B CN 111652051B CN 202010315193 A CN202010315193 A CN 202010315193A CN 111652051 B CN111652051 B CN 111652051B
Authority
CN
China
Prior art keywords
tinydsod
network model
improved
face detection
face
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010315193.5A
Other languages
Chinese (zh)
Other versions
CN111652051A (en
Inventor
王祥雪
林焕凯
贺迪龙
刘双广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gosuncn Technology Group Co Ltd
Original Assignee
Gosuncn Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gosuncn Technology Group Co Ltd filed Critical Gosuncn Technology Group Co Ltd
Priority to CN202010315193.5A priority Critical patent/CN111652051B/en
Publication of CN111652051A publication Critical patent/CN111652051A/en
Application granted granted Critical
Publication of CN111652051B publication Critical patent/CN111652051B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a face detection model generation method, which is suitable for an improved TinyDSOD network model and comprises the following steps: initializing parameters of the improved TinyDSOD network model; combining the complete face images and the incomplete face images of a plurality of different users, and inputting the combined complete face images and the incomplete face images serving as training sets into the improved TinyDSOD network model so as to train the improved TinyDSOD network model; the incomplete face image is a face image of a user blocked by a blocking object; and outputting the improved TinyDSOD network model as a face detection model after training is completed. The invention also discloses a face detection model generating device, a face detection model generating device and a computer readable storage medium. By adopting the embodiment of the invention, the generated face detection model can detect incomplete faces and improve the face detection efficiency and accuracy.

Description

Face detection model generation method, device, equipment and storage medium
Technical Field
The present invention relates to the field of face detection technologies, and in particular, to a method, an apparatus, a device, and a storage medium for generating a face detection model.
Background
In some application scenarios, such as hospitals, research institutions and the like, part of staff needs to wear a mask for a long time to operate, when face detection is carried out, most of features are lost when the face is shielded by the mask, the whole face is usually required to be presented in the existing face detection technology to be normally detected, the face is required to be taken off by the staff for recognition, the face detection speed is definitely slow when the flow of people is large, and the user experience is poor. Therefore, detection of incomplete faces is a highly desirable problem.
Disclosure of Invention
The embodiment of the invention aims to provide a method, a device, equipment and a storage medium for generating a face detection model, wherein the generated face detection model can detect incomplete faces and improve face detection efficiency and accuracy.
In order to achieve the above object, an embodiment of the present invention provides a face detection model generation method, which is applicable to an improved TinyDSOD network model, including:
initializing parameters of the improved TinyDSOD network model;
combining the complete face images and the incomplete face images of a plurality of different users, and inputting the combined complete face images and the incomplete face images serving as training sets into the improved TinyDSOD network model so as to train the improved TinyDSOD network model; the incomplete face image is a face image of a user blocked by a blocking object;
and outputting the improved TinyDSOD network model as a face detection model after training is completed.
Compared with the prior art, the face detection model generation method disclosed by the embodiment of the invention comprises the steps of initializing parameters of the improved TinyDSOD network model; secondly, combining complete face images and incomplete face images of a plurality of different users, and inputting the combined complete face images and the incomplete face images serving as training sets into the improved TinyDSOD network model so as to train the improved TinyDSOD network model, wherein the category combination can reduce the complexity of the model and improve the detection rate; and finally, outputting the improved TinyDSOD network model as a face detection model after training is completed. According to the face detection model generation method disclosed by the embodiment of the invention, the generated face detection model can detect incomplete faces, and meanwhile, the face detection efficiency and accuracy are improved.
As an improvement of the scheme, the improved TinyDSOD network model is obtained by replacing a three-layer convolution pooling structure of a Stem module of the original TinyDSOD network model with a two-way structure; the two-path structure comprises a path structure for performing convolution operation and inversion operation and a path structure for performing pooling operation.
As an improvement of the scheme, the improved TinyDSOD network model is obtained by replacing the DDB-b structure of the original TinyDSOD network model with a DDB-b-plus structure; the DDB-b-plus structure comprises a structure for performing no operation and two other structures for performing convolution operation, wherein one of the two other structures for performing convolution operation is configured with a corresponding expansion coefficient.
As an improvement of the scheme, the improved TinyDSOD network model is obtained by replacing the depth separable convolution after the upsample layer of the original TinyDSOD network model with 3 parallel depth separable convolutions; wherein each of the parallel depth separable convolutions is configured with its corresponding coefficient of expansion.
As an improvement of the scheme, the number of feature graphs in the modified TinyDSOD network model is 4, and the aspect ratio on each feature graph is 1.
In order to achieve the above object, an embodiment of the present invention further provides a face detection model generating device, which is applicable to an improved TinyDSOD network model, including:
the initialization parameter module is used for initializing parameters of the improved TinyDSOD network model;
the training module is used for combining the complete face images and the incomplete face images of a plurality of different users and inputting the combined complete face images and the incomplete face images serving as training sets into the improved TinyDSOD network model so as to train the improved TinyDSOD network model; the incomplete face image is a face image of a user blocked by a blocking object;
and the model output module is used for outputting the improved TinyDSOD network model as a face detection model after training is completed.
Compared with the prior art, the face detection model generating device disclosed by the embodiment of the invention comprises the following steps that firstly, an initialization parameter module initializes parameters of the improved TinyDSOD network model; secondly, the training module combines the complete face images and the incomplete face images of a plurality of different users and then inputs the combined complete face images and the incomplete face images into the improved TinyDSOD network model as a training set so as to train the improved TinyDSOD network model, and the category combination can reduce the complexity of the model and improve the detection rate at the same time; and finally, the model output module outputs the improved TinyDSOD network model as a face detection model after training is completed. The face detection model generating device disclosed by the embodiment of the invention can detect incomplete faces and improve the face detection efficiency and accuracy.
As an improvement of the scheme, the improved TinyDSOD network model is obtained by replacing a three-layer convolution pooling structure of a Stem module of the original TinyDSOD network model with a two-way structure; the two-path structure comprises a path of structure for performing convolution operation and inversion operation and a path of structure for performing pooling operation;
the improved TinyDSOD network model is obtained by replacing a DDB-b structure of an original TinyDSOD network model with a DDB-b-plus structure; the DDB-b-plus structure comprises a structure for performing no operation and two other structures for performing convolution operation, wherein one of the two other structures for performing convolution operation is configured with a corresponding expansion coefficient.
As an improvement of the scheme, the improved TinyDSOD network model is obtained by replacing the depth separable convolution after the upsample layer of the original TinyDSOD network model with 3 parallel depth separable convolutions; wherein each of the parallel depth separable convolutions is configured with its corresponding expansion coefficient;
the number of feature graphs in the improved TinyDSOD network model is 4, and the aspect ratio on each feature graph is 1.
To achieve the above object, an embodiment of the present invention further provides a face detection model generating device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor executes the computer program to implement the face detection model generating method according to any one of the above embodiments.
In order to achieve the above object, an embodiment of the present invention further provides a computer readable storage medium, where the computer readable storage medium includes a stored computer program, where when the computer program runs, a device where the computer readable storage medium is controlled to execute the face detection model generating method according to any one of the above embodiments.
Drawings
FIG. 1 is a flowchart of a face detection model generation method provided in an embodiment of the present invention;
FIG. 2 is a schematic diagram of a Stem module according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a DDB-b structure in an original TinyDSOD network model provided by an embodiment of the present invention;
FIG. 4 is a schematic diagram of a DDB-b-plus structure in an improved TinyDSOD network model provided by an embodiment of the present invention;
FIG. 5 is a schematic diagram of the structure of 3 parallel, depth-separable convolutions in an improved TinyDSOD network model provided by an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a face detection model generating device according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a face detection model generating device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, fig. 1 is a flowchart of a face detection model generating method according to an embodiment of the present invention; the face detection model generation method comprises the following steps:
s1, initializing parameters of the improved TinyDSOD network model;
s2, combining complete face images and incomplete face images of a plurality of different users, and inputting the combined complete face images and the incomplete face images serving as training sets into the improved TinyDSOD network model so as to train the improved TinyDSOD network model;
and S3, outputting the improved TinyDSOD network model as a face detection model after training is completed.
It is worth to be noted that the face detection model generation method provided by the embodiment of the invention is suitable for the improved TinyDSOD network model. The face detection model generation method can be implemented by the face detection/recognition device. The incomplete face image is a face image of a user blocked by a blocking object, for example, the incomplete face image is a face image of the user wearing a mask.
The original TinyDSOD network model is based on the thought of a backbone network and a feature pyramid, denseBlock in DenseNet is used as a basic component of the backbone network, and meanwhile convolution operation in DenseBlock is replaced by deep separable convolution, so that the network extraction capacity is ensured, and meanwhile, the detection speed is improved. In addition, the idea of feature fusion is introduced into the original TinyDSOD network model, and high-level features are fused with adjacent low-level features upwards, so that the detection capability of a small target is improved, and the backbone network structure of the original TinyDSOD is shown in table 1.
TABLE 1 original TinyDSOD network structure
Figure BDA0002459291960000051
Figure BDA0002459291960000061
Specifically, the original TinyDSOD network model is improved in advance.
The three-layer convolution and pooling structure of the Stem module in the original TinyDSOD network model is replaced by a two-way structure; the two-path structure comprises a path structure for performing convolution operation and inversion operation and a path structure for performing pooling operation.
For example, referring to fig. 2, the three-layer convolution and pooling structure of the original TinyDSOD is changed into two paths, one path is subjected to convolution twice, the inverse operation is added, and the other path is directly subjected to pooling operation, so that the structural design can ensure that the Stem module can maintain the diversity of the features as much as possible in the process of realizing downsampling.
The DDB-b structure in the original TinyDSOD network model is replaced by a DDB-b-plus structure; the DDB-b-plus structure comprises a structure for performing no operation and two other structures for performing convolution operation, wherein one of the two other structures for performing convolution operation is configured with a corresponding expansion coefficient.
Exemplary, referring to fig. 3, fig. 3 is a schematic diagram of DDB-b structure in the original tinydssod network model. The DDB-b is changed into a DDB-b-plus structure, as shown in fig. 4, the DDB-b-plus structure is changed from two paths to three paths, and the expansion coefficient is introduced into the added third path, so that the receptive field of convolution calculation is enlarged.
The depth separable convolution after the upsample layer in the original TinyDSOD network model is replaced by 3 parallel depth separable convolutions; wherein each of the parallel depth separable convolutions is configured with its corresponding coefficient of expansion. Illustratively, as shown in fig. 5, different expansion coefficients are configured in each depth separable convolution, so that the receptive field of convolution calculation is further increased, and the characteristic loss is reduced.
The number of feature graphs in the improved TinyDSOD network model is 4, and the aspect ratio on each feature graph is 1. For example, the sizes of the complete face and the incomplete face in the training set are counted, the feature images of the original tinydssod are reduced to 4, and the aspect ratio of the preset achorpox on the 4-layer feature images is fixed to 1 because the face is approximately square, and the specific sizes are shown in table 2.
TABLE 2 Anchor Box size parameters
Feature layer name Anchor Box size
First_out_norm_mbox_priorbox 16,24,32
Second_out_norm_mbox_priorbox 48,64
Third_out_norm_mbox_priorbox 96,128
Fourth_norm_mbox_priorbox 192,224,256
In the image shot by the patrol robot, the human face belongs to a small target, so the proposal improves a data enhancement module in an original TinyDSOD network model. The original data enhancement module comprises operations of random cutting, overturning and the like on the original data to expand the training data set, wherein the area ratio of the randomly cut image to the labeling frame of GroundTruth is 0.1,0.3,0.5,0.7,0.9 and 1.0 respectively, taking 0.1 as an example, which means that as long as the selected image area contains 10% of GroundTruth, the area can be cut for expanding the training set, but such cutting can cause a large number of images of incomplete targets in the training set. In order to make the clipped image include a more complete GroundTruth, the parameters of the batch are changed to 1.0, that is, only when the selected region is completely covered by a certain GroundTruth, the region image is clipped into the training set.
Furthermore, the embodiment of the invention also cuts the improved TinyDSOD network model to meet the application requirement of light patrol robots. The number of repeated modules in each DenseBlock in the original TinyDSOD network model before improvement is changed from original 4, 6 and 6 to 2, so that model parameters are greatly reduced, but receptive fields are increased, the feature extraction capacity is not lost, and the light-weight requirement is met.
The embodiment of the invention adopts the loss function as shown in the formula (1), and can simultaneously realize the position regression and the target classification of the target, wherein the loss function L is the sum of the classification confidence loss and the position loss.
Figure BDA0002459291960000081
Where N is the number of DefaultBox that matches GroundTruth (actual object); l (L) conf (z, c) is a classification confidence loss, L loc (z, l, g) is the loss of position of DefaultBox; z is the matching result of the DefaultBox and the reference object frames of different categories; c is the confidence of the predicted object frame; l is the position information of the predicted object frame; g is the position information of the labeling frame of GroundTruth; alpha is a parameter that balances confidence loss and location loss, and is typically set to 1, i.e., the weights of both losses are the same.
Specifically, in step S1, parameters of the modified tinydssod network model are randomly initialized. In step S2, the complete face images and the incomplete face images of a plurality of different users are combined and then input as a training set into the modified TinyDSOD network model, so as to train the modified TinyDSOD network model. In step S3, after training is completed, the modified TinyDSOD network model is output as a face detection model. In addition, in the training process, the adopted system is ubuntu16.04, the GPU is GTX1080, the training frame is caffe-ssd, the training mode is end2end, the initial learning rate is 0.1, the max_iter is 100000, the learning rate changing strategy is multistep (20000, 40000, 60000, 80000), the Momentum is 0.9, the weight attenuation is 0.0005, the optimization method adopts SGD, and the image input size is 320 x 320.
Further, after the face detection model is obtained through training, the evaluation indexes referred to in the embodiment of the invention comprise an accuracy rate and a recall rate, wherein the accuracy rate is the proportion of correct detection in all detected targets, the recall rate is the proportion of correct detection in the total detection number, and the total detection number comprises a positive detection number, a missed detection number and a false detection number, as shown in the formula (2) and the formula (3).
Accuracy = positive number/(positive number + false number) formula (2);
recall = positive number/(positive number + missed number + false number) equation (3).
Based on a deep neural network, the embodiment of the invention provides a light-weight method for incomplete face detection, which is based on TinyDSOD, and compared with the original TinyDSOD model, the detection rate and the accuracy rate of the incomplete face data set when the test environment is GTX1080 are greatly improved, and the detailed result is shown in a table 3. Compared with the original TinyDSOD, the size of the model in the embodiment of the invention is greatly reduced, the detection speed is faster, and the application requirement of the front end light weight of the patrol robot is met.
TABLE 3 comparison of results of the modified TinyDSOD of the invention with the original TinyDSOD
Figure BDA0002459291960000091
Compared with the prior art, the face detection model generation method disclosed by the embodiment of the invention comprises the steps of initializing parameters of the improved TinyDSOD network model; secondly, combining complete face images and incomplete face images of a plurality of different users, and inputting the combined complete face images and the incomplete face images serving as training sets into the improved TinyDSOD network model so as to train the improved TinyDSOD network model, wherein the category combination can reduce the complexity of the model and improve the detection rate; and finally, outputting the improved TinyDSOD network model as a face detection model after training is completed. According to the face detection model generation method disclosed by the embodiment of the invention, the generated face detection model can detect incomplete faces, and meanwhile, the face detection efficiency and accuracy are improved.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a face detection model generating apparatus 10 according to an embodiment of the present invention; the face detection model generation apparatus 10 includes:
an initialization parameter module 11, configured to initialize parameters of the modified TinyDSOD network model;
the training module 12 is configured to combine the complete face images and the incomplete face images of a plurality of different users and input the combined complete face images and the incomplete face images as a training set into the modified TinyDSOD network model, so as to train the modified TinyDSOD network model; the incomplete face image is a face image of a user blocked by a blocking object, for example, the incomplete face image is a face image of the user wearing a mask;
and the model output module 13 is used for outputting the TinyDSOD network model as a face detection model after training is completed.
The three-layer convolution and pooling structure of the Stem module in the original TinyDSOD network model is replaced by a two-way structure; the two-path structure comprises a path structure for performing convolution operation and inversion operation and a path structure for performing pooling operation.
The DDB-b structure in the original TinyDSOD network model is replaced by a DDB-b-plus structure; the DDB-b-plus structure comprises a structure for performing no operation and two other structures for performing convolution operation, wherein one of the two other structures for performing convolution operation is configured with a corresponding expansion coefficient.
The depth separable convolution after the upsample layer in the original TinyDSOD network model is replaced by 3 parallel depth separable convolutions; wherein each of the parallel depth separable convolutions is configured with its corresponding coefficient of expansion.
The number of feature graphs in the original TinyDSOD network model is 4, and the aspect ratio on each feature graph is 1.
The working process of the face detection model generating device 10 is referred to the working process of the face detection model generating method in the above embodiment, and will not be described herein.
Compared with the prior art, the face detection model generating device 10 disclosed by the embodiment of the invention comprises the following steps that firstly, an initialization parameter module 11 initializes parameters of the improved TinyDSOD network model; secondly, the training module 12 combines the complete face images and the incomplete face images of a plurality of different users and then inputs the combined complete face images and the incomplete face images into the improved TinyDSOD network model as a training set so as to train the improved TinyDSOD network model, and the category combination can reduce the complexity of the model and improve the detection rate at the same time; finally, the model output module 13 outputs the modified TinyDSOD network model as a face detection model after training is completed. The face detection model generating device disclosed by the embodiment of the invention can detect incomplete faces and improve the face detection efficiency and accuracy.
Referring to fig. 7, fig. 7 is a schematic structural diagram of a face detection model generating apparatus 20 according to an embodiment of the present invention. The face detection model generation device 20 of this embodiment includes: a processor 21, a memory 22 and a computer program stored in said memory 22 and executable on said processor 21. The processor 21 implements the steps in the above-described face detection model generation method embodiment, such as steps S1 to S3 shown in fig. 1, when executing the computer program. Alternatively, the processor 21 may implement the functions of the modules/units in the above-described device embodiments when executing the computer program, for example, the initialization parameter module 11.
Illustratively, the computer program may be partitioned into one or more modules/units that are stored in the memory 22 and executed by the processor 21 to complete the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program in the face detection model generating device 20. For example, the computer program may be divided into an initialization parameter module 11, a training module 12 and a model output module 13, and specific functions of each module refer to specific working procedures of the face detection model generating apparatus 10 described in the foregoing embodiments, which are not described herein.
The face detection model generating device 20 may be a computing device such as a desktop computer, a notebook computer, a palm computer, a cloud server, etc. The face detection model generation device 20 may include, but is not limited to, a processor 21, a memory 22. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of the face detection model generating device 20, and does not constitute a limitation of the face detection model generating device 20, and may include more or less components than illustrated, or may combine certain components, or different components, e.g., the face detection model generating device 20 may further include an input-output device, a network access device, a bus, etc.
The processor 21 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The general purpose processor may be a microprocessor or the processor 21 may be any conventional processor or the like, and the processor 21 is a control center of the face detection model generating apparatus 20, and connects the respective parts of the entire face detection model generating apparatus 20 using various interfaces and lines.
The memory 22 may be used to store the computer program and/or module, and the processor 21 may implement various functions of the face detection model generating device 20 by running or executing the computer program and/or module stored in the memory 22 and invoking data stored in the memory 22. The memory 22 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, the memory 22 may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
Wherein the modules/units integrated by the face detection model generating device 20 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as a stand alone product. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and the computer program may implement the steps of each of the method embodiments described above when executed by the processor 21. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.

Claims (6)

1. The face detection model generation method is characterized by being suitable for an improved TinyDSOD network model, and comprises the following steps:
initializing parameters of the improved TinyDSOD network model;
combining the complete face images and the incomplete face images of a plurality of different users, and inputting the combined complete face images and the incomplete face images serving as training sets into the improved TinyDSOD network model so as to train the improved TinyDSOD network model; the incomplete face image is a face image of a user blocked by a blocking object;
outputting the improved TinyDSOD network model as a face detection model after training is completed;
the improved TinyDSOD network model is obtained by replacing a three-layer convolution and pooling structure of a Stem module of the original TinyDSOD network model with a two-way structure; the two-path structure comprises a path of structure for performing convolution operation and inversion operation and a path of structure for performing pooling operation; the improved TinyDSOD network model is obtained by replacing a DDB-b structure of an original TinyDSOD network model with a DDB-b-plus structure; the DDB-b-plus structure comprises a structure for performing no operation and two other structures for performing convolution operation, wherein one of the two other structures for performing convolution operation is configured with a corresponding expansion coefficient; the improved TinyDSOD network model is obtained by replacing the depth separable convolution behind the upsample layer of the original TinyDSOD network model with 3 parallel depth separable convolutions; wherein each of the parallel depth separable convolutions is configured with its corresponding coefficient of expansion.
2. The face detection model generation method of claim 1 wherein the feature map in the modified TinyDSOD network model is 4, and the aspect ratio on each feature map is 1.
3. The utility model provides a face detection model generation device which is characterized in that is applicable to the modified TinyDSOD network model, includes:
the initialization parameter module is used for initializing parameters of the improved TinyDSOD network model;
the training module is used for combining the complete face images and the incomplete face images of a plurality of different users and inputting the combined complete face images and the incomplete face images serving as training sets into the improved TinyDSOD network model so as to train the improved TinyDSOD network model; the incomplete face image is a face image of a user blocked by a blocking object;
the model output module is used for outputting the improved TinyDSOD network model as a face detection model after training is completed;
the improved TinyDSOD network model is obtained by replacing a three-layer convolution and pooling structure of a Stem module of the original TinyDSOD network model with a two-way structure; the two-path structure comprises a path of structure for performing convolution operation and inversion operation and a path of structure for performing pooling operation; the improved TinyDSOD network model is formed by replacing a DDB-b structure of an original TinyDSOD network model with a DDB-b-plus structure; the DDB-b-plus structure comprises a structure for performing no operation and two other structures for performing convolution operation, wherein one of the two other structures for performing convolution operation is configured with a corresponding expansion coefficient; the improved TinyDSOD network model is obtained by replacing the depth separable convolution behind the upsample layer of the original TinyDSOD network model with 3 parallel depth separable convolutions; wherein each of the parallel depth separable convolutions is configured with its corresponding coefficient of expansion.
4. A face detection model generation apparatus according to claim 3, wherein 4 feature maps are provided in the modified tinydssod network model, and the aspect ratio on each feature map is 1.
5. A face detection model generation apparatus comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the face detection model generation method according to any one of claims 1 to 2 when executing the computer program.
6. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored computer program, wherein the computer program when run controls a device in which the computer readable storage medium is located to perform the face detection model generation method according to any one of claims 1 to 2.
CN202010315193.5A 2020-04-21 2020-04-21 Face detection model generation method, device, equipment and storage medium Active CN111652051B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010315193.5A CN111652051B (en) 2020-04-21 2020-04-21 Face detection model generation method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010315193.5A CN111652051B (en) 2020-04-21 2020-04-21 Face detection model generation method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111652051A CN111652051A (en) 2020-09-11
CN111652051B true CN111652051B (en) 2023-06-16

Family

ID=72346554

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010315193.5A Active CN111652051B (en) 2020-04-21 2020-04-21 Face detection model generation method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111652051B (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110555339A (en) * 2018-05-31 2019-12-10 北京嘀嘀无限科技发展有限公司 target detection method, system, device and storage medium
CN110781784A (en) * 2019-10-18 2020-02-11 高新兴科技集团股份有限公司 Face recognition method, device and equipment based on double-path attention mechanism
CN110826519B (en) * 2019-11-14 2023-08-18 深圳华付技术股份有限公司 Face shielding detection method and device, computer equipment and storage medium
CN110909690B (en) * 2019-11-26 2023-03-31 电子科技大学 Method for detecting occluded face image based on region generation
CN110991421B (en) * 2019-12-24 2023-08-25 高新兴科技集团股份有限公司 Bayonet snap image vehicle detection method, computer storage medium and electronic equipment

Also Published As

Publication number Publication date
CN111652051A (en) 2020-09-11

Similar Documents

Publication Publication Date Title
CN110084173B (en) Human head detection method and device
Goceri Analysis of deep networks with residual blocks and different activation functions: classification of skin diseases
CN109740534B (en) Image processing method, device and processing equipment
CN108416327B (en) Target detection method and device, computer equipment and readable storage medium
JP2022534337A (en) Video target tracking method and apparatus, computer apparatus, program
CN109543549B (en) Image data processing method and device for multi-person posture estimation, mobile terminal equipment and server
WO2021022521A1 (en) Method for processing data, and method and device for training neural network model
CN111164601A (en) Emotion recognition method, intelligent device and computer readable storage medium
CN108197532A (en) The method, apparatus and computer installation of recognition of face
CN110781784A (en) Face recognition method, device and equipment based on double-path attention mechanism
CN109754359B (en) Pooling processing method and system applied to convolutional neural network
CN113326930B (en) Data processing method, neural network training method, related device and equipment
US20220198836A1 (en) Gesture recognition method, electronic device, computer-readable storage medium, and chip
US20210042501A1 (en) Method and device for processing point cloud data, electronic device and storage medium
WO2023202285A1 (en) Image processing method and apparatus, computer device, and storage medium
CN112927209A (en) CNN-based significance detection system and method
CN114925320B (en) Data processing method and related device
CN117501245A (en) Neural network model training method and device, and data processing method and device
US20220036106A1 (en) Method and apparatus for data calculation in neural network model, and image processing method and apparatus
Li et al. Findnet: Can you find me? boundary-and-texture enhancement network for camouflaged object detection
US20220044104A1 (en) Method and apparatus for forward computation of neural network, and computer-readable storage medium
CN111382839B (en) Method and device for pruning neural network
CN111652051B (en) Face detection model generation method, device, equipment and storage medium
KR20220039313A (en) Method and apparatus for processing neural network operation
KR102393761B1 (en) Method and system of learning artificial neural network model for image processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant