CN107358209B

CN107358209B - Training method and device of face detection model and face detection method and device

Info

Publication number: CN107358209B
Application number: CN201710579552.6A
Authority: CN
Inventors: 陈志超; 徐鹏飞; 周剑
Original assignee: Chengdu Tongjia Youbo Technology Co Ltd
Current assignee: Chengdu Tongjia Youbo Technology Co Ltd
Priority date: 2017-07-17
Filing date: 2017-07-17
Publication date: 2020-02-28
Anticipated expiration: 2037-07-17
Also published as: CN107358209A

Abstract

The embodiment of the invention provides a training method and a training device for a face detection model and a face detection method and a face detection device, and relates to the technical field of computer image processing. The training method and the device of the face detection model preferentially train the complex face image sample class in the neural network according to the face image sample training of the face image sample class, thereby realizing accelerated convergence and enabling the speed of the face detection model established by the training method and the device of the face detection model provided by the invention to be higher. The face detection method is characterized in that a face detection model is established by applying a training method and a training device of the face detection model, and when the deviation between the first face candidate region and the second face candidate region is smaller than a preset deviation, the first face candidate region is determined as a face recognition region, so that the face recognition region can be determined more accurately, errors in the face detection process are reduced, and the final output face detection result is more accurate.

Description

Training method and device of face detection model and face detection method and device

Technical Field

The invention relates to the technical field of computer image processing, in particular to a training method and a device of a face detection model and a face detection method and a face detection device.

Background

A complete automatic face detection and recognition system should include three aspects: face detection, feature extraction and face recognition. The face detection is the first step of face automatic identification and is the first problem to be solved by a face automatic identification system. The increasing development of economic society makes the timely and effective requirements of automatic identity verification more and more urgent. The biological characteristics of the human body have strong individual difference and self stability, so the method is the most ideal basis for identity verification. Compared with a human body identification method utilizing other human body biological characteristics such as fingerprints, retinas, voice and the like, the human face identification has the characteristics of large content of information, directness, friendliness, convenience and the like, and is easier to accept by wide users.

Face detection is one of the key links of an automatic face recognition system, but early researches mainly aim at face images under strong constraint conditions, and the researches usually assume that the face position is known or is easily obtained, so that the problem of face detection is not highly emphasized. In recent years, with the rapid development of applications such as electronic commerce, face recognition has become the most influential and indispensable biometric authentication means, and in this context, an automatic face recognition system must have strong adaptability to general environment images, so that a series of problems and difficulties faced by the system make face detection begin to be researched and regarded as an independent subject.

Disclosure of Invention

The invention aims to provide a training method and a training device for a face detection model, so as to accelerate the efficiency of establishing the model and increase the accuracy of the model for face detection.

Another objective of the present invention is to provide a method and an apparatus for detecting a face, so as to enhance the accuracy of the final face detection result.

In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:

in a first aspect, an embodiment of the present invention provides a training method for a face detection model, where the training method for a face detection model includes:

acquiring an uploaded face image training set, wherein the face image training set comprises a plurality of different face image samples;

classifying the different face image samples into a complex face image sample class and a simple face image sample class according to the complexity of the face image;

extracting the characteristics of the complex face image according to a plurality of face image samples contained in the complex face image sample class preferentially;

extracting simple face image features according to a plurality of face image samples contained in the simple face image sample class and the extracted complex face image features;

and training a neural network by using the extracted complex face image characteristics and the extracted simple face image characteristics so as to establish a face detection model.

In a second aspect, an embodiment of the present invention further provides a training apparatus for a face detection model, where the training apparatus for a face detection model includes:

the face image training set acquisition module is used for acquiring an uploaded face image training set, and the face image training set comprises a plurality of different face image samples;

the classification module is used for classifying the plurality of different face image samples into a complex face image sample class and a simple face image sample class according to the complexity of the face image;

the characteristic extraction module is used for preferentially extracting the characteristics of the complex face image according to a plurality of face image samples contained in the complex face image sample class;

the feature extraction module is further used for extracting the features of the simple face images according to a plurality of face image samples contained in the simple face image samples and the extracted features of the complex face images;

and the face detection model establishing module is used for training a neural network by using the extracted complex face image characteristics and the extracted simple face image characteristics so as to establish a face detection model.

In a third aspect, an embodiment of the present invention further provides a face detection method, where the face detection method includes:

identifying a first face candidate region of a face image to be detected according to an image identification algorithm;

identifying a second face candidate region according to the face image to be detected and a face detection model established by a plurality of the provided face detection model training methods;

and if the deviation between the first face candidate region and the second face candidate region is smaller than a preset deviation, determining the first face candidate region as a face recognition region.

In a fourth aspect, an embodiment of the present invention further provides a face detection apparatus, where the face detection apparatus includes:

the candidate region acquisition module is used for identifying a first face candidate region of the face image to be detected according to an image recognition algorithm;

the candidate region acquisition module is also used for identifying a second face candidate region according to the face image to be detected and a plurality of face detection models established by the provided training device of the face detection model;

and the face recognition area determining module is used for determining the first face candidate area as the face recognition area if the deviation between the first face candidate area and the second face candidate area is smaller than the preset deviation.

The embodiment of the invention provides a training method and a device of a face detection model, which classifies a plurality of different face image samples contained in a face image training set into a complex face image sample class and a simple face image sample class according to the complexity of a face image, extracts simple face image features according to the simple face image sample class and the complex face image features after extracting the complex face image features according to the complex face image sample class, can directly call the complex face image features without calculating again when extracting the features similar to the complex face image features from the simple face image sample class because the complex face image features may coincide with the features in the plurality of face image samples contained in the simple face image sample class, and finally trains a neural network to establish the face detection model according to the extracted complex face image features and the simple face image features, therefore, accelerated convergence is realized, and the speed of the face detection model established by the training method and the training device for the face detection model provided by the invention is higher.

On the other hand, in the face detection method provided by the embodiment of the invention, the face detection model is established by applying the training method and the training device of the face detection model, and when the deviation between the first face candidate region and the second face candidate region is smaller than the preset deviation, the first face candidate region is determined as the face recognition region, so that the face recognition region can be determined more accurately, the error in the face detection process is reduced, and the finally output face detection result is more accurate.

In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 shows a functional block diagram of a server provided by an embodiment of the present invention.

Fig. 2 shows a functional block diagram of a user terminal according to an embodiment of the present invention.

Fig. 3 is a functional block diagram of a training apparatus for a face detection model according to an embodiment of the present invention.

Fig. 4 shows a functional sub-block diagram of the classification module in fig. 3.

Fig. 5 shows a functional sub-block diagram of the face region determination module of fig. 3.

Fig. 6 shows a flowchart of a training method of a face detection model according to an embodiment of the present invention.

Fig. 7 is a flow chart illustrating sub-steps of classifying the plurality of different face image samples into a complex face image sample class and a simple face image sample class according to the complexity of the face image in fig. 6.

Fig. 8 shows a flow chart of sub-steps of fig. 6 for determining a face region in the face image sample.

Fig. 9 is a functional block diagram of a face detection apparatus according to an embodiment of the present invention.

Fig. 10 shows a flowchart of a face detection method according to an embodiment of the present invention.

Icon: 100-a server; 101-a first memory; 102-a first processor; 103-a communication unit; 200-training means of a face detection model; 201-a face image training set acquisition module; 202-a classification module; 2021-a classification loss rate acquisition sub-module; 2022-classification submodule; 203-a feature extraction module; 204-a face region determination module; 2041-vertex acquisition submodule; 2042-coordinate acquisition submodule; 2043-face region acquisition submodule; 205-a face detection model building module; 300-a face detection device; 301-candidate region acquisition module; 302-a judgment module; 303-a face recognition area determination module; 400-a user terminal; 401-a second memory; 402-a memory controller; 403-a second processor; 404-peripheral interfaces; 405-a radio frequency unit; 406-image acquisition unit.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

Referring to fig. 1, fig. 1 shows a functional block diagram of a server 100 that can be used in embodiments of the present invention. The server 100 comprises a training apparatus 200 for face detection models, a first memory 101, one or more (only one shown) first processors 102, and a communication unit 103. These components communicate with each other via one or more communication buses/signal lines. The training device 200 of the face detection model includes at least one software functional unit which can be stored in the first memory 101 in the form of software or firmware (firmware) or solidified in an Operating System (OS) of the server 100.

The first memory 101 may be configured to store software programs and units, such as program instructions/units corresponding to the software testing apparatus and method in the embodiment of the present invention, and the first processor 102 executes various functional applications and data processing, such as the method for acquiring information of an object to be queried provided by the embodiment of the present invention, by running the software programs and units of the training apparatus 200 and method of the face detection model stored in the first memory 101. The first Memory 101 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.

The communication unit 103 is configured to establish a communication connection between the server 100 and another communication terminal through the network, and to transceive data through the network.

It should be understood that the configuration shown in fig. 1 is merely illustrative, and that server 100 may include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.

Referring to fig. 2, fig. 2 shows a functional block diagram of a ue 400 applicable to the embodiment of the present invention. As shown in fig. 1, the user terminal 400 includes a face detection apparatus 300, a second memory 401, a storage controller 402, one or more (only one is shown) second processors 403, a peripheral interface 404, a radio frequency unit 405, an image acquisition unit 406, and the like. These components communicate with each other via one or more communication buses/signal lines. The face detection apparatus 300 includes at least one software function unit which may be stored in the second memory 401 in the form of software or firmware (firmware) or fixed in an Operating System (OS) of the user terminal 400.

The second memory 401 may be used to store software programs and units, such as program instructions/units corresponding to the face detection method and apparatus in the embodiment of the present invention, and the second processor 403 executes various functional applications and data processing, such as the face detection model training method and the face detection method provided in the embodiment of the present invention, by running the software programs and units stored in the second memory 401. The second memory 401 may include a high speed random access second memory 401 and may also include a non-volatile second memory 401, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state second memory 401. Access to the second memory 401 by the second processor 403 and possibly other components may be under the control of the memory controller 402.

The peripheral interface 404 couples various input/output devices to the second processor 403 and to the second memory 401. In some embodiments, the peripheral interface 404, the second processor 403, and the memory controller 402 may be implemented in a single chip. In other examples, they may be implemented separately from the individual chips.

The rf unit 405 is configured to receive and transmit electromagnetic waves, and achieve interconversion between the electromagnetic waves and electrical signals, so as to communicate with a communication network or other devices.

The image acquisition unit 406 is configured to acquire image information or video information of a user, transmit the image information or video information to the second processor 403, and process the image information or video information by the second processor 403, so as to detect a face included in the image information or video information. The image acquisition unit 406 may be, but is not limited to, a camera, a video camera, or the like.

It is to be understood that the structure shown in fig. 2 is merely illustrative, and that the user terminal 400 may also include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.

First embodiment

Referring to fig. 3, a training apparatus 200 for a face detection model according to an embodiment of the present invention is used for training a neural network to establish a face detection model. The training apparatus 200 for the face detection model includes: the face image training set acquisition module 201, the classification module 202, the feature extraction module 203, the face region determination module 204 and the face detection model establishment module 205.

The face image training set obtaining module 201 is configured to obtain an uploaded face image training set, where the face image training set includes a plurality of different face image samples.

The face image training set is used for establishing a model. The server 100 performs deep learning through the face image training set, thereby obtaining a face detection model. The face image training set may be received by the server 100, or may be generated in response to an operation received by and transmitted to the server 100 by another terminal communicatively or electrically connected to the server 100.

The classification module 202 is configured to classify the plurality of different face image samples into a complex face image sample class and a simple face image sample class according to complexity of a face image.

Referring to fig. 4, specifically, the classification module 202 includes a classification loss rate obtaining sub-module 2021 and a classification sub-module 2022, each of which functions as follows:

the classification loss rate obtaining sub-module 2021 is configured to obtain the classification loss rates of the plurality of different face image samples.

In the process of classifying a plurality of face image samples, some features of the face image samples may be lost. Assuming that the number of features originally included in each face image sample is a1, and the number of features lost in the classification process is a2, the classification loss rate is the ratio a1/a2 of the number of lost features to the number of features originally included in each face image sample. It will be appreciated that the classification loss rate is relatively high for a higher number of missing features and relatively low for a lower number of missing features.

The classification submodule 2022 is configured to classify samples whose classification loss rate is greater than a preset threshold into a complex face image sample class, and classify samples whose classification loss rate is less than or equal to the preset threshold into a simple face image sample class.

After the classification loss rate of each face image sample is obtained, the face image samples with the classification loss rate larger than a preset threshold value are classified into a complex face image sample class, and the face image samples with the classification loss rate smaller than or equal to the preset threshold value are classified into a simple face image sample class, so that the face image samples are classified into levels, the server 100 can train a neural network by preferentially using the complex face image sample class in the deep learning process, and the accelerated convergence is realized, so that the convergence speed is higher than that of a traditional training model.

The feature extraction module 203 is configured to preferentially extract features of the complex face image according to a plurality of face image samples included in the class of complex face image samples.

Firstly, the complex face image features are extracted from a plurality of face image samples included in the complex face image sample class, and the extracted complex face image features are used as the basis for extracting the simple face image features according to the simple face image sample class in the feature extraction module 203.

The feature extraction module 203 is further configured to extract features of the simple face image according to a plurality of face image samples included in the class of simple face image samples and the extracted features of the complex face image.

Since the complex facial image features may coincide with the features in the plurality of facial image samples included in the simple facial image sample class, when the feature extraction module 203 extracts features similar to the complex facial image features from the simple facial image sample class, the complex facial image features can be directly called without performing calculation again.

The face detection model building module 205 is configured to train a neural network with the extracted complex face image features and the extracted simple face image features, so as to build a face detection model.

In the process that the server 100 trains a neural network according to the face image samples to establish a face detection model, a complex face image sample class is preferentially trained; after the complex face image sample class is trained, more and complex features can be extracted from the complex face image sample class, and when the server 100 trains the neural network according to the simple face image sample class, the features extracted from the complex face image sample class can be directly used as initial values for training the simple face image sample class, so that accelerated convergence is realized in the process of training the neural network, the process of establishing a model by the server 100 is quicker and more efficient, and the time of developers is saved.

In a preferred embodiment, the training apparatus 200 for face detection model further comprises a face region determining module 204. The face region determining module 204 is configured to determine a face region in the face image sample. Referring to fig. 5, specifically, the face detection model includes a vertex acquisition sub-module 2041, a coordinate acquisition sub-module 2042, and a face region acquisition sub-module 2043.

The vertex fetch submodule 2041 is configured to fetch a first vertex and a second vertex.

A rectangular area can be determined by using the first vertex and the second vertex as diagonal lines. However, it should be noted that the area of the rectangle whose diagonal line is the connection line of the first vertex and the second vertex is greater than or equal to the area of the face region included in the face image sample.

The coordinate obtaining submodule 2042 is configured to obtain a first coordinate of the first vertex and a second coordinate of the second vertex.

The first coordinates of the first vertex are (x1, y1) and the second coordinates of the second vertex are (x2, y2), so that the server 100 can perform regression analysis on x1, y1, x2 and y2 respectively.

The face region obtaining sub-module 2043 is configured to respectively regress the first coordinates and the second coordinates to obtain a rectangular face region.

The face detection method and the face detection device have the advantages that the rectangular face area is obtained by respectively regressing the first coordinate and the second coordinate, so that the face can be selected more accurately, and the deviation of face detection is reduced.

Second embodiment

Referring to fig. 6, a training method of a face detection model according to an embodiment of the present invention is applied to a server 100. It should be noted that the basic principle and the generated technical effect of the training method for the face detection model provided in the embodiment are the same as those of the embodiment described above, and for brief description, no part of the present embodiment is mentioned, and reference may be made to the corresponding contents in the embodiment described above. The training method of the face detection model comprises the following steps:

step S501: and acquiring an uploaded face image training set, wherein the face image training set comprises a plurality of different face image samples.

It is understood that step S501 may be performed by the face image training set acquisition module 201.

Step S502: and classifying the different face image samples into a complex face image sample class and a simple face image sample class according to the complexity of the face image.

It is understood that step S502 may be performed by the classification module 202.

Specifically, referring to fig. 7, step S502 includes the following two substeps:

substep S5021: and obtaining the classification loss rate of the plurality of different face image samples.

It is to be understood that the sub-step S5021 may be performed by the classification loss rate acquisition sub-module 2021.

Substep S5022: classifying the samples with the classification loss rate larger than a preset threshold value into a complex face image sample class, and classifying the samples with the classification loss rate smaller than or equal to the preset threshold value into a simple face image sample class.

It will be appreciated that sub-step S5022 may be performed by the classification sub-module 2022.

Step S503: and determining a face region in the face image sample.

It is understood that step S503 can be performed by the face region determination module 204.

Specifically, referring to fig. 8, step S503 includes the following three substeps:

substep S5031: and acquiring a first vertex and a second vertex.

It will be appreciated that sub-step S5031 may be performed by vertex fetch sub-module 2041.

Substep S5032: and acquiring a first coordinate of the first vertex and a second coordinate of the second vertex.

It will be appreciated that sub-step S5032 may be performed by the coordinate acquisition sub-module 2042.

Substep S5033: and respectively regressing the first coordinate and the second coordinate to obtain a rectangular face area.

It is to be understood that sub-step S5033 may be performed by the face region acquisition sub-module 2043.

Step S504: and preferentially extracting the characteristics of the complex face image according to a plurality of face image samples contained in the complex face image sample class.

It is understood that step S504 may be performed by the feature extraction module 203.

Step S505: and extracting the characteristics of the simple face images according to a plurality of face image samples contained in the simple face image samples and the extracted characteristics of the complex face images.

It is understood that step S505 can be performed by the feature extraction module 203.

Step S506: and training a neural network by using the extracted complex face image characteristics and the extracted simple face image characteristics so as to establish a face detection model.

It is understood that step S506 can be performed by the face detection model building module 205.

Third embodiment

Referring to fig. 9, a face detection apparatus 300 according to an embodiment of the present invention is used for detecting a face. The face detection apparatus 300 includes: a candidate region acquisition module 301, a judgment module 302 and a face recognition region determination module 303.

The candidate region acquiring module 301 is configured to identify a first face candidate region of a face image to be detected according to an image recognition algorithm.

Firstly, a first face candidate region in a face image to be detected is preliminarily identified through an image identification algorithm, wherein the image identification algorithm can be but is not limited to a convolutional neural network method, an elastic model-based method, a characteristic face method and the like.

The candidate region obtaining module 301 is further configured to obtain a face detection result according to the face image to be detected and the first face detection model.

The face image to be detected is input to the first face detection model established by applying the training method of the face detection model provided by the second embodiment, and a face detection result can be obtained.

The candidate region obtaining module 301 is further configured to obtain a third face candidate region according to the face detection result and the second face detection model.

The face detection result is input to the second face detection model established by applying the training method of the face detection model provided in the second embodiment, and a third face candidate region can be obtained.

The candidate region obtaining module 301 is further configured to obtain a fourth face candidate region according to the face detection result and the third face detection model.

A fourth face candidate region may be obtained by inputting the face detection result to the third face detection model established by applying the training method of the face detection model provided in the second embodiment.

The candidate region obtaining module 301 is further configured to obtain a fifth face candidate region according to the face detection result and the fourth face detection model.

A fifth face candidate region may be obtained by inputting the face detection result to the fourth face detection model established by applying the training method of the face detection model provided in the second embodiment.

Therefore, it can be understood that the candidate region obtaining module 301 is configured to identify a second face candidate region according to the face image to be detected and the face detection model established by the above-mentioned training method of the face detection model.

Specifically, the face detection model established by the training method of the face detection model includes a first face detection model, a second face detection model, a third face detection model and a fourth face detection model. Wherein, the size of the second face detection model is as follows: 12 × 12 complexity; the size of the third face detection model is as follows: 24 × 24 complexity; the fourth face detection model has the size: 48 x 48 complexity.

The determining module 302 is configured to determine whether deviations of the first face candidate region and the third face candidate region, the fourth face candidate region, and the fifth face candidate region are all smaller than a preset deviation.

By comprehensively judging that the deviations of the first face candidate region, the third face candidate region, the fourth face candidate region and the fifth face candidate region are smaller than the preset deviation, the face recognition region can be accurately determined, and the error in the face detection process is reduced.

The face identification region determining module 303 is configured to determine the first face candidate region as the face identification region if the deviations of the first face candidate region from the third face candidate region, the fourth face candidate region, and the fifth face candidate region are smaller than the preset deviation.

Fourth embodiment

Referring to fig. 10, a face detection method according to an embodiment of the present invention is applied to a user terminal 400. It should be noted that the basic principle and the generated technical effects of the face detection method provided by the embodiment are the same as those of the embodiment, and for the sake of brief description, no part of the present embodiment is mentioned, and corresponding contents in the embodiment may be referred to. The face detection method comprises the following steps:

step S801: and identifying a first face candidate region of the face image to be detected according to an image identification algorithm.

It is understood that step S801 may be performed by the candidate region acquisition module 301.

Step S802: and obtaining a face detection result according to the face image to be detected and the first face detection model.

It is understood that step S802 may be performed by the candidate region acquisition module 301.

Step S803: and acquiring a third face candidate region according to the face detection result and the second face detection model.

It is understood that step S803 may be performed by the candidate region acquisition module 301.

Step S804: and acquiring a fourth face candidate area according to the face detection result and the third face detection model.

It is understood that step S804 may be performed by the candidate region acquisition module 301.

Step S805: and acquiring a fifth face candidate region according to the face detection result and the fourth face detection model.

It is understood that step S805 may be performed by the candidate region acquisition module 301.

Step S806: judging whether the deviations of the first face candidate region, the third face candidate region, the fourth face candidate region and the fifth face candidate region are all smaller than a preset deviation, if so, executing the step S807; if not, step S801 is executed.

It is understood that step S806 can be performed by the determining module 302.

Step S807: and determining the first face candidate area as a face recognition area.

It is understood that step S807 can be performed by the face recognition area determination module 303.

In summary, the present invention provides a training method and apparatus for a face detection model, which classifies a plurality of different face image samples included in a face image training set into a complex face image sample class and a simple face image sample class according to the complexity of a face image, extracts a simple face image feature according to the simple face image sample class and the complex face image feature after extracting the complex face image feature according to the complex face image sample class, and therefore, since the complex face image feature may coincide with a feature in a plurality of face image samples included in the simple face image sample class, when extracting a feature similar to the complex face image feature from the simple face image sample class, the complex face image feature can be directly called without performing calculation again, and finally, a neural network is trained on the extracted complex face image feature and the simple face image feature to establish the face detection model, therefore, accelerated convergence is realized, and the speed of the face detection model established by the training method and the training device for the face detection model provided by the invention is higher.

On the other hand, according to the face detection method provided by the invention, the face detection model is established by applying the training method and the training device of the face detection model, and when the deviation between the first face candidate region and the second face candidate region is smaller than the preset deviation, the first face candidate region is determined as the face recognition region, so that the face recognition region can be determined more accurately, and the error in the face detection process is reduced.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a unit, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, each functional unit in the embodiments of the present invention may be integrated together to form an independent part, or each unit may exist separately, or two or more units may be integrated to form an independent part.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

Claims

1. A training method of a face detection model is characterized in that the training method of the face detection model comprises the following steps:

obtaining the classification loss rate of the plurality of different face image samples, wherein the classification loss rate is the ratio of the number of lost features in the face image samples to the number of features contained in the face image samples;

classifying the samples with the classification loss rate larger than a preset threshold value into a complex face image sample class, and classifying the samples with the classification loss rate smaller than or equal to the preset threshold value into a simple face image sample class;

2. The method for training a face detection model according to claim 1, wherein before the step of training a neural network using the extracted complex face image features and the simple face image features, the method for training a face detection model further comprises:

determining a face region in the face image sample;

the step of training a neural network by using the extracted complex face image features and the extracted simple face image features so as to establish a face detection model comprises the following steps:

and training a neural network according to the extracted complex face image characteristics and the extracted simple face image characteristics, and establishing a face detection model according to the face region.

3. The method for training a face detection model according to claim 2, wherein the face region is a rectangular region, and the step of determining the face region in the face image sample comprises:

acquiring a first vertex and a second vertex, wherein the area of a rectangle taking the connecting line of the first vertex and the second vertex as a diagonal line is larger than or equal to the area of a face region contained in the face image sample;

acquiring a first coordinate of the first vertex and a second coordinate of the second vertex;

and respectively regressing the first coordinate and the second coordinate to obtain a rectangular face area.

4. An apparatus for training a face detection model, the apparatus comprising:

the classification module is used for acquiring the classification loss rate of the plurality of different face image samples, wherein the classification loss rate is the ratio of the number of lost features in the face image samples to the number of features contained in the face image samples;

the classification module is also used for classifying the samples with the classification loss rate larger than a preset threshold value into a complex face image sample class, and classifying the samples with the classification loss rate smaller than or equal to the preset threshold value into a simple face image sample class;

5. A face detection method, characterized in that the face detection method comprises:

identifying a second face candidate region according to the face image to be detected and a plurality of face detection models established according to any one of claims 1-3;

6. The face detection method of claim 5, wherein the face detection models include a first face detection model, a second face detection model, a third face detection model, and a fourth face detection model, the first face detection model, the second face detection model, the third face detection model, and the fourth face detection model have different complexities, the face detection method further comprising:

obtaining a face detection result according to the face image to be detected and the first face detection model;

acquiring a third face candidate area according to the face detection result and the second face detection model;

acquiring a fourth face candidate area according to the face detection result and the third face detection model;

acquiring a fifth face candidate region according to the face detection result and the fourth face detection model;

and if the deviations of the first face candidate region, the third face candidate region, the fourth face candidate region and the fifth face candidate region are smaller than a preset deviation, determining the first face candidate region as a face recognition region.

7. A face detection apparatus, characterized in that the face detection apparatus comprises:

the candidate region acquisition module is further configured to identify a second face candidate region according to the face image to be detected and a plurality of face detection models established according to any one of claims 1 to 3;

8. The face detection apparatus of claim 7, wherein the face detection models include a first face detection model, a second face detection model, a third face detection model, and a fourth face detection model, the first face detection model, the second face detection model, the third face detection model, and the fourth face detection model having different complexities;

the candidate region acquisition module is further used for acquiring a face detection result according to the face image to be detected and the first face detection model;

the candidate region acquisition module is further used for acquiring a third face candidate region according to the face detection result and the second face detection model;

the candidate region acquisition module is further used for acquiring a fourth face candidate region according to the face detection result and the third face detection model;

the candidate region acquisition module is further used for acquiring a fifth face candidate region according to the face detection result and the fourth face detection model;

the face recognition area determination module is further configured to determine the first face candidate area as a face recognition area if the deviations of the first face candidate area from the third face candidate area, the fourth face candidate area, and the fifth face candidate area are smaller than a preset deviation.