CN110610140B - Training method, device and equipment of face recognition model and readable storage medium - Google Patents

Training method, device and equipment of face recognition model and readable storage medium Download PDF

Info

Publication number
CN110610140B
CN110610140B CN201910785159.1A CN201910785159A CN110610140B CN 110610140 B CN110610140 B CN 110610140B CN 201910785159 A CN201910785159 A CN 201910785159A CN 110610140 B CN110610140 B CN 110610140B
Authority
CN
China
Prior art keywords
network layer
neural network
value
current
sample face
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910785159.1A
Other languages
Chinese (zh)
Other versions
CN110610140A (en
Inventor
郭玲玲
李佼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910785159.1A priority Critical patent/CN110610140B/en
Priority to PCT/CN2019/118399 priority patent/WO2021035980A1/en
Publication of CN110610140A publication Critical patent/CN110610140A/en
Application granted granted Critical
Publication of CN110610140B publication Critical patent/CN110610140B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a training method, a training device, training equipment and a readable storage medium of a face recognition model, relates to the technical field of neural networks, and can determine whether to continue to build a neural network layer according to the actual condition of face recognition model training, so that a large amount of system memory is saved, and the training cost of the face recognition model is reduced. The method comprises the following steps: obtaining target sample face parameters of a current neural network layer, and performing iterative training on the target sample face parameters to obtain predicted values of the current neural network layer, wherein the target sample face parameters are parameters of sample face areas corresponding to the current neural network layer; generating a first loss value of the current neuron network layer according to the predicted value; extracting a second loss value of a previous neural network layer of the current neural network layer, and taking a difference value between the first loss value and the second loss value as a descending gradient; if the down gradient is smaller than the loss threshold value, the model formed by all the current neuron network layers is used as a face recognition model.

Description

Training method, device and equipment of face recognition model and readable storage medium
Technical Field
The present invention relates to the field of neural networks, and in particular, to a training method, apparatus, device, and readable storage medium for a face recognition model.
Background
In recent years, with rapid development of neural network technology, intelligent recognition technology has been widely researched and developed and applied to various fields. In order to realize intelligent face recognition, personalized features of the face are usually sampled and recognized through comparison of the personalized features, in general, a neural network is adopted to learn the personalized features, a face recognition model is trained, and intelligent recognition is realized based on the face recognition model. The neural network architecture of the neural network comprises a plurality of layers of neurons, the gradient of each layer of neurons gradually decreases, and the decreasing gradient of each layer of neurons is also used as a training parameter when each layer of neurons perform iterative learning in order to ensure that the learning process of the neurons is accurate.
In the related art, when a face recognition model is trained, a neural network with a fixed depth is adopted for supervised learning, and when the face recognition model is trained, each layer of neural network layer needs to execute counter propagation, the descending gradient of the neural network layer is calculated through counter propagation, the neural network layers are built one by one according to the descending gradient, and the face recognition model is obtained based on the built neural network layers.
In carrying out the present invention, the inventors have found that the related art has at least the following problems:
because the depth of the neural network is fixed, if the iterative learning of each layer of the neural network layer is ensured to be stable when the depth requirement is not met, the iterative learning is required to be continuously performed in the reverse transmission process until the established neural network layer meets the depth requirement of the neural network, so that the work of continuously performing the reverse transmission after the iterative learning of the neural network layer is stable is useless, a large amount of system memory is wasted, and the training cost of the face recognition model is increased.
Disclosure of Invention
In view of this, the present invention provides a training method, device, apparatus and readable storage medium for a face recognition model, which mainly aims to solve the problem that a great deal of system memory is wasted and the training cost of the face recognition model is increased.
According to a first aspect of the present invention, there is provided a training method of a face recognition model, the method comprising:
obtaining target sample face parameters of a current neural network layer, and performing iterative training on the target sample face parameters to obtain predicted values of the current neural network layer, wherein the target sample face parameters are parameters of sample face areas corresponding to the current neural network layer;
generating a first loss value of the current neuron network layer according to the predicted value;
extracting a second loss value of a neural network layer above the current neural network layer, and taking a difference value between the first loss value and the second loss value as a descending gradient;
and if the gradient is smaller than the loss threshold value, the model formed by all the current neuron network layers is used as a face recognition model.
In another embodiment, before the obtaining the sample face parameter of the current neural network layer and performing iterative training on the sample face parameter to obtain the prediction of the current neural network layer, the method further includes:
acquiring a sample face parameter of a neural network layer above the current neural network layer;
and carrying the sample face parameters into a parameter association function to generate target sample face parameters of the current neuron network layer, wherein the parameter association function prescribes a functional relationship between the sample face parameters of adjacent neuron network layers.
In another embodiment, the obtaining the target sample face parameter of the current neural network layer, and performing iterative training on the target sample face parameter to obtain the predicted value of the current neural network layer includes:
establishing a fixed track of the current neuron network layer based on the target sample face parameters;
performing tracking simulation on the fixed track, and counting the tracking times of the tracking simulation on the fixed track;
and when the tracking times reach the iteration times of the current neuron network layer, extracting parameters in the last tracking simulation process as the predicted value.
In another embodiment, the generating the loss value of the current neural network layer according to the predicted value includes:
bringing the predicted value into a loss value association function to generate a first loss value of the current neuron network layer, wherein the loss value association function prescribes a functional relation between the predicted value and the loss value of any one of the neuron network layers; or alternatively, the first and second heat exchangers may be,
and acquiring a true value, counting the absolute value of a difference value between the true value and the predicted value, and taking the absolute value of the difference value as a first loss value of the current neuron network layer.
In another embodiment, after the extracting the second loss value of the previous neural network layer of the current neural network layer and taking the difference between the first loss value and the second loss value as the down gradient, the method further includes:
if the descending gradient is larger than the loss threshold value, continuing to establish a next neuron network layer of the current neuron network layer, and repeatedly executing iterative training of the sample face parameters and comparing the loss value with the loss threshold value.
According to a second aspect of the present invention, there is provided a model training apparatus comprising:
the iteration module is used for acquiring target sample face parameters of a current neuron network layer, carrying out iteration training on the target sample face parameters to obtain a predicted value of the current neuron network layer, wherein the target sample face parameters are parameters of a sample face area corresponding to the current neuron network layer;
the first generation module is used for generating a first loss value of the current neuron network layer according to the predicted value;
the extraction module is used for extracting a second loss value of a neural network layer which is the last neural network layer of the current neural network layer, and taking a difference value between the first loss value and the second loss value as a descending gradient;
and the determining module is used for taking a model formed by all the current neuron network layers as a face recognition model if the descent gradient is smaller than a loss threshold value.
In another embodiment, the apparatus further comprises:
the acquisition module is used for acquiring the sample face parameters of the previous neural network layer of the current neural network layer;
and the second generation module is used for bringing the sample face parameters into a parameter association function to generate target sample face parameters of the current neuron network layer, and the parameter association function prescribes a functional relation between the sample face parameters of the adjacent neuron network layers.
In another embodiment, the iteration module includes:
the establishing unit is used for establishing a fixed track of the current neuron network layer based on the target sample face parameters;
the tracking unit is used for carrying out tracking simulation on the fixed track and counting the tracking times of the tracking simulation on the fixed track;
and the extraction unit is used for extracting parameters in the last tracking simulation process as the predicted value when the tracking times reach the iteration times of the current neuron network layer.
In another embodiment, the first generating module is configured to bring the predicted value into a loss value association function, and generate a first loss value of the current neuronal network layer, where the loss value association function specifies a functional relationship between the predicted value and the loss value of any neuronal network layer; or, acquiring a true value, counting the absolute value of the difference value between the true value and the predicted value, and taking the absolute value of the difference value as the first loss value of the current neuron network layer.
In another embodiment, the iteration module is further configured to, if the descent gradient is greater than the loss threshold, continue to establish a next neural network layer of the current neural network layer, and repeatedly perform the iterative training of the sample face parameter and the process of comparing the loss value with the loss threshold.
According to a third aspect of the present invention there is provided an apparatus comprising a memory storing a computer program and a processor implementing the steps of the method of the first aspect described above when the computer program is executed by the processor.
According to a fourth aspect of the present invention there is provided a readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of the first aspect described above.
By means of the technical scheme, compared with the mode that the reverse transmission process is continuously executed at present to perform iterative learning until the established neural network layer meets the depth requirement of the neural network, the training method, device and equipment for the face recognition model and the readable storage medium provided by the invention have the advantages that the target sample face parameters of the current neural network layer are obtained, iterative training is performed on the target sample face parameters to obtain the predicted value of the current neural network layer, the first loss value of the current neural network layer is generated according to the predicted value, so that the descending gradient of the current neural network layer is obtained, if the descending gradient is smaller than the loss threshold value, the establishment of the neural network layer is not continued, the current all the neural network layers are directly combined into the face recognition model, whether the establishment of the neural network layer is continuously executed or not is determined according to the practical situation of model training, a large amount of system memory is saved, and the training cost of the face recognition model is reduced.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
fig. 1 shows a flowchart of a training method of a face recognition model according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a training method of a face recognition model according to an embodiment of the present invention;
fig. 3A is a schematic structural diagram of a training device for a face recognition model according to an embodiment of the present invention;
fig. 3B is a schematic structural diagram of a training device for a face recognition model according to an embodiment of the present invention;
fig. 3C is a schematic structural diagram of a training device for a face recognition model according to an embodiment of the present invention;
fig. 4 shows a schematic device structure of a computer device according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
The embodiment of the invention provides a training method of a face recognition model, which can obtain a predicted value of a current neural network layer by obtaining a target sample face parameter of the current neural network layer, carrying out iterative training on the target sample face parameter to obtain the predicted value of the current neural network layer, and generating a first loss value of the current neural network layer according to the predicted value so as to obtain a descending gradient of the current neural network layer, if the descending gradient is smaller than a loss threshold value, the neural network layer is not continuously built, and all the current neural network layers are directly combined into the face recognition model, so that whether the work of building the neural network layer is continuously executed or not is determined according to the actual condition of model training, thereby achieving the purposes of saving a large amount of system memory and reducing the training cost of the face recognition model, and the method comprises the following steps as shown in figure 1:
101. obtaining target sample face parameters of a current neural network layer, and performing iterative training on the target sample face parameters to obtain predicted values of the current neural network layer, wherein the target sample face parameters are parameters of sample face areas corresponding to the current neural network layer.
102. And generating a first loss value of the current neuron network layer according to the predicted value.
103. And extracting a second loss value of a neural network layer above the current neural network layer, and taking a difference value between the first loss value and the second loss value as a descending gradient.
104. If the down gradient is smaller than the loss threshold value, the model formed by all the current neuron network layers is used as a face recognition model.
According to the method provided by the embodiment of the invention, the target sample face parameter of the current neural network layer is obtained, iterative training is carried out on the target sample face parameter to obtain the predicted value of the current neural network layer, and the first loss value of the current neural network layer is generated according to the predicted value, so that the descending gradient of the current neural network layer is obtained, if the descending gradient is smaller than the loss threshold value, the neural network layer is not continuously built, and the face recognition model is directly formed by all the current neural network layers, so that whether the work of building the neural network layer is continuously executed or not is determined according to the actual condition of model training, a large amount of system memory is saved, and the training cost of the face recognition model is reduced.
The embodiment of the invention provides a training method of a face recognition model, which can obtain a predicted value of a current neural network layer by obtaining a target sample face parameter of the current neural network layer, carrying out iterative training on the target sample face parameter to obtain the predicted value of the current neural network layer, and generating a first loss value of the current neural network layer according to the predicted value so as to obtain a descending gradient of the current neural network layer, if the descending gradient is smaller than a loss threshold value, the neural network layer is not continuously built, and all the current neural network layers are directly combined into the face recognition model, so that whether the work of building the neural network layer is continuously executed or not is determined according to the actual condition of model training, thereby achieving the purposes of saving a large amount of system memory and reducing the training cost of the face recognition model, and the method comprises the following steps as shown in fig. 2:
201. and acquiring the sample face parameters of the previous neural network layer of the current neural network layer, and carrying the sample face parameters into a parameter association function to generate the target sample face parameters of the current neural network layer.
The inventor realizes that when the model is trained, because the sample face area corresponding to each layer of the neural network layer is different, the sample face parameters relied on by each layer of the neural network layer are also different when the model is trained, so that a large number of sample face parameters are filled in the whole neural network, and whether a certain sample face parameter is used or not in the whole process of training the model, the sample face parameters are always operated along with the training of the model, so that the burden of the model training process is heavy. In order to reduce sample face parameters involved in the whole process of training a model based on a neural network and lighten the burden of training the face recognition model, the invention adopts a functional mode to correlate the sample face parameters of adjacent neural network layers, so that two sample face parameters or a plurality of sample face parameters are represented by one sample face parameter, namely, the sample face parameters of each neural network layer can be calculated according to a parameter correlation function between the neural network layers as long as one sample face parameter is set. The parameter association function can be set up by the staff according to the actual situation, so that the parameter association function set by the staff can be directly obtained when the sample face parameters of each neuron network layer are calculated.
The parameter association function prescribes a functional relation between sample face parameters of adjacent neural network layers, and when the target sample face parameters of the current neural network layer are generated, the last neural network layer which is associated with the current neural network layer and prescribed by the parameter association function is established, so that the sample face parameters of the last neural network layer of the current neural network layer are acquired, the sample face parameters are brought into the parameter association function, and the target sample face parameters of the current neural network layer are calculated and generated based on the parameter association function.
For example, assume that the correlation function between the target sample face parameter of the current neural network layer and the sample face parameter of the previous neural network layer is y=2x, where x is used to represent the sample face parameter of the previous neural network layer, y is used to represent the target sample face parameter of the current neural network layer, if the sample face parameter of the previous neural network layer is W 1 The face parameter of the target sample of the current neural network layer can be determined to be 2W according to the parameter association function 1
202. Obtaining target sample face parameters of the current neural network layer, and performing iterative training on the target sample face parameters to obtain a predicted value of the current neural network layer.
In the embodiment of the invention, after the target sample face parameters of the current neural network layer are determined, iterative learning can be carried out on the target sample face parameters, so that the predicted value of the current neural network layer is obtained. Specifically, when iterative learning is performed, firstly, a fixed track of a current neuron network layer is established based on target sample face parameters; then, the fixed track is subjected to tracking simulation, and the tracking times of the fixed track subjected to the tracking simulation are counted. When the tracking times reach the iteration times of the current neural network layer, extracting parameters of the track of the last iteration, and taking the extracted parameters as predicted values, namely extracting parameters in the last tracking simulation process as predicted values of the current neural network layer.
203. And generating a first loss value of the current neuron network layer according to the predicted value.
In the embodiment of the invention, since the iterative learning process is not a copying process, the complete consistency between the predicted value and the true value cannot be ensured, errors exist between the predicted value and the true value generated in the model training process, and the model training can be realized only by reducing the errors through continuous training, so that the errors between the predicted value and the true value, namely the loss value, are determined according to the obtained predicted value of the current neuron network layer, and whether the neuron network layer needs to be continuously built for iteration is determined according to the loss value. In addition, the true value is fixed and already exists, for example, iterative learning is performed on a picture, the picture needs to be used as the basic of iterative learning, the picture is also the true value, and parameters of various areas of the picture generated in the iterative process are also the predicted values, so that when determining the loss value, the following two methods can be adopted:
the first method is as follows: and acquiring a true value, counting the absolute value of the difference value between the true value and the predicted value, and taking the absolute value of the difference value as a first loss value of the current neuron network layer.
The second final method is as follows: for neural networks trained multiple times, based on experience accumulation, a loss correlation function for recording and predicting loss values can be generated according to different loss values of each layer, namely, the loss correlation function is adopted to prescribe a function relationship between the predicted value and the loss value of any neural network layer, so that the loss values can be generated by calculating the predicted values. Specifically, the predicted value is brought into a loss value correlation function, and a first loss value of the current neuron network layer is generated.
When the loss correlation function is set, a plurality of formula templates can be preset, the predicted value and the loss value of the existing neuron network layer are determined, a numerical value pair is formed based on the predicted value and the loss value which correspond to each other, the numerical value pair is respectively brought into the plurality of formula templates, the formula templates which are established after the numerical value pair are brought into the plurality of formula templates are used as designated formula templates, each coefficient in the designated formula templates is calculated, each unknown number of the formula templates is determined, and the unknown numbers and the coefficients are combined, so that the loss correlation function is generated. For example, let the value pairs be (1, 3) and (2, 5), respectively, and the determined specified formula template be y=ax+b, then after the value pairs are brought into the specified formula template, a=2, b=1 can be calculated, then the formula composed is y=2x+b, and this formula is taken as the loss correlation function, so that, if the loss function of other neural network layer is desired to be calculated, the predicted value of other neural network layer can be brought into the formula as x, and the calculation of y is takenThe value is the loss value of other neural network layers. In the practical application process, the loss correlation function can also beIn this form, the present invention is not limited to the specific form of the loss correlation function.
204. Extracting a second loss value of a previous neural network layer of the current neural network layer, taking a difference value between the first loss value and the second loss value as a descending gradient, and executing the following step 205 if the descending gradient is smaller than a loss threshold value; if the drop gradient is greater than the loss threshold, then the following step 206 is performed.
In the embodiment of the invention, after the loss value of the current neural network layer is determined, as each neural network layer has the loss value, the loss value becomes smaller continuously and becomes stable gradually along with the establishment of the neural network layer, therefore, the loss value difference value of the adjacent neural network layers can be calculated as a descending gradient, a loss threshold value for measuring whether to continue to establish the neural network layer is set, and whether to continue to establish the neural network layer is determined subsequently by comparing the descending gradient with the loss threshold value.
When calculating the descending gradient, firstly, acquiring a second loss value of a previous neural network layer of the current neural network layer, wherein the second loss value is recorded in the previous neural network layer, so that the second loss value is acquired in the previous neural network layer; then, a difference between the first loss value and the second loss value is calculated, and the difference is used as a descending gradient, so that whether establishment of the neural network layer is continued or not is determined according to the descending gradient later.
If the falling gradient is less than the loss threshold, it indicates that the loss value has approached a plateau, and that there is no change in the loss value that would be possible if the neural network layer were to be built up, without further iteration, i.e., step 205 described below. If the drop gradient is greater than the loss threshold, indicating that the loss value has not approached a plateau, the change is still large, and the neuron network layer needs to be established continuously, i.e., step 206 described below is performed.
205. If the down gradient is smaller than the loss threshold value, the model formed by all the current neuron network layers is used as a face recognition model.
In the embodiment of the invention, if the gradient is smaller than the loss threshold, the loss value is indicated to be close to stable, and the loss value is not changed when the establishment of the neural network layer is continued, so that the establishment of the next neural network layer is stopped, and the model formed by all the current neural network layers is taken as the face recognition model.
206. If the descending gradient is larger than the loss threshold value, continuing to establish the next neuron network layer of the current neuron network layer, and repeatedly executing the iterative training of the sample face parameters and comparing the loss value with the loss threshold value.
In the embodiment of the present invention, if the drop gradient is greater than or equal to the loss threshold, it indicates that the loss value does not approach to a steady state, and the change is still large, and it is necessary to continue to establish the neural network layer, that is, to continue to determine the sample face parameters of the next neural network layer according to the processes in steps 201 to 204, and repeatedly execute the iterative training of the sample face parameters and the process of comparing the loss value with the loss threshold until the drop gradient is smaller than the loss threshold.
According to the method provided by the embodiment of the invention, the target sample face parameter of the current neural network layer is obtained, iterative training is carried out on the target sample face parameter to obtain the predicted value of the current neural network layer, and the first loss value of the current neural network layer is generated according to the predicted value, so that the descending gradient of the current neural network layer is obtained, if the descending gradient is smaller than the loss threshold value, the neural network layer is not continuously built, and the face recognition model is directly formed by all the current neural network layers, so that whether the work of building the neural network layer is continuously executed or not is determined according to the actual condition of model training, a large amount of system memory is saved, and the training cost of the face recognition model is reduced.
Further, as a specific implementation of the method shown in fig. 1, an embodiment of the present invention provides a model training apparatus, as shown in fig. 3A, where the apparatus includes: an iteration module 301, a first generation module 302, an extraction module 303 and a determination module 304.
The iteration module 301 is configured to obtain a target sample face parameter of a current neuronal network layer, perform iterative training on the target sample face parameter to obtain a predicted value of the current neuronal network layer, where the target sample face parameter is a parameter of a sample face area corresponding to the current neuronal network layer;
the first generating module 302 is configured to generate a first loss value of the current neuronal network layer according to the predicted value;
the extracting module 303 is configured to extract a second loss value of a neural network layer that is a previous neural network layer to the current neural network layer, and take a difference value between the first loss value and the second loss value as a falling gradient;
the determining module 304 is configured to use a model composed of all the current neuronal network layers as a face recognition model if the descent gradient is less than a loss threshold.
In a specific application scenario, as shown in fig. 3B, the apparatus further includes: an acquisition module 305 and a second generation module 306.
The acquiring module 305 is configured to acquire a sample face parameter of a neural network layer that is a previous neural network layer to the current neural network layer;
the second generating module 306 is configured to bring the sample face parameters into a parameter association function, to generate target sample face parameters of the current neuronal network layer, where the parameter association function specifies a functional relationship between sample face parameters of adjacent neuronal network layers.
In a specific application scenario, as shown in fig. 3C, the iteration module 301 includes: a setup unit 3011, a tracking unit 3012, and an extraction unit 3013.
The establishing unit 3011 is configured to establish a fixed track of the current neuronal network layer based on the target sample face parameter;
the tracking unit 3012 is configured to perform tracking simulation on the fixed track, and count tracking times of performing tracking simulation on the fixed track;
the extracting unit 3013 is configured to extract, when the tracking number reaches the iteration number of the current neuronal network layer, a parameter in a last tracking simulation process as the predicted value.
In a specific application scenario, the first generating module 302 is configured to bring the predicted value into a loss value association function, to generate a first loss value of the current neuronal network layer, where the loss value association function specifies a functional relationship between the predicted value and the loss value of any neuronal network layer; or, acquiring a true value, counting the absolute value of the difference value between the true value and the predicted value, and taking the absolute value of the difference value as the first loss value of the current neuron network layer.
In a specific application scenario, the iteration module 301 is further configured to, if the descent gradient is greater than the loss threshold, continue to establish a next neural network layer of the current neural network layer, and repeatedly perform the iterative training of the sample face parameter and the process of comparing the loss value with the loss threshold.
According to the device provided by the embodiment of the invention, the target sample face parameter of the current neural network layer is obtained, iterative training is carried out on the target sample face parameter to obtain the predicted value of the current neural network layer, and the first loss value of the current neural network layer is generated according to the predicted value, so that the descending gradient of the current neural network layer is obtained, if the descending gradient is smaller than the loss threshold value, the neural network layer is not continuously built, and the face recognition model is directly formed by all the current neural network layers, so that whether the work of building the neural network layer is continuously executed or not is determined according to the actual condition of model training, a large amount of system memory is saved, and the training cost of the face recognition model is reduced.
It should be noted that, for other corresponding descriptions of each functional unit related to the model training apparatus provided by the embodiment of the present invention, reference may be made to corresponding descriptions in fig. 1 and fig. 2, and details are not repeated here.
In an exemplary embodiment, referring to fig. 4, there is further provided a device 400 including a communication bus, a processor, a memory, and a communication interface, and may further include an input-output interface, and a display device, wherein the functional units may communicate with each other via the bus. The memory stores a computer program, and a processor is configured to execute the program stored in the memory and perform the model training method in the above embodiment.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the model training method.
From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented in hardware, or may be implemented by means of software plus necessary general hardware platforms. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to perform the methods described in various implementation scenarios of the present application.
Those skilled in the art will appreciate that the drawings are merely schematic illustrations of one preferred implementation scenario, and that the modules or flows in the drawings are not necessarily required to practice the present application.
Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The foregoing application serial numbers are merely for description, and do not represent advantages or disadvantages of the implementation scenario.
The foregoing disclosure is merely a few specific implementations of the present application, but the present application is not limited thereto and any variations that can be considered by a person skilled in the art shall fall within the protection scope of the present application.

Claims (8)

1. A method for training a face recognition model, comprising:
acquiring a sample face parameter of a previous neural network layer of a current neural network layer, bringing the sample face parameter into a parameter correlation function to acquire a target sample face parameter of the current neural network layer, establishing a fixed track of the current neural network layer based on the target sample face parameter, carrying out tracking simulation on the fixed track, counting the tracking times of the tracking simulation on the fixed track, carrying out iterative training on the target sample face parameter, and extracting a parameter in a last tracking simulation process as a predicted value of the current neural network layer when the tracking times reach the iteration times of the current neural network layer, wherein the target sample face parameter is a parameter of a sample face area corresponding to the current neural network layer, and the parameter correlation function prescribes a functional relation between the sample face parameters of adjacent neural network layers;
generating a first loss value of the current neuron network layer according to the predicted value;
extracting a second loss value of a neural network layer above the current neural network layer, and taking a difference value between the first loss value and the second loss value as a descending gradient;
and if the gradient is smaller than the loss threshold value, the model formed by all the current neuron network layers is used as a face recognition model.
2. The method of claim 1, wherein generating a loss value for the current neural network layer from the predicted value comprises:
bringing the predicted value into a loss value association function to generate a first loss value of the current neuron network layer, wherein the loss value association function prescribes a functional relation between the predicted value and the loss value of any one of the neuron network layers; or alternatively, the first and second heat exchangers may be,
and acquiring a true value, counting the absolute value of a difference value between the true value and the predicted value, and taking the absolute value of the difference value as a first loss value of the current neuron network layer.
3. The method of claim 1, wherein the extracting a second loss value of a layer of the neuron network that is immediately preceding the current layer of the neuron network, wherein the method further comprises, after taking a difference between the first loss value and the second loss value as a decreasing gradient:
if the descending gradient is larger than the loss threshold value, continuing to establish a next neuron network layer of the current neuron network layer, and repeatedly executing iterative training of the sample face parameters and comparing the loss value with the loss threshold value.
4. A model training device, comprising:
the iteration module is used for acquiring a sample face parameter of a previous neuron network layer of a current neuron network layer, bringing the sample face parameter into a parameter correlation function to acquire a target sample face parameter of the current neuron network layer, establishing a fixed track of the current neuron network layer based on the target sample face parameter, carrying out tracking simulation on the fixed track, counting the tracking times of the tracking simulation on the fixed track, carrying out iterative training on the target sample face parameter, and extracting a parameter in the last tracking simulation process as a predicted value of the current neuron network layer when the tracking times reach the iteration times of the current neuron network layer, wherein the target sample face parameter is a parameter of a sample face area corresponding to the current neuron network layer, and the parameter correlation function prescribes a functional relation between the sample face parameters of adjacent neuron network layers;
the first generation module is used for generating a first loss value of the current neuron network layer according to the predicted value;
the extraction module is used for extracting a second loss value of a neural network layer which is the last neural network layer of the current neural network layer, and taking a difference value between the first loss value and the second loss value as a descending gradient;
and the determining module is used for taking a model formed by all the current neuron network layers as an identification model if the descent gradient is smaller than a loss threshold value.
5. The apparatus of claim 4, wherein the apparatus further comprises:
the acquisition module is used for acquiring the sample face parameters of the previous neural network layer of the current neural network layer;
and the second generation module is used for bringing the sample face parameters into a parameter association function to generate target sample face parameters of the current neuron network layer, and the parameter association function prescribes a functional relation between the sample face parameters of the adjacent neuron network layers.
6. The apparatus of claim 4, wherein the iterative module comprises:
the establishing unit is used for establishing a fixed track of the current neuron network layer based on the target sample face parameters;
the tracking unit is used for carrying out tracking simulation on the fixed track and counting the tracking times of the tracking simulation on the fixed track;
and the extraction unit is used for extracting parameters in the last tracking simulation process as the predicted value when the tracking times reach the iteration times of the current neuron network layer.
7. An apparatus comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 3 when the computer program is executed.
8. A readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 3.
CN201910785159.1A 2019-08-23 2019-08-23 Training method, device and equipment of face recognition model and readable storage medium Active CN110610140B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910785159.1A CN110610140B (en) 2019-08-23 2019-08-23 Training method, device and equipment of face recognition model and readable storage medium
PCT/CN2019/118399 WO2021035980A1 (en) 2019-08-23 2019-11-14 Facial recognition model training method and apparatus, and device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910785159.1A CN110610140B (en) 2019-08-23 2019-08-23 Training method, device and equipment of face recognition model and readable storage medium

Publications (2)

Publication Number Publication Date
CN110610140A CN110610140A (en) 2019-12-24
CN110610140B true CN110610140B (en) 2024-01-19

Family

ID=68890932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910785159.1A Active CN110610140B (en) 2019-08-23 2019-08-23 Training method, device and equipment of face recognition model and readable storage medium

Country Status (2)

Country Link
CN (1) CN110610140B (en)
WO (1) WO2021035980A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113605984A (en) * 2021-08-31 2021-11-05 中煤科工集团重庆研究院有限公司 Method for judging alarm threshold value for water damage microseismic
CN117557888B (en) * 2024-01-12 2024-04-12 清华大学深圳国际研究生院 Face model training method and face recognition method based on metric learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280513A (en) * 2018-01-22 2018-07-13 百度在线网络技术(北京)有限公司 model generating method and device
CN109165725A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Neural network federation modeling method, equipment and storage medium based on transfer learning
CN109711358A (en) * 2018-12-28 2019-05-03 四川远鉴科技有限公司 Neural network training method, face identification method and system and storage medium
CN110073371A (en) * 2017-05-05 2019-07-30 辉达公司 For to reduce the loss scaling that precision carries out deep neural network training

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69223447T2 (en) * 1991-05-24 1998-06-04 Koninkl Philips Electronics Nv Learning method for neural network and classification system for applying this method
JP5438419B2 (en) * 2009-07-29 2014-03-12 富士フイルム株式会社 Person verification device and person verification method
CN109635927A (en) * 2018-12-05 2019-04-16 东软睿驰汽车技术(沈阳)有限公司 A kind of convolutional neural networks training method and device
CN110084216B (en) * 2019-05-06 2021-11-09 苏州科达科技股份有限公司 Face recognition model training and face recognition method, system, device and medium
CN110135582B (en) * 2019-05-09 2022-09-27 北京市商汤科技开发有限公司 Neural network training method, neural network training device, image processing method, image processing device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110073371A (en) * 2017-05-05 2019-07-30 辉达公司 For to reduce the loss scaling that precision carries out deep neural network training
CN108280513A (en) * 2018-01-22 2018-07-13 百度在线网络技术(北京)有限公司 model generating method and device
CN109165725A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Neural network federation modeling method, equipment and storage medium based on transfer learning
CN109711358A (en) * 2018-12-28 2019-05-03 四川远鉴科技有限公司 Neural network training method, face identification method and system and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"优化 BP神经网络在非均衡数据分类中的应用";童威等;《长春工业大学学报》;第40卷(第3期);第263-269页 *

Also Published As

Publication number Publication date
WO2021035980A1 (en) 2021-03-04
CN110610140A (en) 2019-12-24

Similar Documents

Publication Publication Date Title
CN107358293B (en) Neural network training method and device
CN108182394B (en) Convolutional neural network training method, face recognition method and face recognition device
CN109460793B (en) Node classification method, model training method and device
TWI794157B (en) Automatic multi-threshold feature filtering method and device
US10296827B2 (en) Data category identification method and apparatus based on deep neural network
CN110276406B (en) Expression classification method, apparatus, computer device and storage medium
CN106682906B (en) Risk identification and service processing method and equipment
CN112784778B (en) Method, apparatus, device and medium for generating model and identifying age and sex
WO2018068421A1 (en) Method and device for optimizing neural network
CN110246148B (en) Multi-modal significance detection method for depth information fusion and attention learning
CN110610140B (en) Training method, device and equipment of face recognition model and readable storage medium
CN106204597B (en) A kind of video object dividing method based on from the step Weakly supervised study of formula
CN111564179B (en) Species biology classification method and system based on triple neural network
CN114490065A (en) Load prediction method, device and equipment
CN113627361B (en) Training method and device for face recognition model and computer program product
WO2015176502A1 (en) Image feature estimation method and device
CN110288026A (en) A kind of image partition method and device practised based on metric relation graphics
CN113904915A (en) Intelligent power communication fault analysis method and system based on Internet of things
CN116468967B (en) Sample image screening method and device, electronic equipment and storage medium
CN104200222B (en) Object identifying method in a kind of picture based on factor graph model
CN114694215A (en) Method, device, equipment and storage medium for training and estimating age estimation model
CN115907000A (en) Small sample learning method for optimal power flow prediction of power system
CN112561050B (en) Neural network model training method and device
CN114360032A (en) Polymorphic invariance face recognition method and system
CN111078820A (en) Edge weight prediction method based on weight symbol social network embedding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant