CN110610140A - Training method, device and equipment of face recognition model and readable storage medium - Google Patents

Training method, device and equipment of face recognition model and readable storage medium Download PDF

Info

Publication number
CN110610140A
CN110610140A CN201910785159.1A CN201910785159A CN110610140A CN 110610140 A CN110610140 A CN 110610140A CN 201910785159 A CN201910785159 A CN 201910785159A CN 110610140 A CN110610140 A CN 110610140A
Authority
CN
China
Prior art keywords
network layer
neuron network
value
sample face
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910785159.1A
Other languages
Chinese (zh)
Other versions
CN110610140B (en
Inventor
郭玲玲
李佼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910785159.1A priority Critical patent/CN110610140B/en
Priority to PCT/CN2019/118399 priority patent/WO2021035980A1/en
Publication of CN110610140A publication Critical patent/CN110610140A/en
Application granted granted Critical
Publication of CN110610140B publication Critical patent/CN110610140B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a training method, a training device, equipment and a readable storage medium of a face recognition model, relates to the technical field of neural networks, and can determine whether to continue establishing a neural network layer according to the actual condition of the face recognition model training, thereby saving a large amount of system memory and reducing the training cost of the face recognition model. The method comprises the following steps: acquiring a target sample face parameter of a current neuron network layer, and performing iterative training on the target sample face parameter to obtain a predicted value of the current neuron network layer, wherein the target sample face parameter is a parameter of a sample face area corresponding to the current neuron network layer; generating a first loss value of the current neuron network layer according to the predicted value; extracting a second loss value of a neuron network layer which is the last neuron network layer of the current neuron network layer, and taking the difference value of the first loss value and the second loss value as a descending gradient; and if the descending gradient is smaller than the loss threshold value, the model formed by all the current neuron network layers is used as the face recognition model.

Description

Training method, device and equipment of face recognition model and readable storage medium
Technical Field
The invention relates to the technical field of neural networks, in particular to a training method, a device and equipment of a face recognition model and a readable storage medium.
Background
In recent years, with the rapid development of neural network technology, intelligent recognition technology has been widely researched and developed and applied to a plurality of fields. In order to realize intelligent recognition of a face, personalized features of a sample face are generally extracted and recognized through comparison of the personalized features, generally, a neural network is adopted to learn the personalized features, a face recognition model is trained, and intelligent recognition is realized based on the face recognition model. The neural network architecture of the neural network comprises a plurality of layers of neurons, the gradient of each layer of neurons is gradually reduced, and because the gradient of the neuron is different, in order to ensure that the learning process of the neurons is accurate, the gradient of each layer of neurons needs to be reduced as a training parameter when the neurons perform iterative learning.
In the related technology, when a face recognition model is trained, a neural network with a fixed depth is adopted for supervised learning, when the face recognition model is trained, each layer of neuron network layer needs to execute back propagation, the descending gradient of the neuron network layer is calculated through the back propagation, and the neuron network layers are established one by one according to the descending gradient, so that the face recognition model is obtained based on the training of the established neuron network layer.
In the process of implementing the invention, the inventor finds that the related art has at least the following problems:
because the depth of the neural network is fixed, if the iterative learning of each layer of the neuron network layer is ensured to be stable when the depth requirement is not met, the iterative learning is carried out continuously in the process of continuously executing the reverse transmission until the built neuron network layer meets the depth requirement of the neural network, so that the work of continuously executing the reverse transmission and the like after the iterative learning of the neuron network layer is stable is useless, a large amount of system memory is wasted, and the training cost of the face recognition model is increased.
Disclosure of Invention
In view of the above, the present invention provides a training method, an apparatus, a device and a readable storage medium for a face recognition model, and mainly aims to solve the problems that a large amount of system memory is wasted and the training cost of the face recognition model is increased.
According to a first aspect of the present invention, there is provided a training method for a face recognition model, the method comprising:
acquiring a target sample face parameter of a current neuron network layer, and performing iterative training on the target sample face parameter to obtain a predicted value of the current neuron network layer, wherein the target sample face parameter is a parameter of a sample face area corresponding to the current neuron network layer;
generating a first loss value of the current neuron network layer according to the predicted value;
extracting a second loss value of a neuron network layer previous to the current neuron network layer, and taking a difference value between the first loss value and the second loss value as a descending gradient;
and if the descending gradient is smaller than the loss threshold value, the model formed by all the current neuron network layers is used as a face recognition model.
In another embodiment, before obtaining the sample face parameters of the current neuron network layer, and performing iterative training on the sample face parameters to obtain the prediction of the current neuron network layer, the method further includes:
obtaining a sample face parameter of a neuron network layer above the current neuron network layer;
and substituting the sample face parameters into a parameter association function to generate target sample face parameters of the current neuron network layer, wherein the parameter association function specifies the functional relationship between the sample face parameters of the adjacent neuron network layers.
In another embodiment, the obtaining of the target sample face parameter of the current neuron network layer, and performing iterative training on the target sample face parameter to obtain the predicted value of the current neuron network layer includes:
establishing a fixed track of the current neuron network layer based on the target sample face parameters;
performing tracking simulation on the fixed track, and counting the tracking times of the tracking simulation on the fixed track;
and when the tracking times reach the iteration times of the current neuron network layer, extracting parameters in the last tracking simulation process as the predicted values.
In another embodiment, the generating a loss value of the current neuron network layer based on the predicted value comprises:
substituting the predicted value into a loss value correlation function to generate a first loss value of the current neuron network layer, wherein the loss value correlation function specifies a functional relation between the predicted value and the loss value of any one neuron network layer; or the like, or, alternatively,
and acquiring a true value, counting the absolute value of the difference between the true value and the predicted value, and taking the absolute value of the difference as a first loss value of the current neuron network layer.
In another embodiment, after the extracting a second penalty value of a neuron network layer previous to the current neuron network layer and taking a difference between the first penalty value and the second penalty value as a descending gradient, the method further includes:
and if the descending gradient is larger than the loss threshold, continuously establishing a next neuron network layer of the current neuron network layer, and repeatedly executing the iterative training of the sample face parameters and comparing the loss value with the loss threshold.
According to a second aspect of the present invention, there is provided a model training apparatus, comprising:
the iteration module is used for acquiring a target sample face parameter of a current neuron network layer, performing iteration training on the target sample face parameter to obtain a predicted value of the current neuron network layer, wherein the target sample face parameter is a parameter of a sample face area corresponding to the current neuron network layer;
a first generation module, configured to generate a first loss value of the current neuron network layer according to the predicted value;
an extracting module, configured to extract a second loss value of a neuron network layer that is previous to the current neuron network layer, and use a difference between the first loss value and the second loss value as a descending gradient;
and the determining module is used for taking the model formed by all the current neuron network layers as a face recognition model if the descending gradient is smaller than the loss threshold.
In another embodiment, the apparatus further comprises:
the acquisition module is used for acquiring sample face parameters of a neuron network layer above the current neuron network layer;
and the second generation module is used for substituting the sample face parameters into a parameter association function to generate target sample face parameters of the current neural network layer, and the parameter association function specifies the functional relationship between the sample face parameters of the adjacent neural network layers.
In another embodiment, the iteration module includes:
the establishing unit is used for establishing a fixed track of the current neuron network layer based on the target sample face parameters;
the tracking unit is used for performing tracking simulation on the fixed track and counting the tracking times of the tracking simulation on the fixed track;
and the extracting unit is used for extracting parameters in the last tracking simulation process as the predicted value when the tracking times reach the iteration times of the current neuron network layer.
In another embodiment, the first generating module is configured to generate a first loss value of the current neuron network layer by substituting the predicted value into a loss value correlation function, where the loss value correlation function specifies a functional relationship between the predicted value and the loss value of any one neuron network layer; or, obtaining a true value, counting an absolute value of a difference between the true value and the predicted value, and taking the absolute value of the difference as a first loss value of the current neuron network layer.
In another embodiment, the iterative module is further configured to continue to establish a next neuron network layer of the current neuron network layer if the descent gradient is greater than the loss threshold, and repeat the process of performing the iterative training of the sample face parameters and comparing the loss value with the loss threshold.
According to a third aspect of the present invention, there is provided an apparatus comprising a memory storing a computer program and a processor implementing the steps of the method of the first aspect when the processor executes the computer program.
According to a fourth aspect of the present invention, there is provided a readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the method of the first aspect described above.
By the technical scheme, compared with the mode that iterative learning is continuously carried out in the process of carrying out reverse transmission until the built neuron network layer meets the depth requirement of the neuron network, the training method, the device, the equipment and the readable storage medium of the face recognition model provided by the invention have the advantages that the iterative learning is carried out by obtaining the face parameters of the target sample of the current neuron network layer, the iterative training is carried out on the face parameters of the target sample to obtain the predicted value of the current neuron network layer, the first loss value of the current neuron network layer is generated according to the predicted value to obtain the descending gradient of the current neuron network layer, if the descending gradient is smaller than the loss threshold value, the neuron network layer is not continuously built, all the current neuron network layers are directly combined into the face recognition model, so that whether the work of building the neuron network layer is continuously carried out or not is determined according to the actual condition of model training, and a large amount of system memory is saved, and the training cost of the face recognition model is reduced.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a schematic flow chart illustrating a training method of a face recognition model according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart illustrating a training method of a face recognition model according to an embodiment of the present invention;
fig. 3A is a schematic structural diagram illustrating a training apparatus for a face recognition model according to an embodiment of the present invention;
fig. 3B is a schematic structural diagram illustrating a training apparatus for a face recognition model according to an embodiment of the present invention;
fig. 3C is a schematic structural diagram illustrating a training apparatus for a face recognition model according to an embodiment of the present invention;
fig. 4 shows a schematic device structure diagram of a computer apparatus according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
The embodiment of the invention provides a training method of a face recognition model, which can carry out iterative training on a target sample face parameter of a current neuron network layer by obtaining the target sample face parameter of the current neuron network layer to obtain a predicted value of the current neuron network layer, and generate a first loss value of the current neuron network layer according to the predicted value to obtain a descending gradient of the current neuron network layer, if the descending gradient is smaller than a loss threshold value, the neuron network layer is not continuously established, all current neuron network layers are directly combined into the face recognition model, so that whether the work of establishing the neuron network layer is continuously executed or not is determined according to the actual condition of model training, and the purposes of saving a large amount of system memory and reducing the training cost of the face recognition model are achieved, as shown in figure 1, and the method comprises the following steps:
101. and acquiring a target sample face parameter of the current neuron network layer, and performing iterative training on the target sample face parameter to obtain a predicted value of the current neuron network layer, wherein the target sample face parameter is a parameter of a sample face area corresponding to the current neuron network layer.
102. And generating a first loss value of the current neuron network layer according to the predicted value.
103. And extracting a second loss value of the neuron network layer which is the last neuron network layer of the current neuron network layer, and taking the difference value of the first loss value and the second loss value as a descending gradient.
104. And if the descending gradient is smaller than the loss threshold value, the model formed by all the current neuron network layers is used as the face recognition model.
According to the method provided by the embodiment of the invention, the target sample face parameter of the current neuron network layer is obtained, iterative training is carried out on the target sample face parameter, the predicted value of the current neuron network layer is obtained, the first loss value of the current neuron network layer is generated according to the predicted value, so that the descending gradient of the current neuron network layer is obtained, if the descending gradient is smaller than the loss threshold value, the neuron network layer is not established continuously, all the current neuron network layers are directly formed into the face recognition model, whether the work of establishing the neuron network layer is executed continuously or not is determined according to the actual condition of model training, a large amount of system memory is saved, and the training cost of the face recognition model is reduced.
The embodiment of the invention provides a training method of a face recognition model, which can carry out iterative training on a target sample face parameter of a current neuron network layer by obtaining the target sample face parameter of the current neuron network layer to obtain a predicted value of the current neuron network layer, and generate a first loss value of the current neuron network layer according to the predicted value to obtain a descending gradient of the current neuron network layer, if the descending gradient is smaller than a loss threshold value, the neuron network layer is not continuously established, all current neuron network layers are directly combined into the face recognition model, so that whether the work of establishing the neuron network layer is continuously executed or not is determined according to the actual condition of model training, and the purposes of saving a large amount of system memory and reducing the training cost of the face recognition model are achieved, as shown in figure 2, and the method comprises the following steps:
201. and acquiring sample face parameters of a neuron network layer above the current neuron network layer, and substituting the sample face parameters into a parameter association function to generate target sample face parameters of the current neuron network layer.
The inventor realizes that when model training is carried out, because sample face regions corresponding to each layer of neuron network layer are different, and then sample face parameters depended on by each layer of neuron network layer are different when the model training is carried out, a large number of sample face parameters are filled in the whole neural network, and in the whole process of training the model, the sample face parameters are operated along with the training of the model all the time, so that the burden of the model training process is heavy. In order to reduce sample face parameters involved in the whole process of model training based on the neural network and reduce the burden of training a face recognition model, the sample face parameters of adjacent neural network layers are related in a functional mode, so that two sample face parameters or a plurality of sample face parameters are represented by one sample face parameter, namely, only one sample face parameter is set, and the sample face parameters of each layer of neural network layer can be calculated according to the parameter correlation function between the neural network layers. The sample face parameters related to different modeling processes are different, so that the parameter association function can be set by a worker according to the actual situation, and the parameter association function set by the worker can be directly obtained when the sample face parameters of each layer of the neuron network layer are calculated.
The parameter association function prescribes a functional relationship between sample face parameters of adjacent neuron network layers, and when a target sample face parameter of the current neuron network layer is generated, because the last neuron network layer which is prescribed by the parameter association function and is associated with the current neuron network layer is already established, the sample face parameter of the last neuron network layer of the current neuron network layer is obtained, and the sample face parameter is brought into the parameter association function, so that the target sample face parameter of the current neuron network layer is calculated and generated based on the parameter association function.
For example, if the correlation function between the target sample face parameter of the current neuron network layer and the sample face parameter of the previous neuron network layer is y-2 x, where x is used to represent the sample face parameter of the previous neuron network layer and y is used to represent the target sample face parameter of the current neuron network layer, if the sample face parameter of the previous neuron network layer is W1Then, the face parameter of the target sample of the current neural network layer can be determined to be 2W according to the parameter correlation function1
202. And acquiring the target sample face parameters of the current neuron network layer, and performing iterative training on the target sample face parameters to obtain the predicted value of the current neuron network layer.
In the embodiment of the invention, after the target sample face parameters of the current neural network layer are determined, iterative learning can be carried out on the target sample face parameters, so that the predicted value of the current neural network layer is obtained. Specifically, when iterative learning is performed, firstly, a fixed orbit of a current neuron network layer is established based on a target sample face parameter; subsequently, the fixed track is subjected to tracking simulation, and the number of times of tracking of the fixed track is counted. When the tracking times reach the iteration times of the current neuron network layer, extracting the parameters of the track of the last iteration, and taking the extracted parameters as predicted values, namely extracting the parameters in the last tracking simulation process as the predicted values of the current neuron network layer.
203. And generating a first loss value of the current neuron network layer according to the predicted value.
In the embodiment of the invention, because the iterative learning process is not a duplicated process, the complete consistency between the predicted value and the true value cannot be ensured, an error exists between the predicted value and the true value generated in the process of training the model, and the model training can be realized only by reducing the error through continuous training, so that the error between the predicted value and the true value, namely the loss value, needs to be determined according to the obtained predicted value of the current neuron network layer, and whether the neuron network layer needs to be established continuously for iteration is determined according to the loss value. In addition, the real value is fixed and already existed, for example, when iterative learning is performed on a picture, the picture needs to be used as the basis of the iterative learning, the picture is also called the real value, and the parameter of each region of the picture generated in the iterative process is also called the predicted value, so that when determining the loss value, the following two methods can be adopted:
the first method comprises the following steps: and acquiring a true value, counting the absolute value of the difference between the true value and the predicted value, and taking the absolute value of the difference as a first loss value of the current neuron network layer.
The second end method comprises the following steps: for the neural network trained for many times, based on the accumulation of experience, a loss correlation function for recording and predicting loss values can be generated according to different loss values of each layer, namely, the loss correlation function is adopted to specify the functional relationship between the predicted values and the loss values of any neuron network layer, so that the loss values can be generated by calculating the predicted values. Specifically, the predicted value is substituted into the loss value correlation function, and a first loss value of the current neuron network layer is generated.
When the loss correlation function is set, a plurality of formula templates can be preset, the predicted value and the loss value of the existing neuron network layer are determined, the numerical value pairs are respectively brought into the plurality of formula templates on the basis of the numerical value pairs formed by the predicted value and the loss value which correspond to each other, and the numerical value pairs are formed after the numerical value pairs are brought into the plurality of formula templatesAnd the formula template is used as a specified formula template, each coefficient in the specified formula template is calculated, each unknown number of the formula template is determined, and the unknown number and the coefficient are combined to generate the loss correlation function. For example, if the value pairs are (1,3) and (2,5), and the determined specified formula template is y ═ ax + b, the value pairs are substituted into the specified formula template, and a ═ 2 and b ═ 1 can be calculated, then the formula formed is y ═ 2x + b, and the formula is taken as the loss correlation function. The loss correlation function may also be a function of the loss correlation in the course of practical applicationThe present invention is not limited to the specific form of the loss correlation function.
204. Extracting a second loss value of a neuron network layer immediately above the current neuron network layer, taking a difference value between the first loss value and the second loss value as a descending gradient, and if the descending gradient is smaller than a loss threshold, executing the following step 205; if the falling gradient is greater than the loss threshold, step 206, described below, is performed.
In the embodiment of the invention, after the loss value of the current neuron network layer is determined, because each neuron network layer has the loss value, the loss value can be continuously reduced and gradually tends to be stable along with the establishment of the neuron network layer, the loss value difference value of the adjacent neuron network layers can be calculated as a descending gradient, a loss threshold value for judging whether to continuously establish the neuron network layer is set, and whether to continuously establish the neuron network layer is determined subsequently by comparing the descending gradient with the loss threshold value.
When calculating the descending gradient, firstly, acquiring a second loss value of a neuron network layer which is the last neuron network layer of the current neuron network layer, wherein the second loss value is recorded in the last neuron network layer, so that the second loss value can be acquired only in the last neuron network layer; then, a difference value between the first loss value and the second loss value is calculated and used as a descending gradient, so that whether the establishment of the neuron network layer is continued or not is determined according to the descending gradient.
If the descending gradient is smaller than the loss threshold, it indicates that the loss value has already approached a plateau, and the loss value may not change even if the neural network layer is continuously established, and the iterative operation is not required to be continued, that is, the following step 205 is performed. If the descending gradient is greater than the loss threshold, it indicates that the loss value is not approaching a steady state, and the change is still large, and the neural network layer needs to be built continuously, i.e. the following step 206 is performed.
205. And if the descending gradient is smaller than the loss threshold value, the model formed by all the current neuron network layers is used as the face recognition model.
In the embodiment of the present invention, if the descending gradient is smaller than the loss threshold, it indicates that the loss value has approached a steady state, and the loss value may not change when the neuron network layer is continuously established, so that the establishment of the next neuron network layer is stopped, and the model composed of all current neuron network layers is used as the face recognition model.
206. And if the descending gradient is larger than the loss threshold, continuously establishing a next neuron network layer of the current neuron network layer, and repeatedly executing the iterative training of the sample face parameters and the process of comparing the loss value with the loss threshold.
In the embodiment of the present invention, if the descending gradient is greater than or equal to the loss threshold, it indicates that the loss value is not close to a steady state, and the change is still large, and the neuron network layer needs to be continuously established, that is, the sample face parameters of the next neuron network layer are continuously determined according to the processes from step 201 to step 204, and the iterative training of the sample face parameters and the process of comparing the loss value with the loss threshold are repeatedly performed until the descending gradient is smaller than the loss threshold.
According to the method provided by the embodiment of the invention, the target sample face parameter of the current neuron network layer is obtained, iterative training is carried out on the target sample face parameter, the predicted value of the current neuron network layer is obtained, the first loss value of the current neuron network layer is generated according to the predicted value, so that the descending gradient of the current neuron network layer is obtained, if the descending gradient is smaller than the loss threshold value, the neuron network layer is not established continuously, all the current neuron network layers are directly formed into the face recognition model, whether the work of establishing the neuron network layer is executed continuously or not is determined according to the actual condition of model training, a large amount of system memory is saved, and the training cost of the face recognition model is reduced.
Further, as a specific implementation of the method shown in fig. 1, an embodiment of the present invention provides a model training apparatus, as shown in fig. 3A, the apparatus includes: an iteration module 301, a first generation module 302, an extraction module 303 and a determination module 304.
The iteration module 301 is configured to obtain a target sample face parameter of a current neural network layer, perform iterative training on the target sample face parameter, and obtain a predicted value of the current neural network layer, where the target sample face parameter is a parameter of a sample face area corresponding to the current neural network layer;
the first generating module 302 is configured to generate a first loss value of the current neural network layer according to the predicted value;
the extracting module 303 is configured to extract a second loss value of a neuron network layer that is previous to the current neuron network layer, and use a difference between the first loss value and the second loss value as a descending gradient;
the determining module 304 is configured to, if the descent gradient is smaller than the loss threshold, use the model composed of all current neuron network layers as a face recognition model.
In a specific application scenario, as shown in fig. 3B, the apparatus further includes: an acquisition module 305 and a second generation module 306.
The obtaining module 305 is configured to obtain a sample face parameter of a neuron network layer that is higher than the current neuron network layer;
the second generating module 306 is configured to bring the sample face parameters into a parameter association function, and generate target sample face parameters of the current neural network layer, where the parameter association function specifies a functional relationship between sample face parameters of adjacent neural network layers.
In a specific application scenario, as shown in fig. 3C, the iteration module 301 includes: a setup unit 3011, a trace unit 3012 and an extraction unit 3013.
The establishing unit 3011 is configured to establish a fixed track of the current neural network layer based on the target sample face parameters;
the tracking unit 3012 is configured to perform tracking simulation on the fixed track, and count the tracking times of performing tracking simulation on the fixed track;
the extracting unit 3013 is configured to, when the tracking frequency reaches the iteration frequency of the current neuron network layer, extract a parameter in a last tracking simulation process as the predicted value.
In a specific application scenario, the first generating module 302 is configured to bring the predicted value into a loss value correlation function, and generate a first loss value of the current neuron network layer, where the loss value correlation function specifies a functional relationship between the predicted value and the loss value of any one neuron network layer; or, obtaining a true value, counting an absolute value of a difference between the true value and the predicted value, and taking the absolute value of the difference as a first loss value of the current neuron network layer.
In a specific application scenario, the iteration module 301 is further configured to, if the descending gradient is greater than the loss threshold, continue to establish a next neuron network layer of the current neuron network layer, and repeatedly perform the iterative training of the sample face parameters and the process of comparing the loss value with the loss threshold.
According to the device provided by the embodiment of the invention, the target sample face parameter of the current neuron network layer is obtained, iterative training is carried out on the target sample face parameter, the predicted value of the current neuron network layer is obtained, the first loss value of the current neuron network layer is generated according to the predicted value, so that the descending gradient of the current neuron network layer is obtained, if the descending gradient is smaller than the loss threshold value, the neuron network layer is not established continuously, all the current neuron network layers are directly formed into the face recognition model, whether the work of establishing the neuron network layer is executed continuously or not is determined according to the actual condition of model training, a large amount of system memory is saved, and the training cost of the face recognition model is reduced.
It should be noted that other corresponding descriptions of the functional units related to the model training apparatus provided in the embodiment of the present invention may refer to the corresponding descriptions in fig. 1 and fig. 2, and are not described herein again.
In an exemplary embodiment, referring to fig. 4, there is further provided a device, where the device 400 includes a communication bus, a processor, a memory, and a communication interface, and may further include an input/output interface and a display device, where the functional units may communicate with each other through the bus. The memory stores a computer program, and the processor is used for executing the program stored in the memory and executing the model training method in the embodiment.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the model training method.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by hardware, and also by software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the implementation scenarios of the present application.
Those skilled in the art will appreciate that the figures are merely schematic representations of one preferred implementation scenario and that the blocks or flow diagrams in the figures are not necessarily required to practice the present application.
Those skilled in the art will appreciate that the modules in the devices in the implementation scenario may be distributed in the devices in the implementation scenario according to the description of the implementation scenario, or may be located in one or more devices different from the present implementation scenario with corresponding changes. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The above application serial numbers are for description purposes only and do not represent the superiority or inferiority of the implementation scenarios.
The above disclosure is only a few specific implementation scenarios of the present application, but the present application is not limited thereto, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present application.

Claims (10)

1. A training method of a face recognition model is characterized by comprising the following steps:
acquiring a target sample face parameter of a current neuron network layer, and performing iterative training on the target sample face parameter to obtain a predicted value of the current neuron network layer, wherein the target sample face parameter is a parameter of a sample face area corresponding to the current neuron network layer;
generating a first loss value of the current neuron network layer according to the predicted value;
extracting a second loss value of a neuron network layer previous to the current neuron network layer, and taking a difference value between the first loss value and the second loss value as a descending gradient;
and if the descending gradient is smaller than the loss threshold value, the model formed by all the current neuron network layers is used as a face recognition model.
2. The method of claim 1, wherein before obtaining sample face parameters of a current neural network layer and performing iterative training on the sample face parameters to obtain a prediction of the current neural network layer, the method further comprises:
obtaining a sample face parameter of a neuron network layer above the current neuron network layer;
and substituting the sample face parameters into a parameter association function to generate target sample face parameters of the current neuron network layer, wherein the parameter association function specifies the functional relationship between the sample face parameters of the adjacent neuron network layers.
3. The method according to claim 1, wherein the obtaining of the target sample face parameters of the current neural network layer and the iterative training of the target sample face parameters to obtain the predicted values of the current neural network layer comprises:
establishing a fixed track of the current neuron network layer based on the target sample face parameters;
performing tracking simulation on the fixed track, and counting the tracking times of the tracking simulation on the fixed track;
and when the tracking times reach the iteration times of the current neuron network layer, extracting parameters in the last tracking simulation process as the predicted values.
4. The method of claim 1, wherein generating the loss value for the current neural network layer based on the predicted value comprises:
substituting the predicted value into a loss value correlation function to generate a first loss value of the current neuron network layer, wherein the loss value correlation function specifies a functional relation between the predicted value and the loss value of any one neuron network layer; or the like, or, alternatively,
and acquiring a true value, counting the absolute value of the difference between the true value and the predicted value, and taking the absolute value of the difference as a first loss value of the current neuron network layer.
5. The method of claim 1, wherein after extracting a second penalty value for a neuron network layer immediately above the current neuron network layer and taking a difference between the first penalty value and the second penalty value as a descending gradient, the method further comprises:
and if the descending gradient is larger than the loss threshold, continuously establishing a next neuron network layer of the current neuron network layer, and repeatedly executing the iterative training of the sample face parameters and comparing the loss value with the loss threshold.
6. A model training apparatus, comprising:
the iteration module is used for acquiring a target sample face parameter of a current neuron network layer, performing iteration training on the target sample face parameter to obtain a predicted value of the current neuron network layer, wherein the target sample face parameter is a parameter of a sample face area corresponding to the current neuron network layer;
a first generation module, configured to generate a first loss value of the current neuron network layer according to the predicted value;
an extracting module, configured to extract a second loss value of a neuron network layer that is previous to the current neuron network layer, and use a difference between the first loss value and the second loss value as a descending gradient;
and the determining module is used for taking the model formed by all the current neuron network layers as the identification model if the descending gradient is smaller than the loss threshold.
7. The apparatus of claim 6, further comprising:
the acquisition module is used for acquiring sample face parameters of a neuron network layer above the current neuron network layer;
and the second generation module is used for substituting the sample face parameters into a parameter association function to generate target sample face parameters of the current neural network layer, and the parameter association function specifies the functional relationship between the sample face parameters of the adjacent neural network layers.
8. The apparatus of claim 6, wherein the iteration module comprises:
the establishing unit is used for establishing a fixed track of the current neuron network layer based on the target sample face parameters;
the tracking unit is used for performing tracking simulation on the fixed track and counting the tracking times of the tracking simulation on the fixed track;
and the extracting unit is used for extracting parameters in the last tracking simulation process as the predicted value when the tracking times reach the iteration times of the current neuron network layer.
9. An apparatus comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 5 when executing the computer program.
10. A readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
CN201910785159.1A 2019-08-23 2019-08-23 Training method, device and equipment of face recognition model and readable storage medium Active CN110610140B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910785159.1A CN110610140B (en) 2019-08-23 2019-08-23 Training method, device and equipment of face recognition model and readable storage medium
PCT/CN2019/118399 WO2021035980A1 (en) 2019-08-23 2019-11-14 Facial recognition model training method and apparatus, and device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910785159.1A CN110610140B (en) 2019-08-23 2019-08-23 Training method, device and equipment of face recognition model and readable storage medium

Publications (2)

Publication Number Publication Date
CN110610140A true CN110610140A (en) 2019-12-24
CN110610140B CN110610140B (en) 2024-01-19

Family

ID=68890932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910785159.1A Active CN110610140B (en) 2019-08-23 2019-08-23 Training method, device and equipment of face recognition model and readable storage medium

Country Status (2)

Country Link
CN (1) CN110610140B (en)
WO (1) WO2021035980A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401277A (en) * 2020-03-20 2020-07-10 深圳前海微众银行股份有限公司 Face recognition model updating method, device, equipment and medium
CN113605984A (en) * 2021-08-31 2021-11-05 中煤科工集团重庆研究院有限公司 Method for judging alarm threshold value for water damage microseismic

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117557888B (en) * 2024-01-12 2024-04-12 清华大学深圳国际研究生院 Face model training method and face recognition method based on metric learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280513A (en) * 2018-01-22 2018-07-13 百度在线网络技术(北京)有限公司 model generating method and device
CN109165725A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Neural network federation modeling method, equipment and storage medium based on transfer learning
CN109711358A (en) * 2018-12-28 2019-05-03 四川远鉴科技有限公司 Neural network training method, face identification method and system and storage medium
CN110073371A (en) * 2017-05-05 2019-07-30 辉达公司 For to reduce the loss scaling that precision carries out deep neural network training

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0514986B1 (en) * 1991-05-24 1997-12-10 Laboratoires D'electronique Philips S.A.S. Learning method for a neural network and classification device using this method
JP5438419B2 (en) * 2009-07-29 2014-03-12 富士フイルム株式会社 Person verification device and person verification method
CN109635927A (en) * 2018-12-05 2019-04-16 东软睿驰汽车技术(沈阳)有限公司 A kind of convolutional neural networks training method and device
CN110084216B (en) * 2019-05-06 2021-11-09 苏州科达科技股份有限公司 Face recognition model training and face recognition method, system, device and medium
CN110135582B (en) * 2019-05-09 2022-09-27 北京市商汤科技开发有限公司 Neural network training method, neural network training device, image processing method, image processing device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110073371A (en) * 2017-05-05 2019-07-30 辉达公司 For to reduce the loss scaling that precision carries out deep neural network training
CN108280513A (en) * 2018-01-22 2018-07-13 百度在线网络技术(北京)有限公司 model generating method and device
CN109165725A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Neural network federation modeling method, equipment and storage medium based on transfer learning
CN109711358A (en) * 2018-12-28 2019-05-03 四川远鉴科技有限公司 Neural network training method, face identification method and system and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
童威等: ""优化 BP神经网络在非均衡数据分类中的应用"", 《长春工业大学学报》, vol. 40, no. 3, pages 263 - 269 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401277A (en) * 2020-03-20 2020-07-10 深圳前海微众银行股份有限公司 Face recognition model updating method, device, equipment and medium
CN113605984A (en) * 2021-08-31 2021-11-05 中煤科工集团重庆研究院有限公司 Method for judging alarm threshold value for water damage microseismic

Also Published As

Publication number Publication date
WO2021035980A1 (en) 2021-03-04
CN110610140B (en) 2024-01-19

Similar Documents

Publication Publication Date Title
KR102170105B1 (en) Method and apparatus for generating neural network structure, electronic device, storage medium
CN110610140A (en) Training method, device and equipment of face recognition model and readable storage medium
CN105224984A (en) A kind of data category recognition methods based on deep neural network and device
US20230367934A1 (en) Method and apparatus for constructing vehicle dynamics model and method and apparatus for predicting vehicle state information
CN116166405B (en) Neural network task scheduling strategy determination method and device in heterogeneous scene
CN113705628B (en) Determination method and device of pre-training model, electronic equipment and storage medium
CN111008631B (en) Image association method and device, storage medium and electronic device
CN114490065A (en) Load prediction method, device and equipment
CN106204597A (en) A kind of based on from the VS dividing method walking the Weakly supervised study of formula
CN113449878B (en) Data distributed incremental learning method, system, equipment and storage medium
CN113904915A (en) Intelligent power communication fault analysis method and system based on Internet of things
CN116341634B (en) Training method and device for neural structure search model and electronic equipment
CN114913330B (en) Point cloud component segmentation method and device, electronic equipment and storage medium
CN113961765B (en) Searching method, searching device, searching equipment and searching medium based on neural network model
CN115577787A (en) Quantum amplitude estimation method, device, equipment and storage medium
CN115759209A (en) Neural network model quantification method and device, electronic equipment and medium
CN113158134B (en) Method, device and storage medium for constructing non-invasive load identification model
CN115544307A (en) Directed graph data feature extraction and expression method and system based on incidence matrix
CN112242959B (en) Micro-service current-limiting control method, device, equipment and computer storage medium
CN114021697A (en) End cloud framework neural network generation method and system based on reinforcement learning
CN115688873A (en) Graph data processing method, device and computer program product
CN112825143A (en) Deep convolutional neural network compression method, device, storage medium and equipment
CN117435308B (en) Modelica model simulation method and system based on parallel computing algorithm
CN116738128B (en) Method and device for solving time-containing partial differential equation by utilizing quantum circuit
Süleyman et al. Deep Learning Based Classification of Time Series of Chaotic Systems over Graphic Images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant