WO2021139309A1 - Procédé, appareil et dispositif d'apprentissage d'un modèle de reconnaissance, et support de stockage - Google Patents

Procédé, appareil et dispositif d'apprentissage d'un modèle de reconnaissance, et support de stockage Download PDF

Info

Publication number
WO2021139309A1
WO2021139309A1 PCT/CN2020/122376 CN2020122376W WO2021139309A1 WO 2021139309 A1 WO2021139309 A1 WO 2021139309A1 CN 2020122376 W CN2020122376 W CN 2020122376W WO 2021139309 A1 WO2021139309 A1 WO 2021139309A1
Authority
WO
WIPO (PCT)
Prior art keywords
loss function
classification
feature
recognition model
face recognition
Prior art date
Application number
PCT/CN2020/122376
Other languages
English (en)
Chinese (zh)
Inventor
张国辉
徐玲玲
宋晨
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021139309A1 publication Critical patent/WO2021139309A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • This application relates to the field of artificial intelligence neural networks, and in particular to a training method, device, equipment and storage medium of a face recognition model.
  • Face recognition is a hot field in the field of image recognition.
  • deep learning is used to train a neural network that can perform face recognition, that is, a face recognition model.
  • face recognition model For the recognition accuracy of the face recognition model, because the face recognition model obtained by training using the training data of the application scene will be limited by the application scene of the training data, its recognition accuracy will be low. Therefore, the face recognition model will be used.
  • the universality of the recognition model is optimized to improve the recognition accuracy of the face recognition model.
  • the optimization of the universality of face recognition models generally adopts fine-tuning finetune or mixing multiple training sets.
  • the inventor realizes that due to the fine-tuning method after the model training, the original training set will be changed. Few features are retained, which leads to poor generalization effects of the final face recognition model; the method of mixing multiple training sets has overlapping data that is difficult to clean, and is introduced as dirty data into the training, which affects the training effect of the model. Therefore, the recognition accuracy of the existing face recognition model is low.
  • the main purpose of this application is to solve the problem of low recognition accuracy of existing face recognition models.
  • the first aspect of this application provides a method for training a face recognition model, including:
  • the face recognition model including the backbone network and multiple classification networks
  • the backbone network is iteratively updated according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
  • a second aspect of the present application provides a training device for a face recognition model.
  • the training device for a face recognition model includes a memory, a processor, and a person stored in the memory and running on the processor.
  • the training program of the face recognition model when the processor executes the training program of the face recognition model, the following steps are implemented:
  • the face recognition model including the backbone network and multiple classification networks
  • the backbone network is iteratively updated according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
  • a third aspect of the present application provides a computer-readable storage medium that stores computer instructions, and when the computer instructions are executed on a computer, the computer executes the following steps:
  • the face recognition model including the backbone network and multiple classification networks
  • the backbone network is iteratively updated according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
  • the fourth aspect of the present application provides a training device for a face recognition model, including:
  • An obtaining module configured to obtain a plurality of preprocessed training data sets, where the plurality of training data sets are face training data sets corresponding to a plurality of application scenarios;
  • the feature extraction module is used to extract the facial features of the multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets.
  • the face recognition model includes the backbone network and Multiple classification networks;
  • the classification module is configured to classify the multiple feature sets through the multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
  • the first calculation module is used to calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
  • the second calculation module is configured to calculate the target loss function value of the face recognition model according to the multiple feature vector loss function values and the multiple classification loss function values;
  • An iterative update module is used to iteratively update the backbone network according to the target loss function value until the target loss function value converges to obtain an updated face recognition model.
  • multiple pre-processed training data sets, as well as the backbone network and multiple classification networks of the preset face recognition model are obtained, and the multiple training data sets are people corresponding to multiple application scenarios.
  • the target loss function value obtained by multiple classification loss function values is used to update the backbone network of the face recognition model, which can make the face recognition model have better universality, thereby improving the recognition accuracy of the existing face recognition model .
  • FIG. 1 is a schematic diagram of an embodiment of a method for training a face recognition model in an embodiment of the application
  • FIG. 2 is a schematic diagram of another embodiment of a method for training a face recognition model in an embodiment of the application
  • FIG. 3 is a schematic diagram of an embodiment of a training device for a face recognition model in an embodiment of the application
  • FIG. 4 is a schematic diagram of another embodiment of a training device for a face recognition model in an embodiment of the application
  • Fig. 5 is a schematic diagram of an embodiment of a training device for a face recognition model in an embodiment of the application.
  • the embodiments of the present application provide a method, device, equipment, and storage medium for training a face recognition model, which solve the problem of low recognition accuracy of the existing face recognition model.
  • An embodiment of the training method of the face recognition model in the embodiment of the present application includes:
  • the execution subject of this application may be a training device for a face recognition model, or a terminal or server corresponding to the logistics headquarters, which is not specifically limited here.
  • the embodiment of the present application takes the server corresponding to the logistics headquarters as the execution subject as an example for description.
  • a training data set corresponds to an application scenario, such as: identification scenes and natural scenes.
  • the training data set can be face data, open source data and private data in different dimensions, such as: face data of natural scenes, face data of Asians, attendance data, personal identification data, and competition data.
  • the server can extract multiple pre-processed training data sets from a preset database, and can also obtain face training data sets in different dimensions corresponding to multiple application scenarios from multiple channels, and perform face training data sets on the face training data sets. Preprocessing to obtain multiple training data sets after preprocessing.
  • the face recognition model includes a backbone network and multiple classification networks.
  • the preset face recognition model includes a backbone network and multiple classification networks.
  • the output of the backbone network is the input of multiple classification networks.
  • the data processed by the backbone network is classified through multiple classification networks to realize the training data set. Face recognition training.
  • the backbone network can be a single convolutional neural network or a comprehensive framework of multiple convolutional neural networks.
  • the backbone network can be a deep residual learning framework ResNet or a target detection network framework ET-YOLOv3, or deep residual learning
  • the framework ResNet combines the comprehensive framework of the target detection network framework ET-YOLOv3.
  • the server can perform face frame recognition, frame area division, face key point detection, and face feature vector extraction for each training data set through the backbone network of the face recognition model to obtain the corresponding features of each training data set Set (ie multiple feature sets).
  • the convolutional network layer in the backbone network uses a small convolution kernel, which retains more features through the small convolution kernel, reduces the amount of calculation, and improves the efficiency of facial feature extraction.
  • the server obtains the label on the training data corresponding to each feature set, calls multiple classification networks, and classifies the multiple feature sets through the classification network and the labels to obtain multiple classification data sets.
  • a classification network classifies a feature set.
  • multiple classification networks are A1, B1, C1, and D1, and multiple feature sets are A2, B2, C2, and D2.
  • A1 classifies A2, and B1 classifies A2.
  • B2 classifies, C1 classifies C2, and D1 classifies D2.
  • Each classification network can adopt the same network structure or different network structure.
  • the classification network is A1, B1, C1, and D1 are linear classifiers, and the classification network is A1, B1, C1, and D1 as volumes.
  • the server calculates the first center vector and the second center vector, calculates the distance value between each first center vector and the second center vector, and uses the distance value as the feature vector loss function value corresponding to each feature set, thereby obtaining more A feature vector loss function, where the first center vector is the center vector corresponding to each feature set, or the center vector corresponding to each training data in each feature set, and the second center vector can be the first center vector corresponding to all feature sets.
  • the two center vector can also be the center vector corresponding to all the training data in each feature set.
  • the server can obtain the number of training data corresponding to each feature set, and calculate the sum value of the first center vector corresponding to all training data, and calculate the average value of the sum value according to the number of training data.
  • the average value is corresponding to each feature set.
  • the server can also calculate the second center vector through a preset center vector formula.
  • the server calculates the classification loss function value of each classification data set through the preset cross-entropy loss function, thereby obtaining multiple classification loss function values.
  • the cross-entropy loss function can be a multi-class cross-entropy loss function. Function, the derivation is simpler, can make the convergence faster, and the update of the corresponding weight matrix is faster.
  • the server After the server obtains multiple eigenvector loss function values and multiple classification loss function values, it obtains the number of data sets of multiple training data sets, and calculates the average eigenvector loss function of multiple eigenvector loss function values according to the number of data sets Value, and the average classification loss function value of multiple classification loss function values, the sum of the average feature vector loss function value and the average classification loss function value is used as the target loss function value of the face recognition model, or the average feature vector loss The weighted sum of the function value and the average classification loss function value is used as the target loss function value of the face recognition model.
  • each classification network calculates the classification loss function value, the corresponding classification network can be updated inversely according to the classification loss function value.
  • the server iteratively updates the network structure and/or weight value of the backbone network according to the target loss function value and the preset number of iterations until the target loss function value converges (that is, the training accuracy of the face recognition model meets the preset conditions), and obtains The updated face recognition model.
  • the network structure of the backbone network can be updated by adding or deleting the network layer of the backbone network, or by adding other network frameworks to update the network structure of the backbone network, or by modifying the size of the convolution kernel of the backbone network And step size to update the network structure of the backbone network.
  • the server can also optimize the face recognition model in combination with optimization algorithms.
  • the existence of overlapping data and overlapping data as dirty data have an adverse effect on model training.
  • the loss function based on multiple feature vectors is avoided. Value and the target loss function value obtained by multiple classification loss function values to update the backbone network of the face recognition model, which can make the face recognition model have better universality, thereby improving the performance of the existing face recognition model Recognition accuracy.
  • another embodiment of the training method of the face recognition model in the embodiment of the present application includes:
  • initial training data sets corresponding to multiple application scenarios, where the initial training data sets include open source data and private data.
  • the server extracts the initial training data sets (open source data) in different dimensions corresponding to multiple different application scenarios from the open source database, and crawls the initial training data sets corresponding to multiple different application scenarios (open source data) from the network platform. , Extract multiple initial training data sets (private data) corresponding to different application scenarios from the alliance chain or private database.
  • the server performs missing value detection, missing value filling, and missing value cleanup on each initial training data set according to the preset missing value ratio to obtain the initial training data set after missing value processing, and the initial training data set after missing value processing Perform merging and de-duplication to obtain the initial training data set after merging and de-duplication, determine whether there is training data in the initial training data set after merging and de-duplication that does not meet the preset legality determination rules, if there is, delete the corresponding training data If it does not exist, the initial training data set after the merging and de-duplication processing is determined as a candidate training data set, and the candidate training data set is labeled to obtain multiple preprocessed training data sets.
  • the content of labeling may include at least one of classification, frame labeling, area labeling, and spot labeling, such as age-adult, gender-female, race-yellow, hair-long hair, face Facial expressions-smiles, face-wearing parts-spectacles classification labeling, frame labeling such as: labeling the frame position of the face in the image, area labeling such as: labeling the area position of the face in the image, marking the point labeling such as: face Key points are marked.
  • classification such as: labeling the frame position of the face in the image
  • area labeling such as: labeling the area position of the face in the image
  • marking the point labeling such as: face Key points are marked.
  • the face recognition model includes a backbone network and multiple classification networks.
  • the server obtains the number of data sets of multiple training data sets, and calculates the average data volume of each training data set according to the number of data sets; the training data corresponding to the average data volume is used as batch data to obtain each training data set.
  • the target batch processing data is the 160 training data.
  • the 160 training data (that is, the target batch processing data) is processed into the face image area Detect, obtain the face area, perform face key point detection on the face area, obtain face key point information, and perform face feature vector extraction on the face key point information to obtain multiple feature sets.
  • the target batch data obtained from different training data sets will be randomly looped when the target batch data with a small number of training data is processed first in the data processing process. Processing until the end of the target batch data processing with the largest number of training data.
  • the server obtains the label on the training data corresponding to each feature set, calls multiple classification networks, and classifies the multiple feature sets through the classification network and the labels to obtain multiple classification data sets.
  • a classification network classifies a feature set.
  • Each classification network can use the same network structure or different network structure. Through the same network structure, the network complexity is reduced. By using different network structures to process different types of training data, it is beneficial to Improve the efficiency of classification and the universality of face recognition models.
  • the server calculates the first feature center vector corresponding to each feature set and the second feature center vector corresponding to multiple feature sets; calculates the difference between the first feature center vector and the second feature center vector corresponding to each feature set
  • the distance value is determined as the feature vector loss function value of each feature set, and multiple feature vector loss function values are obtained; the preset labels corresponding to each training data in each training data set are obtained, according to the preset labels and
  • the preset cross entropy loss function calculates the classification loss function value of each classification data set, and obtains multiple classification loss function values.
  • the server obtains the first feature vector of each feature set, the number of corresponding first training data in each feature set, the first number of target batch data, and calculates it through the preset first update center vector formula
  • the first feature center vector corresponding to each feature set, and the first update center vector formula for calculating the first feature center vector is as follows:
  • vc p is used to indicate the p-th feature set
  • vc p is the current first feature center vector
  • vc p-1 is the first feature center vector of the previous iteration
  • vn p is the first data number of the current iteration
  • n p a first number of training data
  • v i first current feature vector is 0.
  • the server obtains the second feature vector of all feature sets, the number of second training data corresponding to all feature sets, and the second data number of target batch data corresponding to all feature sets.
  • the formula for calculating the second update center vector of the second feature center vector is as follows:
  • v q is used to indicate the qth iteration
  • v q is the current second feature center vector
  • v q-1 is the second feature center vector of the previous iteration
  • vk q is the number of second data in the current iteration
  • n q is the first 2.
  • Number of training data v j is the second feature vector of all current feature sets. Among them, the second feature center vector v q before the first iteration is zero.
  • the server obtains the dimension of the first feature vector of each feature set, and calculates the feature vector loss function value and the feature vector loss function value according to the dimension of the first feature vector, the first feature center vector and the second feature center vector of each feature set
  • the calculation formula is as follows:
  • p is used to indicate the p-th feature set
  • m is the dimension of the first feature vector of each feature set
  • vc p is the first feature center vector
  • v q is the second feature center vector
  • the server counts the number of preset labels in each classification data set, and obtains the feature vector of the classification data corresponding to the preset labels in each classification data set; according to the preset cross-entropy loss function, the number of labels and The feature vector is used to calculate the classification loss function value of each classification data set to obtain multiple classification loss function values.
  • the cross-entropy loss function is as follows:
  • y represents the y-th training data set
  • c y is the y-th training data set corresponding to the classified data set
  • n y is the number of labels
  • label i is the i-th label preset categories
  • v i is the eigenvector .
  • the server classifies the features according to the preset labels on each training data to obtain multiple classification data sets, so as to obtain the number of labels of the preset labels in each classification data set, the preset labels of each category, and the The feature vector generated by the classification data corresponding to the preset label, through the preset cross entropy loss function, combined with the number of labels and the feature vector calculation, the classification loss function value of each classification data set is obtained, thereby obtaining multiple classification losses Function value.
  • the server calculates the average value of the multiple eigenvector loss function values according to the number of data sets to obtain the average eigenvector loss function value; calculates the average value of the multiple classification loss function values according to the number of data sets to obtain the average classification loss function value; Calculate the sum of the average feature vector loss function value and the average classification loss function value to obtain the target loss function value of the face recognition model.
  • the target batch data is updated to obtain the updated target batch data, and the network structure of the backbone network is updated to obtain the updated The backbone network; through the updated backbone network and multiple classification networks, the updated target batch data is sequentially extracted and classified by facial features to obtain multiple target classification data sets; according to multiple target classification data sets, calculate The updated objective loss function value, and determine whether the updated objective loss function value converges; if the updated objective loss function value does not converge, the updated backbone network is iteratively updated according to the updated objective loss function value, Until the updated target loss function value converges, the final updated face recognition model is obtained.
  • the server determines that the value of the target loss function has converged, it uses the current face recognition model as the final face recognition model.
  • the server determines that the updated target loss function value has converged, it uses the currently updated face recognition model as the final updated face recognition model.
  • the facial feature extraction and classification are performed on the updated target batch data in sequence to obtain multiple target classification data sets.
  • the operation method is similar to the above steps 102, 103, 203, and 204, according to the multiple target classification data Set, the operation method of calculating the updated target loss function value is similar to the operation method of the above steps 104, 105, 205, and 206, and will not be repeated here.
  • the data quantity of each updated target batch data will be different, and it will change dynamically, which is equal to the sum of the target batch data in the previous iteration and the current target batch data.
  • the training method of the face recognition model in the embodiment of the application is described above, and the training device of the face recognition model in the embodiment of the application is described below. Please refer to FIG. 3, the training device of the face recognition model in the embodiment of the application One embodiment includes:
  • the obtaining module 301 is configured to obtain a plurality of preprocessed training data sets, and the plurality of training data sets are face training data sets corresponding to multiple application scenarios;
  • the feature extraction module 302 is used to extract facial features from multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets.
  • the face recognition model includes the backbone network and multiple classifications. The internet;
  • the classification module 303 is configured to classify multiple feature sets through multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
  • the first calculation module 304 is configured to calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
  • the second calculation module 305 is configured to calculate the target loss function value of the face recognition model according to multiple feature vector loss function values and multiple classification loss function values;
  • the iterative update module 306 is configured to iteratively update the backbone network according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
  • each module in the above-mentioned face recognition model training device corresponds to each step in the embodiment of the above-mentioned face recognition model training method, and its functions and realization process are not repeated here.
  • the existence of overlapping data and overlapping data as dirty data have an adverse effect on model training.
  • the loss function based on multiple feature vectors is avoided. Value and the target loss function value obtained by multiple classification loss function values to update the backbone network of the face recognition model, which can make the face recognition model have better universality, thereby improving the performance of the existing face recognition model Recognition accuracy.
  • another embodiment of the training device for the face recognition model in the embodiment of the present application includes:
  • the obtaining module 301 is configured to obtain a plurality of preprocessed training data sets, and the plurality of training data sets are face training data sets corresponding to multiple application scenarios;
  • the obtaining module 301 specifically includes:
  • the obtaining unit 3011 is configured to obtain initial training data sets corresponding to multiple application scenarios, and the initial training data sets include open source data and private data;
  • the preprocessing unit 3012 is configured to sequentially perform data cleaning and label labeling on each initial training data set to obtain multiple preprocessed training data sets;
  • the feature extraction module 302 is used to extract facial features from multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets.
  • the face recognition model includes the backbone network and multiple classifications. The internet;
  • the classification module 303 is configured to classify multiple feature sets through multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
  • the first calculation module 304 is configured to calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
  • the second calculation module 305 is configured to calculate the target loss function value of the face recognition model according to multiple feature vector loss function values and multiple classification loss function values;
  • the iterative update module 306 is configured to iteratively update the backbone network according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
  • the feature extraction module 302 may also be specifically used for:
  • face image area detection, face key point detection and face feature vector extraction are sequentially performed on the target batch data to obtain multiple feature sets.
  • the second calculation module 305 may also be specifically used for:
  • the first calculation module 304 includes:
  • the first calculation unit 3041 is configured to calculate the first feature center vector corresponding to each feature set and the second feature center vectors corresponding to multiple feature sets;
  • the second calculation unit 3042 is used to calculate the distance value between the first feature center vector and the second feature center vector corresponding to each feature set, and determine the distance value as the feature vector loss function value of each feature set to obtain Multiple eigenvector loss function values;
  • the third calculation unit 3043 is used to obtain the preset label corresponding to each training data in each training data set, and calculate the classification loss function value of each classification data set according to the preset label and the preset cross-entropy loss function, and obtain the The value of the classification loss function.
  • the third calculation unit 3043 may also be specifically configured to:
  • the number of labels and the feature vector, the classification loss function value of each classification data set is calculated, and multiple classification loss function values are obtained.
  • the cross-entropy loss function is as follows:
  • y represents the y-th training data set
  • c y is the y-th training data set corresponding to the classified data set
  • n y is the number of labels
  • label i is the i-th label preset categories
  • v i is the eigenvector .
  • the iterative update module 306 may also be specifically used to:
  • the updated target batch data is sequentially extracted and classified by facial features to obtain multiple target classification data sets;
  • the updated backbone network is iteratively updated according to the updated objective loss function value until the updated objective loss function value converges to obtain the final updated face recognition model .
  • each module and each unit in the above-mentioned face recognition model training device corresponds to each step in the above-mentioned embodiment of the face recognition model training method, and the functions and realization processes are not repeated here.
  • FIG. 5 is a schematic structural diagram of a training device for a face recognition model provided by an embodiment of the present application.
  • the training device 500 for the face recognition model may have relatively large differences due to different configurations or performances, and may include one or more A processor (central processing units, CPU) 510 (for example, one or more processors), a memory 520, and one or more storage media 530 (for example, one or one storage device with a large amount of storage) storing application programs 533 or data 532.
  • the memory 520 and the storage medium 530 may be short-term storage or persistent storage.
  • the program stored in the storage medium 530 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations in the training device 500 for the face recognition model.
  • the processor 510 may be configured to communicate with the storage medium 530, and execute a series of instruction operations in the storage medium 530 on the training device 500 of the face recognition model.
  • the face recognition model training device 500 may also include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input and output interfaces 560, and/or one or more operating systems 531, For example, Windows Serve, Mac OS X, Unix, Linux, FreeBSD, etc.
  • Windows Serve Windows Serve
  • Mac OS X Unix
  • Linux FreeBSD
  • FIG. 5 does not constitute a limitation on the training device for the face recognition model, and may include more or less components than shown in the figure, or a combination of certain components. Some components, or different component arrangements.
  • the training device for a face recognition model includes a memory and a processor.
  • the memory stores instructions.
  • the processor executes each of the foregoing. The steps of the training method of the face recognition model in the embodiment.
  • the computer-readable storage medium may be a non-volatile computer-readable storage medium, and the computer-readable storage medium may also be a volatile computer-readable storage medium.
  • the computer-readable storage medium stores instructions, and when the instructions run on a computer, the computer executes the steps of the method for training the face recognition model.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .

Abstract

La présente invention concerne le domaine de l'intelligence artificielle. L'invention concerne un procédé, un appareil et un dispositif d'apprentissage d'un modèle de reconnaissance faciale, et un support de stockage, qui permettent de résoudre le problème de la précision de reconnaissance relativement faible d'un modèle de reconnaissance faciale existant. Le procédé d'apprentissage d'un modèle de reconnaissance faciale comprend les étapes consistant à : acquérir de multiples ensembles de données d'apprentissage, et un réseau fédérateur et de multiples réseaux de classement d'un modèle de reconnaissance faciale prédéfini; effectuer respectivement une extraction de caractéristiques faciales sur les multiples ensembles de données d'apprentissage au moyen du réseau fédérateur afin d'obtenir de multiples ensembles de caractéristiques; classer les multiples ensembles de caractéristiques au moyen des multiples réseaux de classement afin d'obtenir de multiples ensembles de données de classement; calculer de multiples valeurs de fonction de perte de vecteur de caractéristiques des multiples ensembles de caractéristiques, et de multiples valeurs de fonction de perte de classement des multiples ensembles de données de classement; selon les multiples valeurs de fonction de perte de vecteur de caractéristiques et les multiples valeurs de fonction de perte de classement, calculer une valeur de fonction de perte cible du modèle de reconnaissance faciale; et mettre à jour de manière itérative le réseau fédérateur en fonction de la valeur de fonction de perte cible jusqu'à ce que la valeur de fonction de perte cible converge, de manière à obtenir un modèle de reconnaissance faciale mis à jour.
PCT/CN2020/122376 2020-07-31 2020-10-21 Procédé, appareil et dispositif d'apprentissage d'un modèle de reconnaissance, et support de stockage WO2021139309A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010760772.0A CN111898547B (zh) 2020-07-31 2020-07-31 人脸识别模型的训练方法、装置、设备及存储介质
CN202010760772.0 2020-07-31

Publications (1)

Publication Number Publication Date
WO2021139309A1 true WO2021139309A1 (fr) 2021-07-15

Family

ID=73184137

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/122376 WO2021139309A1 (fr) 2020-07-31 2020-10-21 Procédé, appareil et dispositif d'apprentissage d'un modèle de reconnaissance, et support de stockage

Country Status (2)

Country Link
CN (1) CN111898547B (fr)
WO (1) WO2021139309A1 (fr)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505724A (zh) * 2021-07-23 2021-10-15 上海应用技术大学 基于YOLOv4的交通标志识别模型训练方法及系统
CN113591637A (zh) * 2021-07-20 2021-11-02 北京爱笔科技有限公司 对齐模型的训练方法、装置、计算机设备以及存储介质
CN114093011A (zh) * 2022-01-12 2022-02-25 北京新氧科技有限公司 头发分类方法、装置、设备及存储介质
CN114119959A (zh) * 2021-11-09 2022-03-01 盛视科技股份有限公司 一种基于视觉的垃圾桶满溢检测方法及装置
CN114255354A (zh) * 2021-12-31 2022-03-29 智慧眼科技股份有限公司 人脸识别模型训练方法、人脸识别方法、装置及相关设备
CN114519757A (zh) * 2022-02-17 2022-05-20 巨人移动技术有限公司 一种捏脸处理方法
CN114764899A (zh) * 2022-04-12 2022-07-19 华南理工大学 基于transformer第一视角下的下一个交互物体预测方法
CN115130539A (zh) * 2022-04-21 2022-09-30 腾讯科技(深圳)有限公司 分类模型训练、数据分类方法、装置和计算机设备
CN115641637A (zh) * 2022-11-11 2023-01-24 杭州海量信息技术有限公司 一种戴口罩人脸识别方法及系统
CN116110100A (zh) * 2023-01-14 2023-05-12 深圳市大数据研究院 一种人脸识别方法、装置、计算机设备及存储介质
WO2023118768A1 (fr) * 2021-12-24 2023-06-29 Unissey Dispositif et procédé de traitement de données d'images de visages d'êtres humains
CN116453201A (zh) * 2023-06-19 2023-07-18 南昌大学 基于相邻边缘损失的人脸识别方法及系统
CN116452922A (zh) * 2023-06-09 2023-07-18 深圳前海环融联易信息科技服务有限公司 模型训练方法、装置、计算机设备及可读存储介质
CN116484005A (zh) * 2023-06-25 2023-07-25 北京中关村科金技术有限公司 一种分类模型构建方法、装置及存储介质
WO2023193474A1 (fr) * 2022-04-08 2023-10-12 马上消费金融股份有限公司 Procédé et appareil de traitement d'informations, dispositif informatique et support de stockage
CN116994309A (zh) * 2023-05-06 2023-11-03 浙江大学 一种公平性感知的人脸识别模型剪枝方法
CN117435906A (zh) * 2023-12-18 2024-01-23 湖南行必达网联科技有限公司 基于交叉熵的新能源汽车配置特征选择方法

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112561062B (zh) * 2020-12-18 2023-10-31 北京百度网讯科技有限公司 神经网络训练方法、装置、计算机设备及存储介质
CN112257689A (zh) * 2020-12-18 2021-01-22 北京京东尚科信息技术有限公司 人脸识别模型的训练和识别方法、存储介质及相关设备
CN113221662B (zh) * 2021-04-14 2022-09-27 上海芯翌智能科技有限公司 人脸识别模型的训练方法及装置、存储介质、终端
CN113239876B (zh) * 2021-06-01 2023-06-02 平安科技(深圳)有限公司 大角度人脸识别模型训练方法
CN115797732B (zh) * 2023-02-15 2023-06-09 杭州实在智能科技有限公司 用于开放类别场景下的图像检索模型训练方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109583322A (zh) * 2018-11-09 2019-04-05 长沙小钴科技有限公司 一种人脸识别深度网络训练方法和系统
CN109815801A (zh) * 2018-12-18 2019-05-28 北京英索科技发展有限公司 基于深度学习的人脸识别方法及装置
CN110929802A (zh) * 2019-12-03 2020-03-27 北京迈格威科技有限公司 基于信息熵的细分类识别模型训练、图像识别方法及装置
CN111104874A (zh) * 2019-12-03 2020-05-05 北京金山云网络技术有限公司 人脸年龄预测方法及模型的训练方法、装置及电子设备
CN111260032A (zh) * 2020-01-14 2020-06-09 北京迈格威科技有限公司 神经网络训练方法、图像处理方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108647583B (zh) * 2018-04-19 2022-02-22 浙江大承机器人科技有限公司 一种基于多目标学习的人脸识别算法训练方法
WO2019223582A1 (fr) * 2018-05-24 2019-11-28 Beijing Didi Infinity Technology And Development Co., Ltd. Procédé et système de détection de cible
CN109934197B (zh) * 2019-03-21 2023-07-07 深圳力维智联技术有限公司 人脸识别模型的训练方法、装置和计算机可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109583322A (zh) * 2018-11-09 2019-04-05 长沙小钴科技有限公司 一种人脸识别深度网络训练方法和系统
CN109815801A (zh) * 2018-12-18 2019-05-28 北京英索科技发展有限公司 基于深度学习的人脸识别方法及装置
CN110929802A (zh) * 2019-12-03 2020-03-27 北京迈格威科技有限公司 基于信息熵的细分类识别模型训练、图像识别方法及装置
CN111104874A (zh) * 2019-12-03 2020-05-05 北京金山云网络技术有限公司 人脸年龄预测方法及模型的训练方法、装置及电子设备
CN111260032A (zh) * 2020-01-14 2020-06-09 北京迈格威科技有限公司 神经网络训练方法、图像处理方法及装置

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113591637A (zh) * 2021-07-20 2021-11-02 北京爱笔科技有限公司 对齐模型的训练方法、装置、计算机设备以及存储介质
CN113505724A (zh) * 2021-07-23 2021-10-15 上海应用技术大学 基于YOLOv4的交通标志识别模型训练方法及系统
CN113505724B (zh) * 2021-07-23 2024-04-19 上海应用技术大学 基于YOLOv4的交通标志识别模型训练方法及系统
CN114119959A (zh) * 2021-11-09 2022-03-01 盛视科技股份有限公司 一种基于视觉的垃圾桶满溢检测方法及装置
WO2023118768A1 (fr) * 2021-12-24 2023-06-29 Unissey Dispositif et procédé de traitement de données d'images de visages d'êtres humains
FR3131419A1 (fr) * 2021-12-24 2023-06-30 Unissey Dispositif et procédé de traitement de données d’images de visages d’êtres humains
CN114255354A (zh) * 2021-12-31 2022-03-29 智慧眼科技股份有限公司 人脸识别模型训练方法、人脸识别方法、装置及相关设备
CN114093011A (zh) * 2022-01-12 2022-02-25 北京新氧科技有限公司 头发分类方法、装置、设备及存储介质
CN114093011B (zh) * 2022-01-12 2022-05-06 北京新氧科技有限公司 头发分类方法、装置、设备及存储介质
CN114519757A (zh) * 2022-02-17 2022-05-20 巨人移动技术有限公司 一种捏脸处理方法
WO2023193474A1 (fr) * 2022-04-08 2023-10-12 马上消费金融股份有限公司 Procédé et appareil de traitement d'informations, dispositif informatique et support de stockage
CN114764899A (zh) * 2022-04-12 2022-07-19 华南理工大学 基于transformer第一视角下的下一个交互物体预测方法
CN114764899B (zh) * 2022-04-12 2024-03-22 华南理工大学 基于transformer第一视角下的下一个交互物体预测方法
CN115130539A (zh) * 2022-04-21 2022-09-30 腾讯科技(深圳)有限公司 分类模型训练、数据分类方法、装置和计算机设备
CN115641637A (zh) * 2022-11-11 2023-01-24 杭州海量信息技术有限公司 一种戴口罩人脸识别方法及系统
CN116110100B (zh) * 2023-01-14 2023-11-14 深圳市大数据研究院 一种人脸识别方法、装置、计算机设备及存储介质
CN116110100A (zh) * 2023-01-14 2023-05-12 深圳市大数据研究院 一种人脸识别方法、装置、计算机设备及存储介质
CN116994309B (zh) * 2023-05-06 2024-04-09 浙江大学 一种公平性感知的人脸识别模型剪枝方法
CN116994309A (zh) * 2023-05-06 2023-11-03 浙江大学 一种公平性感知的人脸识别模型剪枝方法
CN116452922B (zh) * 2023-06-09 2023-09-22 深圳前海环融联易信息科技服务有限公司 模型训练方法、装置、计算机设备及可读存储介质
CN116452922A (zh) * 2023-06-09 2023-07-18 深圳前海环融联易信息科技服务有限公司 模型训练方法、装置、计算机设备及可读存储介质
CN116453201B (zh) * 2023-06-19 2023-09-01 南昌大学 基于相邻边缘损失的人脸识别方法及系统
CN116453201A (zh) * 2023-06-19 2023-07-18 南昌大学 基于相邻边缘损失的人脸识别方法及系统
CN116484005B (zh) * 2023-06-25 2023-09-08 北京中关村科金技术有限公司 一种分类模型构建方法、装置及存储介质
CN116484005A (zh) * 2023-06-25 2023-07-25 北京中关村科金技术有限公司 一种分类模型构建方法、装置及存储介质
CN117435906A (zh) * 2023-12-18 2024-01-23 湖南行必达网联科技有限公司 基于交叉熵的新能源汽车配置特征选择方法
CN117435906B (zh) * 2023-12-18 2024-03-12 湖南行必达网联科技有限公司 基于交叉熵的新能源汽车配置特征选择方法

Also Published As

Publication number Publication date
CN111898547B (zh) 2024-04-16
CN111898547A (zh) 2020-11-06

Similar Documents

Publication Publication Date Title
WO2021139309A1 (fr) Procédé, appareil et dispositif d'apprentissage d'un modèle de reconnaissance, et support de stockage
WO2021077984A1 (fr) Procédé et appareil de reconnaissance d'objets, dispositif électronique et support de stockage lisible
CN111079639B (zh) 垃圾图像分类模型构建的方法、装置、设备及存储介质
CN107229904B (zh) 一种基于深度学习的目标检测与识别方法
Fu et al. Fast crowd density estimation with convolutional neural networks
CN109241317B (zh) 基于深度学习网络中度量损失的行人哈希检索方法
CN109344731B (zh) 基于神经网络的轻量级的人脸识别方法
WO2017166586A1 (fr) Procédé et système d'identification d'images basés sur un réseau neuronal convolutif, et dispositif électronique
KR101183391B1 (ko) 메트릭 임베딩에 의한 이미지 비교
Tu et al. A novel graph-based k-means for nonlinear manifold clustering and representative selection
WO2018052587A1 (fr) Procédé et système de segmentation d'image de cellule à l'aide de réseaux neuronaux convolutifs à étages multiples
CN110929848B (zh) 基于多挑战感知学习模型的训练、跟踪方法
WO2021027193A1 (fr) Procédé et appareil de regroupement de visages, dispositif et support d'informations
WO2016062044A1 (fr) Procédé, dispositif et système d'apprentissage de paramètres de modèle
CN110751027B (zh) 一种基于深度多示例学习的行人重识别方法
CN109711366A (zh) 一种基于群组信息损失函数的行人重识别方法
CN113361334A (zh) 基于关键点优化和多跳注意图卷积行人重识别方法及系统
WO2023134084A1 (fr) Procédé et appareil d'identification multi-étiquettes, dispositif électronique, et support d'enregistrement
WO2021036309A1 (fr) Procédé et appareil de reconnaissance d'image, appareil informatique, et support de stockage
CN110516533A (zh) 一种基于深度度量的行人再辨识方法
Liu et al. Learning 2d-3d correspondences to solve the blind perspective-n-point problem
CN115457332A (zh) 基于图卷积神经网络和类激活映射的图像多标签分类方法
WO2023040195A1 (fr) Procédé et appareil de reconnaissance d'objet, procédé et appareil d'entraînement de réseau, dispositif, support et produit
Miao et al. Evolving convolutional neural networks by symbiotic organisms search algorithm for image classification
CN113822419B (zh) 一种基于结构信息的自监督图表示学习运行方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20911410

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20911410

Country of ref document: EP

Kind code of ref document: A1