WO2021139309A1 - Method, apparatus and device for training facial recognition model, and storage medium - Google Patents

Method, apparatus and device for training facial recognition model, and storage medium Download PDF

Info

Publication number
WO2021139309A1
WO2021139309A1 PCT/CN2020/122376 CN2020122376W WO2021139309A1 WO 2021139309 A1 WO2021139309 A1 WO 2021139309A1 CN 2020122376 W CN2020122376 W CN 2020122376W WO 2021139309 A1 WO2021139309 A1 WO 2021139309A1
Authority
WO
WIPO (PCT)
Prior art keywords
loss function
classification
feature
recognition model
face recognition
Prior art date
Application number
PCT/CN2020/122376
Other languages
French (fr)
Chinese (zh)
Inventor
张国辉
徐玲玲
宋晨
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021139309A1 publication Critical patent/WO2021139309A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • This application relates to the field of artificial intelligence neural networks, and in particular to a training method, device, equipment and storage medium of a face recognition model.
  • Face recognition is a hot field in the field of image recognition.
  • deep learning is used to train a neural network that can perform face recognition, that is, a face recognition model.
  • face recognition model For the recognition accuracy of the face recognition model, because the face recognition model obtained by training using the training data of the application scene will be limited by the application scene of the training data, its recognition accuracy will be low. Therefore, the face recognition model will be used.
  • the universality of the recognition model is optimized to improve the recognition accuracy of the face recognition model.
  • the optimization of the universality of face recognition models generally adopts fine-tuning finetune or mixing multiple training sets.
  • the inventor realizes that due to the fine-tuning method after the model training, the original training set will be changed. Few features are retained, which leads to poor generalization effects of the final face recognition model; the method of mixing multiple training sets has overlapping data that is difficult to clean, and is introduced as dirty data into the training, which affects the training effect of the model. Therefore, the recognition accuracy of the existing face recognition model is low.
  • the main purpose of this application is to solve the problem of low recognition accuracy of existing face recognition models.
  • the first aspect of this application provides a method for training a face recognition model, including:
  • the face recognition model including the backbone network and multiple classification networks
  • the backbone network is iteratively updated according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
  • a second aspect of the present application provides a training device for a face recognition model.
  • the training device for a face recognition model includes a memory, a processor, and a person stored in the memory and running on the processor.
  • the training program of the face recognition model when the processor executes the training program of the face recognition model, the following steps are implemented:
  • the face recognition model including the backbone network and multiple classification networks
  • the backbone network is iteratively updated according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
  • a third aspect of the present application provides a computer-readable storage medium that stores computer instructions, and when the computer instructions are executed on a computer, the computer executes the following steps:
  • the face recognition model including the backbone network and multiple classification networks
  • the backbone network is iteratively updated according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
  • the fourth aspect of the present application provides a training device for a face recognition model, including:
  • An obtaining module configured to obtain a plurality of preprocessed training data sets, where the plurality of training data sets are face training data sets corresponding to a plurality of application scenarios;
  • the feature extraction module is used to extract the facial features of the multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets.
  • the face recognition model includes the backbone network and Multiple classification networks;
  • the classification module is configured to classify the multiple feature sets through the multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
  • the first calculation module is used to calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
  • the second calculation module is configured to calculate the target loss function value of the face recognition model according to the multiple feature vector loss function values and the multiple classification loss function values;
  • An iterative update module is used to iteratively update the backbone network according to the target loss function value until the target loss function value converges to obtain an updated face recognition model.
  • multiple pre-processed training data sets, as well as the backbone network and multiple classification networks of the preset face recognition model are obtained, and the multiple training data sets are people corresponding to multiple application scenarios.
  • the target loss function value obtained by multiple classification loss function values is used to update the backbone network of the face recognition model, which can make the face recognition model have better universality, thereby improving the recognition accuracy of the existing face recognition model .
  • FIG. 1 is a schematic diagram of an embodiment of a method for training a face recognition model in an embodiment of the application
  • FIG. 2 is a schematic diagram of another embodiment of a method for training a face recognition model in an embodiment of the application
  • FIG. 3 is a schematic diagram of an embodiment of a training device for a face recognition model in an embodiment of the application
  • FIG. 4 is a schematic diagram of another embodiment of a training device for a face recognition model in an embodiment of the application
  • Fig. 5 is a schematic diagram of an embodiment of a training device for a face recognition model in an embodiment of the application.
  • the embodiments of the present application provide a method, device, equipment, and storage medium for training a face recognition model, which solve the problem of low recognition accuracy of the existing face recognition model.
  • An embodiment of the training method of the face recognition model in the embodiment of the present application includes:
  • the execution subject of this application may be a training device for a face recognition model, or a terminal or server corresponding to the logistics headquarters, which is not specifically limited here.
  • the embodiment of the present application takes the server corresponding to the logistics headquarters as the execution subject as an example for description.
  • a training data set corresponds to an application scenario, such as: identification scenes and natural scenes.
  • the training data set can be face data, open source data and private data in different dimensions, such as: face data of natural scenes, face data of Asians, attendance data, personal identification data, and competition data.
  • the server can extract multiple pre-processed training data sets from a preset database, and can also obtain face training data sets in different dimensions corresponding to multiple application scenarios from multiple channels, and perform face training data sets on the face training data sets. Preprocessing to obtain multiple training data sets after preprocessing.
  • the face recognition model includes a backbone network and multiple classification networks.
  • the preset face recognition model includes a backbone network and multiple classification networks.
  • the output of the backbone network is the input of multiple classification networks.
  • the data processed by the backbone network is classified through multiple classification networks to realize the training data set. Face recognition training.
  • the backbone network can be a single convolutional neural network or a comprehensive framework of multiple convolutional neural networks.
  • the backbone network can be a deep residual learning framework ResNet or a target detection network framework ET-YOLOv3, or deep residual learning
  • the framework ResNet combines the comprehensive framework of the target detection network framework ET-YOLOv3.
  • the server can perform face frame recognition, frame area division, face key point detection, and face feature vector extraction for each training data set through the backbone network of the face recognition model to obtain the corresponding features of each training data set Set (ie multiple feature sets).
  • the convolutional network layer in the backbone network uses a small convolution kernel, which retains more features through the small convolution kernel, reduces the amount of calculation, and improves the efficiency of facial feature extraction.
  • the server obtains the label on the training data corresponding to each feature set, calls multiple classification networks, and classifies the multiple feature sets through the classification network and the labels to obtain multiple classification data sets.
  • a classification network classifies a feature set.
  • multiple classification networks are A1, B1, C1, and D1, and multiple feature sets are A2, B2, C2, and D2.
  • A1 classifies A2, and B1 classifies A2.
  • B2 classifies, C1 classifies C2, and D1 classifies D2.
  • Each classification network can adopt the same network structure or different network structure.
  • the classification network is A1, B1, C1, and D1 are linear classifiers, and the classification network is A1, B1, C1, and D1 as volumes.
  • the server calculates the first center vector and the second center vector, calculates the distance value between each first center vector and the second center vector, and uses the distance value as the feature vector loss function value corresponding to each feature set, thereby obtaining more A feature vector loss function, where the first center vector is the center vector corresponding to each feature set, or the center vector corresponding to each training data in each feature set, and the second center vector can be the first center vector corresponding to all feature sets.
  • the two center vector can also be the center vector corresponding to all the training data in each feature set.
  • the server can obtain the number of training data corresponding to each feature set, and calculate the sum value of the first center vector corresponding to all training data, and calculate the average value of the sum value according to the number of training data.
  • the average value is corresponding to each feature set.
  • the server can also calculate the second center vector through a preset center vector formula.
  • the server calculates the classification loss function value of each classification data set through the preset cross-entropy loss function, thereby obtaining multiple classification loss function values.
  • the cross-entropy loss function can be a multi-class cross-entropy loss function. Function, the derivation is simpler, can make the convergence faster, and the update of the corresponding weight matrix is faster.
  • the server After the server obtains multiple eigenvector loss function values and multiple classification loss function values, it obtains the number of data sets of multiple training data sets, and calculates the average eigenvector loss function of multiple eigenvector loss function values according to the number of data sets Value, and the average classification loss function value of multiple classification loss function values, the sum of the average feature vector loss function value and the average classification loss function value is used as the target loss function value of the face recognition model, or the average feature vector loss The weighted sum of the function value and the average classification loss function value is used as the target loss function value of the face recognition model.
  • each classification network calculates the classification loss function value, the corresponding classification network can be updated inversely according to the classification loss function value.
  • the server iteratively updates the network structure and/or weight value of the backbone network according to the target loss function value and the preset number of iterations until the target loss function value converges (that is, the training accuracy of the face recognition model meets the preset conditions), and obtains The updated face recognition model.
  • the network structure of the backbone network can be updated by adding or deleting the network layer of the backbone network, or by adding other network frameworks to update the network structure of the backbone network, or by modifying the size of the convolution kernel of the backbone network And step size to update the network structure of the backbone network.
  • the server can also optimize the face recognition model in combination with optimization algorithms.
  • the existence of overlapping data and overlapping data as dirty data have an adverse effect on model training.
  • the loss function based on multiple feature vectors is avoided. Value and the target loss function value obtained by multiple classification loss function values to update the backbone network of the face recognition model, which can make the face recognition model have better universality, thereby improving the performance of the existing face recognition model Recognition accuracy.
  • another embodiment of the training method of the face recognition model in the embodiment of the present application includes:
  • initial training data sets corresponding to multiple application scenarios, where the initial training data sets include open source data and private data.
  • the server extracts the initial training data sets (open source data) in different dimensions corresponding to multiple different application scenarios from the open source database, and crawls the initial training data sets corresponding to multiple different application scenarios (open source data) from the network platform. , Extract multiple initial training data sets (private data) corresponding to different application scenarios from the alliance chain or private database.
  • the server performs missing value detection, missing value filling, and missing value cleanup on each initial training data set according to the preset missing value ratio to obtain the initial training data set after missing value processing, and the initial training data set after missing value processing Perform merging and de-duplication to obtain the initial training data set after merging and de-duplication, determine whether there is training data in the initial training data set after merging and de-duplication that does not meet the preset legality determination rules, if there is, delete the corresponding training data If it does not exist, the initial training data set after the merging and de-duplication processing is determined as a candidate training data set, and the candidate training data set is labeled to obtain multiple preprocessed training data sets.
  • the content of labeling may include at least one of classification, frame labeling, area labeling, and spot labeling, such as age-adult, gender-female, race-yellow, hair-long hair, face Facial expressions-smiles, face-wearing parts-spectacles classification labeling, frame labeling such as: labeling the frame position of the face in the image, area labeling such as: labeling the area position of the face in the image, marking the point labeling such as: face Key points are marked.
  • classification such as: labeling the frame position of the face in the image
  • area labeling such as: labeling the area position of the face in the image
  • marking the point labeling such as: face Key points are marked.
  • the face recognition model includes a backbone network and multiple classification networks.
  • the server obtains the number of data sets of multiple training data sets, and calculates the average data volume of each training data set according to the number of data sets; the training data corresponding to the average data volume is used as batch data to obtain each training data set.
  • the target batch processing data is the 160 training data.
  • the 160 training data (that is, the target batch processing data) is processed into the face image area Detect, obtain the face area, perform face key point detection on the face area, obtain face key point information, and perform face feature vector extraction on the face key point information to obtain multiple feature sets.
  • the target batch data obtained from different training data sets will be randomly looped when the target batch data with a small number of training data is processed first in the data processing process. Processing until the end of the target batch data processing with the largest number of training data.
  • the server obtains the label on the training data corresponding to each feature set, calls multiple classification networks, and classifies the multiple feature sets through the classification network and the labels to obtain multiple classification data sets.
  • a classification network classifies a feature set.
  • Each classification network can use the same network structure or different network structure. Through the same network structure, the network complexity is reduced. By using different network structures to process different types of training data, it is beneficial to Improve the efficiency of classification and the universality of face recognition models.
  • the server calculates the first feature center vector corresponding to each feature set and the second feature center vector corresponding to multiple feature sets; calculates the difference between the first feature center vector and the second feature center vector corresponding to each feature set
  • the distance value is determined as the feature vector loss function value of each feature set, and multiple feature vector loss function values are obtained; the preset labels corresponding to each training data in each training data set are obtained, according to the preset labels and
  • the preset cross entropy loss function calculates the classification loss function value of each classification data set, and obtains multiple classification loss function values.
  • the server obtains the first feature vector of each feature set, the number of corresponding first training data in each feature set, the first number of target batch data, and calculates it through the preset first update center vector formula
  • the first feature center vector corresponding to each feature set, and the first update center vector formula for calculating the first feature center vector is as follows:
  • vc p is used to indicate the p-th feature set
  • vc p is the current first feature center vector
  • vc p-1 is the first feature center vector of the previous iteration
  • vn p is the first data number of the current iteration
  • n p a first number of training data
  • v i first current feature vector is 0.
  • the server obtains the second feature vector of all feature sets, the number of second training data corresponding to all feature sets, and the second data number of target batch data corresponding to all feature sets.
  • the formula for calculating the second update center vector of the second feature center vector is as follows:
  • v q is used to indicate the qth iteration
  • v q is the current second feature center vector
  • v q-1 is the second feature center vector of the previous iteration
  • vk q is the number of second data in the current iteration
  • n q is the first 2.
  • Number of training data v j is the second feature vector of all current feature sets. Among them, the second feature center vector v q before the first iteration is zero.
  • the server obtains the dimension of the first feature vector of each feature set, and calculates the feature vector loss function value and the feature vector loss function value according to the dimension of the first feature vector, the first feature center vector and the second feature center vector of each feature set
  • the calculation formula is as follows:
  • p is used to indicate the p-th feature set
  • m is the dimension of the first feature vector of each feature set
  • vc p is the first feature center vector
  • v q is the second feature center vector
  • the server counts the number of preset labels in each classification data set, and obtains the feature vector of the classification data corresponding to the preset labels in each classification data set; according to the preset cross-entropy loss function, the number of labels and The feature vector is used to calculate the classification loss function value of each classification data set to obtain multiple classification loss function values.
  • the cross-entropy loss function is as follows:
  • y represents the y-th training data set
  • c y is the y-th training data set corresponding to the classified data set
  • n y is the number of labels
  • label i is the i-th label preset categories
  • v i is the eigenvector .
  • the server classifies the features according to the preset labels on each training data to obtain multiple classification data sets, so as to obtain the number of labels of the preset labels in each classification data set, the preset labels of each category, and the The feature vector generated by the classification data corresponding to the preset label, through the preset cross entropy loss function, combined with the number of labels and the feature vector calculation, the classification loss function value of each classification data set is obtained, thereby obtaining multiple classification losses Function value.
  • the server calculates the average value of the multiple eigenvector loss function values according to the number of data sets to obtain the average eigenvector loss function value; calculates the average value of the multiple classification loss function values according to the number of data sets to obtain the average classification loss function value; Calculate the sum of the average feature vector loss function value and the average classification loss function value to obtain the target loss function value of the face recognition model.
  • the target batch data is updated to obtain the updated target batch data, and the network structure of the backbone network is updated to obtain the updated The backbone network; through the updated backbone network and multiple classification networks, the updated target batch data is sequentially extracted and classified by facial features to obtain multiple target classification data sets; according to multiple target classification data sets, calculate The updated objective loss function value, and determine whether the updated objective loss function value converges; if the updated objective loss function value does not converge, the updated backbone network is iteratively updated according to the updated objective loss function value, Until the updated target loss function value converges, the final updated face recognition model is obtained.
  • the server determines that the value of the target loss function has converged, it uses the current face recognition model as the final face recognition model.
  • the server determines that the updated target loss function value has converged, it uses the currently updated face recognition model as the final updated face recognition model.
  • the facial feature extraction and classification are performed on the updated target batch data in sequence to obtain multiple target classification data sets.
  • the operation method is similar to the above steps 102, 103, 203, and 204, according to the multiple target classification data Set, the operation method of calculating the updated target loss function value is similar to the operation method of the above steps 104, 105, 205, and 206, and will not be repeated here.
  • the data quantity of each updated target batch data will be different, and it will change dynamically, which is equal to the sum of the target batch data in the previous iteration and the current target batch data.
  • the training method of the face recognition model in the embodiment of the application is described above, and the training device of the face recognition model in the embodiment of the application is described below. Please refer to FIG. 3, the training device of the face recognition model in the embodiment of the application One embodiment includes:
  • the obtaining module 301 is configured to obtain a plurality of preprocessed training data sets, and the plurality of training data sets are face training data sets corresponding to multiple application scenarios;
  • the feature extraction module 302 is used to extract facial features from multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets.
  • the face recognition model includes the backbone network and multiple classifications. The internet;
  • the classification module 303 is configured to classify multiple feature sets through multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
  • the first calculation module 304 is configured to calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
  • the second calculation module 305 is configured to calculate the target loss function value of the face recognition model according to multiple feature vector loss function values and multiple classification loss function values;
  • the iterative update module 306 is configured to iteratively update the backbone network according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
  • each module in the above-mentioned face recognition model training device corresponds to each step in the embodiment of the above-mentioned face recognition model training method, and its functions and realization process are not repeated here.
  • the existence of overlapping data and overlapping data as dirty data have an adverse effect on model training.
  • the loss function based on multiple feature vectors is avoided. Value and the target loss function value obtained by multiple classification loss function values to update the backbone network of the face recognition model, which can make the face recognition model have better universality, thereby improving the performance of the existing face recognition model Recognition accuracy.
  • another embodiment of the training device for the face recognition model in the embodiment of the present application includes:
  • the obtaining module 301 is configured to obtain a plurality of preprocessed training data sets, and the plurality of training data sets are face training data sets corresponding to multiple application scenarios;
  • the obtaining module 301 specifically includes:
  • the obtaining unit 3011 is configured to obtain initial training data sets corresponding to multiple application scenarios, and the initial training data sets include open source data and private data;
  • the preprocessing unit 3012 is configured to sequentially perform data cleaning and label labeling on each initial training data set to obtain multiple preprocessed training data sets;
  • the feature extraction module 302 is used to extract facial features from multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets.
  • the face recognition model includes the backbone network and multiple classifications. The internet;
  • the classification module 303 is configured to classify multiple feature sets through multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
  • the first calculation module 304 is configured to calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
  • the second calculation module 305 is configured to calculate the target loss function value of the face recognition model according to multiple feature vector loss function values and multiple classification loss function values;
  • the iterative update module 306 is configured to iteratively update the backbone network according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
  • the feature extraction module 302 may also be specifically used for:
  • face image area detection, face key point detection and face feature vector extraction are sequentially performed on the target batch data to obtain multiple feature sets.
  • the second calculation module 305 may also be specifically used for:
  • the first calculation module 304 includes:
  • the first calculation unit 3041 is configured to calculate the first feature center vector corresponding to each feature set and the second feature center vectors corresponding to multiple feature sets;
  • the second calculation unit 3042 is used to calculate the distance value between the first feature center vector and the second feature center vector corresponding to each feature set, and determine the distance value as the feature vector loss function value of each feature set to obtain Multiple eigenvector loss function values;
  • the third calculation unit 3043 is used to obtain the preset label corresponding to each training data in each training data set, and calculate the classification loss function value of each classification data set according to the preset label and the preset cross-entropy loss function, and obtain the The value of the classification loss function.
  • the third calculation unit 3043 may also be specifically configured to:
  • the number of labels and the feature vector, the classification loss function value of each classification data set is calculated, and multiple classification loss function values are obtained.
  • the cross-entropy loss function is as follows:
  • y represents the y-th training data set
  • c y is the y-th training data set corresponding to the classified data set
  • n y is the number of labels
  • label i is the i-th label preset categories
  • v i is the eigenvector .
  • the iterative update module 306 may also be specifically used to:
  • the updated target batch data is sequentially extracted and classified by facial features to obtain multiple target classification data sets;
  • the updated backbone network is iteratively updated according to the updated objective loss function value until the updated objective loss function value converges to obtain the final updated face recognition model .
  • each module and each unit in the above-mentioned face recognition model training device corresponds to each step in the above-mentioned embodiment of the face recognition model training method, and the functions and realization processes are not repeated here.
  • FIG. 5 is a schematic structural diagram of a training device for a face recognition model provided by an embodiment of the present application.
  • the training device 500 for the face recognition model may have relatively large differences due to different configurations or performances, and may include one or more A processor (central processing units, CPU) 510 (for example, one or more processors), a memory 520, and one or more storage media 530 (for example, one or one storage device with a large amount of storage) storing application programs 533 or data 532.
  • the memory 520 and the storage medium 530 may be short-term storage or persistent storage.
  • the program stored in the storage medium 530 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations in the training device 500 for the face recognition model.
  • the processor 510 may be configured to communicate with the storage medium 530, and execute a series of instruction operations in the storage medium 530 on the training device 500 of the face recognition model.
  • the face recognition model training device 500 may also include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input and output interfaces 560, and/or one or more operating systems 531, For example, Windows Serve, Mac OS X, Unix, Linux, FreeBSD, etc.
  • Windows Serve Windows Serve
  • Mac OS X Unix
  • Linux FreeBSD
  • FIG. 5 does not constitute a limitation on the training device for the face recognition model, and may include more or less components than shown in the figure, or a combination of certain components. Some components, or different component arrangements.
  • the training device for a face recognition model includes a memory and a processor.
  • the memory stores instructions.
  • the processor executes each of the foregoing. The steps of the training method of the face recognition model in the embodiment.
  • the computer-readable storage medium may be a non-volatile computer-readable storage medium, and the computer-readable storage medium may also be a volatile computer-readable storage medium.
  • the computer-readable storage medium stores instructions, and when the instructions run on a computer, the computer executes the steps of the method for training the face recognition model.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .

Abstract

The present application relates to the field of artificial intelligence. Provided are a method, apparatus and device for training a facial recognition model, and a storage medium, which are used for solving the problem of the relatively low recognition accuracy of an existing facial recognition model. The method for training a facial recognition model comprises: acquiring multiple training data sets, and a backbone network and multiple classification networks of a preset facial recognition model; respectively performing facial feature extraction on the multiple training data sets by means of the backbone network to obtain multiple feature sets; classifying the multiple feature sets by means of the multiple classification networks to obtain multiple classification data sets; calculating multiple feature vector loss function values of the multiple feature sets, and multiple classification loss function values of the multiple classification data sets; according to the multiple feature vector loss function values and the multiple classification loss function values, calculating a target loss function value of the facial recognition model; and iteratively updating the backbone network according to the target loss function value until the target loss function value converges, so as to obtain an updated facial recognition model.

Description

人脸识别模型的训练方法、装置、设备及存储介质Training method, device, equipment and storage medium of face recognition model
本申请要求于2020年7月31日提交中国专利局、申请号为202010760772.0、发明名称为“人脸识别模型的训练方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office, the application number is 202010760772.0, and the invention title is "Face Recognition Model Training Method, Apparatus, Equipment and Storage Medium" on July 31, 2020, and its entire contents Incorporated in the application by reference.
技术领域Technical field
本申请涉及人工智能的神经网络领域,尤其涉及一种人脸识别模型的训练方法、装置、设备及存储介质。This application relates to the field of artificial intelligence neural networks, and in particular to a training method, device, equipment and storage medium of a face recognition model.
背景技术Background technique
人脸识别是图像识别领域的一个热门领域,通常会通过深度学习,训练得到能够进行人脸识别的神经网络,即人脸识别模型。对于人脸识别模型的识别精确度,由于使用应用场景的训练数据进行训练所得的人脸识别模型,会受到训练数据的应用场景的限制,导致其识别精度较低,因而,会采用对人脸识别模型的普适性进行优化的方式,来提高人脸识别模型的识别精度。Face recognition is a hot field in the field of image recognition. Usually, deep learning is used to train a neural network that can perform face recognition, that is, a face recognition model. For the recognition accuracy of the face recognition model, because the face recognition model obtained by training using the training data of the application scene will be limited by the application scene of the training data, its recognition accuracy will be low. Therefore, the face recognition model will be used. The universality of the recognition model is optimized to improve the recognition accuracy of the face recognition model.
目前,对于人脸识别模型普适性的优化,一般都是采用微调finetune或混合多个训练集的方式,但是,发明人意识到由于微调的方式在模型训练之后,会导致原来的训练集的特征保留得很少,从而导致最终的人脸识别模型的泛化效果差;混合多个训练集的方式存在重叠数据难清洗,并作为脏数据引入到训练中,影响模型的训练效果的缺陷,因而,导致现有的人脸识别模型的识别精度较低。At present, the optimization of the universality of face recognition models generally adopts fine-tuning finetune or mixing multiple training sets. However, the inventor realizes that due to the fine-tuning method after the model training, the original training set will be changed. Few features are retained, which leads to poor generalization effects of the final face recognition model; the method of mixing multiple training sets has overlapping data that is difficult to clean, and is introduced as dirty data into the training, which affects the training effect of the model. Therefore, the recognition accuracy of the existing face recognition model is low.
发明内容Summary of the invention
本申请的主要目的在于解决现有的人脸识别模型的识别精度较低的问题。The main purpose of this application is to solve the problem of low recognition accuracy of existing face recognition models.
本申请第一方面提供了一种人脸识别模型的训练方法,包括:The first aspect of this application provides a method for training a face recognition model, including:
获取预处理后的多个训练数据集,所述多个训练数据集为多个应用场景分别对应的人脸训练数据集;Acquiring preprocessed multiple training data sets, where the multiple training data sets are face training data sets corresponding to multiple application scenarios, respectively;
通过预置的人脸识别模型中的主干网络,对所述多个训练数据集分别进行人脸特征提取,得到多个特征集,所述人脸识别模型包括主干网络和多个分类网络;Performing face feature extraction on the multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets, the face recognition model including the backbone network and multiple classification networks;
通过所述多个分类网络对所述多个特征集进行分类,得到多个分类数据集,其中,一个分类网络对应一个特征集;Classify the multiple feature sets through the multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值;Calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
根据所述多个特征向量损失函数值和所述多个分类损失函数值,计算所述人脸识别模型的目标损失函数值;Calculating a target loss function value of the face recognition model according to the multiple feature vector loss function values and the multiple classification loss function values;
根据所述目标损失函数值对所述主干网络进行迭代更新,直至所述目标损失函数值收敛,得到更新后的人脸识别模型。The backbone network is iteratively updated according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
本申请第二方面提供了一种人脸识别模型的训练设备,所述人脸识别模型的训练设备包括:存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的人脸识别模型的训练程序,所述处理器执行所述人脸识别模型的训练程序时实现如下步骤:A second aspect of the present application provides a training device for a face recognition model. The training device for a face recognition model includes a memory, a processor, and a person stored in the memory and running on the processor. The training program of the face recognition model, when the processor executes the training program of the face recognition model, the following steps are implemented:
获取预处理后的多个训练数据集,所述多个训练数据集为多个应用场景分别对应的人脸训练数据集;Acquiring preprocessed multiple training data sets, where the multiple training data sets are face training data sets corresponding to multiple application scenarios, respectively;
通过预置的人脸识别模型中的主干网络,对所述多个训练数据集分别进行人脸特征提取,得到多个特征集,所述人脸识别模型包括主干网络和多个分类网络;Performing face feature extraction on the multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets, the face recognition model including the backbone network and multiple classification networks;
通过所述多个分类网络对所述多个特征集进行分类,得到多个分类数据集,其中,一个分类网络对应一个特征集;Classify the multiple feature sets through the multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值;Calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
根据所述多个特征向量损失函数值和所述多个分类损失函数值,计算所述人脸识别模型的目标损失函数值;Calculating a target loss function value of the face recognition model according to the multiple feature vector loss function values and the multiple classification loss function values;
根据所述目标损失函数值对所述主干网络进行迭代更新,直至所述目标损失函数值收敛,得到更新后的人脸识别模型。The backbone network is iteratively updated according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
本申请第三方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:A third aspect of the present application provides a computer-readable storage medium that stores computer instructions, and when the computer instructions are executed on a computer, the computer executes the following steps:
获取预处理后的多个训练数据集,所述多个训练数据集为多个应用场景分别对应的人脸训练数据集;Acquiring preprocessed multiple training data sets, where the multiple training data sets are face training data sets corresponding to multiple application scenarios, respectively;
通过预置的人脸识别模型中的主干网络,对所述多个训练数据集分别进行人脸特征提取,得到多个特征集,所述人脸识别模型包括主干网络和多个分类网络;Performing face feature extraction on the multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets, the face recognition model including the backbone network and multiple classification networks;
通过所述多个分类网络对所述多个特征集进行分类,得到多个分类数据集,其中,一个分类网络对应一个特征集;Classify the multiple feature sets through the multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值;Calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
根据所述多个特征向量损失函数值和所述多个分类损失函数值,计算所述人脸识别模型的目标损失函数值;Calculating a target loss function value of the face recognition model according to the multiple feature vector loss function values and the multiple classification loss function values;
根据所述目标损失函数值对所述主干网络进行迭代更新,直至所述目标损失函数值收敛,得到更新后的人脸识别模型。The backbone network is iteratively updated according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
本申请第四方面提供了一种人脸识别模型的训练装置,包括:The fourth aspect of the present application provides a training device for a face recognition model, including:
获取模块,用于获取预处理后的多个训练数据集,所述多个训练数据集为多个应用场景分别对应的人脸训练数据集;An obtaining module, configured to obtain a plurality of preprocessed training data sets, where the plurality of training data sets are face training data sets corresponding to a plurality of application scenarios;
特征提取模块,用于通过预置的人脸识别模型中的主干网络,对所述多个训练数据集分别进行人脸特征提取,得到多个特征集,所述人脸识别模型包括主干网络和多个分类网络;The feature extraction module is used to extract the facial features of the multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets. The face recognition model includes the backbone network and Multiple classification networks;
分类模块,用于通过所述多个分类网络对所述多个特征集进行分类,得到多个分类数据集,其中,一个分类网络对应一个特征集;The classification module is configured to classify the multiple feature sets through the multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
第一计算模块,用于计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值;The first calculation module is used to calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
第二计算模块,用于根据所述多个特征向量损失函数值和所述多个分类损失函数值,计算所述人脸识别模型的目标损失函数值;The second calculation module is configured to calculate the target loss function value of the face recognition model according to the multiple feature vector loss function values and the multiple classification loss function values;
迭代更新模块,用于根据所述目标损失函数值对所述主干网络进行迭代更新,直至所述目标损失函数值收敛,得到更新后的人脸识别模型。An iterative update module is used to iteratively update the backbone network according to the target loss function value until the target loss function value converges to obtain an updated face recognition model.
本申请提供的技术方案中,获取预处理后的多个训练数据集,以及预置的人脸识别模型的主干网络和多个分类网络,多个训练数据集为多个应用场景分别对应的人脸训练数据集;通过主干网络对多个训练数据集分别进行人脸特征提取,得到多个特征集;通过多个分类网络对多个特征集进行分类,得到多个分类数据集,其中,一个分类网络对应一个特征集;计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值;根据多个特征向量损失函数值和多个分类损失函数值,计算人脸识别模型的目标损失函数值;根据目标损失函数值对主干网络进行迭代更新,直至目标损失函数值收敛,得到更新后的人脸识别模型。本申请中,通过对多个训练数据集分别进行人脸特征提取和分类,避免了存在重叠数据以及重叠 数据作为脏数据对模型训练造成不良影响的状况,通过根据多个特征向量损失函数值和多个分类损失函数值得到的目标损失函数值来更新人脸识别模型的主干网络,能够使人脸识别模型具有较好的普适性,从而,提高了现有的人脸识别模型的识别精度。In the technical solution provided by this application, multiple pre-processed training data sets, as well as the backbone network and multiple classification networks of the preset face recognition model are obtained, and the multiple training data sets are people corresponding to multiple application scenarios. Face training data set; multiple training data sets are extracted through the backbone network to obtain multiple feature sets; multiple classification networks are used to classify multiple feature sets to obtain multiple classification data sets, of which, one The classification network corresponds to a feature set; calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values; according to Multiple feature vector loss function values and multiple classification loss function values are used to calculate the target loss function value of the face recognition model; the backbone network is iteratively updated according to the target loss function value until the target loss function value converges, and the updated person is obtained Face recognition model. In this application, by performing face feature extraction and classification on multiple training data sets, the existence of overlapping data and overlapping data as dirty data have an adverse effect on model training is avoided, and the loss function value and the value of the multiple feature vectors are avoided in this application. The target loss function value obtained by multiple classification loss function values is used to update the backbone network of the face recognition model, which can make the face recognition model have better universality, thereby improving the recognition accuracy of the existing face recognition model .
附图说明Description of the drawings
图1为本申请实施例中人脸识别模型的训练方法的一个实施例示意图;FIG. 1 is a schematic diagram of an embodiment of a method for training a face recognition model in an embodiment of the application;
图2为本申请实施例中人脸识别模型的训练方法的另一个实施例示意图;FIG. 2 is a schematic diagram of another embodiment of a method for training a face recognition model in an embodiment of the application;
图3为本申请实施例中人脸识别模型的训练装置的一个实施例示意图;3 is a schematic diagram of an embodiment of a training device for a face recognition model in an embodiment of the application;
图4为本申请实施例中人脸识别模型的训练装置的另一个实施例示意图;FIG. 4 is a schematic diagram of another embodiment of a training device for a face recognition model in an embodiment of the application;
图5为本申请实施例中人脸识别模型的训练设备的一个实施例示意图。Fig. 5 is a schematic diagram of an embodiment of a training device for a face recognition model in an embodiment of the application.
具体实施方式Detailed ways
本申请实施例提供了一种人脸识别模型的训练方法、装置、设备及存储介质,解决了现有的人脸识别模型的识别精度低的问题。The embodiments of the present application provide a method, device, equipment, and storage medium for training a face recognition model, which solve the problem of low recognition accuracy of the existing face recognition model.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”或“具有”及其任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects, without having to use To describe a specific order or sequence. It should be understood that the data used in this way can be interchanged under appropriate circumstances so that the embodiments described herein can be implemented in a sequence other than the content illustrated or described herein. In addition, the terms "including" or "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not necessarily limited to those clearly listed. Steps or units, but may include other steps or units that are not clearly listed or are inherent to these processes, methods, products, or equipment.
为便于理解,下面对本申请实施例的具体流程进行描述,请参阅图1,本申请实施例中人脸识别模型的训练方法的一个实施例包括:For ease of understanding, the following describes the specific process of the embodiment of the present application. Please refer to FIG. 1. An embodiment of the training method of the face recognition model in the embodiment of the present application includes:
101、获取预处理后的多个训练数据集,多个训练数据集为多个应用场景分别对应的人脸训练数据集。101. Obtain preprocessed multiple training data sets, where the multiple training data sets are face training data sets corresponding to multiple application scenarios, respectively.
可以理解的是,本申请的执行主体可以为人脸识别模型的训练装置,还可以是物流总部对应的终端或者服务器,具体此处不做限定。本申请实施例以物流总部对应的服务器为执行主体为例进行说明。It is understandable that the execution subject of this application may be a training device for a face recognition model, or a terminal or server corresponding to the logistics headquarters, which is not specifically limited here. The embodiment of the present application takes the server corresponding to the logistics headquarters as the execution subject as an example for description.
一个训练数据集对应一个应用场景,例如:人证识别场景和自然场景。训练数据集可为不同维度下的人脸数据、开源数据和私有数据,例如:自然场景的人脸数据、亚洲人的人脸数据、考勤数据、人证数据和竞赛数据。A training data set corresponds to an application scenario, such as: identification scenes and natural scenes. The training data set can be face data, open source data and private data in different dimensions, such as: face data of natural scenes, face data of Asians, attendance data, personal identification data, and competition data.
服务器可从预置的数据库中提取预处理后的多个训练数据集,也可从多种渠道获取多个应用场景分别对应的不同维度下的人脸训练数据集,对人脸训练数据集进行预处理,得到预处理后的多个训练数据集。The server can extract multiple pre-processed training data sets from a preset database, and can also obtain face training data sets in different dimensions corresponding to multiple application scenarios from multiple channels, and perform face training data sets on the face training data sets. Preprocessing to obtain multiple training data sets after preprocessing.
102、通过预置的人脸识别模型中的主干网络,对多个训练数据集分别进行人脸特征提取,得到多个特征集,人脸识别模型包括主干网络和多个分类网络。102. Perform face feature extraction on multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets. The face recognition model includes a backbone network and multiple classification networks.
预置的人脸识别模型包括主干网络和多个分类网络,主干网络的输出为多个分类网络的输入,通过多个分类网络对主干网络处理后的数据进行分类,从而实现对训练数据集的人脸识别训练。主干网络可为单个卷积神经网络也可为多个卷积神经网络的综合框架,例如:主干网络可为深度残差学习框架ResNet或目标检测网络框架ET-YOLOv3,也可为深度残差学习框架ResNet结合目标检测网络框架ET-YOLOv3的综合框架。The preset face recognition model includes a backbone network and multiple classification networks. The output of the backbone network is the input of multiple classification networks. The data processed by the backbone network is classified through multiple classification networks to realize the training data set. Face recognition training. The backbone network can be a single convolutional neural network or a comprehensive framework of multiple convolutional neural networks. For example, the backbone network can be a deep residual learning framework ResNet or a target detection network framework ET-YOLOv3, or deep residual learning The framework ResNet combines the comprehensive framework of the target detection network framework ET-YOLOv3.
服务器可通过人脸识别模型的主干网络,对每个训练数据集进行人脸标框识别、标框区域划分、人脸关键点检测和人脸特征向量提取,得到每个训练数据集对应的特征集(即多个特征集)。主干网络中的卷积网络层采用小卷积核,通过小卷积核保留更多的特征,减少计算量,提高人脸特征提取的效率。The server can perform face frame recognition, frame area division, face key point detection, and face feature vector extraction for each training data set through the backbone network of the face recognition model to obtain the corresponding features of each training data set Set (ie multiple feature sets). The convolutional network layer in the backbone network uses a small convolution kernel, which retains more features through the small convolution kernel, reduces the amount of calculation, and improves the efficiency of facial feature extraction.
103、通过多个分类网络对多个特征集进行分类,得到多个分类数据集,其中,一个分类网络对应一个特征集。103. Classify multiple feature sets through multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set.
服务器获取每个特征集对应的训练数据上的标签,并调用多个分类网络,通过分类网络和标签对多个特征集进行分类,得到多个分类数据集。其中,一个分类网络对一个特征集进行分类,例如:多个分类网络分别为A1、B1、C1和D1,多个特征集分别为A2、B2、C2和D2,A1对A2进行分类,B1对B2进行分类,C1对C2进行分类,D1对D2进行分类。每个分类网络可采用相同的网络结构,也可以采用不相同的网络结构,例如:分类网络为A1、B1、C1和D1均为线性分类器,分类网络为A1、B1、C1和D1为卷积神经网络Inception-v3、线性分类器、最邻近分类器和谷歌网络GoogLeNet,通过相同的网络结构,减少了网络复杂性,通过采用不相同的网络结构对不同类型的训练数据进行处理,有利于提高分类效率和人脸识别模型的普适性。The server obtains the label on the training data corresponding to each feature set, calls multiple classification networks, and classifies the multiple feature sets through the classification network and the labels to obtain multiple classification data sets. Among them, a classification network classifies a feature set. For example, multiple classification networks are A1, B1, C1, and D1, and multiple feature sets are A2, B2, C2, and D2. A1 classifies A2, and B1 classifies A2. B2 classifies, C1 classifies C2, and D1 classifies D2. Each classification network can adopt the same network structure or different network structure. For example, the classification network is A1, B1, C1, and D1 are linear classifiers, and the classification network is A1, B1, C1, and D1 as volumes. Product neural network Inception-v3, linear classifier, nearest neighbor classifier and Google network GoogLeNet, through the same network structure, reduce the complexity of the network, and use different network structures to process different types of training data, which is beneficial Improve the efficiency of classification and the universality of face recognition models.
104、计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值。104. Calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values.
服务器计算第一中心向量和第二中心向量,计算每个第一中心向量和第二中心向量之间的距离值,将该距离值作为每个特征集对应的特征向量损失函数值,从而获得多个特征向量损失函数,其中,第一中心向量为每个特征集对应的中心向量,也可为每个特征集中每个训练数据对应的中心向量,第二中心向量可为所有特征集对应的第二中心向量,也可为每个特征集中所有训练数据对应的中心向量。The server calculates the first center vector and the second center vector, calculates the distance value between each first center vector and the second center vector, and uses the distance value as the feature vector loss function value corresponding to each feature set, thereby obtaining more A feature vector loss function, where the first center vector is the center vector corresponding to each feature set, or the center vector corresponding to each training data in each feature set, and the second center vector can be the first center vector corresponding to all feature sets. The two center vector can also be the center vector corresponding to all the training data in each feature set.
服务器可通过获取每个特征集对应的训练数据个数,以及计算所有训练数据对应的第一中心向量的和值,根据训练数据个数计算和值的均值,该均值为每个特征集对应的第二中心向量,服务器也可通过预置的中心向量公式计算第二中心向量。The server can obtain the number of training data corresponding to each feature set, and calculate the sum value of the first center vector corresponding to all training data, and calculate the average value of the sum value according to the number of training data. The average value is corresponding to each feature set. For the second center vector, the server can also calculate the second center vector through a preset center vector formula.
服务器通过预置的交叉熵损失函数计算每个分类数据集的分类损失函数值,从而得到多个分类损失函数值,该交叉熵损失函数可为多分类交叉熵损失函数,通过多分类交叉熵损失函数,求导更简单,能够使得收敛较快,对应的权重矩阵的更新更快。The server calculates the classification loss function value of each classification data set through the preset cross-entropy loss function, thereby obtaining multiple classification loss function values. The cross-entropy loss function can be a multi-class cross-entropy loss function. Function, the derivation is simpler, can make the convergence faster, and the update of the corresponding weight matrix is faster.
105、根据多个特征向量损失函数值和多个分类损失函数值,计算人脸识别模型的目标损失函数值。105. Calculate the target loss function value of the face recognition model according to the multiple feature vector loss function values and the multiple classification loss function values.
服务器获得多个特征向量损失函数值和多个分类损失函数值后,获取多个训练数据集的数据集个数,根据数据集个数,计算多个特征向量损失函数值的平均特征向量损失函数值,以及多个分类损失函数值的平均分类损失函数值,将平均特征向量损失函数值和平均分类损失函数值的和值,作为人脸识别模型的目标损失函数值,或者将平均特征向量损失函数值和平均分类损失函数值的加权和值,作为人脸识别模型的目标损失函数值。每个分类网络计算得到分类损失函数值时,可根据分类损失函数值对对应的分类网络进行反向更新。After the server obtains multiple eigenvector loss function values and multiple classification loss function values, it obtains the number of data sets of multiple training data sets, and calculates the average eigenvector loss function of multiple eigenvector loss function values according to the number of data sets Value, and the average classification loss function value of multiple classification loss function values, the sum of the average feature vector loss function value and the average classification loss function value is used as the target loss function value of the face recognition model, or the average feature vector loss The weighted sum of the function value and the average classification loss function value is used as the target loss function value of the face recognition model. When each classification network calculates the classification loss function value, the corresponding classification network can be updated inversely according to the classification loss function value.
106、根据目标损失函数值对主干网络进行迭代更新,直至目标损失函数值收敛,得到更新后的人脸识别模型。106. Iteratively update the backbone network according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
服务器根据目标损失函数值和预置的迭代次数,对主干网络的网络结构和/或权重值进行迭代更新,直至目标损失函数值收敛(即人脸识别模型的训练精度符合预设条件),得到更新后的人脸识别模型。其中,可通过对主干网络进行网络层的增加或删减来更新主干网络的网络结构,也可通过增设其他的网络框架来更新主干网络的网络结构,也可通过修改主干网络的卷积核大小和步长等来更新主干网络的网络结构。在对主干网络进行迭代更新时,服务器也可结合优化算法对人脸识别模型进行优化。The server iteratively updates the network structure and/or weight value of the backbone network according to the target loss function value and the preset number of iterations until the target loss function value converges (that is, the training accuracy of the face recognition model meets the preset conditions), and obtains The updated face recognition model. Among them, the network structure of the backbone network can be updated by adding or deleting the network layer of the backbone network, or by adding other network frameworks to update the network structure of the backbone network, or by modifying the size of the convolution kernel of the backbone network And step size to update the network structure of the backbone network. When iteratively update the backbone network, the server can also optimize the face recognition model in combination with optimization algorithms.
本申请实施例中,通过对多个训练数据集分别进行人脸特征提取和分类,避免了存在重叠数据以及重叠数据作为脏数据对模型训练造成不良影响的状况,通过根据多个特征向 量损失函数值和多个分类损失函数值得到的目标损失函数值来更新人脸识别模型的主干网络,能够使人脸识别模型具有较好的普适性,从而,提高了现有的人脸识别模型的识别精度。In the embodiment of the application, by extracting and categorizing face features of multiple training data sets, the existence of overlapping data and overlapping data as dirty data have an adverse effect on model training. The loss function based on multiple feature vectors is avoided. Value and the target loss function value obtained by multiple classification loss function values to update the backbone network of the face recognition model, which can make the face recognition model have better universality, thereby improving the performance of the existing face recognition model Recognition accuracy.
请参阅图2,本申请实施例中人脸识别模型的训练方法的另一个实施例包括:Referring to FIG. 2, another embodiment of the training method of the face recognition model in the embodiment of the present application includes:
201、获取多个应用场景分别对应的初始训练数据集,初始训练数据集包括开源数据和私有数据。201. Obtain initial training data sets corresponding to multiple application scenarios, where the initial training data sets include open source data and private data.
服务器通过从开源数据库中提取多个不同的应用场景对应的不同维度下的初始训练数据集(开源数据),从网络平台上爬取多个不同的应用场景对应的初始训练数据集(开源数据),从联盟链或私有数据库中提取多个不同的应用场景对应的初始训练数据集(私有数据)。The server extracts the initial training data sets (open source data) in different dimensions corresponding to multiple different application scenarios from the open source database, and crawls the initial training data sets corresponding to multiple different application scenarios (open source data) from the network platform. , Extract multiple initial training data sets (private data) corresponding to different application scenarios from the alliance chain or private database.
202、对每个初始训练数据集依次进行数据清洗和标签标注,得到预处理后的多个训练数据集。202. Perform data cleaning and label labeling on each initial training data set in sequence to obtain multiple preprocessed training data sets.
服务器根据预置的缺失值比例对每个初始训练数据集依次进行缺失值检测、缺失值填充和缺失值清理,得到缺失值处理后的初始训练数据集,对缺失值处理后的初始训练数据集进行合并去重,得到合并去重后的初始训练数据集,判断合并去重后的初始训练数据集中是否存在不符合预设的合法性判定规则的训练数据,若存在,则删除对应的训练数据,若不存在,则将合并去重处理后的初始训练数据集确定为候选训练数据集,对候选训练数据集进行标签标注,得到预处理后的多个训练数据集。The server performs missing value detection, missing value filling, and missing value cleanup on each initial training data set according to the preset missing value ratio to obtain the initial training data set after missing value processing, and the initial training data set after missing value processing Perform merging and de-duplication to obtain the initial training data set after merging and de-duplication, determine whether there is training data in the initial training data set after merging and de-duplication that does not meet the preset legality determination rules, if there is, delete the corresponding training data If it does not exist, the initial training data set after the merging and de-duplication processing is determined as a candidate training data set, and the candidate training data set is labeled to obtain multiple preprocessed training data sets.
其中,标签标注的内容可包括分类标注、标框标注、区域标注和描点标注中的至少一种,分类标注例如:年龄-成人、性别-女、人种-黄种人、头发-长发、脸部表情-微笑、脸部佩戴部件-眼镜的分类标注,标框标注例如:图像中人脸的标框位置标注,区域标注例如:图像中人脸的区域位置标注,描点标注例如:人脸的关键点标注。Among them, the content of labeling may include at least one of classification, frame labeling, area labeling, and spot labeling, such as age-adult, gender-female, race-yellow, hair-long hair, face Facial expressions-smiles, face-wearing parts-spectacles classification labeling, frame labeling such as: labeling the frame position of the face in the image, area labeling such as: labeling the area position of the face in the image, marking the point labeling such as: face Key points are marked.
203、通过预置的人脸识别模型中的主干网络,对多个训练数据集分别进行人脸特征提取,得到多个特征集,人脸识别模型包括主干网络和多个分类网络。203. Perform face feature extraction on multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets. The face recognition model includes a backbone network and multiple classification networks.
具体地,服务器获取多个训练数据集的数据集个数,并根据数据集个数计算每个训练数据集的平均数据量;将平均数据量对应的训练数据作为批处理的数据,得到各训练数据集对应的目标批处理数据;通过预置的人脸识别模型中的主干网络,对目标批处理数据依次进行人脸图像区域检测、人脸关键点检测和人脸特征向量提取,得到多个特征集。Specifically, the server obtains the number of data sets of multiple training data sets, and calculates the average data volume of each training data set according to the number of data sets; the training data corresponding to the average data volume is used as batch data to obtain each training data set. The target batch data corresponding to the data set; through the backbone network in the preset face recognition model, the target batch data is sequentially subjected to face image region detection, face key point detection and face feature vector extraction to obtain multiple Feature set.
以多个训练数据集中的一个训练数据集为例说明,例如:服务器获取的多个训练数据集的数据集个数为5,训练数据集E有800个训练数据,训练数据集E的平均数据量为160个训练数据,则目标批处理数据为该160个训练数据,通过预置的人脸识别模型中的主干网络,对该160个训练数据(即目标批处理数据)进行人脸图像区域检测,得到人脸区域,对人脸区域进行人脸关键点检测,得到人脸关键点信息,对人脸关键点信息进行人脸特征向量提取,得到多个特征集。其中,若不同的训练数据集的训练数据数量不一致,则不同的训练数据集所得的目标批处理数据,在数据处理过程中,当训练数据数量少的目标批处理数据先处理完时,随机循环处理,直至训练数据数量最多的目标批处理数据处理结束。Take a training data set from multiple training data sets as an example. For example, the number of multiple training data sets acquired by the server is 5, the training data set E has 800 training data, and the average data of the training data set E If the amount is 160 training data, the target batch processing data is the 160 training data. Through the backbone network in the preset face recognition model, the 160 training data (that is, the target batch processing data) is processed into the face image area Detect, obtain the face area, perform face key point detection on the face area, obtain face key point information, and perform face feature vector extraction on the face key point information to obtain multiple feature sets. Among them, if the number of training data in different training data sets is inconsistent, the target batch data obtained from different training data sets will be randomly looped when the target batch data with a small number of training data is processed first in the data processing process. Processing until the end of the target batch data processing with the largest number of training data.
204、通过多个分类网络对多个特征集进行分类,得到多个分类数据集,其中,一个分类网络对应一个特征集。204. Classify multiple feature sets through multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set.
服务器获取每个特征集对应的训练数据上的标签,并调用多个分类网络,通过分类网络和标签对多个特征集进行分类,得到多个分类数据集。其中,一个分类网络对一个特征集进行分类。每个分类网络可采用相同的网络结构,也可以采用不相同的网络结构,通过相同的网络结构,减少了网络复杂性,通过采用不相同的网络结构对不同类型的训练数据进行处理,有利于提高分类效率和人脸识别模型的普适性。The server obtains the label on the training data corresponding to each feature set, calls multiple classification networks, and classifies the multiple feature sets through the classification network and the labels to obtain multiple classification data sets. Among them, a classification network classifies a feature set. Each classification network can use the same network structure or different network structure. Through the same network structure, the network complexity is reduced. By using different network structures to process different types of training data, it is beneficial to Improve the efficiency of classification and the universality of face recognition models.
205、计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值。205. Calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values.
具体地,服务器计算每个特征集对应的第一特征中心向量,以及多个特征集对应的第二特征中心向量;计算每个特征集对应的第一特征中心向量与第二特征中心向量之间的距离值,并将距离值确定为每个特征集的特征向量损失函数值,得到多个特征向量损失函数值;获取每个训练数据集中各训练数据对应的预置标签,根据预置标签和预置的交叉熵损失函数,计算每个分类数据集的分类损失函数值,得到多个分类损失函数值。Specifically, the server calculates the first feature center vector corresponding to each feature set and the second feature center vector corresponding to multiple feature sets; calculates the difference between the first feature center vector and the second feature center vector corresponding to each feature set The distance value is determined as the feature vector loss function value of each feature set, and multiple feature vector loss function values are obtained; the preset labels corresponding to each training data in each training data set are obtained, according to the preset labels and The preset cross entropy loss function calculates the classification loss function value of each classification data set, and obtains multiple classification loss function values.
其中,服务器获取每个特征集的第一特征向量,每个特征集中对应的第一训练数据个数,目标批处理数据的第一数据个数,通过预置的第一更新中心向量公式,计算每个特征集对应的第一特征中心向量,计算第一特征中心向量的第一更新中心向量公式如下:Among them, the server obtains the first feature vector of each feature set, the number of corresponding first training data in each feature set, the first number of target batch data, and calculates it through the preset first update center vector formula The first feature center vector corresponding to each feature set, and the first update center vector formula for calculating the first feature center vector is as follows:
Figure PCTCN2020122376-appb-000001
Figure PCTCN2020122376-appb-000001
p用于指示第p个特征集,vc p为当前的第一特征中心向量,vc p-1为上一次迭代的第一特征中心向量,vn p为当前迭代的第一数据个数,n p为第一训练数据个数,v i为当前的第一特征向量。其中,第一次迭代前的第一特征中心向量vc p为0。 p is used to indicate the p-th feature set, vc p is the current first feature center vector, vc p-1 is the first feature center vector of the previous iteration, vn p is the first data number of the current iteration, n p a first number of training data, v i first current feature vector. Wherein, prior to the first iteration of the first feature of the center vector vc p is 0.
服务器获取所有特征集的第二特征向量,所有特征集中对应的第二训练数据个数,所有特征集对应的目标批处理数据的第二数据个数,通过预置的第二更新中心向量公式,计算第二特征中心向量的第二更新中心向量公式如下:The server obtains the second feature vector of all feature sets, the number of second training data corresponding to all feature sets, and the second data number of target batch data corresponding to all feature sets. Through the preset second update center vector formula, The formula for calculating the second update center vector of the second feature center vector is as follows:
Figure PCTCN2020122376-appb-000002
Figure PCTCN2020122376-appb-000002
q用于指示第q次迭代,v q为当前的第二特征中心向量,v q-1为上一次迭代的第二特征中心向量,vk q当前迭代的第二数据个数,n q为第二训练数据个数,v j为当前所有特征集的第二特征向量。其中,第一次迭代前的第二特征中心向量v q为0。 q is used to indicate the qth iteration, v q is the current second feature center vector, v q-1 is the second feature center vector of the previous iteration, vk q is the number of second data in the current iteration, n q is the first 2. Number of training data, v j is the second feature vector of all current feature sets. Among them, the second feature center vector v q before the first iteration is zero.
服务器获取每个特征集的第一特征向量的维度,根据每个特征集的第一特征向量的维度、第一特征中心向量和第二特征中心向量计算特征向量损失函数值,特征向量损失函数值的计算公式如下:The server obtains the dimension of the first feature vector of each feature set, and calculates the feature vector loss function value and the feature vector loss function value according to the dimension of the first feature vector, the first feature center vector and the second feature center vector of each feature set The calculation formula is as follows:
Figure PCTCN2020122376-appb-000003
Figure PCTCN2020122376-appb-000003
p用于指示第p个特征集,m为每个特征集的第一特征向量的维度,vc p为第一特征中心向量,v q为第二特征中心向量。 p is used to indicate the p-th feature set, m is the dimension of the first feature vector of each feature set, vc p is the first feature center vector, and v q is the second feature center vector.
具体地,服务器统统计每个分类数据集中预置标签的标签个数,以及获取每个分类数据集中预置标签对应的分类数据的特征向量;根据预置的交叉熵损失函数、标签个数和特征向量,计算每个分类数据集的分类损失函数值,得到多个分类损失函数值,交叉熵损失函数如下:Specifically, the server counts the number of preset labels in each classification data set, and obtains the feature vector of the classification data corresponding to the preset labels in each classification data set; according to the preset cross-entropy loss function, the number of labels and The feature vector is used to calculate the classification loss function value of each classification data set to obtain multiple classification loss function values. The cross-entropy loss function is as follows:
Figure PCTCN2020122376-appb-000004
Figure PCTCN2020122376-appb-000004
其中,y表示第y个训练数据集,c y为第y个训练数据集对应的分类数据集,n y为标签个数,label i为第i个分类的预置标签,v i为特征向量。 Wherein, y represents the y-th training data set, c y is the y-th training data set corresponding to the classified data set, n y is the number of labels, label i is the i-th label preset categories, v i is the eigenvector .
服务器根据每个训练数据上的预置标签对特征进行分类,得到多个分类数据集,从而可获取每个分类数据集中预置标签的标签个数,各个分类的预置标签,以及各个分类的预置标签对应的分类数据产生的特征向量,通过预置的交叉熵损失函数,结合所得的标签个数和特征向量计算,得到每个分类数据集的分类损失函数值,从而获得多个分类损失函数值。The server classifies the features according to the preset labels on each training data to obtain multiple classification data sets, so as to obtain the number of labels of the preset labels in each classification data set, the preset labels of each category, and the The feature vector generated by the classification data corresponding to the preset label, through the preset cross entropy loss function, combined with the number of labels and the feature vector calculation, the classification loss function value of each classification data set is obtained, thereby obtaining multiple classification losses Function value.
206、根据多个特征向量损失函数值和多个分类损失函数值,计算人脸识别模型的目标损失函数值。206. Calculate the target loss function value of the face recognition model according to the multiple feature vector loss function values and the multiple classification loss function values.
具体地,服务器根据数据集个数计算多个特征向量损失函数值的均值,得到平均特征向量损失函数值;根据数据集个数计算多个分类损失函数值的均值,得到平均分类损失函数值;计算平均特征向量损失函数值和平均分类损失函数值的和值,得到人脸识别模型的目标损失函数值。Specifically, the server calculates the average value of the multiple eigenvector loss function values according to the number of data sets to obtain the average eigenvector loss function value; calculates the average value of the multiple classification loss function values according to the number of data sets to obtain the average classification loss function value; Calculate the sum of the average feature vector loss function value and the average classification loss function value to obtain the target loss function value of the face recognition model.
例如:数据集个数为6,多个特征向量损失函数值分别为L1、L2、L3、L4和L5,多个分类损失函数值分别为K1、K2、K3和K4,则平均特征向量损失函数值为(L1+L2+L3+L4+L5)/6=L,平均分类损失函数值为(K1+K2+K3+K4)/6=K,目标损失函数值为LC=L+K。For example: the number of data sets is 6, multiple eigenvector loss function values are L1, L2, L3, L4, and L5, and multiple classification loss function values are K1, K2, K3, and K4, then the average eigenvector loss function The value is (L1+L2+L3+L4+L5)/6=L, the average classification loss function value is (K1+K2+K3+K4)/6=K, and the target loss function value is LC=L+K.
207、根据目标损失函数值对主干网络进行迭代更新,直至目标损失函数值收敛,得到更新后的人脸识别模型。207. Iteratively update the backbone network according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
具体地,判断目标损失函数值是否收敛,若目标损失函数值不收敛,则对目标批处理数据进行更新,得到更新后的目标批处理数据,并对主干网络的网络结构进行更新,得到更新后的主干网络;通过更新后的主干网络和多个分类网络,对更新后的目标批处理数据依次进行人脸特征提取和分类,得到多个目标分类数据集;根据多个目标分类数据集,计算更新后的目标损失函数值,并判断更新后的目标损失函数值是否收敛;若更新后的目标损失函数值不收敛,则根据更新后的目标损失函数值对更新后的主干网络进行迭代更新,直至更新后的目标损失函数值收敛,得到最终的更新后的人脸识别模型。Specifically, it is judged whether the value of the target loss function converges. If the value of the target loss function does not converge, the target batch data is updated to obtain the updated target batch data, and the network structure of the backbone network is updated to obtain the updated The backbone network; through the updated backbone network and multiple classification networks, the updated target batch data is sequentially extracted and classified by facial features to obtain multiple target classification data sets; according to multiple target classification data sets, calculate The updated objective loss function value, and determine whether the updated objective loss function value converges; if the updated objective loss function value does not converge, the updated backbone network is iteratively updated according to the updated objective loss function value, Until the updated target loss function value converges, the final updated face recognition model is obtained.
当服务器判断到目标损失函数值收敛时,将当前的人脸识别模型作为最终的人脸识别模型。当服务器判断到更新后的目标损失函数值收敛时,将当前更新的人脸识别模型,作为最终更新后的人脸识别模型。其中,对更新后的目标批处理数据依次进行人脸特征提取和分类,得到多个目标分类数据集的操作方法与上述步骤102、103、203和204的操作方法类似,根据多个目标分类数据集,计算更新后的目标损失函数值的操作方法与上述步骤104、105、205和206的操作方法类似,在此不再赘述。每次迭代中,每个更新后的目标批处理数据的数据数量都会不一样,是动态变化的,等于上一迭代中的目标批处理数据与当前的目标批处理数据之间的和值。When the server determines that the value of the target loss function has converged, it uses the current face recognition model as the final face recognition model. When the server determines that the updated target loss function value has converged, it uses the currently updated face recognition model as the final updated face recognition model. Among them, the facial feature extraction and classification are performed on the updated target batch data in sequence to obtain multiple target classification data sets. The operation method is similar to the above steps 102, 103, 203, and 204, according to the multiple target classification data Set, the operation method of calculating the updated target loss function value is similar to the operation method of the above steps 104, 105, 205, and 206, and will not be repeated here. In each iteration, the data quantity of each updated target batch data will be different, and it will change dynamically, which is equal to the sum of the target batch data in the previous iteration and the current target batch data.
本申请实施例中,通过对多个初始训练数据集分别进行清数据清洗和标签标注,以及对多个训练数据集进行人脸特征提取和分类,对于不同的数据集不需要合并清洗,只需要各自清洗即可,不仅大大地节约了清洗数据的时间,还有效地避免了存在重叠数据以及重叠数据作为脏数据对模型训练造成不良影响的状况,通过根据多个特征向量损失函数值和多个分类损失函数值得到的目标损失函数值来更新人脸识别模型的主干网络,能够使人脸识别模型具有较好的普适性,从而,提高了现有的人脸识别模型的识别精度。In the embodiment of this application, by performing data cleaning and labeling on multiple initial training data sets, and extracting and classifying face features on multiple training data sets, different data sets do not need to be merged and cleaned, only You can clean them separately, which not only greatly saves the time of cleaning data, but also effectively avoids the existence of overlapping data and the situation of overlapping data as dirty data that has an adverse effect on model training. By losing function values according to multiple feature vectors and multiple The target loss function value obtained by the classification loss function value is used to update the backbone network of the face recognition model, which can make the face recognition model have better universality, thereby improving the recognition accuracy of the existing face recognition model.
上面对本申请实施例中人脸识别模型的训练方法进行了描述,下面对本申请实施例中人脸识别模型的训练装置进行描述,请参阅图3,本申请实施例中人脸识别模型的训练装置一个实施例包括:The training method of the face recognition model in the embodiment of the application is described above, and the training device of the face recognition model in the embodiment of the application is described below. Please refer to FIG. 3, the training device of the face recognition model in the embodiment of the application One embodiment includes:
获取模块301,用于获取预处理后的多个训练数据集,多个训练数据集为多个应用场景分别对应的人脸训练数据集;The obtaining module 301 is configured to obtain a plurality of preprocessed training data sets, and the plurality of training data sets are face training data sets corresponding to multiple application scenarios;
特征提取模块302,用于通过预置的人脸识别模型中的主干网络,对多个训练数据集分别进行人脸特征提取,得到多个特征集,人脸识别模型包括主干网络和多个分类网络;The feature extraction module 302 is used to extract facial features from multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets. The face recognition model includes the backbone network and multiple classifications. The internet;
分类模块303,用于通过多个分类网络对多个特征集进行分类,得到多个分类数据集,其中,一个分类网络对应一个特征集;The classification module 303 is configured to classify multiple feature sets through multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
第一计算模块304,用于计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值;The first calculation module 304 is configured to calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
第二计算模块305,用于根据多个特征向量损失函数值和多个分类损失函数值,计算人脸识别模型的目标损失函数值;The second calculation module 305 is configured to calculate the target loss function value of the face recognition model according to multiple feature vector loss function values and multiple classification loss function values;
迭代更新模块306,用于根据目标损失函数值对主干网络进行迭代更新,直至目标损失函数值收敛,得到更新后的人脸识别模型。The iterative update module 306 is configured to iteratively update the backbone network according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
上述人脸识别模型的训练装置中各个模块的功能实现与上述人脸识别模型的训练方法实施例中各步骤相对应,其功能和实现过程在此处不再一一赘述。The function realization of each module in the above-mentioned face recognition model training device corresponds to each step in the embodiment of the above-mentioned face recognition model training method, and its functions and realization process are not repeated here.
本申请实施例中,通过对多个训练数据集分别进行人脸特征提取和分类,避免了存在重叠数据以及重叠数据作为脏数据对模型训练造成不良影响的状况,通过根据多个特征向量损失函数值和多个分类损失函数值得到的目标损失函数值来更新人脸识别模型的主干网络,能够使人脸识别模型具有较好的普适性,从而,提高了现有的人脸识别模型的识别精度。In the embodiment of the application, by extracting and categorizing face features of multiple training data sets, the existence of overlapping data and overlapping data as dirty data have an adverse effect on model training. The loss function based on multiple feature vectors is avoided. Value and the target loss function value obtained by multiple classification loss function values to update the backbone network of the face recognition model, which can make the face recognition model have better universality, thereby improving the performance of the existing face recognition model Recognition accuracy.
请参阅图4,本申请实施例中人脸识别模型的训练装置的另一个实施例包括:Referring to FIG. 4, another embodiment of the training device for the face recognition model in the embodiment of the present application includes:
获取模块301,用于获取预处理后的多个训练数据集,多个训练数据集为多个应用场景分别对应的人脸训练数据集;The obtaining module 301 is configured to obtain a plurality of preprocessed training data sets, and the plurality of training data sets are face training data sets corresponding to multiple application scenarios;
其中,获取模块301具体包括:Among them, the obtaining module 301 specifically includes:
获取单元3011,用于获取多个应用场景分别对应的初始训练数据集,初始训练数据集包括开源数据和私有数据;The obtaining unit 3011 is configured to obtain initial training data sets corresponding to multiple application scenarios, and the initial training data sets include open source data and private data;
预处理单元3012,用于对每个初始训练数据集依次进行数据清洗和标签标注,得到预处理后的多个训练数据集;The preprocessing unit 3012 is configured to sequentially perform data cleaning and label labeling on each initial training data set to obtain multiple preprocessed training data sets;
特征提取模块302,用于通过预置的人脸识别模型中的主干网络,对多个训练数据集分别进行人脸特征提取,得到多个特征集,人脸识别模型包括主干网络和多个分类网络;The feature extraction module 302 is used to extract facial features from multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets. The face recognition model includes the backbone network and multiple classifications. The internet;
分类模块303,用于通过多个分类网络对多个特征集进行分类,得到多个分类数据集,其中,一个分类网络对应一个特征集;The classification module 303 is configured to classify multiple feature sets through multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
第一计算模块304,用于计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值;The first calculation module 304 is configured to calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
第二计算模块305,用于根据多个特征向量损失函数值和多个分类损失函数值,计算人脸识别模型的目标损失函数值;The second calculation module 305 is configured to calculate the target loss function value of the face recognition model according to multiple feature vector loss function values and multiple classification loss function values;
迭代更新模块306,用于根据目标损失函数值对主干网络进行迭代更新,直至目标损失函数值收敛,得到更新后的人脸识别模型。The iterative update module 306 is configured to iteratively update the backbone network according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
可选的,特征提取模块302还可以具体用于:Optionally, the feature extraction module 302 may also be specifically used for:
获取多个训练数据集的数据集个数,并根据数据集个数计算每个训练数据集的平均数据量;Obtain the number of data sets of multiple training data sets, and calculate the average data volume of each training data set according to the number of data sets;
将平均数据量对应的训练数据作为批处理的数据,得到各训练数据集对应的目标批处理数据;Use the training data corresponding to the average data volume as batch data to obtain the target batch data corresponding to each training data set;
通过预置的人脸识别模型中的主干网络,对目标批处理数据依次进行人脸图像区域检测、人脸关键点检测和人脸特征向量提取,得到多个特征集。Through the backbone network in the preset face recognition model, face image area detection, face key point detection and face feature vector extraction are sequentially performed on the target batch data to obtain multiple feature sets.
可选的,第二计算模块305还可以具体用于:Optionally, the second calculation module 305 may also be specifically used for:
根据数据集个数计算多个特征向量损失函数值的均值,得到平均特征向量损失函数值;Calculate the average value of multiple eigenvector loss function values according to the number of data sets to obtain the average eigenvector loss function value;
根据数据集个数计算多个分类损失函数值的均值,得到平均分类损失函数值;Calculate the average value of multiple classification loss function values according to the number of data sets to obtain the average classification loss function value;
计算平均特征向量损失函数值和平均分类损失函数值的和值,得到人脸识别模型的目标损失函数值。Calculate the sum of the average feature vector loss function value and the average classification loss function value to obtain the target loss function value of the face recognition model.
可选的,第一计算模块304包括:Optionally, the first calculation module 304 includes:
第一计算单元3041,用于计算每个特征集对应的第一特征中心向量,以及多个特征集对应的第二特征中心向量;The first calculation unit 3041 is configured to calculate the first feature center vector corresponding to each feature set and the second feature center vectors corresponding to multiple feature sets;
第二计算单元3042,用于计算每个特征集对应的第一特征中心向量与第二特征中心向量之间的距离值,并将距离值确定为每个特征集的特征向量损失函数值,得到多个特征向量损失函数值;The second calculation unit 3042 is used to calculate the distance value between the first feature center vector and the second feature center vector corresponding to each feature set, and determine the distance value as the feature vector loss function value of each feature set to obtain Multiple eigenvector loss function values;
第三计算单元3043,用于获取每个训练数据集中各训练数据对应的预置标签,根据预置标签和预置的交叉熵损失函数,计算每个分类数据集的分类损失函数值,得到多个分类损失函数值。The third calculation unit 3043 is used to obtain the preset label corresponding to each training data in each training data set, and calculate the classification loss function value of each classification data set according to the preset label and the preset cross-entropy loss function, and obtain the The value of the classification loss function.
可选的,第三计算单元3043还可以具体用于:Optionally, the third calculation unit 3043 may also be specifically configured to:
统计每个分类数据集中预置标签的标签个数,以及获取每个分类数据集中预置标签对应的分类数据的特征向量;Count the number of tags of the preset tags in each classification data set, and obtain the feature vector of the classification data corresponding to the preset tags in each classification data set;
根据预置的交叉熵损失函数、标签个数和特征向量,计算每个分类数据集的分类损失函数值,得到多个分类损失函数值,交叉熵损失函数如下:According to the preset cross-entropy loss function, the number of labels and the feature vector, the classification loss function value of each classification data set is calculated, and multiple classification loss function values are obtained. The cross-entropy loss function is as follows:
Figure PCTCN2020122376-appb-000005
Figure PCTCN2020122376-appb-000005
其中,y表示第y个训练数据集,c y为第y个训练数据集对应的分类数据集,n y为标签个数,label i为第i个分类的预置标签,v i为特征向量。 Wherein, y represents the y-th training data set, c y is the y-th training data set corresponding to the classified data set, n y is the number of labels, label i is the i-th label preset categories, v i is the eigenvector .
可选的,迭代更新模块306还可以具体用于:Optionally, the iterative update module 306 may also be specifically used to:
判断目标损失函数值是否收敛,若目标损失函数值不收敛,则对目标批处理数据进行更新,得到更新后的目标批处理数据,并对主干网络的网络结构进行更新,得到更新后的主干网络;Determine whether the value of the target loss function converges. If the value of the target loss function does not converge, update the target batch data to obtain the updated target batch data, and update the network structure of the backbone network to obtain the updated backbone network ;
通过更新后的主干网络和多个分类网络,对更新后的目标批处理数据依次进行人脸特征提取和分类,得到多个目标分类数据集;Through the updated backbone network and multiple classification networks, the updated target batch data is sequentially extracted and classified by facial features to obtain multiple target classification data sets;
根据多个目标分类数据集,计算更新后的目标损失函数值,并判断更新后的目标损失函数值是否收敛;According to multiple target classification data sets, calculate the updated target loss function value, and judge whether the updated target loss function value converges;
若更新后的目标损失函数值不收敛,则根据更新后的目标损失函数值对更新后的主干网络进行迭代更新,直至更新后的目标损失函数值收敛,得到最终的更新后的人脸识别模型。If the updated objective loss function value does not converge, the updated backbone network is iteratively updated according to the updated objective loss function value until the updated objective loss function value converges to obtain the final updated face recognition model .
上述人脸识别模型的训练装置中各模块和各单元的功能实现与上述人脸识别模型的训练方法实施例中各步骤相对应,其功能和实现过程在此处不再一一赘述。The functional realization of each module and each unit in the above-mentioned face recognition model training device corresponds to each step in the above-mentioned embodiment of the face recognition model training method, and the functions and realization processes are not repeated here.
本申请实施例中,通过对多个初始训练数据集分别进行清数据清洗和标签标注,以及对多个训练数据集进行人脸特征提取和分类,对于不同的数据集不需要合并清洗,只需要 各自清洗即可,不仅大大地节约了清洗数据的时间,还有效地避免了存在重叠数据以及重叠数据作为脏数据对模型训练造成不良影响的状况,通过根据多个特征向量损失函数值和多个分类损失函数值得到的目标损失函数值来更新人脸识别模型的主干网络,能够使人脸识别模型具有较好的普适性,从而,提高了现有的人脸识别模型的识别精度。In the embodiment of this application, by performing data cleaning and labeling on multiple initial training data sets, and extracting and classifying face features on multiple training data sets, different data sets do not need to be merged and cleaned, only You can clean them separately, which not only greatly saves the time of cleaning data, but also effectively avoids the existence of overlapping data and overlapping data as dirty data that have an adverse effect on model training. By losing function values and multiple eigenvectors according to multiple The target loss function value obtained by the classification loss function value is used to update the backbone network of the face recognition model, which can make the face recognition model have better universality, thereby improving the recognition accuracy of the existing face recognition model.
上面图3和图4从模块化功能实体的角度对本申请实施例中的人脸识别模型的训练装置进行详细描述,下面从硬件处理的角度对本申请实施例中人脸识别模型的训练设备进行详细描述。The above figures 3 and 4 describe in detail the face recognition model training device in the embodiment of the present application from the perspective of modular functional entities, and the following is a detailed description of the face recognition model training device in the embodiment of the present application from the perspective of hardware processing. description.
图5是本申请实施例提供的一种人脸识别模型的训练设备的结构示意图,该人脸识别模型的训练设备500可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)510(例如,一个或一个以上处理器)和存储器520,一个或一个以上存储应用程序533或数据532的存储介质530(例如一个或一个以上海量存储设备)。其中,存储器520和存储介质530可以是短暂存储或持久存储。存储在存储介质530的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对人脸识别模型的训练设备500中的一系列指令操作。更进一步地,处理器510可以设置为与存储介质530通信,在人脸识别模型的训练设备500上执行存储介质530中的一系列指令操作。FIG. 5 is a schematic structural diagram of a training device for a face recognition model provided by an embodiment of the present application. The training device 500 for the face recognition model may have relatively large differences due to different configurations or performances, and may include one or more A processor (central processing units, CPU) 510 (for example, one or more processors), a memory 520, and one or more storage media 530 (for example, one or one storage device with a large amount of storage) storing application programs 533 or data 532. Among them, the memory 520 and the storage medium 530 may be short-term storage or persistent storage. The program stored in the storage medium 530 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations in the training device 500 for the face recognition model. Further, the processor 510 may be configured to communicate with the storage medium 530, and execute a series of instruction operations in the storage medium 530 on the training device 500 of the face recognition model.
人脸识别模型的训练设备500还可以包括一个或一个以上电源540,一个或一个以上有线或无线网络接口550,一个或一个以上输入输出接口560,和/或,一个或一个以上操作系统531,例如Windows Serve,Mac OS X,Unix,Linux,FreeBSD等等。本领域技术人员可以理解,图5示出的人脸识别模型的训练设备结构并不构成对人脸识别模型的训练设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。The face recognition model training device 500 may also include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input and output interfaces 560, and/or one or more operating systems 531, For example, Windows Serve, Mac OS X, Unix, Linux, FreeBSD, etc. Those skilled in the art can understand that the structure of the training device for the face recognition model shown in FIG. 5 does not constitute a limitation on the training device for the face recognition model, and may include more or less components than shown in the figure, or a combination of certain components. Some components, or different component arrangements.
本申请还提供一种人脸识别模型的训练设备,所述人脸识别模型的训练设备包括存储器和处理器,存储器中存储有指令,所述指令被处理器执行时,使得处理器执行上述各实施例中的所述人脸识别模型的训练方法的步骤。This application also provides a training device for a face recognition model. The training device for a face recognition model includes a memory and a processor. The memory stores instructions. When the instructions are executed by the processor, the processor executes each of the foregoing. The steps of the training method of the face recognition model in the embodiment.
本申请还提供一种计算机可读存储介质,该计算机可读存储介质可以为非易失性计算机可读存储介质,该计算机可读存储介质也可以为易失性计算机可读存储介质,所述计算机可读存储介质中存储有指令,当所述指令在计算机上运行时,使得计算机执行所述人脸识别模型的训练方法的步骤。This application also provides a computer-readable storage medium. The computer-readable storage medium may be a non-volatile computer-readable storage medium, and the computer-readable storage medium may also be a volatile computer-readable storage medium. The computer-readable storage medium stores instructions, and when the instructions run on a computer, the computer executes the steps of the method for training the face recognition model.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of the description, the specific working process of the system, device and unit described above can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。As mentioned above, the above embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions recorded in the embodiments are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (20)

  1. 一种人脸识别模型的训练方法,其中,所述人脸识别模型的训练方法包括:A method for training a face recognition model, wherein the training method for the face recognition model includes:
    获取预处理后的多个训练数据集,所述多个训练数据集为多个应用场景分别对应的人脸训练数据集;Acquiring preprocessed multiple training data sets, where the multiple training data sets are face training data sets corresponding to multiple application scenarios, respectively;
    通过预置的人脸识别模型中的主干网络,对所述多个训练数据集分别进行人脸特征提取,得到多个特征集,所述人脸识别模型包括主干网络和多个分类网络;Performing face feature extraction on the multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets, the face recognition model including the backbone network and multiple classification networks;
    通过所述多个分类网络对所述多个特征集进行分类,得到多个分类数据集,其中,一个分类网络对应一个特征集;Classify the multiple feature sets through the multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
    计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值;Calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
    根据所述多个特征向量损失函数值和所述多个分类损失函数值,计算所述人脸识别模型的目标损失函数值;Calculating a target loss function value of the face recognition model according to the multiple feature vector loss function values and the multiple classification loss function values;
    根据所述目标损失函数值对所述主干网络进行迭代更新,直至所述目标损失函数值收敛,得到更新后的人脸识别模型。The backbone network is iteratively updated according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
  2. 根据权利要求1所述的人脸识别模型的训练方法,其中,所述通过预置的人脸识别模型中的主干网络,对所述多个训练数据集分别进行人脸特征提取,得到多个特征集,包括:The method for training a face recognition model according to claim 1, wherein said using the backbone network in the preset face recognition model, the plurality of training data sets are respectively subjected to face feature extraction to obtain a plurality of Feature set, including:
    获取所述多个训练数据集的数据集个数,并根据所述数据集个数计算每个训练数据集的平均数据量;Acquiring the number of data sets of the multiple training data sets, and calculating the average data amount of each training data set according to the number of data sets;
    将所述平均数据量对应的训练数据作为批处理的数据,得到各训练数据集对应的目标批处理数据;Use the training data corresponding to the average data volume as batch data to obtain target batch data corresponding to each training data set;
    通过预置的人脸识别模型中的主干网络,对所述目标批处理数据依次进行人脸图像区域检测、人脸关键点检测和人脸特征向量提取,得到多个特征集。Through the backbone network in the preset face recognition model, face image region detection, face key point detection, and face feature vector extraction are sequentially performed on the target batch data to obtain multiple feature sets.
  3. 根据权利要求2所述的人脸识别模型的训练方法,其中,所述根据所述多个特征向量损失函数值和所述多个分类损失函数值,计算所述人脸识别模型的目标损失函数值,包括:The method for training a face recognition model according to claim 2, wherein the calculation of the target loss function of the face recognition model is based on the plurality of feature vector loss function values and the plurality of classification loss function values Values include:
    根据所述数据集个数计算所述多个特征向量损失函数值的均值,得到平均特征向量损失函数值;Calculating an average value of the multiple eigenvector loss function values according to the number of the data sets to obtain an average eigenvector loss function value;
    根据所述数据集个数计算所述多个分类损失函数值的均值,得到平均分类损失函数值;Calculating an average value of the multiple classification loss function values according to the number of the data sets to obtain an average classification loss function value;
    计算所述平均特征向量损失函数值和所述平均分类损失函数值的和值,得到所述人脸识别模型的目标损失函数值。The sum of the average feature vector loss function value and the average classification loss function value is calculated to obtain the target loss function value of the face recognition model.
  4. 根据权利要求2所述的人脸识别模型的训练方法,其中,所述计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值,包括:The method for training a face recognition model according to claim 2, wherein said calculating the feature vector loss function value of each feature set to obtain a plurality of feature vector loss function values, and calculating the classification loss of each classification data set Function value, get multiple classification loss function values, including:
    计算每个特征集对应的第一特征中心向量,以及所述多个特征集对应的第二特征中心向量;Calculating a first feature center vector corresponding to each feature set, and a second feature center vector corresponding to the multiple feature sets;
    计算每个特征集对应的第一特征中心向量与所述第二特征中心向量之间的距离值,并将所述距离值确定为每个特征集的特征向量损失函数值,得到多个特征向量损失函数值;Calculate the distance value between the first feature center vector corresponding to each feature set and the second feature center vector, and determine the distance value as the feature vector loss function value of each feature set to obtain multiple feature vectors Loss function value;
    获取每个训练数据集中各训练数据对应的预置标签,根据所述预置标签和预置的交叉熵损失函数,计算每个分类数据集的分类损失函数值,得到多个分类损失函数值。Obtain a preset label corresponding to each training data in each training data set, and calculate the classification loss function value of each classification data set according to the preset label and the preset cross-entropy loss function to obtain multiple classification loss function values.
  5. 根据权利要求4所述的人脸识别模型的训练方法,其中,所述根据所述预置标签和预置的交叉熵损失函数,计算每个分类数据集的分类损失函数值,得到多个分类损失函数值,包括:The method for training a face recognition model according to claim 4, wherein the value of the classification loss function of each classification data set is calculated according to the preset label and the preset cross-entropy loss function to obtain a plurality of classifications Loss function value, including:
    统计每个分类数据集中所述预置标签的标签个数,以及获取每个分类数据集中所述预置标签对应的分类数据的特征向量;Counting the number of tags of the preset tags in each classification data set, and obtaining the feature vector of the classification data corresponding to the preset tags in each classification data set;
    根据预置的交叉熵损失函数、所述标签个数和所述特征向量,计算每个分类数据集的分类损失函数值,得到多个分类损失函数值,所述交叉熵损失函数如下:According to the preset cross-entropy loss function, the number of tags, and the feature vector, the classification loss function value of each classification data set is calculated to obtain multiple classification loss function values, and the cross-entropy loss function is as follows:
    Figure PCTCN2020122376-appb-100001
    Figure PCTCN2020122376-appb-100001
    其中,所述y表示第y个训练数据集,所述c y为第y个训练数据集对应的分类数据集,所述n y为所述标签个数,所述label i为第i个分类的预置标签,所述v i为所述特征向量。 Wherein, the y represents the y-th training data set, the c y is the classification data set corresponding to the y-th training data set, the n y is the number of the labels, and the label i is the i-th classification preset tag, the v i is the feature vector.
  6. 根据权利要求2-5中任一项所述的人脸识别模型的训练方法,其中,所述根据所述目标损失函数值对所述主干网络进行迭代更新,直至所述目标损失函数值收敛,得到更新后的人脸识别模型,包括:The method for training a face recognition model according to any one of claims 2-5, wherein the iterative update of the backbone network according to the value of the target loss function until the value of the target loss function converges, Get the updated face recognition model, including:
    判断所述目标损失函数值是否收敛,若所述目标损失函数值不收敛,则对所述目标批处理数据进行更新,得到更新后的目标批处理数据,并对所述主干网络的网络结构进行更新,得到更新后的主干网络;Determine whether the value of the target loss function converges. If the value of the target loss function does not converge, update the target batch data to obtain the updated target batch data, and perform the network structure of the backbone network. Update, get the updated backbone network;
    通过所述更新后的主干网络和所述多个分类网络,对所述更新后的目标批处理数据依次进行人脸特征提取和分类,得到多个目标分类数据集;Performing face feature extraction and classification on the updated target batch data sequentially through the updated backbone network and the multiple classification networks to obtain multiple target classification data sets;
    根据所述多个目标分类数据集,计算更新后的目标损失函数值,并判断所述更新后的目标损失函数值是否收敛;Calculating an updated target loss function value according to the multiple target classification data sets, and judging whether the updated target loss function value converges;
    若所述更新后的目标损失函数值不收敛,则根据所述更新后的目标损失函数值对所述更新后的主干网络进行迭代更新,直至所述更新后的目标损失函数值收敛,得到最终的更新后的人脸识别模型。If the updated objective loss function value does not converge, the updated backbone network is iteratively updated according to the updated objective loss function value until the updated objective loss function value converges to obtain the final The updated face recognition model.
  7. 根据权利要求1所述的人脸识别模型的训练方法,其中,所述获取预处理后的多个训练数据集,所述多个训练数据集为多个应用场景分别对应的人脸训练数据集,包括:The method for training a face recognition model according to claim 1, wherein said acquiring a plurality of training data sets after preprocessing, said plurality of training data sets being face training data sets corresponding to a plurality of application scenarios respectively ,include:
    获取多个应用场景分别对应的初始训练数据集,所述初始训练数据集包括开源数据和私有数据;Acquiring initial training data sets corresponding to multiple application scenarios, where the initial training data sets include open source data and private data;
    对每个初始训练数据集依次进行数据清洗和标签标注,得到预处理后的多个训练数据集。Perform data cleaning and label labeling on each initial training data set in order to obtain multiple preprocessed training data sets.
  8. 一种人脸识别模型的训练设备,其中,所述人脸识别模型的训练设备包括:存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的人脸识别模型的训练程序,所述处理器执行所述人脸识别模型的训练程序时实现如下步骤:A training device for a face recognition model, wherein the training device for the face recognition model includes: a memory, a processor, and training of a face recognition model stored in the memory and running on the processor Program, the processor implements the following steps when executing the training program of the face recognition model:
    获取预处理后的多个训练数据集,所述多个训练数据集为多个应用场景分别对应的人脸训练数据集;Acquiring preprocessed multiple training data sets, where the multiple training data sets are face training data sets corresponding to multiple application scenarios, respectively;
    通过预置的人脸识别模型中的主干网络,对所述多个训练数据集分别进行人脸特征提取,得到多个特征集,所述人脸识别模型包括主干网络和多个分类网络;Performing face feature extraction on the multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets, the face recognition model including the backbone network and multiple classification networks;
    通过所述多个分类网络对所述多个特征集进行分类,得到多个分类数据集,其中,一个分类网络对应一个特征集;Classify the multiple feature sets through the multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
    计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值;Calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
    根据所述多个特征向量损失函数值和所述多个分类损失函数值,计算所述人脸识别模型的目标损失函数值;Calculating a target loss function value of the face recognition model according to the multiple feature vector loss function values and the multiple classification loss function values;
    根据所述目标损失函数值对所述主干网络进行迭代更新,直至所述目标损失函数值收敛,得到更新后的人脸识别模型。The backbone network is iteratively updated according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
  9. 根据权利要求8所述的人脸识别模型的训练设备,其中,所述处理器执行所述人脸 识别模型的训练程序实现所述通过预置的人脸识别模型中的主干网络,对所述多个训练数据集分别进行人脸特征提取,得到多个特征集时,包括以下步骤:The training device for a face recognition model according to claim 8, wherein the processor executes the training program of the face recognition model to implement the backbone network in the preset face recognition model, and the The face feature extraction is performed on multiple training data sets, and when multiple feature sets are obtained, the following steps are included:
    获取所述多个训练数据集的数据集个数,并根据所述数据集个数计算每个训练数据集的平均数据量;Acquiring the number of data sets of the multiple training data sets, and calculating the average data amount of each training data set according to the number of data sets;
    将所述平均数据量对应的训练数据作为批处理的数据,得到各训练数据集对应的目标批处理数据;Use the training data corresponding to the average data volume as batch data to obtain target batch data corresponding to each training data set;
    通过预置的人脸识别模型中的主干网络,对所述目标批处理数据依次进行人脸图像区域检测、人脸关键点检测和人脸特征向量提取,得到多个特征集。Through the backbone network in the preset face recognition model, face image region detection, face key point detection, and face feature vector extraction are sequentially performed on the target batch data to obtain multiple feature sets.
  10. 根据权利要求9所述的人脸识别模型的训练设备,其中,所述处理器执行所述人脸识别模型的训练程序实现所述根据所述多个特征向量损失函数值和所述多个分类损失函数值,计算所述人脸识别模型的目标损失函数值时,包括以下步骤:The training device for a face recognition model according to claim 9, wherein the processor executes a training program of the face recognition model to implement the loss function value according to the plurality of feature vectors and the plurality of classifications Loss function value, when calculating the target loss function value of the face recognition model, the following steps are included:
    根据所述数据集个数计算所述多个特征向量损失函数值的均值,得到平均特征向量损失函数值;Calculating an average value of the multiple eigenvector loss function values according to the number of the data sets to obtain an average eigenvector loss function value;
    根据所述数据集个数计算所述多个分类损失函数值的均值,得到平均分类损失函数值;Calculating an average value of the multiple classification loss function values according to the number of the data sets to obtain an average classification loss function value;
    计算所述平均特征向量损失函数值和所述平均分类损失函数值的和值,得到所述人脸识别模型的目标损失函数值。The sum of the average feature vector loss function value and the average classification loss function value is calculated to obtain the target loss function value of the face recognition model.
  11. 根据权利要求9所述的人脸识别模型的训练设备,其中,所述处理器执行所述人脸识别模型的训练程序实现所述计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值时,包括以下步骤:The training device for a face recognition model according to claim 9, wherein the processor executes the training program of the face recognition model to realize the calculation of the feature vector loss function value of each feature set to obtain a plurality of features Vector loss function value, and calculate the classification loss function value of each classification data set. When multiple classification loss function values are obtained, the following steps are included:
    计算每个特征集对应的第一特征中心向量,以及所述多个特征集对应的第二特征中心向量;Calculating a first feature center vector corresponding to each feature set, and a second feature center vector corresponding to the multiple feature sets;
    计算每个特征集对应的第一特征中心向量与所述第二特征中心向量之间的距离值,并将所述距离值确定为每个特征集的特征向量损失函数值,得到多个特征向量损失函数值;Calculate the distance value between the first feature center vector corresponding to each feature set and the second feature center vector, and determine the distance value as the feature vector loss function value of each feature set to obtain multiple feature vectors Loss function value;
    获取每个训练数据集中各训练数据对应的预置标签,根据所述预置标签和预置的交叉熵损失函数,计算每个分类数据集的分类损失函数值,得到多个分类损失函数值。Obtain a preset label corresponding to each training data in each training data set, and calculate the classification loss function value of each classification data set according to the preset label and the preset cross-entropy loss function to obtain multiple classification loss function values.
  12. 根据权利要求11所述的人脸识别模型的训练设备,其中,所述处理器执行所述人脸识别模型的训练程序实现所述根据所述预置标签和预置的交叉熵损失函数,计算每个分类数据集的分类损失函数值,得到多个分类损失函数值时,包括以下步骤:The training device for a face recognition model according to claim 11, wherein the processor executes the training program of the face recognition model to realize the calculation according to the preset label and the preset cross-entropy loss function The classification loss function value of each classification data set, when multiple classification loss function values are obtained, the following steps are included:
    统计每个分类数据集中所述预置标签的标签个数,以及获取每个分类数据集中所述预置标签对应的分类数据的特征向量;Counting the number of tags of the preset tags in each classification data set, and obtaining the feature vector of the classification data corresponding to the preset tags in each classification data set;
    根据预置的交叉熵损失函数、所述标签个数和所述特征向量,计算每个分类数据集的分类损失函数值,得到多个分类损失函数值,所述交叉熵损失函数如下:According to the preset cross-entropy loss function, the number of tags, and the feature vector, the classification loss function value of each classification data set is calculated to obtain multiple classification loss function values, and the cross-entropy loss function is as follows:
    Figure PCTCN2020122376-appb-100002
    Figure PCTCN2020122376-appb-100002
    其中,所述y表示第y个训练数据集,所述c y为第y个训练数据集对应的分类数据集,所述n y为所述标签个数,所述label i为第i个分类的预置标签,所述v i为所述特征向量。 Wherein, the y represents the y-th training data set, the c y is the classification data set corresponding to the y-th training data set, the n y is the number of the labels, and the label i is the i-th classification preset tag, the v i is the feature vector.
  13. 根据权利要求9-12中任一项所述的人脸识别模型的训练设备,其中,所述处理器执行所述人脸识别模型的训练程序实现所述根据所述目标损失函数值对所述主干网络进行迭代更新,直至所述目标损失函数值收敛,得到更新后的人脸识别模型时,包括以下步骤:The training device for a face recognition model according to any one of claims 9-12, wherein the processor executes the training program of the face recognition model to realize the pairing of the face recognition model according to the target loss function value The backbone network is updated iteratively until the value of the target loss function converges, and when the updated face recognition model is obtained, the following steps are included:
    判断所述目标损失函数值是否收敛,若所述目标损失函数值不收敛,则对所述目标批处理数据进行更新,得到更新后的目标批处理数据,并对所述主干网络的网络结构进行更新,得到更新后的主干网络;Determine whether the value of the target loss function converges. If the value of the target loss function does not converge, update the target batch data to obtain the updated target batch data, and perform the network structure of the backbone network. Update, get the updated backbone network;
    通过所述更新后的主干网络和所述多个分类网络,对所述更新后的目标批处理数据依次进行人脸特征提取和分类,得到多个目标分类数据集;Performing face feature extraction and classification on the updated target batch data sequentially through the updated backbone network and the multiple classification networks to obtain multiple target classification data sets;
    根据所述多个目标分类数据集,计算更新后的目标损失函数值,并判断所述更新后的目标损失函数值是否收敛;Calculating an updated target loss function value according to the multiple target classification data sets, and judging whether the updated target loss function value converges;
    若所述更新后的目标损失函数值不收敛,则根据所述更新后的目标损失函数值对所述更新后的主干网络进行迭代更新,直至所述更新后的目标损失函数值收敛,得到最终的更新后的人脸识别模型。If the updated objective loss function value does not converge, the updated backbone network is iteratively updated according to the updated objective loss function value until the updated objective loss function value converges to obtain the final The updated face recognition model.
  14. 根据权利要求8所述的人脸识别模型的训练设备,其中,所述处理器执行所述人脸识别模型的训练程序实现所述获取预处理后的多个训练数据集,所述多个训练数据集为多个应用场景分别对应的人脸训练数据集时,包括以下步骤:The training device for a face recognition model according to claim 8, wherein the processor executes a training program of the face recognition model to realize the acquisition of a plurality of training data sets after preprocessing, and the plurality of training When the data set is a face training data set corresponding to multiple application scenarios, the following steps are included:
    获取多个应用场景分别对应的初始训练数据集,所述初始训练数据集包括开源数据和私有数据;Acquiring initial training data sets corresponding to multiple application scenarios, where the initial training data sets include open source data and private data;
    对每个初始训练数据集依次进行数据清洗和标签标注,得到预处理后的多个训练数据集。Perform data cleaning and label labeling on each initial training data set in order to obtain multiple preprocessed training data sets.
  15. 一种计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:A computer-readable storage medium in which computer instructions are stored, and when the computer instructions are executed on a computer, the computer executes the following steps:
    获取预处理后的多个训练数据集,所述多个训练数据集为多个应用场景分别对应的人脸训练数据集;Acquiring preprocessed multiple training data sets, where the multiple training data sets are face training data sets corresponding to multiple application scenarios, respectively;
    通过预置的人脸识别模型中的主干网络,对所述多个训练数据集分别进行人脸特征提取,得到多个特征集,所述人脸识别模型包括主干网络和多个分类网络;Performing face feature extraction on the multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets, the face recognition model including the backbone network and multiple classification networks;
    通过所述多个分类网络对所述多个特征集进行分类,得到多个分类数据集,其中,一个分类网络对应一个特征集;Classify the multiple feature sets through the multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
    计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值;Calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
    根据所述多个特征向量损失函数值和所述多个分类损失函数值,计算所述人脸识别模型的目标损失函数值;Calculating a target loss function value of the face recognition model according to the multiple feature vector loss function values and the multiple classification loss function values;
    根据所述目标损失函数值对所述主干网络进行迭代更新,直至所述目标损失函数值收敛,得到更新后的人脸识别模型。The backbone network is iteratively updated according to the value of the target loss function until the value of the target loss function converges to obtain an updated face recognition model.
  16. 根据权利要求15所述的计算机可读存储介质,所述计算机可读存储介质执行所述计算机指令实现所述通过预置的人脸识别模型中的主干网络,对所述多个训练数据集分别进行人脸特征提取,得到多个特征集时,包括以下步骤:The computer-readable storage medium according to claim 15, wherein the computer-readable storage medium executes the computer instructions to implement the backbone network in the preset face recognition model, and performs the respective training data sets for the plurality of training data sets. When performing face feature extraction to obtain multiple feature sets, the following steps are included:
    获取所述多个训练数据集的数据集个数,并根据所述数据集个数计算每个训练数据集的平均数据量;Acquiring the number of data sets of the multiple training data sets, and calculating the average data amount of each training data set according to the number of data sets;
    将所述平均数据量对应的训练数据作为批处理的数据,得到各训练数据集对应的目标批处理数据;Use the training data corresponding to the average data volume as batch data to obtain target batch data corresponding to each training data set;
    通过预置的人脸识别模型中的主干网络,对所述目标批处理数据依次进行人脸图像区域检测、人脸关键点检测和人脸特征向量提取,得到多个特征集。Through the backbone network in the preset face recognition model, face image region detection, face key point detection, and face feature vector extraction are sequentially performed on the target batch data to obtain multiple feature sets.
  17. 根据权利要求16所述的计算机可读存储介质,所述计算机可读存储介质执行所述计算机指令实现所述根据所述多个特征向量损失函数值和所述多个分类损失函数值,计算所述人脸识别模型的目标损失函数值时,包括以下步骤:The computer-readable storage medium according to claim 16, wherein the computer-readable storage medium executes the computer instructions to implement the calculation of the loss function value based on the plurality of feature vector loss function values and the plurality of classification loss function values When describing the objective loss function value of the face recognition model, the following steps are included:
    根据所述数据集个数计算所述多个特征向量损失函数值的均值,得到平均特征向量损失函数值;Calculating an average value of the multiple eigenvector loss function values according to the number of the data sets to obtain an average eigenvector loss function value;
    根据所述数据集个数计算所述多个分类损失函数值的均值,得到平均分类损失函数值;Calculating an average value of the multiple classification loss function values according to the number of the data sets to obtain an average classification loss function value;
    计算所述平均特征向量损失函数值和所述平均分类损失函数值的和值,得到所述人脸识别模型的目标损失函数值。The sum of the average feature vector loss function value and the average classification loss function value is calculated to obtain the target loss function value of the face recognition model.
  18. 根据权利要求16所述的计算机可读存储介质,所述计算机可读存储介质执行所述计算机指令实现所述计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值时,包括以下步骤:The computer-readable storage medium according to claim 16, wherein the computer-readable storage medium executes the computer instructions to implement the calculation of the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and Calculate the classification loss function value of each classification data set, and when multiple classification loss function values are obtained, the following steps are included:
    计算每个特征集对应的第一特征中心向量,以及所述多个特征集对应的第二特征中心向量;Calculating a first feature center vector corresponding to each feature set, and a second feature center vector corresponding to the multiple feature sets;
    计算每个特征集对应的第一特征中心向量与所述第二特征中心向量之间的距离值,并将所述距离值确定为每个特征集的特征向量损失函数值,得到多个特征向量损失函数值;Calculate the distance value between the first feature center vector corresponding to each feature set and the second feature center vector, and determine the distance value as the feature vector loss function value of each feature set to obtain multiple feature vectors Loss function value;
    获取每个训练数据集中各训练数据对应的预置标签,根据所述预置标签和预置的交叉熵损失函数,计算每个分类数据集的分类损失函数值,得到多个分类损失函数值。Obtain a preset label corresponding to each training data in each training data set, and calculate the classification loss function value of each classification data set according to the preset label and the preset cross-entropy loss function to obtain multiple classification loss function values.
  19. 根据权利要求18所述的计算机可读存储介质,所述计算机可读存储介质执行所述计算机指令实现所述根据所述预置标签和预置的交叉熵损失函数,计算每个分类数据集的分类损失函数值,得到多个分类损失函数值时,包括以下步骤:The computer-readable storage medium according to claim 18, wherein the computer-readable storage medium executes the computer instructions to realize the calculation of each classification data set according to the preset label and the preset cross-entropy loss function Classification loss function value. When multiple classification loss function values are obtained, the following steps are included:
    统计每个分类数据集中所述预置标签的标签个数,以及获取每个分类数据集中所述预置标签对应的分类数据的特征向量;Counting the number of tags of the preset tags in each classification data set, and obtaining the feature vector of the classification data corresponding to the preset tags in each classification data set;
    根据预置的交叉熵损失函数、所述标签个数和所述特征向量,计算每个分类数据集的分类损失函数值,得到多个分类损失函数值,所述交叉熵损失函数如下:According to the preset cross-entropy loss function, the number of tags, and the feature vector, the classification loss function value of each classification data set is calculated to obtain multiple classification loss function values, and the cross-entropy loss function is as follows:
    Figure PCTCN2020122376-appb-100003
    Figure PCTCN2020122376-appb-100003
    其中,所述y表示第y个训练数据集,所述c y为第y个训练数据集对应的分类数据集,所述n y为所述标签个数,所述label i为第i个分类的预置标签,所述v i为所述特征向量。 Wherein, the y represents the y-th training data set, the c y is the classification data set corresponding to the y-th training data set, the n y is the number of the labels, and the label i is the i-th classification preset tag, the v i is the feature vector.
  20. 一种人脸识别模型的训练装置,其特征在于,所述人脸识别模型的训练装置包括:A training device for a face recognition model, characterized in that the training device for a face recognition model includes:
    获取模块,用于获取预处理后的多个训练数据集,所述多个训练数据集为多个应用场景分别对应的人脸训练数据集;An obtaining module, configured to obtain a plurality of preprocessed training data sets, where the plurality of training data sets are face training data sets corresponding to a plurality of application scenarios;
    特征提取模块,用于通过预置的人脸识别模型中的主干网络,对所述多个训练数据集分别进行人脸特征提取,得到多个特征集,所述人脸识别模型包括主干网络和多个分类网络;The feature extraction module is used to extract the facial features of the multiple training data sets through the backbone network in the preset face recognition model to obtain multiple feature sets. The face recognition model includes the backbone network and Multiple classification networks;
    分类模块,用于通过所述多个分类网络对所述多个特征集进行分类,得到多个分类数据集,其中,一个分类网络对应一个特征集;The classification module is configured to classify the multiple feature sets through the multiple classification networks to obtain multiple classification data sets, where one classification network corresponds to one feature set;
    第一计算模块,用于计算每个特征集的特征向量损失函数值,得到多个特征向量损失函数值,并计算每个分类数据集的分类损失函数值,得到多个分类损失函数值;The first calculation module is used to calculate the feature vector loss function value of each feature set to obtain multiple feature vector loss function values, and calculate the classification loss function value of each classification data set to obtain multiple classification loss function values;
    第二计算模块,用于根据所述多个特征向量损失函数值和所述多个分类损失函数值,计算所述人脸识别模型的目标损失函数值;The second calculation module is configured to calculate the target loss function value of the face recognition model according to the multiple feature vector loss function values and the multiple classification loss function values;
    迭代更新模块,用于根据所述目标损失函数值对所述主干网络进行迭代更新,直至所述目标损失函数值收敛,得到更新后的人脸识别模型。An iterative update module is used to iteratively update the backbone network according to the target loss function value until the target loss function value converges to obtain an updated face recognition model.
PCT/CN2020/122376 2020-07-31 2020-10-21 Method, apparatus and device for training facial recognition model, and storage medium WO2021139309A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010760772.0A CN111898547B (en) 2020-07-31 2020-07-31 Training method, device, equipment and storage medium of face recognition model
CN202010760772.0 2020-07-31

Publications (1)

Publication Number Publication Date
WO2021139309A1 true WO2021139309A1 (en) 2021-07-15

Family

ID=73184137

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/122376 WO2021139309A1 (en) 2020-07-31 2020-10-21 Method, apparatus and device for training facial recognition model, and storage medium

Country Status (2)

Country Link
CN (1) CN111898547B (en)
WO (1) WO2021139309A1 (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505724A (en) * 2021-07-23 2021-10-15 上海应用技术大学 Traffic sign recognition model training method and system based on YOLOv4
CN113591637A (en) * 2021-07-20 2021-11-02 北京爱笔科技有限公司 Alignment model training method and device, computer equipment and storage medium
CN114093011A (en) * 2022-01-12 2022-02-25 北京新氧科技有限公司 Hair classification method, device, equipment and storage medium
CN114119959A (en) * 2021-11-09 2022-03-01 盛视科技股份有限公司 Vision-based garbage can overflow detection method and device
CN114255354A (en) * 2021-12-31 2022-03-29 智慧眼科技股份有限公司 Face recognition model training method, face recognition device and related equipment
CN114519757A (en) * 2022-02-17 2022-05-20 巨人移动技术有限公司 Face pinching processing method
CN114764899A (en) * 2022-04-12 2022-07-19 华南理工大学 Method for predicting next interactive object based on transform first visual angle
CN115130539A (en) * 2022-04-21 2022-09-30 腾讯科技(深圳)有限公司 Classification model training method, data classification device and computer equipment
CN115641637A (en) * 2022-11-11 2023-01-24 杭州海量信息技术有限公司 Face recognition method and system for mask
CN116110100A (en) * 2023-01-14 2023-05-12 深圳市大数据研究院 Face recognition method, device, computer equipment and storage medium
WO2023118768A1 (en) * 2021-12-24 2023-06-29 Unissey Device and method for processing human face image data
CN116452922A (en) * 2023-06-09 2023-07-18 深圳前海环融联易信息科技服务有限公司 Model training method, device, computer equipment and readable storage medium
CN116453201A (en) * 2023-06-19 2023-07-18 南昌大学 Face recognition method and system based on adjacent edge loss
CN116484005A (en) * 2023-06-25 2023-07-25 北京中关村科金技术有限公司 Classification model construction method, device and storage medium
WO2023193474A1 (en) * 2022-04-08 2023-10-12 马上消费金融股份有限公司 Information processing method and apparatus, computer device, and storage medium
CN116994309A (en) * 2023-05-06 2023-11-03 浙江大学 Face recognition model pruning method for fairness perception
CN117435906A (en) * 2023-12-18 2024-01-23 湖南行必达网联科技有限公司 New energy automobile configuration feature selection method based on cross entropy

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257689A (en) * 2020-12-18 2021-01-22 北京京东尚科信息技术有限公司 Training and recognition method of face recognition model, storage medium and related equipment
CN112561062B (en) * 2020-12-18 2023-10-31 北京百度网讯科技有限公司 Neural network training method, device, computer equipment and storage medium
CN113221662B (en) * 2021-04-14 2022-09-27 上海芯翌智能科技有限公司 Training method and device of face recognition model, storage medium and terminal
CN113239876B (en) * 2021-06-01 2023-06-02 平安科技(深圳)有限公司 Training method for large-angle face recognition model
CN115797732B (en) * 2023-02-15 2023-06-09 杭州实在智能科技有限公司 Image retrieval model training method and system for open class scene

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109583322A (en) * 2018-11-09 2019-04-05 长沙小钴科技有限公司 A kind of recognition of face depth network training method and system
CN109815801A (en) * 2018-12-18 2019-05-28 北京英索科技发展有限公司 Face identification method and device based on deep learning
CN110929802A (en) * 2019-12-03 2020-03-27 北京迈格威科技有限公司 Information entropy-based subdivision identification model training and image identification method and device
CN111104874A (en) * 2019-12-03 2020-05-05 北京金山云网络技术有限公司 Face age prediction method, training method and device of model and electronic equipment
CN111260032A (en) * 2020-01-14 2020-06-09 北京迈格威科技有限公司 Neural network training method, image processing method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108647583B (en) * 2018-04-19 2022-02-22 浙江大承机器人科技有限公司 Face recognition algorithm training method based on multi-target learning
WO2019223582A1 (en) * 2018-05-24 2019-11-28 Beijing Didi Infinity Technology And Development Co., Ltd. Target detection method and system
CN109934197B (en) * 2019-03-21 2023-07-07 深圳力维智联技术有限公司 Training method and device for face recognition model and computer readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109583322A (en) * 2018-11-09 2019-04-05 长沙小钴科技有限公司 A kind of recognition of face depth network training method and system
CN109815801A (en) * 2018-12-18 2019-05-28 北京英索科技发展有限公司 Face identification method and device based on deep learning
CN110929802A (en) * 2019-12-03 2020-03-27 北京迈格威科技有限公司 Information entropy-based subdivision identification model training and image identification method and device
CN111104874A (en) * 2019-12-03 2020-05-05 北京金山云网络技术有限公司 Face age prediction method, training method and device of model and electronic equipment
CN111260032A (en) * 2020-01-14 2020-06-09 北京迈格威科技有限公司 Neural network training method, image processing method and device

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113591637A (en) * 2021-07-20 2021-11-02 北京爱笔科技有限公司 Alignment model training method and device, computer equipment and storage medium
CN113505724A (en) * 2021-07-23 2021-10-15 上海应用技术大学 Traffic sign recognition model training method and system based on YOLOv4
CN113505724B (en) * 2021-07-23 2024-04-19 上海应用技术大学 YOLOv 4-based traffic sign recognition model training method and system
CN114119959A (en) * 2021-11-09 2022-03-01 盛视科技股份有限公司 Vision-based garbage can overflow detection method and device
WO2023118768A1 (en) * 2021-12-24 2023-06-29 Unissey Device and method for processing human face image data
FR3131419A1 (en) * 2021-12-24 2023-06-30 Unissey Device and method for processing human face image data
CN114255354A (en) * 2021-12-31 2022-03-29 智慧眼科技股份有限公司 Face recognition model training method, face recognition device and related equipment
CN114093011A (en) * 2022-01-12 2022-02-25 北京新氧科技有限公司 Hair classification method, device, equipment and storage medium
CN114093011B (en) * 2022-01-12 2022-05-06 北京新氧科技有限公司 Hair classification method, device, equipment and storage medium
CN114519757A (en) * 2022-02-17 2022-05-20 巨人移动技术有限公司 Face pinching processing method
WO2023193474A1 (en) * 2022-04-08 2023-10-12 马上消费金融股份有限公司 Information processing method and apparatus, computer device, and storage medium
CN114764899A (en) * 2022-04-12 2022-07-19 华南理工大学 Method for predicting next interactive object based on transform first visual angle
CN114764899B (en) * 2022-04-12 2024-03-22 华南理工大学 Method for predicting next interaction object based on transformation first view angle
CN115130539A (en) * 2022-04-21 2022-09-30 腾讯科技(深圳)有限公司 Classification model training method, data classification device and computer equipment
CN115641637A (en) * 2022-11-11 2023-01-24 杭州海量信息技术有限公司 Face recognition method and system for mask
CN116110100B (en) * 2023-01-14 2023-11-14 深圳市大数据研究院 Face recognition method, device, computer equipment and storage medium
CN116110100A (en) * 2023-01-14 2023-05-12 深圳市大数据研究院 Face recognition method, device, computer equipment and storage medium
CN116994309B (en) * 2023-05-06 2024-04-09 浙江大学 Face recognition model pruning method for fairness perception
CN116994309A (en) * 2023-05-06 2023-11-03 浙江大学 Face recognition model pruning method for fairness perception
CN116452922A (en) * 2023-06-09 2023-07-18 深圳前海环融联易信息科技服务有限公司 Model training method, device, computer equipment and readable storage medium
CN116452922B (en) * 2023-06-09 2023-09-22 深圳前海环融联易信息科技服务有限公司 Model training method, device, computer equipment and readable storage medium
CN116453201A (en) * 2023-06-19 2023-07-18 南昌大学 Face recognition method and system based on adjacent edge loss
CN116453201B (en) * 2023-06-19 2023-09-01 南昌大学 Face recognition method and system based on adjacent edge loss
CN116484005B (en) * 2023-06-25 2023-09-08 北京中关村科金技术有限公司 Classification model construction method, device and storage medium
CN116484005A (en) * 2023-06-25 2023-07-25 北京中关村科金技术有限公司 Classification model construction method, device and storage medium
CN117435906A (en) * 2023-12-18 2024-01-23 湖南行必达网联科技有限公司 New energy automobile configuration feature selection method based on cross entropy
CN117435906B (en) * 2023-12-18 2024-03-12 湖南行必达网联科技有限公司 New energy automobile configuration feature selection method based on cross entropy

Also Published As

Publication number Publication date
CN111898547A (en) 2020-11-06
CN111898547B (en) 2024-04-16

Similar Documents

Publication Publication Date Title
WO2021139309A1 (en) Method, apparatus and device for training facial recognition model, and storage medium
WO2021077984A1 (en) Object recognition method and apparatus, electronic device, and readable storage medium
Jalilian et al. Iris segmentation using fully convolutional encoder–decoder networks
WO2023000574A1 (en) Model training method, apparatus and device, and readable storage medium
CN109344731B (en) Lightweight face recognition method based on neural network
WO2017166586A1 (en) Image identification method and system based on convolutional neural network, and electronic device
CN111079639A (en) Method, device and equipment for constructing garbage image classification model and storage medium
KR101183391B1 (en) Image comparison by metric embeddings
CN110929848B (en) Training and tracking method based on multi-challenge perception learning model
CN105608471A (en) Robust transductive label estimation and data classification method and system
CN109740679B (en) Target identification method based on convolutional neural network and naive Bayes
CN110751027B (en) Pedestrian re-identification method based on deep multi-instance learning
EP0628190A1 (en) Method of forming a template
WO2022042043A1 (en) Machine learning model training method and apparatus, and electronic device
CN109711366A (en) A kind of recognition methods again of the pedestrian based on group information loss function
CN113361334A (en) Convolutional pedestrian re-identification method and system based on key point optimization and multi-hop attention intention
WO2023134084A1 (en) Multi-label identification method and apparatus, electronic device, and storage medium
CN110516533A (en) A kind of pedestrian based on depth measure discrimination method again
Liu et al. Learning 2d-3d correspondences to solve the blind perspective-n-point problem
TW202217597A (en) Image incremental clustering method, electronic equipment, computer storage medium thereof
CN115457332A (en) Image multi-label classification method based on graph convolution neural network and class activation mapping
CN112668482A (en) Face recognition training method and device, computer equipment and storage medium
WO2023040195A1 (en) Object recognition method and apparatus, network training method and apparatus, device, medium, and product
WO2021253938A1 (en) Neural network training method and apparatus, and video recognition method and apparatus
CN110598727B (en) Model construction method based on transfer learning, image recognition method and device thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20911410

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20911410

Country of ref document: EP

Kind code of ref document: A1