WO2020093303A1 - 基于人脸识别的处理方法、装置、设备及可读存储介质 - Google Patents
基于人脸识别的处理方法、装置、设备及可读存储介质 Download PDFInfo
- Publication number
- WO2020093303A1 WO2020093303A1 PCT/CN2018/114537 CN2018114537W WO2020093303A1 WO 2020093303 A1 WO2020093303 A1 WO 2020093303A1 CN 2018114537 W CN2018114537 W CN 2018114537W WO 2020093303 A1 WO2020093303 A1 WO 2020093303A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- picture
- euclidean distance
- feature extraction
- face
- loss function
- Prior art date
Links
- 230000001815 facial effect Effects 0.000 title claims abstract description 59
- 238000003672 processing method Methods 0.000 title claims abstract description 19
- 238000000605 extraction Methods 0.000 claims abstract description 89
- 238000012549 training Methods 0.000 claims abstract description 80
- 238000000034 method Methods 0.000 claims abstract description 33
- 230000006870 function Effects 0.000 claims description 79
- 238000012545 processing Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 12
- 239000000284 extract Substances 0.000 claims description 9
- 238000004891 communication Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000012806 monitoring device Methods 0.000 description 3
- 238000013145 classification model Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
Definitions
- the present disclosure relates to the technical field of face recognition, for example, to a processing method, device, device and readable storage medium based on face recognition.
- face recognition is divided into four processes: face detection, face alignment, feature extraction, and feature matching.
- feature extraction is the most critical step in face recognition, and the extracted features are more biased towards the "unique" features of the face, which plays an important role in feature matching.
- most face recognition models rely on deep neural network technology, which is a method that requires training data to train model parameters. Therefore, training data has a great impact on the performance of the model.
- face recognition usually uses ordinary face data sets collected from the Internet to train the face feature extraction model.
- this kind of face data set there are usually dozens or even hundreds of thousands of pictures corresponding to a person.
- the face recognition model obtained by using this type of data training cannot be well adapted to face recognition in witness scenarios, resulting in people The accuracy of face recognition is low.
- the face recognition model must be trained on the "witness data set", and the feature of the witness data set is that the number of photos of each person generally contains only one life photo And an ID photo.
- An embodiment of the present disclosure provides a processing method based on face recognition, including:
- the face feature extraction model is retrained by using the witness data set and the Triplet-based second loss function to obtain the final face feature extraction model.
- An embodiment of the present disclosure also provides a processing device based on face recognition, including:
- the first training module is used to train the face feature extraction model using the face data set and the first loss function based on Softmax;
- the second training module is used to retrain the face feature extraction model by using the witness data set and the Triplet-based second loss function to obtain the final face feature extraction model.
- An embodiment of the present disclosure also provides a computer device including the above-mentioned processing device based on face recognition.
- An embodiment of the present disclosure also provides a computer-readable storage medium that stores computer-executable instructions that are configured to perform the above-described processing method based on face recognition.
- An embodiment of the present disclosure also provides a computer program product.
- the computer program product includes a computer program stored on a computer-readable storage medium.
- the computer program includes program instructions. When the program instructions are executed by a computer, the The computer executes the above-mentioned processing method based on face recognition.
- An embodiment of the present disclosure also provides an electronic device, including:
- At least one processor At least one processor
- a memory communicatively connected to the at least one processor; wherein,
- the memory stores instructions executable by the at least one processor.
- the at least one processor is caused to execute the above-described processing method based on face recognition.
- Embodiments of the present disclosure provide a processing method, device, equipment, and readable storage medium based on face recognition, by first using a face data set and a first loss function based on Softmax to train a face feature extraction model, and then using a witness
- the data set and the Triplet-based second loss function are used to retrain the face feature extraction model to obtain the final face feature extraction model.
- the ordinary face data set is first used to train the first loss function based on Softmax for model training , Can accelerate the subsequent convergence speed when using the witness data set based on Triplet's second loss function for model training, and improve the efficiency of model training; and, can make full use of the data characteristics of ordinary face data sets and witness data sets, even There are few witness data sets.
- FIG. 1 is a flowchart of a processing method based on face recognition provided by an embodiment of the present disclosure
- FIG. 2 is a flowchart of another processing method based on face recognition provided by an embodiment of the present disclosure
- FIG. 3 is a schematic structural diagram of a processing device based on face recognition provided by an embodiment of the present disclosure
- FIG. 4 is a schematic structural diagram of another processing device based on face recognition according to an embodiment of the present disclosure.
- FIG. 5 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
- FIG. 1 is a flowchart of a processing method based on face recognition provided by an embodiment of the present disclosure. As shown in FIG. 1, the method in this embodiment may include:
- Step S101 Use the face data set and the first loss function based on Softmax to train a face feature extraction model.
- face recognition is divided into four processes: face detection, face alignment, feature extraction, and feature matching.
- feature extraction is the most critical step in face recognition, and the extracted features are more biased towards the "unique" features of the face, which plays an important role in feature matching.
- modifying the loss function is another option. Optimizing the loss function can enable the model to learn more valuable information from existing data.
- the face data set includes face images of multiple people, and each person has multiple face images.
- the face pictures in the face data set include both the ID pictures used for ID cards, passports, driving licenses and other documents, as well as the non-ID pictures including the life photos of the face area.
- the undocumented pictures can be single photos or group photos.
- the face image in the face data set can be a face image downloaded from the network and subjected to data cleaning, or it can also be a training data set used for training a face recognition model in the prior art, such as LFW (LabeledFaces) in the Wild) data set, MegaFace data set, VGG2Face face data set, etc., this embodiment is not specifically limited here.
- the face feature extraction model may use a network structure based on residual network (ResNet).
- the face feature extraction model may be a residual network or a network based on various variants of the residual network, for example, it may be a deep residual network or a deep pyramid residual network, etc.
- ResNet residual network
- the face feature extraction model may be a residual network or a network based on various variants of the residual network, for example, it may be a deep residual network or a deep pyramid residual network, etc.
- the face feature extraction model can use a 100-layer residual network, abbreviated as ResNet-100.
- the first loss function based on Softmax may be Softmax Loss, or other loss functions improved based on Softmax Loss, such as Cosine Loss, etc.
- the face data set and the first loss function based on Softmax are used to train a face feature extraction model, which may be implemented as follows:
- the convergence condition can be set by a technician according to actual needs, and this embodiment is not specifically limited here.
- the classification model based on the first loss function of Softmax uses the Softmax regression classification algorithm for classification training.
- the cost function of the Softmax regression algorithm can be set by the technician according to experience and actual needs, which is not specifically limited here in this embodiment.
- the face data set and the first loss function based on Softmax are used to train the face feature extraction model, and any one of the existing technologies can also be used to classify the face using the loss function based on Softmax
- the method for training the model is implemented, which will not be repeated here in this embodiment.
- Step S102 Re-train the face feature extraction model using the witness data set and the Triplet-based second loss function to obtain the final face feature extraction model.
- the personal witness data set includes multiple people's face pictures, and each person's face picture includes at least one document picture and one non-document picture.
- the face picture of each person in the ID card data set in this embodiment may include an ID picture and a non-ID document image.
- the second loss function based on Triplet may be Triplet Loss or other loss functions after modification or improvement based on Triplet Loss.
- TripletLoss is a loss function in deep learning, used to train samples with small differences, such as faces.
- TripletLoss is based on the triplet (Triplet) composed of three pictures to calculate the loss (Loss).
- the triplet is composed of an anchor (Anchor) example, a positive (Positive) example and a negative (Negative) example.
- the face feature of can be used as an anchor example.
- the second picture that belongs to the same person as the picture is its positive example, and the third picture that does not belong to the same person as the picture is its negative example.
- the face feature extraction model is retrained using the witness data set and the Triplet-based second loss function to obtain the final face feature extraction model, which can be implemented as follows:
- Each set of training data includes three training pictures, of which two training pictures belong to the same person; face feature extraction obtained after training in step S101
- the model extracts the facial features of the three training pictures in each set of training data.
- the facial features of the three pictures of each set of training data form a triplet, resulting in multiple triplets; the triplet input is based on Triplet ’s
- the Triplet model of the two-loss function re-trains the model until the model meets the preset conditions, the final parameters of the face feature extraction model are obtained, and the face feature extraction model is maintained.
- triplet input Triplet model based on Triplet's second loss function for model retraining. Specifically, it can be implemented as follows:
- the preset condition may be set by a technician according to actual needs, for example, it may be a set convergence condition, which is not specifically limited here in this embodiment.
- the first training picture and the second training picture are the certificate picture and the non-document picture of the same person, and the third training picture and the second training picture do not belong to the same person.
- calculating the value of the second loss function corresponding to multiple triples may be implemented in the following manner:
- the first Euclidean distance refers to the Euclidean distance between the face features of two training pictures belonging to the same person in the triplet
- the second Euclidean distance Distance refers to the Euclidean distance between the face features of two training pictures that do not belong to the same person in the triplet
- multiple triplets are calculated based on the first and second Euclidean distances of each triplet Corresponding second loss function value.
- the corresponding Euclidean distance and the second Euclidean distance of each triplet can be used to calculate the corresponding The second loss function value, according to the second loss function value corresponding to all triples, and judge whether it meets the preset condition, if it meets the preset condition, the current face feature extraction model is used as the final face feature extraction model ; If the preset conditions are not met, jump to the step of selecting multiple triples from the witness data set, and continue to train the face feature extraction model.
- the triples can also be filtered according to the first Euclidean distance and the second Euclidean distance of each triple. Filter out the effective triples with the first Euclidean distance less than the second Euclidean distance; then calculate the corresponding second loss function value of the effective triples according to the first Euclidean distance and the second Euclidean distance of the effective triplet; and judge Whether the corresponding second loss function value of the effective triplet meets the preset condition, if it meets the preset condition, the current face feature extraction model is used as the final face feature extraction model; if it does not meet the preset condition, then Jump to perform the steps of selecting multiple triples from the witness data set, and continue to train the face feature extraction model. By screening the triples, the effective triples that are more helpful for optimizing the face feature extraction model are obtained, and the face feature extraction model obtained through the training of these effective triples is better.
- the face feature extraction model is first trained by using the face data set and the first loss function based on Softmax, and then the face feature extraction model is retrained by using the witness data set and the second loss function based on Triplet ,
- the witness data set trains the face feature model, which improves the accuracy of extracting face features when the face feature model is applied to the "personal witness scene" above, and can improve the accuracy of face recognition.
- FIG. 2 is a flowchart of another processing method based on face recognition provided by an embodiment of the present disclosure. Based on the embodiment shown in FIG. 1, in this embodiment, after training to obtain the final face feature extraction model, it also includes a process of using the face feature extraction model to perform face recognition. As shown in Figure 2, the method also includes the following steps:
- Step S201 Acquire a character picture to be processed.
- the face feature extraction model in this embodiment can be applied to a variety of face recognition system application scenarios for security monitoring and security protection.
- the image of the person to be processed can be obtained through the image acquisition device.
- the image acquisition device can be pre-arranged A monitoring device, or a monitoring device pre-installed in a community, a road, etc .; the image of the person to be processed may be a picture of the control object captured by the monitoring device, or a picture of the suspected escape captured by the roadside, etc.
- the method for obtaining the task picture to be processed and the specific type of the character picture to be processed are not specifically limited here.
- Step S202 Extract the facial features of the person picture through the facial feature extraction model.
- the character image to be processed is input into a facial feature extraction model to extract facial features of the character image.
- Step S203 Calculate the Euclidean distance between the facial features of the person picture and the facial features of each ID picture in the picture library.
- Step S204 According to the Euclidean distance between the facial features of the character picture and the facial features of each ID picture in the picture library, determine the ID picture matching the character picture.
- the picture library includes the identification pictures of all persons with known identities, and there is at least one identification picture for each person.
- the facial feature extraction model can be used to extract the facial features of each ID picture in the image library of the facial recognition system and store them so that the During face recognition, you can directly read the face features of each ID picture in the picture library, which can improve the efficiency of face recognition.
- any one of the techniques in the prior art can be used to calculate the Euclidean between the feature data of two pictures
- the distance method is implemented, and this embodiment will not repeat them here.
- this step can be implemented as follows:
- the European distance between the facial features of the character pictures can reflect the difference between the faces of the two pictures.
- the ID picture with the smallest Euclidean distance from the face features of the character picture is determined as the ID picture matching the character picture, and the only ID picture matching the character picture is obtained.
- this step can also be implemented as follows:
- the ID picture with the Euclidean distance from the face features of the person picture less than the preset distance threshold is determined as At least one ID picture matching the character picture is obtained from the ID picture matching the character picture.
- the preset distance threshold can be calculated by the technician according to the last round of training or verification of the face feature extraction model in the embodiment shown in FIG. 1 above.
- the Euclidean distance between the face features of the picture and the Euclidean distance between the face features of two training pictures that do not belong to the same person are statistically analyzed; or, the preset distance threshold can be determined by the technician according to actual experience
- the setting is not specifically limited in this embodiment.
- the face feature extraction model is trained by using the face data set and the first loss function based on Softmax, and then the face feature extraction model is retrained using the witness data set and the Triplet-based second loss function.
- the obtained face feature extraction model, using this face feature extraction model can accurately extract the face features of the to-be-processed person picture and the face features of the ID picture in the picture library, and further accurately match the person picture with the picture library
- the ID pictures in the matching are used to improve the accuracy of face recognition.
- FIG. 3 is a schematic structural diagram of a processing device based on face recognition provided by an embodiment of the present disclosure.
- the processing device based on face recognition provided by the embodiment of the present disclosure may execute the processing flow provided by the embodiment of the processing method based on face recognition.
- the processing device 30 based on face recognition includes: a first training module 301 and a second training module 302.
- the first training module 301 is used to train a face feature extraction model using a face data set and a first loss function based on Softmax.
- the second training module 302 is used to retrain the face feature extraction model using the witness data set and the Triplet-based second loss function to obtain the final face feature extraction model.
- the witness data set includes face pictures of multiple persons, and each person's face picture includes at least one document picture and one non-document picture.
- the second training module 302 is also used to:
- Each set of training data includes three training pictures, two of which belong to the same person; the face feature extraction model extracts the faces of the three training pictures in each set of training data Feature, the facial features of the three pictures of each set of training data form a triplet, and multiple triplets are obtained; the second loss function value corresponding to multiple triplets is calculated; if the multiple triplets correspond to the first If the second loss function value meets the preset condition, the current face feature extraction model is used as the final face feature extraction model; if the second loss function value corresponding to multiple triples does not meet the preset condition, jump execution Steps for selecting multiple triples from the witness data set.
- the first training picture and the second training picture are the certificate picture and the non-document picture of the same person, and the third training picture and the second training picture do not belong to the same person.
- the second training module 302 is also used to:
- the first Euclidean distance refers to the Euclidean distance between the face features of two training pictures belonging to the same person in the triplet
- the second Euclidean distance Distance refers to the Euclidean distance between the face features of two training pictures that do not belong to the same person in the triplet
- multiple triplets are calculated based on the first and second Euclidean distances of each triplet Corresponding second loss function value.
- the second training module 302 is also used to:
- the second loss function value corresponding to each triple is calculated.
- the second training module 302 is also used to:
- the effective triples with the first Euclidean distance smaller than the second Euclidean distance are selected; according to the first Euclidean distance and the second Euclidean distance of the effective triplet , Calculate the corresponding second loss function value of the effective triplet.
- the first loss function is Softmax Loss or Cosine Loss.
- the face feature extraction model uses a network structure based on the residual network ResNet.
- the device provided by the embodiment of the present disclosure may be specifically used to execute the method embodiment shown in FIG. 1, and specific functions will not be repeated here.
- the face feature extraction model is first trained by using the face data set and the first loss function based on Softmax, and then the face feature extraction model is retrained by using the witness data set and the second loss function based on Triplet ,
- the first loss function based on Softmax for model training, which can accelerate the subsequent convergence of the model training based on the Triplet second loss function of the witness data set Speed, improve the efficiency of model training; and, can make full use of the data characteristics of ordinary face data sets and witness data sets, even if there are few witness data sets, you can make good use of these through Triplet-based second loss function
- the witness data set trains the face feature model, which improves the accuracy of extracting face features when the face feature model is applied to the "personal witness scene" above, and can improve the accuracy of face recognition.
- the processing device 30 based on face recognition further includes: a face recognition module 303.
- the face recognition module 303 is used to:
- the face recognition module 303 is also used to:
- the facial feature extraction model extracts the facial features of each ID picture in the image library.
- the face recognition module 303 is also used to:
- the ID picture with the smallest Euclidean distance from the facial features of the character picture is determined to match the character picture ID picture.
- the face recognition module 303 is also used to:
- the ID picture with the Euclidean distance between the face features of the person picture and less than the preset distance threshold is determined as ID picture matching the character picture.
- the device provided by the embodiment of the present disclosure may be specifically used to execute the method embodiment shown in FIG. 2, and specific functions will not be repeated here.
- the face feature extraction model is trained by using the face data set and the first loss function based on Softmax, and then the face feature extraction model is retrained using the witness data set and the Triplet-based second loss function.
- the obtained face feature extraction model, using this face feature extraction model can accurately extract the face features of the to-be-processed person picture and the face features of the ID picture in the picture library, and further accurately match the person picture with the picture library
- the ID pictures in the matching are used to improve the accuracy of face recognition.
- An embodiment of the present disclosure also provides a computer device, including a processing device based on face recognition provided in any of the above embodiments.
- An embodiment of the present disclosure also provides a computer-readable storage medium that stores computer-executable instructions that are configured to perform the above-described processing method based on face recognition.
- An embodiment of the present disclosure also provides a computer program product.
- the computer program product includes a computer program stored on a computer-readable storage medium.
- the computer program includes program instructions. When the program instructions are executed by a computer, the The computer executes the aforementioned processing method based on face recognition.
- the aforementioned computer-readable storage medium may be a transient computer-readable storage medium or a non-transitory computer-readable storage medium.
- An embodiment of the present disclosure also provides an electronic device, whose structure is shown in FIG. 5, the electronic device includes:
- At least one processor (processor) 100 one processor 100 is taken as an example in FIG. 5; and the memory (memory) 101 may further include a communication interface (Communication) Interface 102 and a bus 103.
- the processor 100, the communication interface 102, and the memory 101 can complete communication with each other through the bus 103.
- the communication interface 102 can be used for information transmission.
- the processor 100 may call logic instructions in the memory 101 to execute the processing method based on face recognition in the above embodiments.
- logic instructions in the memory 101 described above can be implemented in the form of software functional units and sold or used as independent products, and can be stored in a computer-readable storage medium.
- the memory 101 is a computer-readable storage medium and can be used to store software programs and computer-executable programs, such as program instructions / modules corresponding to the methods in the embodiments of the present disclosure.
- the processor 100 executes functional applications and data processing by running software programs, instructions, and modules stored in the memory 101, that is, implementing the processing method based on face recognition in the foregoing method embodiments.
- the memory 101 may include a storage program area and a storage data area, wherein the storage program area may store an operating system and application programs required for at least one function; the storage data area may store data created according to the use of a terminal device and the like.
- the memory 101 may include a high-speed random access memory, and may also include a non-volatile memory.
- the technical solutions of the embodiments of the present disclosure may be embodied in the form of software products, which are stored in a storage medium and include one or more instructions to make a computer device (which may be a personal computer, server, or network) Equipment, etc.) to perform all or part of the steps of the method described in the embodiments of the present disclosure.
- the aforementioned storage medium may be a non-transitory storage medium, including: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk, etc.
- a medium that can store program codes may also be a transient storage medium.
- first, second, etc. may be used in this application to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another.
- the first element can be called the second element, and likewise, the second element can be called the first element, as long as all occurrences of the "first element” are consistently renamed and all occurrences of The “second component” can be renamed consistently.
- the first element and the second element are both elements, but they may not be the same element.
- the various aspects, implementations, implementations or features in the described embodiments can be used alone or in any combination.
- Various aspects in the described embodiments may be implemented by software, hardware, or a combination of software and hardware.
- the described embodiments may also be embodied by a computer-readable medium that stores computer-readable code including instructions executable by at least one computing device.
- the computer-readable medium can be associated with any data storage device capable of storing data, which can be read by a computer system.
- Computer-readable media used for examples may include read-only memory, random access memory, CD-ROM, HDD, DVD, magnetic tape, optical data storage devices, and the like.
- the computer-readable medium may also be distributed in computer systems connected through a network, so that computer-readable codes can be stored and executed in a distributed manner.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
Abstract
一种基于人脸识别的处理方法、装置、设备及可读存储介质,该方法通过首先利用人脸数据集和基于Softmax的第一损失函数训练人脸特征提取模型(S101),然后利用人证数据集和基于Triplet的第二损失函数对所述人脸特征提取模型进行模型再训练,得到最终的人脸特征提取模型(S102),能够加速使用人证数据集基于Triplet的第二损失函数进行模型训练时的收敛速度,提高模型训练效率;即使人证数据集较少也可以通过基于Triplet的第二损失函数很好地利用这些人证数据集对人脸特征模型进行训练,能充分利用人脸数据集和人证数据集的数据特性,提高了人脸特征模型在应用于"人证场景"时提取人脸特征的精度,能够提高人脸识别的准确率。
Description
本公开涉及人脸识别技术领域,例如涉及一种基于人脸识别的处理方法、装置、设备及可读存储介质。
在安全监控、安全防护等应用场景中,为了对某一人物(例如逃犯、嫌疑人等)进行追踪,往往需要根据抓拍的人物图片,识别出数据库中与该人物图片匹配的证件图片,以确定人物身份,这里可以将这种通过一张人物图片来匹配证件图片的人脸识别场景称为“人证场景”。
通常人脸识别分为四个过程:人脸检测、人脸对齐、特征提取、特征匹配。其中,特征提取作为人脸识别最关键的步骤,提取到的特征更偏向于该人脸“独有”的特征,对于特征匹配起到举足轻重的作用。而目前绝大多数人脸识别模型都依靠深度神经网络技术,这是一种需要训练数据来训练模型参数的方法,因此,训练数据对模型的性能有很大的影响。
目前,人脸识别通常采用从网上搜集得到的普通人脸数据集进行人脸特征提取模型的训练。这种人脸数据集中,通常一个人对应有几十甚至几百上千张图片,使用这类数据训练得到的人脸识别模型,并不能很好的适应人证场景下的人脸识别,导致人脸识别的准确率低。要适应人证场景下的人脸识别,必须依靠”人证数据集”对人脸识别模型进行训练,而人证数据集的特征是每个人的照片数量一般情况下都仅含有一张生活照和一张证件照。
发明内容
本公开实施例提供了一种基于人脸识别的处理方法,包括:
利用人脸数据集和基于Softmax的第一损失函数训练人脸特征提取模型;
利用人证数据集和基于Triplet的第二损失函数对所述人脸特征提取模型进行模型再训练,得到最终的人脸特征提取模型。
本公开实施例还提供了一种基于人脸识别的处理装置,包括:
第一训练模块,用于利用人脸数据集和基于Softmax的第一损失函数训练人脸特征提取模型;
第二训练模块,用于利用人证数据集和基于Triplet的第二损失函数对所述人脸特征提取模型进行模型再训练,得到最终的人脸特征提取模型。
本公开实施例还提供了一种计算机设备,包含上述的基于人脸识别的处理装置。
本公开实施例还提供了一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令设置为执行上述的基于人脸识别的处理方法。
本公开实施例还提供了一种计算机程序产品,所述计算机程序产品包括存储在计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行上述的基于人脸识别的处理方法。
本公开实施例还提供了一种电子设备,包括:
至少一个处理器;以及
与所述至少一个处理器通信连接的存储器;其中,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行时,使所述至少一个处理器执行上述的基于人脸识别的处理方法。
本公开实施例提供一种基于人脸识别的处理方法、装置、设备及可读存储介质,通过首先利用人脸数据集和基于Softmax的第一损失函数训练人脸特征提取模型,然后利用人证数据集和基于Triplet的第二损失函数对所述人脸特征提取模型进行模型再训练,得到最终的人脸特征提取模型,先使用普通人脸数据集训练基于Softmax的第一损失函数进行模型训练,能够加速后续使用人证数据集基于Triplet的第二损失函数进行模型训练时的收敛速度,提高模型训练的效率;并且,能充分利用普通人脸数据集和人证数据集的数据特性,即使人证数据集较少,也可以通过基于Triplet的第二损失函数很好地利用这些人证数据集对人脸特征模型进行训练,提高了人脸特征模型在应用于上述“人证场景”时提取人脸特征的精度,能够提高人脸识别的准确率。
一个或多个实施例通过与之对应的附图进行示例性说明,这些示例性说明和附图并不构成对实施例的限定,附图中具有相同参考数字标号的元件示为类似的元件,附图不构成比例限制,并且其中:
图1为本公开实施例提供的基于人脸识别的处理方法的流程图;
图2为本公开实施例提供的另一基于人脸识别的处理方法的流程图;
图3为本公开实施例提供的基于人脸识别的处理装置的结构示意图;
图4为本公开实施例提供的另一基于人脸识别的处理装置的结构示意图;
图5为本公开实施例提供的电子设备的结构示意图。
为了能够更加详尽地了解本公开实施例的特点与技术内容,下面结合附图对本公开实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本公开实施例。在以下的技术描述中,为方便解释起见,通过多个细节以提供对所披露实施例的充分理解。然而,在没有这些细节的情况下,一个或多个实施例仍然可以实施。在其它情况下,为简化附图,熟知的结构和装置可以简化展示。
本公开实施例提供了一种基于人脸识别的处理方法,图1为本公开实施例提供的基于人脸识别的处理方法的流程图。如图1所示,本实施例中的方法,可以包括:
步骤S101、利用人脸数据集和基于Softmax的第一损失函数训练人脸特征提取模型。
通常人脸识别分为四个过程:人脸检测、人脸对齐、特征提取、特征匹配。其中,特征提取作为人脸识别最关键的步骤,提取到的特征更偏向于该人脸“独有”的特征,对于特征匹配起到举足轻重的作用。要提高人脸识别模型的性能,除了优化网络结构之外,修改损失函数是另一种选择,优化损失函数可以使模型从现有数据中学习到更多有价值的信息。
本实施例中,人脸数据集包括多个人的人脸图片,且每个人的人脸图片有多张。人脸数据集中的人脸图片既包括用于身份证、护照、驾驶证等证件的证件图片,又包括包含人脸区域的生活照等非证件图片。其中没证件图片 可以是单人照或者集体照。另外,人脸数据集中的人脸图片可以是从网络上下载并经过数据清洗的人脸图片,或者还可以是现有技术中用于人脸识别模型训练的训练数据集,例如LFW(Labeled Faces in the Wild)数据集,MegaFace数据集,VGG2Face人脸数据集等等,本实施例此处不做具体限定。
可选的,人脸特征提取模型可以采用基于残差网络(residual network,ResNet)的网络结构。具体的,人脸特征提取模型可以是残差网络,也可以是基于残差网络的各种变体的网络,例如,可以是深度残差网络,或者深度金字塔残差网络等,本实施例此处不做具体限定。
例如,人脸特征提取模型可以采用100层的残差网络,简记为ResNet-100。
可选的,基于Softmax的第一损失函数可以是Softmax Loss,或者是基于Softmax Loss改进得到的其他损失函数,例如Cosine Loss等。
本实施例中,利用人脸数据集和基于Softmax的第一损失函数训练人脸特征提取模型,具体可以采用如方式实现:
对于人脸数据集中的人脸图片,通过待训练的人脸特征提取模型提取人脸图片的人脸特征,将提取的人脸特征输入基于Softmax的第一损失函数的分类模型进行分类训练,直至分类结果满足预设的收敛条件,得到人脸特征提取模型的参数。
其中,收敛条件可以由技术人员根据实际需要进行设定,本实施例此处不做具体限定。
基于Softmax的第一损失函数的分类模型使用Softmax回归分类算法进行分类训练,Softmax回归算法的代价函数可以由技术人员根据经验和实际需要进行设定,本实施例此处不做具体限定。
需要说明的是,本实施例中,利用人脸数据集和基于Softmax的第一损失函数训练人脸特征提取模型,还可以采用现有技术中任意一种使用基于Softmax的损失函数对人脸分类模型进行训练的方法实现,本实施例此处不再赘述。
步骤S102、利用人证数据集和基于Triplet的第二损失函数对人脸特征提取模型进行模型再训练,得到最终的人脸特征提取模型。
其中,人证数据集包括多个人的人脸图片,每个人的人脸图片至少包括一张证件图片和一张非证件图片。
优选地,为了适用于只有一张证件图片(例如身份证上的人脸图片)的场景,本实施例中的人证数据集中每个人的人脸图片可以包括一张证件图片和一张非证件图片。
基于Triplet的第二损失函数可以是Triplet Loss或者是基于Triplet Loss进行变型或者改进后的其他损失函数。
其中,Triplet Loss是深度学习中的一种损失函数,用于训练差异性较小的样本,如人脸等。Triplet Loss是根据三张图片信息组成的三元组(Triplet)计算损失(Loss),三元组由锚(Anchor)示例,正(Positive)示例和负(Negative)示例组成,其中任意一张图片的人脸特征可以作为锚示例,与该图片属于同一个人的第二张图片就是它的正示例,与该图片不属于同一个人的第三张图片就是它的负示例。通过优化锚示例与正示例的距离小于锚示例与负示例的距离,实现图片的相似性计算。
具体的,利用人证数据集和基于Triplet的第二损失函数对人脸特征提取模型进行模型再训练,得到最终的人脸特征提取模型,可以采用如下方式实现:
对于认证数据集中的人脸图片,从人证数据集中选取多组训练数据,每组训练数据包括三张训练图片,其中两张训练图片属于同一个人;通过步骤S101训练后得到的人脸特征提取模型提取每组训练数据中的三张训练图片的人脸特征,每组训练数据的三张图片的人脸特征构成一个三元组,得到多个三元组;三元组输入基于Triplet的第二损失函数的Triplet模型进行模型再训练,直至模型符合预设条件,得到人脸特征提取模型的最终参数,保持人脸特征提取模型。
具体的,三元组输入基于Triplet的第二损失函数的Triplet模型进行模型再训练。具体可以采用如下方式实现:
计算多个三元组对应的第二损失函数值;判断多个三元组对应的第二损失函数值是否符合预设条件;若多个三元组对应的第二损失函数值符合预设条件,则将当前的人脸特征提取模型作为最终的人脸特征提取模型;若多个三元组对应的第二损失函数值不符合预设条件,则跳转执行从人证数据集中选取多个三元组的步骤,继续对人脸特征提取模型进行训练。
其中,预设条件可以由技术人员根据实际需要进行设定,例如,可以是 设定的收敛条件,本实施例此处不做具体限定。
可选的,每组训练数据的三张训练图片中,第一训练图片和第二训练图片为同一个人的证件图片和非证件图片,第三训练图片与第二训练图片不属于同一个人。
进一步地,计算多个三元组对应的第二损失函数值,具体可以采用如下方式实现:
计算每个三元组的第一欧式距离和第二欧式距离,其中第一欧式距离是指三元组中属于同一个人的两张训练图片的人脸特征之间的欧氏距离,第二欧式距离是指三元组中不属于同一个人的两张训练图片的人脸特征之间的欧氏距离;根据每个三元组的第一欧式距离和第二欧式距离,计算多个三元组对应的第二损失函数值。
可选的,在计算得到每个三元组的第一欧式距离和第二欧式距离之后,可以根据每个三元组的第一欧式距离和第二欧式距离,计算每个三元组对应的第二损失函数值,根据所有三元组对应的第二损失函数值,并判断是否符合预设条件,若符合预设条件,则将当前的人脸特征提取模型作为最终的人脸特征提取模型;若不符合预设条件,则跳转执行从人证数据集中选取多个三元组的步骤,继续对人脸特征提取模型进行训练。
可选的,在计算得到每个三元组的第一欧式距离和第二欧式距离之后,还可以根据每个三元组的第一欧式距离和第二欧式距离,对三元组进行筛选,筛选出第一欧式距离小于第二欧式距离的有效三元组;然后根据有效三元组的第一欧式距离和第二欧式距离,计算有效三元组的对应的第二损失函数值;并判断有效三元组的对应的第二损失函数值是否符合预设条件,若符合预设条件,则将当前的人脸特征提取模型作为最终的人脸特征提取模型;若不符合预设条件,则跳转执行从人证数据集中选取多个三元组的步骤,继续对人脸特征提取模型进行训练。通过对三元组进行筛选,获取更有助于优化人脸特征提取模型的有效三元组,通过这些有效三元组训练得到的人脸特征提取模型的效果更好。
本公开实施例通过首先利用人脸数据集和基于Softmax的第一损失函数训练人脸特征提取模型,然后利用人证数据集和基于Triplet的第二损失函数对人脸特征提取模型进行模型再训练,得到最终的人脸特征提取模型,先使 用普通人脸数据集训练基于Softmax的第一损失函数进行模型训练,能够加速后续使用人证数据集基于Triplet的第二损失函数进行模型训练时的收敛速度,提高模型训练的效率;并且,能充分利用普通人脸数据集和人证数据集的数据特性,即使人证数据集较少,也可以通过基于Triplet的第二损失函数很好地利用这些人证数据集对人脸特征模型进行训练,提高了人脸特征模型在应用于上述“人证场景”时提取人脸特征的精度,能够提高人脸识别的准确率。
图2为本公开实施例提供的另一基于人脸识别的处理方法的流程图。在图1所示实施例的基础上,本实施例中,在训练得到最终的人脸特征提取模型之后,还包括使用人脸特征提取模型进行人脸识别的处理过程。如图2所示,该方法还包括如下步骤:
步骤S201、获取待处理的人物图片。
本实施例中的人脸特征提取模型可以应用于安全监控、安全防护的多种人脸识别系统的应用场景中,可以通过图像采集装置获取待处理的人物图片,图像采集装置可以是预先布控的监控装置,或者是预先安装在小区、道路等地方的监控装置;待处理的人物图片可以是监控装置拍摄到的布控对象的图片,或者路边抓拍到的逃窜嫌疑人的图片等等,本实施例此处对获取待处理的任务图片的方式以及待处理的人物图片的具体类型不做具体限定。
步骤S202、通过人脸特征提取模型提取人物图片的人脸特征。
将待处理的人物图片输入人脸特征提取模型,提取人物图片的人脸特征。
步骤S203、计算人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离。
步骤S204、根据人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离,确定与人物图片匹配的证件图片。
其中,图片库包括所有已知身份的人的证件图片,其中每个人的证件图片有至少一张。
可选的,在得到最终的人脸特征提取模型之后,可以利用人脸特征提取模型提取人脸识别系统的图片库中每一张证件图片的人脸特征,并存储,以便于在需要进行人脸识别时可以直接读取图片库中每一张证件图片的人脸特征,可以提高人脸识别的效率。
本实施例中,计算人物图片的人脸特征与任意一张证件图片的人脸特征之间的欧氏距离,可以采用现有技术中任意一种技术两张图片的特征数据之间的欧氏距离的方法实现,本实施例此处不再赘述。
具体的,该步骤可以采用如下方式实现:
根据人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离,两张图片的人脸特征之间的欧式距离可以体现两张图片人脸的差异,将与人物图片的人脸特征之间的欧式距离最小的证件图片确定为与人物图片匹配的证件图片,得到唯一一张与人物图片匹配的证件图片。
可选的,该步骤还可以采用如下方式实现:
根据人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离,将与人物图片的人脸特征之间的欧式距离小于预设距离阈值的证件图片确定为与人物图片匹配的证件图片,得到至少一个与人物图片匹配的证件图片。
其中,预设距离阈值可以由技术人员根据在上述图1所示实施例中对人脸特征提取模型的最后一轮训练或者验证时计算得到的多个三元组中属于同一个人的两张训练图片的人脸特征之间的欧氏距离和不属于同一个人的两张训练图片的人脸特征之间的欧氏距离进行统计和分析得到;或者,预设距离阈值可以由技术人员根据实际经验进行设定,本实施例此处不做具体限定。
本公开实施例通过利用人脸数据集和基于Softmax的第一损失函数训练人脸特征提取模型,然后利用人证数据集和基于Triplet的第二损失函数对人脸特征提取模型进行模型再训练,得到的人脸特征提取模型,使用该人脸特征提取模型可以精确地提取到待处理的人物图片的人脸特征以及图片库中证件图片的人脸特征,进一步可以准确地将人物图片与图片库中的证件图片进行匹配,提高了人脸识别的准确度。
本公开实施例还提供了一种基于人脸识别的处理装置,图3为本公开实施例提供的基于人脸识别的处理装置的结构示意图。本公开实施例提供的基于人脸识别的处理装置可以执行基于人脸识别的处理方法实施例提供的处理流程。如图3所示,基于人脸识别的处理装置30包括:第一训练模块301和第二训练模块302。
具体的,第一训练模块301用于利用人脸数据集和基于Softmax的第一 损失函数训练人脸特征提取模型。
第二训练模块302用于利用人证数据集和基于Triplet的第二损失函数对人脸特征提取模型进行模型再训练,得到最终的人脸特征提取模型。
可选的,人证数据集包括多个人的人脸图片,每个人的人脸图片至少包括一张证件图片和一张非证件图片。
可选的,第二训练模块302还用于:
从人证数据集中选取多组训练数据,每组训练数据包括三张训练图片,其中两张训练图片属于同一个人;通过人脸特征提取模型提取每组训练数据中的三张训练图片的人脸特征,每组训练数据的三张图片的人脸特征构成一个三元组,得到多个三元组;计算多个三元组对应的第二损失函数值;若多个三元组对应的第二损失函数值符合预设条件,则将当前的人脸特征提取模型作为最终的人脸特征提取模型;若多个三元组对应的第二损失函数值不符合预设条件,则跳转执行从人证数据集中选取多个三元组的步骤。
可选的,每组训练数据的三张训练图片中,第一训练图片和第二训练图片为同一个人的证件图片和非证件图片,第三训练图片与第二训练图片不属于同一个人。
可选的,第二训练模块302还用于:
计算每个三元组的第一欧式距离和第二欧式距离,其中第一欧式距离是指三元组中属于同一个人的两张训练图片的人脸特征之间的欧氏距离,第二欧式距离是指三元组中不属于同一个人的两张训练图片的人脸特征之间的欧氏距离;根据每个三元组的第一欧式距离和第二欧式距离,计算多个三元组对应的第二损失函数值。
可选的,第二训练模块302还用于:
根据每个三元组的第一欧式距离和第二欧式距离,计算每个三元组对应的第二损失函数值。
可选的,第二训练模块302还用于:
根据每个三元组的第一欧氏距离和第二欧式距离,筛选出第一欧式距离小于第二欧式距离的有效三元组;根据有效三元组的第一欧式距离和第二欧式距离,计算有效三元组的对应的第二损失函数值。
可选的,第一损失函数为Softmax Loss或者Cosine Loss。
可选的,人脸特征提取模型采用基于残差网络ResNet的网络结构。
本公开实施例提供的装置可以具体用于执行上述图1所示的方法实施例,具体功能此处不再赘述。
本公开实施例通过首先利用人脸数据集和基于Softmax的第一损失函数训练人脸特征提取模型,然后利用人证数据集和基于Triplet的第二损失函数对人脸特征提取模型进行模型再训练,得到最终的人脸特征提取模型,先使用普通人脸数据集训练基于Softmax的第一损失函数进行模型训练,能够加速后续使用人证数据集基于Triplet的第二损失函数进行模型训练时的收敛速度,提高模型训练的效率;并且,能充分利用普通人脸数据集和人证数据集的数据特性,即使人证数据集较少,也可以通过基于Triplet的第二损失函数很好地利用这些人证数据集对人脸特征模型进行训练,提高了人脸特征模型在应用于上述“人证场景”时提取人脸特征的精度,能够提高人脸识别的准确率。
图4为本公开实施例提供的另一基于人脸识别的处理装置的结构示意图。在上述实施例的基础上,本实施例中,如图4所示,基于人脸识别的处理装置30还包括:人脸识别模块303。
具体的,人脸识别模块303用于:
获取待处理的人物图片;通过人脸特征提取模型提取人物图片的人脸特征;计算人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离;根据人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离,确定与人物图片匹配的证件图片。
可选的,人脸识别模块303还用于:
通过人脸特征提取模型提取图片库中每一张证件图片的人脸特征。
可选的,人脸识别模块303还用于:
根据人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离,将与人物图片的人脸特征之间的欧式距离最小的证件图片确定为与人物图片匹配的证件图片。
可选的,人脸识别模块303还用于:
根据人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离,将与人物图片的人脸特征之间的欧式距离小于预设距离阈值的证 件图片确定为与人物图片匹配的证件图片。
本公开实施例提供的装置可以具体用于执行上述图2所示的方法实施例,具体功能此处不再赘述。
本公开实施例通过利用人脸数据集和基于Softmax的第一损失函数训练人脸特征提取模型,然后利用人证数据集和基于Triplet的第二损失函数对人脸特征提取模型进行模型再训练,得到的人脸特征提取模型,使用该人脸特征提取模型可以精确地提取到待处理的人物图片的人脸特征以及图片库中证件图片的人脸特征,进一步可以准确地将人物图片与图片库中的证件图片进行匹配,提高了人脸识别的准确度。
本公开实施例还提供了一种计算机设备,包含上述任一实施例所提供的基于人脸识别的处理装置。
本公开实施例还提供了一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令设置为执行上述基于人脸识别的处理方法。
本公开实施例还提供了一种计算机程序产品,所述计算机程序产品包括存储在计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行上述基于人脸识别的处理方法。
上述的计算机可读存储介质可以是暂态计算机可读存储介质,也可以是非暂态计算机可读存储介质。
本公开实施例还提供了一种电子设备,其结构如图5所示,该电子设备包括:
至少一个处理器(processor)100,图5中以一个处理器100为例;和存储器(memory)101,还可以包括通信接口(Communication Interface)102和总线103。其中,处理器100、通信接口102、存储器101可以通过总线103完成相互间的通信。通信接口102可以用于信息传输。处理器100可以调用存储器101中的逻辑指令,以执行上述实施例的基于人脸识别的处理方法。
此外,上述的存储器101中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。
存储器101作为一种计算机可读存储介质,可用于存储软件程序、计算 机可执行程序,如本公开实施例中的方法对应的程序指令/模块。处理器100通过运行存储在存储器101中的软件程序、指令以及模块,从而执行功能应用以及数据处理,即实现上述方法实施例中的基于人脸识别的处理方法。
存储器101可包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据终端设备的使用所创建的数据等。此外,存储器101可以包括高速随机存取存储器,还可以包括非易失性存储器。
本公开实施例的技术方案可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括一个或多个指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开实施例所述方法的全部或部分步骤。而前述的存储介质可以是非暂态存储介质,包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等多种可以存储程序代码的介质,也可以是暂态存储介质。
当用于本申请中时,虽然术语“第一”、“第二”等可能会在本申请中使用以描述各元件,但这些元件不应受到这些术语的限制。这些术语仅用于将一个元件与另一个元件区别开。比如,在不改变描述的含义的情况下,第一元件可以叫做第二元件,并且同样第,第二元件可以叫做第一元件,只要所有出现的“第一元件”一致重命名并且所有出现的“第二元件”一致重命名即可。第一元件和第二元件都是元件,但可以不是相同的元件。
本申请中使用的用词仅用于描述实施例并且不用于限制权利要求。如在实施例以及权利要求的描述中使用的,除非上下文清楚地表明,否则单数形式的“一个”(a)、“一个”(an)和“所述”(the)旨在同样包括复数形式。类似地,如在本申请中所使用的术语“和/或”是指包含一个或一个以上相关联的列出的任何以及所有可能的组合。另外,当用于本申请中时,术语“包括”(comprise)及其变型“包括”(comprises)和/或包括(comprising)等指陈述的特征、整体、步骤、操作、元素,和/或组件的存在,但不排除一个或一个以上其它特征、整体、步骤、操作、元素、组件和/或这些的分组的存在或添加。
所描述的实施例中的各方面、实施方式、实现或特征能够单独使用或以 任意组合的方式使用。所描述的实施例中的各方面可由软件、硬件或软硬件的结合实现。所描述的实施例也可以由存储有计算机可读代码的计算机可读介质体现,该计算机可读代码包括可由至少一个计算装置执行的指令。所述计算机可读介质可与任何能够存储数据的数据存储装置相关联,该数据可由计算机系统读取。用于举例的计算机可读介质可以包括只读存储器、随机存取存储器、CD-ROM、HDD、DVD、磁带以及光数据存储装置等。所述计算机可读介质还可以分布于通过网络联接的计算机系统中,这样计算机可读代码就可以分布式存储并执行。
上述技术描述可参照附图,这些附图形成了本申请的一部分,并且通过描述在附图中示出了依照所描述的实施例的实施方式。虽然这些实施例描述的足够详细以使本领域技术人员能够实现这些实施例,但这些实施例是非限制性的;这样就可以使用其它的实施例,并且在不脱离所描述的实施例的范围的情况下还可以做出变化。比如,流程图中所描述的操作顺序是非限制性的,因此在流程图中阐释并且根据流程图描述的两个或两个以上操作的顺序可以根据若干实施例进行改变。作为另一个例子,在若干实施例中,在流程图中阐释并且根据流程图描述的一个或一个以上操作是可选的,或是可删除的。另外,某些步骤或功能可以添加到所公开的实施例中,或两个以上的步骤顺序被置换。所有这些变化被认为包含在所公开的实施例以及权利要求中。
另外,上述技术描述中使用术语以提供所描述的实施例的透彻理解。然而,并不需要过于详细的细节以实现所描述的实施例。因此,实施例的上述描述是为了阐释和描述而呈现的。上述描述中所呈现的实施例以及根据这些实施例所公开的例子是单独提供的,以添加上下文并有助于理解所描述的实施例。上述说明书不用于做到无遗漏或将所描述的实施例限制到本公开的精确形式。根据上述教导,若干修改、选择适用以及变化是可行的。在某些情况下,没有详细描述为人所熟知的处理步骤以避免不必要地影响所描述的实施例。
Claims (18)
- 一种基于人脸识别的处理方法,其特征在于,包括:利用人脸数据集和基于Softmax的第一损失函数训练人脸特征提取模型;利用人证数据集和基于Triplet的第二损失函数对所述人脸特征提取模型进行模型再训练,得到最终的人脸特征提取模型。
- 根据权利要求1所述的方法,其特征在于,所述人证数据集包括多个人的人脸图片,每个人的人脸图片至少包括一张证件图片和一张非证件图片。
- 根据权利要求2所述的方法,其特征在于,用人证数据集和基于Triplet的第二损失函数对所述人脸特征提取模型进行模型再训练,得到最终的人脸特征提取模型包括:从所述人证数据集中选取多组训练数据,每组训练数据包括三张训练图片,其中两张训练图片属于同一个人;通过所述人脸特征提取模型提取每组训练数据中的三张训练图片的人脸特征,每组训练数据的三张图片的人脸特征构成一个三元组,得到多个三元组;计算所述多个三元组对应的第二损失函数值;若所述多个三元组对应的第二损失函数值符合预设条件,则将当前的人脸特征提取模型作为最终的人脸特征提取模型;若所述多个三元组对应的第二损失函数值不符合预设条件,则跳转执行从所述人证数据集中选取多个三元组的步骤。
- 根据权利要求3所述的方法,其特征在于,每组训练数据的三张训练图片中,第一训练图片和第二训练图片为同一个人的证件图片和非证件图片,第三训练图片与所述第二训练图片不属于同一个人。
- 根据权利要求3或4所述的方法,其特征在于,计算所述多个三元组对应的第二损失函数值,包括:计算每个三元组的第一欧式距离和第二欧式距离,其中所述第一欧式距离是指三元组中属于同一个人的两张训练图片的人脸特征之间的欧氏距离,所述第二欧式距离是指三元组中不属于同一个人的两张训练图片的人脸特征 之间的欧氏距离;根据每个三元组的第一欧式距离和第二欧式距离,计算所述多个三元组对应的第二损失函数值。
- 根据权利要求5所述的方法,其特征在于,根据每个三元组的第一欧式距离和第二欧式距离,计算多个三元组对应的第二损失函数值,包括:根据每个三元组的第一欧式距离和第二欧式距离,计算每个三元组对应的第二损失函数值。
- 根据权利要求5所述的方法,其特征在于,根据每个三元组的第一欧式距离和第二欧式距离,计算多个三元组对应的第二损失函数值,包括:根据每个三元组的第一欧氏距离和第二欧式距离,筛选出第一欧式距离小于第二欧式距离的有效三元组;根据所述有效三元组的第一欧式距离和第二欧式距离,计算所述有效三元组的对应的第二损失函数值。
- 根据权利要求1所述的方法,其特征在于,所述第一损失函数为Softmax Loss或者Cosine Loss。
- 根据权利要求1所述的方法,其特征在于,所述人脸特征提取模型采用基于残差网络的网络结构。
- 根据权利要求1所述的方法,其特征在于,得到最终的人脸特征提取模型之后,还包括:获取待处理的人物图片;通过所述人脸特征提取模型提取所述人物图片的人脸特征;计算所述人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离;根据所述人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离,确定与人物图片匹配的证件图片。
- 根据权利要求10所述的方法,其特征在于,计算所述人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离之前,还包括:通过所述人脸特征提取模型提取所述图片库中每一张证件图片的人脸特征。
- 根据权利要求10所述的方法,其特征在于,根据所述人物图片的人 脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离,确定与人物图片匹配的证件图片,包括:根据所述人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离,将与所述人物图片的人脸特征之间的欧式距离最小的证件图片确定为与人物图片匹配的证件图片。
- 根据权利要求10所述的方法,其特征在于,根据所述人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离,确定与人物图片匹配的证件图片,包括:根据所述人物图片的人脸特征与图片库中每一张证件图片的人脸特征之间的欧氏距离,将与所述人物图片的人脸特征之间的欧式距离小于预设距离阈值的证件图片确定为与人物图片匹配的证件图片。
- 一种基于人脸识别的处理装置,其特征在于,包括:第一训练模块,用于利用人脸数据集和基于Softmax的第一损失函数训练人脸特征提取模型;第二训练模块,用于利用人证数据集和基于Triplet的第二损失函数对所述人脸特征提取模型进行模型再训练,得到最终的人脸特征提取模型。
- 一种计算机设备,其特征在于,包含权利要求14所述的装置。
- 一种电子设备,其特征在于,包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行时,使所述至少一个处理器执行权利要求1-13任一项所述的方法。
- 一种计算机可读存储介质,其特征在于,存储有计算机可执行指令,所述计算机可执行指令设置为执行权利要求1-13任一项所述的方法。
- 一种计算机程序产品,其特征在于,所述计算机程序产品包括存储在计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行权利要求1-13任一项所述的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201880098348.5A CN112912887A (zh) | 2018-11-08 | 2018-11-08 | 基于人脸识别的处理方法、装置、设备及可读存储介质 |
PCT/CN2018/114537 WO2020093303A1 (zh) | 2018-11-08 | 2018-11-08 | 基于人脸识别的处理方法、装置、设备及可读存储介质 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2018/114537 WO2020093303A1 (zh) | 2018-11-08 | 2018-11-08 | 基于人脸识别的处理方法、装置、设备及可读存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020093303A1 true WO2020093303A1 (zh) | 2020-05-14 |
Family
ID=70611313
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/114537 WO2020093303A1 (zh) | 2018-11-08 | 2018-11-08 | 基于人脸识别的处理方法、装置、设备及可读存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112912887A (zh) |
WO (1) | WO2020093303A1 (zh) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111738157A (zh) * | 2020-06-23 | 2020-10-02 | 平安科技(深圳)有限公司 | 面部动作单元数据集的构建方法、装置和计算机设备 |
CN113158852A (zh) * | 2021-04-08 | 2021-07-23 | 浙江工业大学 | 一种基于人脸与非机动车协同识别的交通卡口监控系统 |
CN113971824A (zh) * | 2021-04-28 | 2022-01-25 | 安徽科力信息产业有限责任公司 | 一种基于人脸识别的特殊行为定点识别查控方法及系统 |
CN116129227A (zh) * | 2023-04-12 | 2023-05-16 | 合肥的卢深视科技有限公司 | 模型训练方法、装置、电子设备及计算机可读存储介质 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598004B (zh) * | 2020-05-18 | 2023-12-08 | 江苏星闪世图科技(集团)有限公司 | 一种渐进增强自学习的无监督跨领域行人再识别方法 |
CN114663965B (zh) * | 2022-05-24 | 2022-10-21 | 之江实验室 | 一种基于双阶段交替学习的人证比对方法和装置 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106203533A (zh) * | 2016-07-26 | 2016-12-07 | 厦门大学 | 基于混合训练的深度学习人脸验证方法 |
CN107103281A (zh) * | 2017-03-10 | 2017-08-29 | 中山大学 | 基于聚集损失深度度量学习的人脸识别方法 |
CN107871100A (zh) * | 2016-09-23 | 2018-04-03 | 北京眼神科技有限公司 | 人脸模型的训练方法和装置、人脸认证方法和装置 |
CN108197561A (zh) * | 2017-12-29 | 2018-06-22 | 北京智慧眼科技股份有限公司 | 人脸识别模型优化控制方法、装置、设备及存储介质 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105975959B (zh) * | 2016-06-14 | 2019-09-03 | 广州视源电子科技股份有限公司 | 基于神经网络的人脸特征提取建模、人脸识别方法及装置 |
CN106934346B (zh) * | 2017-01-24 | 2019-03-15 | 北京大学 | 一种目标检测性能优化的方法 |
-
2018
- 2018-11-08 WO PCT/CN2018/114537 patent/WO2020093303A1/zh active Application Filing
- 2018-11-08 CN CN201880098348.5A patent/CN112912887A/zh active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106203533A (zh) * | 2016-07-26 | 2016-12-07 | 厦门大学 | 基于混合训练的深度学习人脸验证方法 |
CN107871100A (zh) * | 2016-09-23 | 2018-04-03 | 北京眼神科技有限公司 | 人脸模型的训练方法和装置、人脸认证方法和装置 |
CN107103281A (zh) * | 2017-03-10 | 2017-08-29 | 中山大学 | 基于聚集损失深度度量学习的人脸识别方法 |
CN108197561A (zh) * | 2017-12-29 | 2018-06-22 | 北京智慧眼科技股份有限公司 | 人脸识别模型优化控制方法、装置、设备及存储介质 |
Non-Patent Citations (1)
Title |
---|
SONG, YILONG ET AL.: "Deep Learning Facial Feature Extraction Method Based on Mixed Training", NEW TECHNOLOGY & NEW PROCESS, 25 March 2018 (2018-03-25), pages 39 - 42, ISSN: 1003-5311 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111738157A (zh) * | 2020-06-23 | 2020-10-02 | 平安科技(深圳)有限公司 | 面部动作单元数据集的构建方法、装置和计算机设备 |
CN111738157B (zh) * | 2020-06-23 | 2023-07-21 | 平安科技(深圳)有限公司 | 面部动作单元数据集的构建方法、装置和计算机设备 |
CN113158852A (zh) * | 2021-04-08 | 2021-07-23 | 浙江工业大学 | 一种基于人脸与非机动车协同识别的交通卡口监控系统 |
CN113158852B (zh) * | 2021-04-08 | 2024-03-29 | 浙江工业大学 | 一种基于人脸与非机动车协同识别的交通卡口监控系统 |
CN113971824A (zh) * | 2021-04-28 | 2022-01-25 | 安徽科力信息产业有限责任公司 | 一种基于人脸识别的特殊行为定点识别查控方法及系统 |
CN116129227A (zh) * | 2023-04-12 | 2023-05-16 | 合肥的卢深视科技有限公司 | 模型训练方法、装置、电子设备及计算机可读存储介质 |
CN116129227B (zh) * | 2023-04-12 | 2023-09-01 | 合肥的卢深视科技有限公司 | 模型训练方法、装置、电子设备及计算机可读存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN112912887A (zh) | 2021-06-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020093303A1 (zh) | 基于人脸识别的处理方法、装置、设备及可读存储介质 | |
CN109766872B (zh) | 图像识别方法和装置 | |
CN110851835B (zh) | 图像模型检测方法、装置、电子设备及存储介质 | |
CN108111489B (zh) | Url攻击检测方法、装置以及电子设备 | |
CN112381775B (zh) | 一种图像篡改检测方法、终端设备及存储介质 | |
US20210019519A1 (en) | Detection of fraudulently generated and photocopied credential documents | |
CN107577945B (zh) | Url攻击检测方法、装置以及电子设备 | |
CN109543548A (zh) | 一种人脸识别方法、装置及存储介质 | |
WO2020164278A1 (zh) | 一种图像处理方法、装置、电子设备和可读存储介质 | |
CN111861240A (zh) | 可疑用户识别方法、装置、设备及可读存储介质 | |
CN103824055A (zh) | 一种基于级联神经网络的人脸识别方法 | |
CN108710893B (zh) | 一种基于特征融合的数字图像相机源模型分类方法 | |
CN109389098B (zh) | 一种基于唇语识别的验证方法以及系统 | |
US11182468B1 (en) | Methods and systems for facilitating secure authentication of user based on known data | |
TW201917636A (zh) | 一種基於線上學習的人臉辨識方法與系統 | |
Mazumdar et al. | Universal image manipulation detection using deep siamese convolutional neural network | |
CN111241873A (zh) | 图像翻拍检测方法及其模型的训练方法、支付方法及装置 | |
CN111291780B (zh) | 一种跨域网络训练及图像识别方法 | |
US20230267709A1 (en) | Dataset-aware and invariant learning for face recognition | |
CN116959075B (zh) | 基于深度学习的身份识别机器人迭代优化方法 | |
CN112906676A (zh) | 人脸图像来源的识别方法、装置、存储介质及电子设备 | |
CN110457877B (zh) | 用户认证方法和装置、电子设备、计算机可读存储介质 | |
WO2020113563A1 (zh) | 人脸图像质量评估方法、装置、设备及存储介质 | |
WO2023071180A1 (zh) | 真伪识别方法、装置、电子设备以及存储介质 | |
CN116543181A (zh) | 一种基于图像背景特征识别的反团伙欺诈方法及系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18939575 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 09.09.2021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18939575 Country of ref document: EP Kind code of ref document: A1 |