WO2020253127A1 - 脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质 - Google Patents

脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2020253127A1
WO2020253127A1 PCT/CN2019/122165 CN2019122165W WO2020253127A1 WO 2020253127 A1 WO2020253127 A1 WO 2020253127A1 CN 2019122165 W CN2019122165 W CN 2019122165W WO 2020253127 A1 WO2020253127 A1 WO 2020253127A1
Authority
WO
WIPO (PCT)
Prior art keywords
facial feature
image set
face
data
candidate
Prior art date
Application number
PCT/CN2019/122165
Other languages
English (en)
French (fr)
Inventor
姚旭峰
徐国强
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US16/981,192 priority Critical patent/US20230119593A1/en
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Priority to JP2020560486A priority patent/JP6994588B2/ja
Priority to EP19919545.4A priority patent/EP3989104A4/en
Priority to KR1020207026394A priority patent/KR102385463B1/ko
Publication of WO2020253127A1 publication Critical patent/WO2020253127A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/469Contour-based spatial representations, e.g. vector-coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • This application relates to the field of biometrics, and in particular to a facial feature extraction model training method, facial feature extraction method, device, equipment and storage medium.
  • Face recognition technology refers to extracting facial features through a feature extraction model, and performing identity recognition or target detection through the facial features.
  • This application provides a facial feature extraction model training method, facial feature extraction method, device, equipment, and storage medium, which improve the feature extraction accuracy of the facial feature extraction model and provide an important reference for human facial action recognition.
  • this application provides a method for training a facial feature extraction model, the method including:
  • Input face training data into multiple original student networks for model training, so as to obtain candidate student networks corresponding to each of the original student networks;
  • a plurality of the candidate facial feature data are screened to obtain target sample features, and the candidate student network corresponding to the target sample feature is determined as the facial feature extraction model.
  • this application also provides a facial feature extraction method, including:
  • the target processed image is input into a facial feature extraction model to output target facial features; wherein the facial feature extraction model is trained using the aforementioned facial feature extraction model training method.
  • the present application also provides a facial feature extraction model training device, the device includes:
  • the model training unit is configured to input the face training data into multiple original student networks for model training, so as to obtain candidate student networks corresponding to each of the original student networks;
  • a data output unit for inputting face verification data into each of the candidate student networks to output verification facial feature data corresponding to each of the candidate student networks;
  • a data input unit configured to input each of the verified facial feature data into a preset teacher network to output candidate facial feature data corresponding to each of the verified facial feature data;
  • the model determining unit is used to screen multiple candidate facial feature data based on preset feature screening rules to obtain target sample features, and determine the candidate student network corresponding to the target sample feature as the facial feature extraction model .
  • the present application also provides a facial feature extraction device, which includes:
  • An image processing unit configured to perform image processing on the target image to obtain a target processed image
  • the image input unit is used to input the target processed image into a facial feature extraction model to output target facial features; wherein the facial feature extraction model adopts the aforementioned facial feature extraction model training method to obtain a model .
  • the present application also provides a computer device, the computer device includes a memory and a processor; the memory is used to store a computer program; the processor is used to execute the computer program and execute the The computer program implements the aforementioned facial feature extraction model training method or facial feature extraction method.
  • the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor realizes the aforementioned facial features Extract model training method or facial feature extraction method.
  • This application discloses a facial feature extraction model training method, facial feature extraction method, device, equipment and storage medium.
  • the facial training data is input into multiple original student networks for model training to obtain each of the original The candidate student network corresponding to the student network; input facial verification data into each candidate student network to output the verification facial feature data corresponding to each candidate student network; input each verification facial feature data into the preset Set up a teacher network to output candidate facial feature data corresponding to each of the verified facial feature data; based on preset feature screening rules, filter multiple candidate facial feature data to obtain target sample features,
  • the candidate student network corresponding to the target sample feature is determined as the facial feature extraction model. Furthermore, the accuracy of feature extraction of the facial feature extraction model is improved.
  • FIG. 1 is a schematic flowchart of a method for marking an image set provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of sub-steps of the method for marking the image set in FIG. 1;
  • Fig. 3 is a schematic flowchart of sub-steps of the method for marking the image set in Fig. 1;
  • FIG. 4 is a schematic flowchart of steps for acquiring the first screening image set in FIG. 1;
  • FIG. 5 is a schematic flowchart of a method for training a facial feature extraction model provided by an embodiment of the present application
  • FIG. 6 is a schematic flowchart of sub-steps of the facial feature extraction model training method in FIG. 5;
  • Fig. 7 is a schematic flowchart of sub-steps of the facial feature extraction model training method in Fig. 5;
  • Figure 8 is a schematic flow chart of the steps for determining the loss value
  • FIG. 9 is a schematic flowchart of steps of a facial feature extraction method provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of an application scenario of a facial feature extraction method provided by an embodiment of the present application.
  • FIG. 11 is a schematic block diagram of a facial feature extraction model training device provided by an embodiment of the present application.
  • FIG. 12 is a schematic block diagram of a subunit of the facial feature extraction model training device in FIG. 11;
  • FIG. 13 is a schematic block diagram of a facial feature extraction device according to an embodiment of the present application.
  • FIG. 14 is a schematic block diagram of the structure of a computer device according to an embodiment of the application.
  • the embodiments of the present application provide a facial feature extraction model training method, facial feature extraction method, device, computer equipment, and storage medium.
  • the facial feature extraction model can be used to extract facial features of human facial actions in human facial action recognition scenes.
  • FIG. 1 is a schematic flowchart of steps of a method for marking an image set according to an embodiment of the present application.
  • the present application provides a method for labeling image sets for labeling face images, so as to perform model training on the facial feature extraction model.
  • the method specifically includes: step S110 to step S150.
  • S110 Select unlabeled images from multiple original face images according to a preset selection rule to obtain a set of face sample images.
  • the original face image refers to a large number of images obtained from the Internet. These images are not processed images.
  • Machine learning can perform recognition processing on these images to obtain recognition results, or select corresponding images for test training processing to obtain data more suitable for the machine learning, so that the machine learning can be performed according to the preset goals, so as to obtain a better machine Learning model.
  • the preset selection rule can select an image of a specific image generation source as a face sample image, for example, select a face image in a Yale face database through a preset channel as a face sample image. Of course, it can also be selected according to the output time of the image, for example, the face image during the legal holidays is selected as the face sample image.
  • the set of all face sample images is constructed as a set of face sample images.
  • the human face sample image set includes several unlabeled images.
  • the performing uncertainty analysis on each of the face sample image sets to obtain the analysis result specifically includes:
  • At least one of minimum confidence analysis, edge sampling analysis, and information entropy analysis is performed on each of the face sample image sets to obtain the uncertainty value of each face sample image set.
  • the uncertainty can be measured by at least one of minimum confidence, edge sampling, and information entropy.
  • the analysis results can be embodied in numerical ways. For example, the higher the value, the higher the uncertainty. Of course, the analysis results can also be divided into multiple levels for comparison and other forms to show the uncertainty.
  • the specific process of performing uncertainty analysis on each of the face sample image sets specifically includes sub-steps S121, S122, and S123.
  • S121 Perform a minimum confidence analysis on the images in the face sample image set to obtain a first uncertainty value corresponding to each of the images.
  • the uncertainty of an image is also called the marked value of the image.
  • the minimum confidence analysis, Least confidence can be defined as:
  • S122 Perform edge sampling analysis on the images in the face sample image set to obtain a second uncertainty value corresponding to each of the images.
  • marginal sample analysis namely margin sampling
  • margin sampling can be defined as:
  • Represents the second highest probability in the predicted probability distribution of sample x where, They are the categories corresponding to the largest and second largest probabilities predicted by the model ⁇
  • P ⁇ represents the probability distribution of model prediction The larger the value, the higher the uncertainty of the sample x, that is, the more corresponding labeling processing is required.
  • information entropy analysis can be defined as:
  • the analysis result includes the uncertainty value corresponding to each image in the face sample image set.
  • step S130 specifically includes:
  • the face sample image set is filtered to obtain the image set to be labeled.
  • step S130 specifically includes substeps S131 to S134.
  • the first uncertainty value corresponding to the analysis method can be obtained.
  • the images in the face sample image set are screened to obtain the corresponding first screened image set.
  • the second uncertainty value corresponding to the analysis method can be obtained.
  • the images in the face sample image set are screened to obtain a corresponding second screened image set.
  • S133 Perform screening processing on the images in the face sample image set according to the third uncertainty value to obtain a third screening image set.
  • the third uncertainty value corresponding to the analysis method can be obtained.
  • the images in the face sample image set are screened to obtain a corresponding third screened image set.
  • S134 Construct the image set to be labeled according to the first screening image set, the second screening image set, and the third screening image set.
  • the first screening image set, the second screening image set, and the third screening image set are combined to form the image set to be labeled.
  • This increases the number of images in the image set to be labeled, and at the same time increases the image diversity of the image set to be labeled, and finally enriches the images in the image set to be labeled, improves the training efficiency of the model, effectively reduces the training time, and makes the model more robust .
  • step S131 includes:
  • the images in the face sample image set are sorted in descending order according to the corresponding uncertainty values, and the first preset number of corresponding images are set as the images to be labeled. Set all the images to be labeled as image sets to be labeled.
  • the images before the sequence can be kept as images with higher uncertainty, and then the selection can keep the data uncertain In turn, it ensures that the image training efficiency for training the model is high.
  • the preset number can be selected according to the application environment, or the preset number can be set according to a certain ratio. For example, 85% of the total image volume in the face sample image set can be selected as the image to be labeled, and the preset number is 85% of the total number.
  • step S131 specifically includes sub-steps S1311, S1312 and S1313.
  • the uncertainty value is set according to the actual working environment.
  • the uncertainty value of an image in the face sample image set is greater than the uncertainty threshold, it indicates that the image conforms to the labeling rules, and then it is labelled.
  • S140 Perform annotation processing on the images in the image set to be annotated to obtain an already annotated face image set.
  • Annotation processing refers to forming a one-to-one correspondence between each of the images to be annotated and their corresponding categories, so that the corresponding annotated image can be obtained, and the annotated image is the face verification data.
  • the performing annotating processing on the image to be annotated may include: receiving an arbitrary input of annotation corresponding to the image to be annotated; combining the annotation information with the image according to the review result.
  • the image to be annotated forms a corresponding relationship, and an already annotated face image set is obtained; the review result is obtained by reviewing the marked information by the reviewer.
  • FIG. 5 is a schematic flowchart of steps of a facial feature extraction model training method provided by an embodiment of the present application.
  • the training method can select multiple original student networks for model training to obtain corresponding candidate student networks.
  • the multiple can be two, three or more.
  • the original student network can be a network such as YOLO9000, AlexNet or VGGNet.
  • the following will take multiple as two, and the two original student networks are YOLO9000 network and VGGNet network as examples.
  • the method for training a facial feature extraction model specifically includes: step S210 to step S240.
  • S210 Input the face training data into multiple original student networks for model training, so as to obtain candidate student networks corresponding to each of the original student networks.
  • the face training data is used to perform model training on the original student network, which includes training sub-data and test sub-data.
  • the test sub-data is the data obtained by marking using the above-mentioned image marking method, and is used to test the candidate student network to determine whether the candidate student network meets the learning requirements.
  • the face training data is input into the YOLO9000 network for model training, and the first candidate student network is obtained.
  • the face training data is input into the VGGNet network for model training, and the second candidate student network is obtained.
  • S220 Input face verification data into each of the candidate student networks to output verification facial feature data corresponding to each of the candidate student networks.
  • the face verification data may also use the data obtained by the above-mentioned image labeling method.
  • the face verification data is input into the first candidate student network to obtain the first verification facial feature data.
  • the face verification data is input into the second candidate student network to obtain the second verification facial feature data.
  • S230 Input each of the verified facial feature data into a preset teacher network to output candidate facial feature data corresponding to each of the verified facial feature data.
  • the teacher network can be pre-trained YOLO9000.
  • the first verification facial feature data is input to the teacher network, and the first candidate facial feature data is output.
  • S240 Filter a plurality of candidate facial feature data based on preset feature screening rules to obtain target sample features, and determine a candidate student network corresponding to the target sample feature as a facial feature extraction model.
  • the preset feature screening rules can be set according to specific application scenarios.
  • screening a plurality of candidate facial feature data based on preset feature screening rules to obtain target sample features includes sub-steps S241 and S242.
  • the verification face image can be set according to a specific scene.
  • the first accuracy rate is calculated according to the first candidate facial feature data.
  • the second accuracy rate is calculated.
  • the first candidate facial feature data and the verification facial feature data of the preset verification facial image are input into a pre-trained neural network model to output the first candidate facial feature data corresponding to the first candidate facial feature data.
  • An accuracy rate is input into the neural network model to output the second accuracy rate corresponding to the second candidate facial feature data.
  • the neural network model may specifically be a pre-trained GoogLeNet model, of course, it may also be other network models.
  • S242 Determine the candidate facial feature data corresponding to the maximum accuracy rate as the target sample feature.
  • the second candidate facial feature data corresponding to the second accuracy rate is determined as the target sample feature.
  • the second candidate student network corresponding to the second candidate facial feature data is determined as the facial feature extraction model.
  • determining the candidate student network corresponding to the target sample feature as the facial feature extraction model includes sub-steps S243, S244, and S245.
  • step S243 includes S2431 and S2432.
  • the first sub-loss value of the candidate student network corresponding to the target sample feature is determined according to the target sample feature.
  • the second sub-loss value of the candidate student network corresponding to the target sample characteristic is determined according to the target sample characteristic.
  • the first loss function is:
  • J s is the first sub-loss value
  • u k is the feature vector of the target sample feature of the k- th picture in the face training data
  • u j is the label vector of the k- th picture in the face training data
  • m is each The number of images in the face training data of each batch.
  • the second loss function is:
  • J c is the second sub-loss value
  • u k is the feature vector of the target sample feature of the k-th picture in the face training data
  • c k is the center of the k-th picture in the face training data
  • m is each The number of images in the batch of face training data.
  • the loss value fusion formula is:
  • J is the loss value
  • w 1 and w 2 are weights.
  • the combination of the first loss function and the second loss function is used as the loss function for the facial feature extraction model retraining, so that the trained facial feature extraction model has cohesion, when there is a lack of massive high-quality face training data sets It can also accurately extract feature data while avoiding the slow convergence rate and over-fitting phenomenon when the facial feature extraction model is retrained.
  • the loss value is less than the preset loss threshold, it indicates that the candidate student network corresponding to the target sample feature has converged, and the candidate student network is determined as the facial feature extraction model.
  • the loss value is not less than a preset loss threshold, adjust the parameters of the candidate student network according to the loss value.
  • the loss value is not less than the preset loss threshold, it indicates that the candidate student network corresponding to the target sample feature does not converge, and the candidate student network needs to be continuously trained, and the parameters of the candidate student network are adjusted according to the loss value until The loss value is less than the loss threshold, and the candidate student network corresponding to the target sample feature is determined as the facial feature extraction model, that is, step S244 is executed.
  • the above-mentioned facial feature extraction model training method is to train multiple original student networks with the facial training data marked by the above-mentioned marking method, and then filter them through the teacher network and face verification data to obtain the accuracy of feature extraction
  • the best candidate student network is used as the facial feature extraction model.
  • This training method enriches the diversity of partial face training data and face verification data, improves the efficiency of model training, and improves the accuracy of facial features extracted by the model, providing an important reference for human facial action recognition.
  • FIG. 9 is a schematic flowchart of steps of a facial feature extraction method provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of an application scenario of a facial feature extraction method provided by an embodiment of the present application.
  • the facial feature extraction method can be applied to a system including terminal devices 610 and 620, a network 630 and a server 640.
  • the network 640 is used to provide a medium of communication links between the terminal devices 610 and 620 and the server 640.
  • the network 630 may include various connection types, such as wired, wireless communication links, or fiber optic cables.
  • the user can use the terminal devices 610 and 620 to interact with the server 640 via the network 630 to receive or send request instructions and the like.
  • Various communication client applications such as image processing applications, web browser applications, search applications, instant messaging tools, etc., may be installed on the terminal devices 610 and 620.
  • the facial feature extraction method specifically includes: step S310 to step S330.
  • the image to be recognized includes a face target to be recognized, which may be a visible light image, such as an image in an RGB (Red Green Blue) mode.
  • a visible light image such as an image in an RGB (Red Green Blue) mode.
  • the aforementioned image to be recognized may also be a near infrared (Near Infrared, NIR) image.
  • the execution subject of this embodiment may be installed with a camera for collecting visible light images or a camera for collecting near infrared images.
  • the user can select the camera to be turned on, and then use the selected camera to take a picture (using a self-portrait of the user's head or face) to obtain the image to be recognized.
  • S320 Perform image processing on the target image to obtain a target processed image.
  • image processing operations include: size adjustment, cropping processing, rotation processing, image algorithm processing and so on.
  • Image algorithm processing includes: adjusting color temperature algorithm, adjusting exposure algorithm, etc. These image processing operations can make the target image closer to the real picture.
  • picture parameters include size information, pixel size, contrast, sharpness, and natural saturation.
  • the facial feature extraction model is a model obtained by training using the aforementioned facial feature extraction model training method.
  • the above-mentioned facial feature extraction method obtains a target image; performs image processing on the target image; and inputs the target processed image into a facial feature extraction model, so that the accuracy of facial feature extraction is high, and it is easy to apply to actual application scenarios. in.
  • Figure 11 is a schematic block diagram of a facial feature extraction model training device provided by an embodiment of the present application.
  • the facial feature extraction model training device is used to perform any of the aforementioned facial feature extraction model training. method.
  • the facial feature extraction model training device can be configured in a server or a terminal.
  • the server can be an independent server or a server cluster.
  • the terminal can be an electronic device such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device.
  • the facial feature extraction model training device 400 includes a model training unit 410, a data output unit 420, a data input unit 430, and a model determination unit 440.
  • the model training unit 410 is configured to input face training data into multiple original student networks for model training, so as to obtain candidate student networks corresponding to each of the original student networks;
  • the data output unit 420 is configured to input face verification data into each of the candidate student networks to output verification facial feature data corresponding to each of the candidate student networks;
  • the data input unit 430 is configured to input each of the verified facial feature data into a preset teacher network to output candidate facial feature data corresponding to each of the verified facial feature data;
  • the model determining unit 440 is configured to screen multiple candidate facial feature data based on preset feature screening rules to obtain target sample features, and determine the candidate student network corresponding to the target sample feature as facial feature extraction model.
  • the facial feature extraction model training device 400 further includes a result obtaining unit 450, an image screening unit 460, and an image labeling unit 470.
  • the result obtaining unit 450 is configured to perform uncertainty analysis on a face sample image set to obtain an analysis result, the face sample image set including a number of unlabeled images;
  • the image screening unit 460 is configured to perform screening processing on the face sample image set according to the analysis result to obtain a to-be-annotated image set;
  • the image tagging unit 470 is configured to perform tagging processing on the images in the image set to be annotated to obtain face verification data.
  • the model determination unit 440 includes an accuracy rate calculation subunit 441 and a feature determination subunit 442.
  • the accuracy rate calculation subunit 441 is configured to calculate the accuracy rate corresponding to each candidate facial feature data according to each of the candidate facial feature data and the verification facial feature data of the preset verification facial image.
  • the feature determination subunit 442 is configured to determine the candidate facial feature data corresponding to the maximum accuracy rate as the target sample feature.
  • the model determination unit 440 further includes a loss value determination subunit 443 and a model determination subunit 444.
  • the loss value determining subunit 443 is configured to calculate the loss value of the candidate student network corresponding to the target sample feature according to the target sample feature;
  • the model determination subunit 444 is configured to determine the candidate student network corresponding to the target sample feature as a facial feature extraction model if the loss value is less than a preset loss threshold.
  • the loss value determination subunit 443 includes a loss value determination subunit 4431 and a loss value fusion subunit 4432.
  • the sub-loss value determining sub-unit 4431 is configured to determine, based on the first loss function and the second loss function, the first sub-loss value and the second sub-loss value of the candidate student network corresponding to the target sample feature according to the target sample feature .
  • the loss value fusion subunit 4432 is configured to determine the loss value of the candidate student network corresponding to the target sample feature based on the loss value fusion formula and according to the first sub-loss value and the second sub-loss value.
  • the first loss function is:
  • J s is the first sub-loss value
  • u k is the feature vector of the target sample feature of the k- th picture in the face training data
  • u j is the label vector of the k- th picture in the face training data
  • m is each The number of pictures in the face training data of each batch
  • the second loss function is:
  • J c is the second sub-loss value
  • u k is the feature vector of the target sample feature of the k-th picture in the face training data
  • c k is the center of the k-th picture in the face training data
  • m is each The number of pictures in the batch of face training data
  • the loss value fusion formula is:
  • J is the loss value
  • w 1 and w 2 are weights.
  • FIG. 13 is a schematic block diagram of a facial feature extraction device according to an embodiment of the present application.
  • the facial feature extraction device is used to execute the aforementioned facial feature extraction method.
  • the facial feature extraction device can be configured in a server or a terminal.
  • the facial feature extraction device 500 includes: an image acquisition unit 510, an image processing unit 520, and an image input unit 530.
  • the image acquisition unit 510 is used to acquire a target image.
  • the image processing unit 520 is configured to perform image processing on the target image to obtain a target processed image.
  • the image input unit 530 is configured to input the target processed image into the facial feature extraction model to output the target facial feature.
  • the facial feature extraction model is a model trained by the above-mentioned facial feature extraction model training method.
  • the above-mentioned apparatus may be implemented in the form of a computer program, and the computer program may be run on a computer device as shown in FIG. 14.
  • FIG. 14 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • the computer equipment can be a server or a terminal.
  • the computer device includes a processor, a memory, and a network interface connected through a system bus, where the memory may include a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium can store an operating system and a computer program.
  • the computer program includes program instructions, and when the program instructions are executed, the processor can execute a facial feature extraction model training method.
  • the processor is used to provide computing and control capabilities and support the operation of the entire computer equipment.
  • the internal memory provides an environment for the operation of the computer program in the non-volatile storage medium.
  • the processor can execute a facial feature extraction model training method.
  • the network interface is used for network communication, such as sending assigned tasks.
  • the network interface is used for network communication, such as sending assigned tasks.
  • FIG. 14 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • the processor may be a central processing unit (Central Processing Unit, CPU), the processor may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), and application specific integrated circuits (Application Specific Integrated Circuits). Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor.
  • the processor is used to run a computer program stored in the memory to implement the following steps:
  • the rule is to screen multiple candidate facial feature data to obtain target sample features, and determine the candidate student network corresponding to the target sample feature as the facial feature extraction model.
  • the processor before the processor implements the input of the face verification data into each of the candidate student networks to output the verification facial feature data corresponding to each of the candidate student networks, the processor is further configured to implement:
  • Uncertainty analysis is performed on the face sample image set to obtain an analysis result.
  • the face sample image set includes a number of unlabeled images; according to the analysis result, the face sample image set is filtered to obtain Obtain the image set to be annotated; annotate the images in the image set to be annotated to obtain face verification data.
  • the processor when the processor implements the screening of multiple candidate facial feature data based on preset feature screening rules to obtain target sample features, the processor is configured to achieve:
  • the accuracy rate corresponding to each candidate facial feature data is calculated; the candidate facial feature corresponding to the maximum accuracy rate The data is determined as the target sample feature.
  • the processor when the processor implements the candidate student network corresponding to the target sample feature as a facial feature extraction model, it is used to achieve:
  • the processor is configured to implement the following calculation of the loss value of the candidate student network corresponding to the target sample feature according to the target sample feature:
  • the first loss function and the second loss function determine the first sub-loss value and the second sub-loss value of the candidate student network corresponding to the target sample feature according to the target sample feature; based on the loss value fusion formula, according to the The first sub-loss value and the second sub-loss value determine the loss value of the candidate student network corresponding to the target sample feature;
  • the first loss function is:
  • J s is the first sub-loss value
  • u k is the feature vector of the target sample feature of the k- th picture in the face training data
  • u j is the label vector of the k- th picture in the face training data
  • m is each The number of pictures in the face training data of each batch
  • the second loss function is:
  • J c is the second sub-loss value
  • u k is the feature vector of the target sample feature of the k-th picture in the face training data
  • c k is the center of the k-th picture in the face training data
  • m is each The number of pictures in the batch of face training data
  • the loss value fusion formula is:
  • J is the loss value
  • w 1 and w 2 are weights.
  • the processor is used to run a computer program stored in the memory to implement the following steps:
  • the facial feature extraction model is a model trained by the above-mentioned facial feature extraction model training method.
  • the embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, the computer program includes program instructions, and the processor executes the program instructions to implement the present application
  • the facial feature extraction model training method or the facial feature extraction method provided by the embodiment are stored in a computer-readable storage medium.
  • the computer-readable storage medium may be the internal storage unit of the computer device described in the foregoing embodiment, such as the hard disk or memory of the computer device.
  • the computer-readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a smart memory card (SMC), or a secure digital (Secure Digital, SD) equipped on the computer device. ) Card, Flash Card, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

一种脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质,训练方法包括:将脸部训练数据分别输入多个原始学生网络进行模型训练;将脸部验证数据分别输入各候选学生网络;将各验证脸部特征数据分别输入预设的老师网络;对多个候选脸部特征数据进行筛选以确定脸部特征提取模型。

Description

脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质
本申请要求于2019年07月05日提交中国专利局、申请号为201910606508.9、发明名称为“脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及生物识别领域,尤其涉及一种脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质。
背景技术
近年来,以人脸为代表的生物特征检测识别广泛应用于身份辨认、智慧教育等诸多领域。人脸识别技术,是指通过特征提取模型提取人脸特征,通过该人脸特征进行身份辨认或目标检测。现有的特征提取模型在对人体脸部进行特征提取时,提取准确率不高,难以满足实际应用场景的使用需求。
发明内容
本申请提供了一种脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质,提高了脸部特征提取模型的特征提取准确率,为人体面部动作识别提供了重要参考。
第一方面,本申请提供了一种脸部特征提取模型训练方法,所述方法包括:
将脸部训练数据分别输入多个原始学生网络进行模型训练,以得到各所述原始学生网络对应的候选学生网络;
将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应的验证脸部特征数据;
将各所述验证脸部特征数据分别输入预设的老师网络,以输出各所述验证脸部特征数据对应的候选脸部特征数据;
基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征,将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型。
第二方面,本申请还提供一种脸部特征提取方法,包括:
获取目标图像;
对所述目标图像进行图像处理,以得到目标处理图像;
将所述目标处理图像输入脸部特征提取模型,以输出目标脸部特征;其中,所述脸部特征提取模型采用上述脸部特征提取模型训练方法训练得到模型。
第三方面,本申请还提供了一种脸部特征提取模型训练装置,所述装置包括:
模型训练单元,用于将脸部训练数据分别输入多个原始学生网络进行模型训练,以得到各所述原始学生网络对应的候选学生网络;
数据输出单元,用于将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应的验证脸部特征数据;
数据输入单元,用于将各所述验证脸部特征数据分别输入预设的老师网络,以输出各所述验证脸部特征数据对应的候选脸部特征数据;
模型确定单元,用于基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征,将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型。
第四方面,本申请还提供了一种脸部特征提取装置,所述装置包括:
图像获取单元,用于获取目标图像;
图像处理单元,用于对所述目标图像进行图像处理,以得到目标处理图像;
图像输入单元,用于将所述目标处理图像输入脸部特征提取模型,以输出目标脸部特征;其中,所述脸部特征提取模型采用如上所述的脸部特征提取模型训练方法训练得到模型。
第五方面,本申请还提供了一种计算机设备,所述计算机设备包括存储器和处理器;所述存储器用于存储计算机程序;所述处理器,用于执行所述计算机程序并在执行所述计算机程序时实现如上述的脸部特征提取模型训练方法或脸部特征提取方法。
第六方面,本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如上述的脸部特征提取模型训练方法或脸部特征提取方法。
本申请公开了一种脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质,通过将脸部训练数据分别输入多个原始学生网络进行模型训练,以得到各所述原始学生网络对应的候选学生网络;将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应的验证脸部特征数据;将各所述验证脸部特征数据分别输入预设的老师网络,以输出各所述验证脸部特征数据对应的候选脸部特征数据;基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征,将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型。进而提高了脸部特征提取模型的特征提取准确率。
附图说明
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请的实施例提供的一种图像集的标记方法的示意流程图;
图2是图1中图像集的标记方法的子步骤示意流程图;
图3是图1中图像集的标记方法的子步骤示意流程图;
图4是图1中第一筛选图像集的获取步骤示意流程图;
图5是本申请的实施例提供的一种脸部特征提取模型训练方法的示意流程图;
图6是图5中脸部特征提取模型训练方法的子步骤示意流程图;
图7是图5中脸部特征提取模型训练方法的子步骤示意流程图;
图8是确定损失值的步骤示意流程图;
图9是本申请一实施例提供的一种脸部特征提取方法的步骤示意流程图;
图10是本申请一实施例提供的一种脸部特征提取方法的应用场景示意图;
图11是本申请的实施例提供的一种脸部特征提取模型训练装置的示意性框图;
图12是图11中脸部特征提取模型训练装置的子单元的示意性框图;
图13是本申请的实施例还提供一种脸部特征提取装置的示意性框图;
图14为本申请一实施例提供的一种计算机设备的结构示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
附图中所示的流程图仅是示例说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解、组合或部分合并,因此实际执行的顺序有可能根据实际情况改变。
本申请的实施例提供了一种脸部特征提取模型训练方法、脸部特征提取方法、装置、计算机设备及存储介质。脸部特征提取模型可用于人体面部动作识别场景中对人体面部动作进 行脸部特征提取。
下面结合附图,对本申请的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。
请参阅图1,图1是本申请实施例提供的一种图像集的标记方法的步骤示意流程图。
在机器学习过程中,需要对人脸样本图像进行标注才能够进行模型的测试训练。通常会对人脸样本图像直接进行标注,得到相应的脸部训练数据。然而,由于人脸样本图像会存在目前模型可以轻松识别的情形,将此类图像进行标注后进行训练通常起不到所需要的效果,还浪费了大量的标注人员的人力,降低了机器学习的整体效率。
如图1所示,为了提高模型训练的效率、提升机器学习的整体效率,本申请提供一种图像集的标记方法,用于标记人脸图像,从而对脸部特征提取模型进行模型训练。其中,该方法具体包括:步骤S110至步骤S150。
S110、按照预设选取规则从多个原始人脸图像中选取未标注的图像,以得到人脸样本图像集。
具体的,原始人脸图像是指从互联网中获得的大量图像。这些图像没有经过处理的图像。机器学习可以对这些图像进行识别处理以得到识别结果,或者从中选取相应的图像进行测试训练处理,以得到更适合该机器学习的数据,使得机器学习按照预设目标进行,从而得到更好的机器学习模型。
因而,需要从大量的原始人脸图像中选取未标注的人脸样本图像,所有的人脸样本图像构成人脸样本图像集。其中,预设选取规则,可以选择特定图像产生源的图像作为人脸样本图像,例如,选择通过预设渠道例如选取Yale人脸数据库中的人脸图像作为人脸样本图像。当然,也可以按照图像的产出时间等进行选取,比如,选取法定假期之间的人脸图像作为人脸样本图像。将所有人脸样本图像的集合构造为人脸样本图像集。
S120、对所述人脸样本图像集进行不确定性分析,以得到分析结果。
其中,所述人脸样本图像集包括若干未标注的图像。所述对各所述人脸样本图像集进行不确定性分析,以得到分析结果,具体包括:
对各所述人脸样本图像集进行最小置信度分析、边缘采样分析以及信息熵分析中的至少一项分析,得到每个所述人脸样本图像集的不确定性数值。
具体的,不确定性可以通过最小置信度、边缘采样以及信息熵中的至少一种进行衡量。具体的,其分析结果可以通过数值的方式进行体现。比如,数值越高不确定性就越高。当然,分析结果也可以将不确定性划分为多个等级进行比较等形式进行展示不确定性。
如图2所示,在一实施例中,所述对各所述人脸样本图像集进行不确定性分析的具体过程,即步骤S120,具体包括子步骤S121、S122和S123。
S121、对所述人脸样本图像集中的图像进行最小置信度分析,以得到每个所述图像对应的第一不确定性数值。
具体的,图像的不确定性也称为图像的标记价值。最小置信度分析,即Least confidence,可以定义为:
Figure PCTCN2019122165-appb-000001
其中,
Figure PCTCN2019122165-appb-000002
表示所述第一不确定性数值,
Figure PCTCN2019122165-appb-000003
表示类别,
Figure PCTCN2019122165-appb-000004
表示样本x的预测概率分布中的概率,P θ表示模型预测概率分布,
Figure PCTCN2019122165-appb-000005
越大表示样本x不确定性越高,即越需要进行相应的标注处理。
S122、对所述人脸样本图像集中的图像进行边缘采样分析,以得到每个所述图像对应的第二不确定性数值。
具体的,边缘样本分析,即Margin sampling,其可以定义为:
Figure PCTCN2019122165-appb-000006
其中,
Figure PCTCN2019122165-appb-000007
表示所述第二不确定性数值,
Figure PCTCN2019122165-appb-000008
表示样本x的预测概率分布中最大概率,
Figure PCTCN2019122165-appb-000009
表示样本x的预测概率分布中次大概率,其中,
Figure PCTCN2019122165-appb-000010
分别为模型θ预测的最大概率和次大概率对应的类别,P θ表示模型预测概率分布,
Figure PCTCN2019122165-appb-000011
越大表示样本x不确定性越高,即越需要进行相应的标注处理。
S123、对所述人脸样本图像集中的图像进行信息熵分析,以得到每个所述图像对应的第三不确定性数值。
具体的,信息熵分析可以定义为:
Figure PCTCN2019122165-appb-000012
其中,
Figure PCTCN2019122165-appb-000013
表示所述第三不确定性数值,P θ(y i|x)表示样本x的预测概率,P θ表示模型预测概率分布,
Figure PCTCN2019122165-appb-000014
越大表示样本x不确定性越高,即越需要进行相应的标注处理。
S130、根据所述分析结果,对所述人脸样本图像集进行筛选处理,以得到待标注图像集。
具体的,所述分析结果包括所述人脸样本图像集中各图像对应的不确定性数值。
在一实施例中,步骤S130具体包括:
根据第一不确定性数值、第二不确定性数值和第三不确定性数值,对所述人脸样本图像集进行筛选处理,以得到待标注图像集。
如图3所示,所述根据所述分析结果,对所述人脸样本图像集进行筛选处理,以得到待标注图像集,即步骤S130具体包括子步骤S131至S134。
S131、根据第一不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,以得到第一筛选图像集。
具体的,基于最小置信度分析,可以得到与该分析方法对应的第一不确定性数值。根据该第一不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,可以得到相应的第一筛选图像集。
S132、根据第二不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,以得到第二筛选图像集。
具体的,基于边缘采样分析,可以得到与该分析方法对应的第二不确定性数值。根据该第二不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,可以得到相应的第二筛选图像集。
S133、根据第三不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,以得到第三筛选图像集。
具体的,基于信息熵分析,可以得到与该分析方法对应的第三不确定性数值。根据该第三不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,可以得到相应的第三筛选图像集。
S134、根据所述第一筛选图像集、第二筛选图像集和第三筛选图像集构造所述待标注图像集。
具体的,将第一筛选图像集、第二筛选图像集和第三筛选图像集一起构成待标注图像集。进而增加待标注图像集中的图像数量,同时又增加了待标注图像集的图像多样性,最终丰富了待标注图像集中的图像,提高模型的训练效率,有效减少训练时间,使得模型更为鲁棒。
以不确定性数值为第一不确定性数值为例进行说明。在一实施例中,所述根据第一不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,以得到第一筛选图像集,即步骤S131包括:
对所述人脸样本图像集中的图像,根据相对应的不确定性数值进行降序排序,将前预设数量个对应的所述图像设置为待标注图像。将所有所述待标注图像设置为待标注图像集。
具体的,通过将人脸样本图像集中的图像按照对应的不确定性数值大小进行降序排列,可以保持序列前的图像都是不确定性较高的图像,再进行选择就能够保持数据的不确定性程度,进而保证用于训练模型的图像训练效率较高。
其中,预设数量可以根据应用的环境进行相应的选择,也可以根据一定的比例设置预设数量。比如,可以选择人脸样本图像集中总图像量的85%作为待标注图像,此时预设数量就是总数量的85%。
在另一实施例中,如图4所示,所述根据第一不确定性数值,对所述人脸样本图像集进行筛选处理,以得到第一筛选图像集,即步骤S131具体包括子步骤S1311、S1312和S1313。
S1311、判断每一所述图像的对应不确定性数值是否大于预设不确定性阀值。
具体的,不确定性数值是根据实际工作环境进行设置的。
S1312、若所述不确定性数值大于所述不确定性阀值,将对应的所述人脸样本图像集设置为待标注图像。
如果人脸样本图像集中某一图像的不确定性数值大于不确定性阀值,表明该图像符合标注的规则,进而对其进行标注处理。
S1313、将所有所述待标注图像设置为待标注图像集。
可以理解的,不确定性数值为第二不确定性数值或第三不确定性数值时可以参照上述步骤,在此不再赘述。
S140、对所述待标注图像集中的图像进行标注处理,以得到已标注的人脸图像集。
标注处理是指将各所述待标注图像与各自对应的类别形成一一对应的关系,就可以得到对应的已标注图像,该已标注图像即为所述脸部验证数据。
在一实施例中,所述对所述待标注图像进行标注处理,即步骤S140,可以包括:接收标注任意输入的与所述待标注图像对应的标注信息;根据审核结果将标注信息与所述待标注图像形成对应关系,得到已标注的人脸图像集;所述审核结果为审核人员对标注信息进行审核处理得到的。
上述图像标注方法,通过对人脸样本图像集分别进行最小置信度分析、边缘采样分析和信息熵分析,得到各自对应的不确定性结果,再对三个不确定性分析结果进行融合,从而可以从不同角度分析出图像的不确定性,增加待标注图像的数量,同时又增加了待标注图像的多样性。对各待标注图像进行标注可以提高模型训练效率,以更少的数据达到更好的效果,提升机器学习的整体效率。
请参阅图5,图5是本申请实施例提供的一种脸部特征提取模型训练方法的步骤示意流程图。
需要说明的是,该训练方法可以选取多个原始学生网络进行模型训练得到对应的候选学生网络。多个可以为两个、三个或者更多。原始学生网络可以为YOLO9000、AlexNet或VGGNet等网络。以下将以多个为两个,两个原始学生网络分别为YOLO9000网络和VGGNet网络为例进行介绍。
如图1所示,该脸部特征提取模型训练方法,具体包括:步骤S210至步骤S240。
S210、将脸部训练数据分别输入多个原始学生网络进行模型训练,以得到各所述原始学生网络对应的候选学生网络。
具体的,脸部训练数据用于对原始学生网络进行模型训练,其包括训练子数据和测试子数据。其中,测试子数据为采用上述图像标记方法标记得到的数据,用于对候选学生网络进行测试,以判断该候选学生网络是否达到学习的要求。其中,将脸部训练数据输入YOLO9000网络进行模型训练,得到第一候选学生网络。将该脸部训练数据输入VGGNet网络进行模型训练,得到第二候选学生网络。
S220、将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应 的验证脸部特征数据。
具体的,脸部验证数据也可以采用上述图像标记方法标记得到的数据。其中,将脸部验证数据输入第一候选学生网络,得到第一验证脸部特征数据。将该脸部验证数据输入第二候选学生网络,得到第二验证脸部特征数据。
S230、将各所述验证脸部特征数据分别输入预设的老师网络,以输出各所述验证脸部特征数据对应的候选脸部特征数据。
其中,老师网络可以为预先训练好的YOLO9000。具体的,将第一验证脸部特征数据输入该老师网络,输出第一候选脸部特征数据。将第二验证脸部特征数据输入该老师网络,输出第二候选脸部特征数据。
S240、基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征,将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型。
具体的,预设特征筛选规则可以根据具体的应用场景进行设置。在一实施例中,如图6所示,所示基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征,包括子步骤S241和S242。
S241、根据各所述候选脸部特征数据和预设校验人脸图像的校验脸部特征数据,计算各所述候选脸部特征数据对应的准确率。
具体的,校验人脸图像可以根据具体的场景进行设置。其中,根据第一候选脸部特征数据,计算第一准确率。根据第二候选脸部特征数据,计算第二准确率。
更为具体的,将第一候选脸部特征数据和预设校验人脸图像的校验脸部特征数据,输入预先训练好的神经网络模型,以输出第一候选脸部特征数据对应的第一准确率。将第二候选脸部特征数据和预设校验人脸图像的校验脸部特征数据,输入该神经网络模型,以输出第二候选脸部特征数据对应的第二准确率。该神经网络模型具体可以为预先训练好的GoogLeNet模型,当然,也可以是其他网络模型。
S242、将最大的准确率对应的候选脸部特征数据确定为目标样本特征。
比如,若第一准确率小于第二准确率,将第二准确率对应的第二候选脸部特征数据确定为目标样本特征。将第二候选脸部特征数据对应的第二候选学生网络确定为脸部特征提取模型。
如图7所示,在一实施例中,所述将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型,包括子步骤S243、S244和S245。
S243、根据所述目标样本特征计算所述目标样本特征对应的候选学生网络的损失值。
其中,根据所述目标样本特征计算所述目标样本特征对应的候选学生网络的损失值的具体过程,如图8所示,即步骤S243包括S2431和S2432。
S2431、基于第一损失函数和第二损失函数,根据所述目标样本特征确定所述目标样本特征对应的候选学生网络的第一子损失值和第二子损失值。
具体的,基于第一损失函数,根据所述目标样本特征确定所述目标样本特征对应的候选学生网络的第一子损失值。基于第二损失函数,根据所述目标样本特征确定所述目标样本特征对应的候选学生网络的第二子损失值。
其中,所述第一损失函数为:
Figure PCTCN2019122165-appb-000015
其中,J s为第一子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,u j为脸部训练数据中第k张图片的标签向量,m为每个批次的脸部训练数据中图片数量。
所述第二损失函数为:
Figure PCTCN2019122165-appb-000016
其中,J c为第二子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,c k为脸部训练数据中第k张图片的中心,m为每个批次的脸部训练数据中图片数量。
S2432、基于损失值融合公式,根据所述第一子损失值和所述第二子损失值,确定所述目标样本特征对应的候选学生网络的损失值。
具体的,所述损失值融合公式为:
J=w 1J s+w 2J c
其中,J为所述损失值,w 1和w 2为权重。
采用第一损失函数和第二损失函数结合作为脸部特征提取模型再训练的损失函数,使得训练后的脸部特征提取模型具有内聚性,当在缺乏海量高质量脸部训练数据集的情况下也能准确地提取特征数据,同时也避免了脸部特征提取模型再训练时收敛速度慢和过拟合的现象。
S244、若所述损失值小于预设损失阀值,将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型。
具体的,若所述损失值小于预设损失阀值,表明目标样本特征对应的候选学生网络已收敛,将该候选学生网络确定为脸部特征提取模型。
S245、若所述损失值不小于预设损失阀值,根据所述损失值,调整所述候选学生网络的参数。
具体的,若所述损失值不小于预设损失阀值,表明目标样本特征对应的候选学生网络不收敛,需要对候选学生网络继续进行训练,根据该损失值调整该候选学生网络的参数,直至该损失值小于损失阀值,将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型,即执行步骤S244。
上述脸部特征提取模型训练方法,通过将上述标记方法标记的脸部训练数据对多个原始学生网络进行模型训练,再通过老师网络和脸部验证数据对其进行筛选,从而得到特征提取准确率最优的候选学生网络作为脸部特征提取模型。该训练方法,丰富了部分脸部训练数据和脸部验证数据的多样性,提高模型训练的效率,并提高了模型提取脸部特征的准确率,为人体面部动作识别提供了重要参考。
请参阅图9,图9是本申请一实施例提供的一种脸部特征提取方法的步骤示意流程图。请参阅图10,图10是本申请一实施例提供的一种脸部特征提取方法的应用场景示意图。其中,该脸部特征提取方法可以应用于包括终端设备610、620,网络630和服务器640的系统中。
网络640用以在终端设备610、620和服务器640之间提供通信链路的介质。网络630可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
用户可以使用终端设备610、620通过网络630与服务器640交互,以接收或发送请求指令等。终端设备610、620上可以安装有各种通讯客户端应用,例如图片处理应用、网页浏览器应用、搜索类应用、即时通信工具等。
具体的,该脸部特征提取方法,具体包括:步骤S310至步骤S330。
S310、获取目标图像。
具体的,所述待识别图像为包括待识别人脸目标,其可以为可见光图像,比如RGB(Red Green Blue,红绿蓝)模式的图像。当然上述待识别图像也可以为近红外(Near Infrared,NIR)图像。
本实施例的执行主体可以安装有用于采集可见光图像的摄像头或用于采集近红外图像的摄像头。用户可以选择需开启的摄像头,进而利用所选择的摄像头进行拍摄(利用进行用户头部或者脸部的自拍),得到待识别图像。
S320、对所述目标图像进行图像处理,以得到目标处理图像。
为了提高脸部特征提取模型的准确度,在获取目标图像后,还需对目标图像进行图像处理操作以改变所述目标图像的图像参数。
其中,图像处理操作包括:尺寸调整、裁剪处理、旋转处理和图像算法处理等等。图像算法处理包括:调整色温算法、调整曝光算法等。通过这些图像处理操作可以使得目标图像更贴近真实拍摄的图片。
相应地,图片参数包括尺寸信息、像素大小、对比度、清晰度和自然饱和度等。
S330、将所述目标处理图像输入脸部特征提取模型,以输出目标脸部特征。
其中,所述脸部特征提取模型为采用上述脸部特征提取模型训练方法训练得到模型。
上述脸部特征提取方法,通过获取目标图像;对所述目标图像进行图像处理;将所述目标处理图像输入脸部特征提取模型,提取脸部特征的准确率高,便于应用到实际的应用场景中。
请参阅图11,图11是本申请的实施例提供的一种脸部特征提取模型训练装置的示意性框图,该脸部特征提取模型训练装置用于执行前述任一项脸部特征提取模型训练方法。其中,该脸部特征提取模型训练装置可以配置于服务器或终端中。
其中,服务器可以为独立的服务器,也可以为服务器集群。该终端可以是手机、平板电脑、笔记本电脑、台式电脑、个人数字助理和穿戴式设备等电子设备。
如图11所示,脸部特征提取模型训练装置400包括:模型训练单元410、数据输出单元420、数据输入单元430和模型确定单元440。
模型训练单元410,用于将脸部训练数据分别输入多个原始学生网络进行模型训练,以得到各所述原始学生网络对应的候选学生网络;
数据输出单元420,用于将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应的验证脸部特征数据;
数据输入单元430,用于将各所述验证脸部特征数据分别输入预设的老师网络,以输出各所述验证脸部特征数据对应的候选脸部特征数据;
模型确定单元440,用于基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征,将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型。
如图11所示,在一个实施例中,所述脸部特征提取模型训练装置400还包括结果获取单元450、图像筛选单元460和图像标注单元470。
结果获取单元450,用于对人脸样本图像集进行不确定性分析,以得到分析结果,所述人脸样本图像集包括若干未标注的图像;
图像筛选单元460,用于根据所述分析结果,对所述人脸样本图像集进行筛选处理,以得到待标注图像集;
图像标注单元470,用于对所述待标注图像集中的图像进行标注处理,以得到脸部验证数据。
如图12所示,在一实施例中,所述模型确定单元440包括准确率计算子单元441、特征确定子单元442。
准确率计算子单元441,用于根据各所述候选脸部特征数据和预设校验人脸图像的校验脸部特征数据,计算各所述候选脸部特征数据对应的准确率。
特征确定子单元442,用于将最大的准确率对应的候选脸部特征数据确定为目标样本特征。
如图12所示,在一实施例中,所述模型确定单元440还包括损失值确定子单元443和模型确定子单元444。
损失值确定子单元443,用于根据所述目标样本特征计算所述目标样本特征对应的候选学生网络的损失值;
模型确定子单元444,用于若所述损失值小于预设损失阀值,将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型。
如图12所示,在一实施例中,所述损失值确定子单元443包括子损失值确定子单元4431和损失值融合子单元4432。
子损失值确定子单元4431,用于基于第一损失函数和第二损失函数,根据所述目标样本特征确定所述目标样本特征对应的候选学生网络的第一子损失值和第二子损失值。
损失值融合子单元4432,用于基于损失值融合公式,根据所述第一子损失值和所述第二子损失值,确定所述目标样本特征对应的候选学生网络的损失值。
所述第一损失函数为:
Figure PCTCN2019122165-appb-000017
其中,J s为第一子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,u j为脸部训练数据中第k张图片的标签向量,m为每个批次的脸部训练数据中图片数量;
所述第二损失函数为:
Figure PCTCN2019122165-appb-000018
其中,J c为第二子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,c k为脸部训练数据中第k张图片的中心,m为每个批次的脸部训练数据中图片数量;
所述损失值融合公式为:
J=w 1J s+w 2J c
其中,J为所述损失值,w 1和w 2为权重。
请参阅图13,图13是本申请的实施例还提供一种脸部特征提取装置的示意性框图,该脸部特征提取装置用于执行前述的脸部特征提取方法。其中,该脸部特征提取装置可以配置于服务器或终端中。
如图13所示,该脸部特征提取装置500,包括:图像获取单元510、图像处理单元520和图像输入单元530。
图像获取单元510,用于获取目标图像。
图像处理单元520,用于对所述目标图像进行图像处理,以得到目标处理图像。
图像输入单元530,用于将所述目标处理图像输入脸部特征提取模型,以输出目标脸部特征。其中,所述脸部特征提取模型为采用上述脸部特征提取模型训练方法训练得到的模型。
需要说明的是,所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的脸部特征提取模型训练装置和各单元的具体工作过程,可以参考前述脸部特征提取模型训练方法实施例中的对应过程,在此不再赘述。
上述装置可以实现为一种计算机程序的形式,该计算机程序可以在如图14所示的计算机设备上运行。
请参阅图14,图14是本申请实施例提供的一种计算机设备的示意性框图。该计算机设备可以是服务器或终端。
参阅图14,该计算机设备包括通过系统总线连接的处理器、存储器和网络接口,其中,存储器可以包括非易失性存储介质和内存储器。
非易失性存储介质可存储操作系统和计算机程序。该计算机程序包括程序指令,该程序 指令被执行时,可使得处理器执行一种脸部特征提取模型训练方法。
处理器用于提供计算和控制能力,支撑整个计算机设备的运行。
内存储器为非易失性存储介质中的计算机程序的运行提供环境,该计算机程序被处理器执行时,可使得处理器执行一种脸部特征提取模型训练方法。
该网络接口用于进行网络通信,如发送分配的任务等。本领域技术人员可以理解,图14中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
应当理解的是,处理器可以是中央处理单元(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
其中,所述处理器用于运行存储在存储器中的计算机程序,以实现如下步骤:
将脸部训练数据分别输入多个原始学生网络进行模型训练,以得到各所述原始学生网络对应的候选学生网络;将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应的验证脸部特征数据;将各所述验证脸部特征数据分别输入预设的老师网络,以输出各所述验证脸部特征数据对应的候选脸部特征数据;基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征,将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型。
在一个实施例中,所述处理器在实现所述将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应的验证脸部特征数据之前,还用于实现:
对人脸样本图像集进行不确定性分析,以得到分析结果,所述人脸样本图像集包括若干未标注的图像;根据所述分析结果,对所述人脸样本图像集进行筛选处理,以得到待标注图像集;对所述待标注图像集中的图像进行标注处理,以得到脸部验证数据。
在一个实施例中,所述处理器在实现所述基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征时,用于实现:
根据各所述候选脸部特征数据和预设校验人脸图像的校验脸部特征数据,计算各所述候选脸部特征数据对应的准确率;将最大的准确率对应的候选脸部特征数据确定为目标样本特征。
在一个实施例中,所述处理器在实现所述将所述目标样本特征对应的候选学生网络作为脸部特征提取模型时,用于实现:
根据所述目标样本特征计算所述目标样本特征对应的候选学生网络的损失值;若所述损失值小于预设损失阀值,将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型。
在一个实施例中,所述处理器在实现所述根据所述目标样本特征计算所述目标样本特征对应的候选学生网络的损失值时,用于实现:
基于第一损失函数和第二损失函数,根据所述目标样本特征确定所述目标样本特征对应的候选学生网络的第一子损失值和第二子损失值;基于损失值融合公式,根据所述第一子损失值和所述第二子损失值,确定所述目标样本特征对应的候选学生网络的损失值;
所述第一损失函数为:
Figure PCTCN2019122165-appb-000019
其中,J s为第一子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,u j为脸部训练数据中第k张图片的标签向量,m为每个批次的脸部训练数据中图片数量;
所述第二损失函数为:
Figure PCTCN2019122165-appb-000020
其中,J c为第二子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,c k为脸部训练数据中第k张图片的中心,m为每个批次的脸部训练数据中图片数量;
所述损失值融合公式为:
J=w 1J s+w 2J c
其中,J为所述损失值,w 1和w 2为权重。
其中,在另一实施例中,所述处理器用于运行存储在存储器中的计算机程序,以实现如下步骤:
获取目标图像;对所述目标图像进行图像处理,以得到目标处理图像;将所述目标处理图像输入脸部特征提取模型,以输出目标脸部特征。其中,所述脸部特征提取模型为采用上述脸部特征提取模型训练方法训练得到的模型。
本申请的实施例中还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序中包括程序指令,所述处理器执行所述程序指令,实现本申请实施例提供的脸部特征提取模型训练方法或脸部特征提取方法。
其中,所述计算机可读存储介质可以是前述实施例所述的计算机设备的内部存储单元,例如所述计算机设备的硬盘或内存。所述计算机可读存储介质也可以是所述计算机设备的外部存储设备,例如所述计算机设备上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (20)

  1. 一种脸部特征提取模型训练方法,包括:
    将脸部训练数据分别输入多个原始学生网络进行模型训练,以得到各所述原始学生网络对应的候选学生网络;
    将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应的验证脸部特征数据;
    将各所述验证脸部特征数据分别输入预设的老师网络,以输出各所述验证脸部特征数据对应的候选脸部特征数据;
    基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征;
    基于第一损失函数和第二损失函数,根据所述目标样本特征确定所述目标样本特征对应的候选学生网络的第一子损失值和第二子损失值;
    基于损失值融合公式,根据所述第一子损失值和所述第二子损失值,确定所述目标样本特征对应的候选学生网络的损失值;
    若所述损失值小于预设损失阀值,将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型;
    所述第一损失函数为:
    Figure PCTCN2019122165-appb-100001
    其中,J s为第一子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,u j为脸部训练数据中第k张图片的标签向量,m为每个批次的脸部训练数据中图片数量;
    所述第二损失函数为:
    Figure PCTCN2019122165-appb-100002
    其中,J c为第二子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,c k为脸部训练数据中第k张图片的中心,m为每个批次的脸部训练数据中图片数量;
    所述损失值融合公式为:
    J=w 1J s+w 2J c
    其中,J为所述损失值,w 1和w 2为权重。
  2. 根据权利要求1所述的脸部特征提取模型训练方法,其中,所述将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应的验证脸部特征数据之前,还包括:
    对人脸样本图像集进行不确定性分析,以得到分析结果,所述人脸样本图像集包括若干未标注的图像;
    根据所述分析结果,对所述人脸样本图像集进行筛选处理,以得到待标注图像集;
    对所述待标注图像集中的图像进行标注处理,以得到脸部验证数据。
  3. 根据权利要求2所述的脸部特征提取模型训练方法,其中,所述对人脸样本图像集进行不确定性分析,以得到分析结果,所述人脸样本图像集包括若干未标注的图像,包括:
    对所述人脸样本图像集中的图像进行最小置信度分析,以得到每个所述图像对应的第一不确定性数值;
    对所述人脸样本图像集中的图像进行边缘采样分析,以得到每个所述图像对应的第二不确定性数值;
    对所述人脸样本图像集中的图像进行信息熵分析,以得到每个所述图像对应的第三不确定性数值,从而得到所述分析结果。
  4. 根据权利要求2所述的脸部特征提取模型训练方法,其中,所述分析结果包括第一不确定性数值、第二不确定性数值和第三不确定性数值;所述根据所述分析结果,对所述人脸样本图像集进行筛选处理,以得到待标注图像集,包括:
    根据第一不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,以得到第一筛选图像集;
    根据第二不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,以得到第二筛选图像集;
    根据第三不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,以得到第三筛选图像集;
    根据所述第一筛选图像集、第二筛选图像集和第三筛选图像集构造所述待标注图像集。
  5. 根据权利要求1所述的脸部特征提取模型训练方法,其中,所述基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征,包括:
    根据各所述候选脸部特征数据和预设校验人脸图像的校验脸部特征数据,计算各所述候选脸部特征数据对应的准确率;
    将最大的准确率对应的候选脸部特征数据确定为目标样本特征。
  6. 一种脸部特征提取方法,包括:
    获取目标图像;
    对所述目标图像进行图像处理,以得到目标处理图像;
    将所述目标处理图像输入脸部特征提取模型,以输出目标脸部特征;其中,所述脸部特征提取模型采用如权利要求1至5任一项所述的脸部特征提取模型训练方法训练得到模型。
  7. 一种脸部特征提取模型训练装置,包括:
    模型训练单元,用于将脸部训练数据分别输入多个原始学生网络进行模型训练,以得到各所述原始学生网络对应的候选学生网络;
    数据输出单元,用于将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应的验证脸部特征数据;
    数据输入单元,用于将各所述验证脸部特征数据分别输入预设的老师网络,以输出各所述验证脸部特征数据对应的候选脸部特征数据;
    模型确定单元,用于基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征;
    基于第一损失函数和第二损失函数,根据所述目标样本特征确定所述目标样本特征对应的候选学生网络的第一子损失值和第二子损失值;
    基于损失值融合公式,根据所述第一子损失值和所述第二子损失值,确定所述目标样本特征对应的候选学生网络的损失值;
    若所述损失值小于预设损失阀值,将所述目标样本特征对应的候选学生网络确定为脸部 特征提取模型;
    所述第一损失函数为:
    Figure PCTCN2019122165-appb-100003
    其中,J s为第一子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,u j为脸部训练数据中第k张图片的标签向量,m为每个批次的脸部训练数据中图片数量;
    所述第二损失函数为:
    Figure PCTCN2019122165-appb-100004
    其中,J c为第二子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,c k为脸部训练数据中第k张图片的中心,m为每个批次的脸部训练数据中图片数量;
    所述损失值融合公式为:
    J=w 1J s+w 2J c
    其中,J为所述损失值,w 1和w 2为权重。
  8. 一种脸部特征提取装置,包括:
    图像获取单元,用于获取目标图像;
    图像处理单元,用于对所述目标图像进行图像处理,以得到目标处理图像;
    图像输入单元,用于将所述目标处理图像输入脸部特征提取模型,以输出目标脸部特征;其中,所述脸部特征提取模型采用如权利要求1至5任一项所述的脸部特征提取模型训练方法训练得到模型。
  9. 一种计算机设备,所述计算机设备包括存储器和处理器;
    所述存储器用于存储计算机程序;
    所述处理器,用于执行所述计算机程序并在执行所述计算机程序时实现如下步骤:
    将脸部训练数据分别输入多个原始学生网络进行模型训练,以得到各所述原始学生网络对应的候选学生网络;
    将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应的验证脸部特征数据;
    将各所述验证脸部特征数据分别输入预设的老师网络,以输出各所述验证脸部特征数据对应的候选脸部特征数据;
    基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征;
    基于第一损失函数和第二损失函数,根据所述目标样本特征确定所述目标样本特征对应的候选学生网络的第一子损失值和第二子损失值;
    基于损失值融合公式,根据所述第一子损失值和所述第二子损失值,确定所述目标样本特征对应的候选学生网络的损失值;
    若所述损失值小于预设损失阀值,将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型;
    所述第一损失函数为:
    Figure PCTCN2019122165-appb-100005
    其中,J s为第一子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,u j为脸部训练数据中第k张图片的标签向量,m为每个批次的脸部训练数据中图片数量;
    所述第二损失函数为:
    Figure PCTCN2019122165-appb-100006
    其中,J c为第二子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,c k为脸部训练数据中第k张图片的中心,m为每个批次的脸部训练数据中图片数量;
    所述损失值融合公式为:
    J=w 1J s+w 2J c
    其中,J为所述损失值,w 1和w 2为权重。
  10. 根据权利要求9所述的计算机设备,其中,所述将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应的验证脸部特征数据之前,还包括:
    对人脸样本图像集进行不确定性分析,以得到分析结果,所述人脸样本图像集包括若干未标注的图像;
    根据所述分析结果,对所述人脸样本图像集进行筛选处理,以得到待标注图像集;
    对所述待标注图像集中的图像进行标注处理,以得到脸部验证数据。
  11. 根据权利要求10所述的计算机设备,其中,所述对人脸样本图像集进行不确定性分析,以得到分析结果,所述人脸样本图像集包括若干未标注的图像,包括:
    对所述人脸样本图像集中的图像进行最小置信度分析,以得到每个所述图像对应的第一不确定性数值;
    对所述人脸样本图像集中的图像进行边缘采样分析,以得到每个所述图像对应的第二不确定性数值;
    对所述人脸样本图像集中的图像进行信息熵分析,以得到每个所述图像对应的第三不确定性数值,从而得到所述分析结果。
  12. 根据权利要求10所述的计算机设备,其中,所述分析结果包括第一不确定性数值、第二不确定性数值和第三不确定性数值;所述根据所述分析结果,对所述人脸样本图像集进行筛选处理,以得到待标注图像集,包括:
    根据第一不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,以得到第一筛选图像集;
    根据第二不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,以得到第二筛选图像集;
    根据第三不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,以得到第三筛选图像集;
    根据所述第一筛选图像集、第二筛选图像集和第三筛选图像集构造所述待标注图像集。
  13. 根据权利要求9所述的计算机设备,其中,所述基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征,包括:
    根据各所述候选脸部特征数据和预设校验人脸图像的校验脸部特征数据,计算各所述候选脸部特征数据对应的准确率;
    将最大的准确率对应的候选脸部特征数据确定为目标样本特征。
  14. 一种计算机设备,所述计算机设备包括存储器和处理器;
    所述存储器用于存储计算机程序;
    所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现如下步骤:
    获取目标图像;
    对所述目标图像进行图像处理,以得到目标处理图像;
    将所述目标处理图像输入脸部特征提取模型,以输出目标脸部特征;其中,所述脸部特征提取模型采用如权利要求1至5任一项所述的脸部特征提取模型训练方法训练得到模型。
  15. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如下步骤:
    将脸部训练数据分别输入多个原始学生网络进行模型训练,以得到各所述原始学生网络对应的候选学生网络;
    将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应的验证脸部特征数据;
    将各所述验证脸部特征数据分别输入预设的老师网络,以输出各所述验证脸部特征数据对应的候选脸部特征数据;
    基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征;
    基于第一损失函数和第二损失函数,根据所述目标样本特征确定所述目标样本特征对应的候选学生网络的第一子损失值和第二子损失值;
    基于损失值融合公式,根据所述第一子损失值和所述第二子损失值,确定所述目标样本特征对应的候选学生网络的损失值;
    若所述损失值小于预设损失阀值,将所述目标样本特征对应的候选学生网络确定为脸部特征提取模型;
    所述第一损失函数为:
    Figure PCTCN2019122165-appb-100007
    其中,J s为第一子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,u j为脸部训练数据中第k张图片的标签向量,m为每个批次的脸部训练数据中图片数量;
    所述第二损失函数为:
    Figure PCTCN2019122165-appb-100008
    其中,J c为第二子损失值,u k为脸部训练数据中第k张图片的目标样本特征的特征向量,c k为脸部训练数据中第k张图片的中心,m为每个批次的脸部训练数据中图片数量;
    所述损失值融合公式为:
    J=w 1J s+w 2J c
    其中,J为所述损失值,w 1和w 2为权重。
  16. 根据权利要求15所述的计算机可读存储介质,其中,所述将脸部验证数据分别输入各所述候选学生网络,以输出各所述候选学生网络对应的验证脸部特征数据之前,还包括:
    对人脸样本图像集进行不确定性分析,以得到分析结果,所述人脸样本图像集包括若干未标注的图像;
    根据所述分析结果,对所述人脸样本图像集进行筛选处理,以得到待标注图像集;
    对所述待标注图像集中的图像进行标注处理,以得到脸部验证数据。
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述对人脸样本图像集进行不确定性分析,以得到分析结果,所述人脸样本图像集包括若干未标注的图像,包括:
    对所述人脸样本图像集中的图像进行最小置信度分析,以得到每个所述图像对应的第一不确定性数值;
    对所述人脸样本图像集中的图像进行边缘采样分析,以得到每个所述图像对应的第二不确定性数值;
    对所述人脸样本图像集中的图像进行信息熵分析,以得到每个所述图像对应的第三不确定性数值,从而得到所述分析结果。
  18. 根据权利要求16所述的计算机可读存储介质,其中,所述分析结果包括第一不确定性数值、第二不确定性数值和第三不确定性数值;所述根据所述分析结果,对所述人脸样本图像集进行筛选处理,以得到待标注图像集,包括:
    根据第一不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,以得到第一筛选图像集;
    根据第二不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,以得到第二筛选图像集;
    根据第三不确定性数值,对所述人脸样本图像集中的图像进行筛选处理,以得到第三筛选图像集;
    根据所述第一筛选图像集、第二筛选图像集和第三筛选图像集构造所述待标注图像集。
  19. 根据权利要求15所述的计算机可读存储介质,其中,所述基于预设特征筛选规则,对多个所述候选脸部特征数据进行筛选,以得到目标样本特征,包括:
    根据各所述候选脸部特征数据和预设校验人脸图像的校验脸部特征数据,计算各所述候选脸部特征数据对应的准确率;
    将最大的准确率对应的候选脸部特征数据确定为目标样本特征。
  20. 一种计算机可读存储介质,包括:
    获取目标图像;
    对所述目标图像进行图像处理,以得到目标处理图像;
    将所述目标处理图像输入脸部特征提取模型,以输出目标脸部特征;其中,所述脸部特征提取模型采用如权利要求1至5任一项所述的脸部特征提取模型训练方法训练得到模型。
PCT/CN2019/122165 2019-06-21 2019-11-29 脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质 WO2020253127A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US16/981,192 US20230119593A1 (en) 2019-06-21 2019-07-05 Method and apparatus for training facial feature extraction model, method and apparatus for extracting facial features, device, and storage medium
JP2020560486A JP6994588B2 (ja) 2019-06-21 2019-11-29 顔特徴抽出モデル訓練方法、顔特徴抽出方法、装置、機器および記憶媒体
EP19919545.4A EP3989104A4 (en) 2019-06-21 2019-11-29 FACIAL FEATURE EXTRACTION MODEL TRAINING METHOD AND APPARATUS, FACIAL FEATURE EXTRACTION METHOD AND APPARATUS, DEVICE AND INFORMATION MEDIA
KR1020207026394A KR102385463B1 (ko) 2019-06-21 2019-11-29 얼굴 특징 추출 모델 학습 방법, 얼굴 특징 추출 방법, 장치, 디바이스 및 저장 매체

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201910542712 2019-06-21
CN201910542712.9 2019-06-21
CN201910606508.9 2019-07-05
CN201910606508.9A CN110472494A (zh) 2019-06-21 2019-07-05 脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2020253127A1 true WO2020253127A1 (zh) 2020-12-24

Family

ID=68507127

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/122165 WO2020253127A1 (zh) 2019-06-21 2019-11-29 脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质

Country Status (6)

Country Link
US (1) US20230119593A1 (zh)
EP (1) EP3989104A4 (zh)
JP (1) JP6994588B2 (zh)
KR (1) KR102385463B1 (zh)
CN (1) CN110472494A (zh)
WO (1) WO2020253127A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469977A (zh) * 2021-07-06 2021-10-01 浙江霖研精密科技有限公司 一种基于蒸馏学习机制的瑕疵检测装置、方法、存储介质
CN113610028A (zh) * 2021-08-13 2021-11-05 广联达科技股份有限公司 行人重识别方法、装置、计算机设备及存储介质
CN113657468A (zh) * 2021-07-29 2021-11-16 北京百度网讯科技有限公司 预训练模型的生成方法、装置、电子设备和存储介质
CN113902899A (zh) * 2021-09-29 2022-01-07 北京百度网讯科技有限公司 训练方法、目标检测方法、装置、电子设备以及存储介质
CN114049289A (zh) * 2021-11-10 2022-02-15 合肥工业大学 基于对比学习与StyleGAN2的近红外-可见光人脸图像合成方法
CN114842457A (zh) * 2022-06-29 2022-08-02 小米汽车科技有限公司 模型训练及特征提取方法、装置、电子设备和介质
CN118116087A (zh) * 2024-04-29 2024-05-31 广东康软科技股份有限公司 应用于数字化医疗服务的活体身份验证方法及系统

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110472494A (zh) * 2019-06-21 2019-11-19 深圳壹账通智能科技有限公司 脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质
US11443235B2 (en) * 2019-11-14 2022-09-13 International Business Machines Corporation Identifying optimal weights to improve prediction accuracy in machine learning techniques
CN111062995B (zh) * 2019-11-28 2024-02-23 重庆中星微人工智能芯片技术有限公司 生成人脸图像的方法、装置、电子设备和计算机可读介质
CN112561080B (zh) * 2020-12-18 2023-03-03 Oppo(重庆)智能科技有限公司 样本筛选方法、样本筛选装置及终端设备
CN113065512A (zh) * 2021-04-21 2021-07-02 深圳壹账通智能科技有限公司 人脸微表情识别方法、装置、设备及存储介质
CN113128522B (zh) * 2021-05-11 2024-04-05 四川云从天府人工智能科技有限公司 目标识别方法、装置、计算机设备和存储介质
CN112990161A (zh) * 2021-05-17 2021-06-18 江苏数兑科技有限公司 电子证照识别方法及装置
CN116543426A (zh) * 2022-01-21 2023-08-04 小米科技(武汉)有限公司 图像处理方法、装置、电子设备以及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150206064A1 (en) * 2014-01-19 2015-07-23 Jacob Levman Method for supervised machine learning
CN108334934A (zh) * 2017-06-07 2018-07-27 北京深鉴智能科技有限公司 基于剪枝和蒸馏的卷积神经网络压缩方法
CN109241988A (zh) * 2018-07-16 2019-01-18 北京市商汤科技开发有限公司 特征提取方法和装置、电子设备、存储介质、程序产品
CN109447169A (zh) * 2018-11-02 2019-03-08 北京旷视科技有限公司 图像处理方法及其模型的训练方法、装置和电子系统
CN109711544A (zh) * 2018-12-04 2019-05-03 北京市商汤科技开发有限公司 模型压缩的方法、装置、电子设备及计算机存储介质
CN110472494A (zh) * 2019-06-21 2019-11-19 深圳壹账通智能科技有限公司 脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2014240213B2 (en) * 2014-09-30 2016-12-08 Canon Kabushiki Kaisha System and Method for object re-identification
KR102477190B1 (ko) * 2015-08-10 2022-12-13 삼성전자주식회사 얼굴 인식 방법 및 장치
CN106372581B (zh) * 2016-08-25 2020-09-04 中国传媒大学 构建及训练人脸识别特征提取网络的方法
CN107704871A (zh) * 2017-09-08 2018-02-16 第四范式(北京)技术有限公司 生成机器学习样本的组合特征的方法及系统
CN108062394A (zh) * 2017-12-18 2018-05-22 北京中关村科金技术有限公司 一种数据集的标注方法及相关装置
CN108764024B (zh) * 2018-04-09 2020-03-24 平安科技(深圳)有限公司 人脸识别模型的生成装置、方法及计算机可读存储介质
CN108875912A (zh) * 2018-05-29 2018-11-23 天津科技大学 一种用于图像识别的神经网络模型
CN110598840B (zh) * 2018-06-13 2023-04-18 富士通株式会社 知识迁移方法、信息处理设备以及存储介质
CN108875833B (zh) * 2018-06-22 2021-07-16 北京儒博科技有限公司 神经网络的训练方法、人脸识别方法及装置
CN109409196A (zh) * 2018-08-30 2019-03-01 深圳壹账通智能科技有限公司 基于人脸的性格预测方法、装置、电子设备
CN109145854A (zh) * 2018-08-31 2019-01-04 东南大学 一种基于级联卷积神经网络结构的人脸检测方法
CN109118119A (zh) * 2018-09-06 2019-01-01 多点生活(成都)科技有限公司 风控模型生成方法及装置
US11494637B2 (en) * 2019-03-28 2022-11-08 International Business Machines Corporation Layer-wise distillation for protecting pre-trained neural network models

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150206064A1 (en) * 2014-01-19 2015-07-23 Jacob Levman Method for supervised machine learning
CN108334934A (zh) * 2017-06-07 2018-07-27 北京深鉴智能科技有限公司 基于剪枝和蒸馏的卷积神经网络压缩方法
CN109241988A (zh) * 2018-07-16 2019-01-18 北京市商汤科技开发有限公司 特征提取方法和装置、电子设备、存储介质、程序产品
CN109447169A (zh) * 2018-11-02 2019-03-08 北京旷视科技有限公司 图像处理方法及其模型的训练方法、装置和电子系统
CN109711544A (zh) * 2018-12-04 2019-05-03 北京市商汤科技开发有限公司 模型压缩的方法、装置、电子设备及计算机存储介质
CN110472494A (zh) * 2019-06-21 2019-11-19 深圳壹账通智能科技有限公司 脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469977A (zh) * 2021-07-06 2021-10-01 浙江霖研精密科技有限公司 一种基于蒸馏学习机制的瑕疵检测装置、方法、存储介质
CN113469977B (zh) * 2021-07-06 2024-01-12 浙江霖研精密科技有限公司 一种基于蒸馏学习机制的瑕疵检测装置、方法、存储介质
CN113657468A (zh) * 2021-07-29 2021-11-16 北京百度网讯科技有限公司 预训练模型的生成方法、装置、电子设备和存储介质
CN113610028A (zh) * 2021-08-13 2021-11-05 广联达科技股份有限公司 行人重识别方法、装置、计算机设备及存储介质
CN113902899A (zh) * 2021-09-29 2022-01-07 北京百度网讯科技有限公司 训练方法、目标检测方法、装置、电子设备以及存储介质
CN114049289A (zh) * 2021-11-10 2022-02-15 合肥工业大学 基于对比学习与StyleGAN2的近红外-可见光人脸图像合成方法
CN114049289B (zh) * 2021-11-10 2024-03-05 合肥工业大学 基于对比学习与StyleGAN2的近红外-可见光人脸图像合成方法
CN114842457A (zh) * 2022-06-29 2022-08-02 小米汽车科技有限公司 模型训练及特征提取方法、装置、电子设备和介质
CN118116087A (zh) * 2024-04-29 2024-05-31 广东康软科技股份有限公司 应用于数字化医疗服务的活体身份验证方法及系统

Also Published As

Publication number Publication date
EP3989104A4 (en) 2022-10-26
EP3989104A1 (en) 2022-04-27
JP2021532434A (ja) 2021-11-25
US20230119593A1 (en) 2023-04-20
KR102385463B1 (ko) 2022-04-12
CN110472494A (zh) 2019-11-19
JP6994588B2 (ja) 2022-01-14
KR20200145827A (ko) 2020-12-30

Similar Documents

Publication Publication Date Title
WO2020253127A1 (zh) 脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质
CN108898186B (zh) 用于提取图像的方法和装置
WO2021012526A1 (zh) 人脸识别模型的训练方法、人脸识别方法、装置、设备及存储介质
CN109284729B (zh) 基于视频获取人脸识别模型训练数据的方法、装置和介质
WO2020119350A1 (zh) 视频分类方法、装置、计算机设备和存储介质
WO2020228525A1 (zh) 地点识别及其模型训练的方法和装置以及电子设备
CN107679466B (zh) 信息输出方法和装置
CN108197618B (zh) 用于生成人脸检测模型的方法和装置
US11455831B2 (en) Method and apparatus for face classification
CN110321845B (zh) 一种从视频中提取表情包的方法、装置及电子设备
WO2023040506A1 (zh) 一种基于模型的数据处理方法、装置、电子设备、计算机可读存储介质及计算机程序产品
WO2021083069A1 (zh) 用于训练换脸模型的方法和设备
WO2020238353A1 (zh) 数据处理方法和装置、存储介质及电子装置
Chen et al. No-reference screen content image quality assessment with unsupervised domain adaptation
CN111126347B (zh) 人眼状态识别方法、装置、终端及可读存储介质
CN112581355B (zh) 图像处理方法、装置、电子设备和计算机可读介质
WO2022227218A1 (zh) 药名识别方法、装置、计算机设备和存储介质
JP6997369B2 (ja) プログラム、測距方法、及び測距装置
WO2021184754A1 (zh) 视频对比方法、装置、计算机设备和存储介质
WO2023123923A1 (zh) 人体重识别方法、人体重识别装置、计算机设备及介质
TWI803243B (zh) 圖像擴增方法、電腦設備及儲存介質
CN110321778B (zh) 一种人脸图像处理方法、装置和存储介质
WO2020135054A1 (zh) 视频推荐方法、装置、设备及存储介质
JP2012048624A (ja) 学習装置、方法及びプログラム
Dvoršak et al. Kinship verification from ear images: An explorative study with deep learning models

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2020560486

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19919545

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2019919545

Country of ref document: EP