WO2021184898A1 - Facial feature extraction method, apparatus and device - Google Patents

Facial feature extraction method, apparatus and device Download PDF

Info

Publication number
WO2021184898A1
WO2021184898A1 PCT/CN2020/140574 CN2020140574W WO2021184898A1 WO 2021184898 A1 WO2021184898 A1 WO 2021184898A1 CN 2020140574 W CN2020140574 W CN 2020140574W WO 2021184898 A1 WO2021184898 A1 WO 2021184898A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature extraction
user
extraction model
vector
face
Prior art date
Application number
PCT/CN2020/140574
Other languages
French (fr)
Chinese (zh)
Inventor
徐崴
Original Assignee
支付宝(杭州)信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 支付宝(杭州)信息技术有限公司 filed Critical 支付宝(杭州)信息技术有限公司
Publication of WO2021184898A1 publication Critical patent/WO2021184898A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • One or more embodiments of this specification relate to the field of computer technology, and in particular to a method, device, and device for extracting facial features.
  • one or more embodiments of this specification provide a method, device and device for extracting facial features, which are used to extract the facial features of a user on the basis of ensuring the privacy of the facial information of the user.
  • the embodiment of this specification provides a face feature extraction method, which uses a user feature extraction model for privacy protection.
  • the user feature extraction model includes an encoder and a face feature extraction model.
  • the feature extraction model is a model obtained by locking a decoder and a feature extraction model based on a convolutional neural network, where the encoder and the decoder constitute an autoencoder; the encoder and the face The decoder in the feature extraction model is connected, and the decoder is connected to the feature extraction model; the method includes: inputting the face image of the user to be identified into the encoder to obtain the person output by the encoder
  • the encoding vector of the face image where the encoding vector is vector data obtained after characterizing the face image; after receiving the encoding vector, the decoder in the facial feature extraction model extracts the features from the encoding vector
  • the model outputs reconstructed face image data; so that after the feature extraction model performs characterization processing on the reconstructed face image data, it outputs the face feature vector of the user
  • the embodiment of this specification provides a training method for a user feature extraction model for privacy protection, the method includes: obtaining a first training sample set, and the training samples in the first training sample set are face images; using The first training sample set trains the initial autoencoder to obtain the trained autoencoder; obtains a second training sample set, and the training samples in the second training sample set are coding vectors, and the coding vector is The vector data obtained after the face image is characterized by the encoder in the trained autoencoder is used; the training samples in the second training sample set are input into the decoder of the initial face feature extraction model , So as to use the reconstructed face image data output by the decoder to train the initial feature extraction model based on the convolutional neural network in the initial face feature extraction model to obtain the trained face feature extraction model; The initial facial feature extraction model is obtained by locking the decoder and the initial feature extraction model, and the decoder is a decoder in the trained autoencoder; according to the encoding And the trained face feature extraction model to
  • An embodiment of this specification provides a face feature extraction device, which uses a user feature extraction model for privacy protection, and the user feature extraction model includes an encoder and a face feature extraction model.
  • the feature extraction model is a model obtained by locking a decoder and a feature extraction model based on a convolutional neural network, where the encoder and the decoder constitute an autoencoder; the encoder and the face The decoder in the feature extraction model is connected, and the decoder is connected with the feature extraction model;
  • the device includes: an input module for inputting the face image of the user to be identified into the encoder to obtain the encoder The output encoding vector of the face image, where the encoding vector is vector data obtained after characterizing the face image; the face feature vector generation module is used to make the face feature extraction model After receiving the encoding vector, the decoder in the output of the reconstructed face image data to the feature extraction model; so that the feature extraction model performs characterization processing on the reconstructed face image data, and then outputs the user to
  • An embodiment of this specification provides a training device for a user feature extraction model for privacy protection.
  • the device includes: a first acquisition module configured to acquire a first training sample set.
  • the training sample is a face image;
  • the first training module is used to train the initial autoencoder using the first training sample set to obtain the trained autoencoder;
  • the second acquisition module is used to obtain the second training sample set
  • the training samples in the second training sample set are coding vectors, and the coding vectors are vector data obtained by characterizing a face image using an encoder in the trained autoencoder;
  • second The training module is used to input the training samples in the second training sample set into the decoder of the initial face feature extraction model, so as to use the reconstructed face image data output by the decoder to compare the initial face
  • the initial feature extraction model based on the convolutional neural network in the feature extraction model is trained to obtain a trained face feature extraction model; the initial face feature extraction model is performed by comparing the decoder and the initial feature extraction model Ob
  • a client device includes: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores an image encoder and can be used
  • the instruction executed by the processor, the image encoder is an encoder in a self-encoder, and the instruction is executed by the at least one processor, so that the at least one processor can: Input to the image encoder to obtain the encoding vector of the face image output by the image encoder, where the encoding vector is vector data obtained after characterizing the face image; sending the encoding vector To the server device, so that the server device uses the face feature extraction model to generate the face feature vector of the user to be identified according to the encoding vector, and the face feature extraction model The model obtained by locking the decoder and the feature extraction model based on the convolutional neural network.
  • a server device provided by an embodiment of this specification includes: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores a facial feature extraction model, and the facial feature
  • the extraction model is a model obtained by locking a decoder in self-encoding and a feature extraction model based on a convolutional neural network.
  • the memory also stores instructions that can be executed by the at least one processor, and the instructions are The at least one processor executes, so that the at least one processor can: obtain a coding vector of the face image of the user to be recognized, and the coding vector is to use the encoder in the autoencoder to perform the The vector data obtained by image characterization processing; after the coding vector is input to the decoder in the facial feature extraction model, the decoder outputs the reconstructed facial image data to the feature extraction model; After the feature extraction model performs characterization processing on the reconstructed face image data, it outputs the face feature vector of the user to be identified.
  • An embodiment of this specification provides a training device for a facial feature extraction model for privacy protection, including: at least one processor; and, a memory communicatively connected to the at least one processor; wherein the memory stores There is an instruction executable by the at least one processor, the instruction is executed by the at least one processor, so that the at least one processor can: obtain a first training sample set, in the first training sample set The training sample of is a face image; the initial autoencoder is trained using the first training sample set to obtain the trained autoencoder; the second training sample set is obtained, and the training samples in the second training sample set are Encoding vector, the encoding vector is the vector data obtained after the face image is characterized by the encoder in the trained autoencoder; the training samples in the second training sample set are input into the initial person In the decoder of the face feature extraction model, it is convenient to use the reconstructed face image data output by the decoder to train the initial feature extraction model based on the convolutional neural network in the initial face feature extraction model to obtain training After
  • An embodiment of this specification achieves the following beneficial effects: because the encoding vector of the face image generated by the encoder in the autoencoder is transmitted, stored or used, the privacy and security of the user's face information will not be compromised. Sex has an impact. Therefore, the service provider can obtain and process the encoding vector of the face image of the user to be identified to generate the face feature vector of the user to be identified without having to obtain the original face image of the user to be identified, thereby ensuring that the user Based on the privacy and security of the facial information, the user's facial feature vector is extracted.
  • the face feature extraction model used to extract the face feature vector is a model obtained by locking the decoder in the autoencoder and the feature extraction model based on the convolutional neural network
  • the face feature extraction model is In the process of extracting the user's face feature vector, the reconstructed face image data generated by the decoder in the encoder will not be leaked, so as to ensure the privacy and security of the user's face information.
  • FIG. 1 is a schematic flowchart of a method for extracting facial features according to an embodiment of this specification
  • FIG. 2 is a schematic structural diagram of a face feature extraction model for privacy protection provided by an embodiment of this specification
  • FIG. 3 is a schematic flowchart of a training method for a facial feature extraction model for privacy protection provided by an embodiment of this specification
  • FIG. 4 is a schematic structural diagram of a face feature extraction device corresponding to FIG. 1 provided by an embodiment of this specification;
  • Fig. 5 is a training device for a facial feature extraction model for privacy protection provided by an embodiment of the specification and corresponding to Fig. 3.
  • the user's face image is usually preprocessed before extracting the face feature vector.
  • the principal component information is first extracted from the user's face picture, and part of the detailed information is discarded to generate facial features based on the principal component information vector.
  • the face feature vector generated based on this method has the problem of loss of face feature information. It can be seen that the accuracy of the currently extracted face feature vector is also poor.
  • FIG. 1 is a schematic flowchart of a method for extracting facial features provided by an embodiment of this specification.
  • the method uses a face feature extraction model for privacy protection to extract face feature vectors.
  • FIG. 2 is a schematic structural diagram of a face feature extraction model for privacy protection provided by an embodiment of this specification.
  • the user feature extraction model 201 for privacy protection includes: an encoder 202 and face features
  • the extraction model 203, the face feature extraction model is a model obtained by locking the decoder 204 and the feature extraction model 205 based on the convolutional neural network, wherein the encoder 202 and the decoder 204 form a self-encoding Device.
  • the encoder 202 is connected to the decoder 204 in the face feature extraction model 201, and the decoder 204 is connected to the feature extraction model 205.
  • the execution subject of the process shown in FIG. 1 may be a user's facial feature extraction system or a program carried on the user's facial feature extraction system.
  • the user facial feature extraction system may include a client device and a server device.
  • the client device may be equipped with an encoder in a face feature extraction model for privacy protection
  • the server device may be equipped with a face feature extraction model in a face feature extraction model for privacy protection.
  • the process may include step 102 to step 104.
  • Step 102 Input the face image of the user to be identified into the encoder to obtain the encoding vector of the face image output by the encoder, where the encoding vector is obtained after characterizing the face image The vector data.
  • a user when a user uses various applications, he usually needs to register an account at each application.
  • the user logs in or unlocks the registered account, or the user uses the registered account to make payments, it is usually necessary to perform user identification on the operating user of the registered account (that is, the user to be identified), and confirm that the user to be identified is the registered user. Only after the authenticated user (that is, the designated user) of the account, the user to be identified is allowed to perform subsequent operations. Or, for the scenario where the user needs to pass through the access control system, the user usually needs to be identified, and the user (ie the user to be identified) is determined to be the whitelisted user (ie the designated user) of the access control system before the user is allowed Through the access control system.
  • the client device When performing user recognition on the user to be identified based on the face recognition technology, the client device usually needs to collect the face image of the user to be identified, and extract the encoding vector of the face image by using the encoder mounted on it.
  • the client device may send the encoding vector to the server device, so that the server device generates the facial feature vector of the user to be identified according to the encoding vector, and then performs user identification based on the generated facial feature vector of the user to be identified.
  • the encoder in step 102 may be an encoder in an auto encoder (AE).
  • the autoencoder is a network model structure in deep learning. Its characteristic is that the input image itself can be used as supervision information, and the input image is reconstructed as the goal for network training, so as to achieve the purpose of encoding the input image (encoding) . Since the autoencoder does not need other information except the input image as the supervision information in the network training, the training cost of the autoencoder is low, and it is economical and practical.
  • the autoencoder usually includes two parts: an encoder and a decoder.
  • the encoder in the self-encoder can be used to encode the face image to obtain the encoding vector of the face image
  • the decoder in the self-encoder can perform the encoding process on the face image according to the encoding vector. Reconstruction to obtain a reconstructed face image.
  • the service provider transmits, stores, and processes the encoding vector of the face image, and will not affect the security and privacy of the face information of the identified user.
  • the autoencoder is an artificial neural network that can learn input data through unsupervised learning, and can efficiently and accurately represent the input data. Therefore, the face feature information contained in the encoding vector of the face image generated by the encoder in the autoencoder is more comprehensive and the noise is small, so that the face image generated by the encoder in the autoencoder is
  • the encoding vector is used to extract the face feature vector, the accuracy of the obtained face feature vector can be improved, which in turn helps to improve the accuracy of the user recognition result generated based on the face feature vector.
  • the face image of the user to be identified may be a multi-channel face image.
  • the single-channel image data of the user to be identified can be determined first; the user to be identified is generated based on the single-channel image data
  • the image data of each channel of the multi-channel face image of the user to be identified is the same as The single-channel image data is the same.
  • Step 104 After receiving the encoding vector, the decoder in the face feature extraction model outputs reconstructed face image data to the feature extraction model; so that the feature extraction model can perform processing on the reconstructed face image data. After the characterization process, the facial feature vector of the user to be recognized is output.
  • the training goal of the autoencoder is to minimize the difference between the reconstructed face image and the original face image, it is not used to classify the user's face. Therefore, if the autoencoder is used
  • the encoding vector of the face image extracted by the encoder in the device is directly used as the face feature vector of the user to be identified to perform user identification, which will make the accuracy of the user identification result poor.
  • a facial feature extraction model obtained by locking a decoder in self-encoding and a feature extraction model based on a convolutional neural network can be deployed on the server device. Since the decoder in the self-encoding can be used to generate reconstructed face image data according to the encoding vector of the face image of the user to be recognized, and the feature extraction model based on the convolutional neural network can classify the reconstructed face image data, so , The output vector of the feature extraction model based on the convolutional neural network can be used as the facial feature vector of the user to be recognized, so as to improve the accuracy of the user recognition result generated based on the facial feature vector of the user to be recognized.
  • the feature extraction model based on the convolutional neural network in the face feature extraction model is used to extract the face feature vector from the reconstructed face image
  • the feature extraction based on the convolutional neural network can be implemented using existing face recognition models based on convolutional neural networks, for example, DeepFace, FaceNet, MTCNN, RetinaFace, etc. It can be seen that the compatibility of the face feature extraction model is better.
  • the decoder in the facial feature extraction model decrypts the encoding vector of the face image of the user to be identified, the reconstructed face image data obtained after decryption processing has a high degree of similarity with the face image of the user to be identified , So that the accuracy of the facial feature vector of the user to be recognized extracted by the feature extraction model based on the convolutional neural network is better.
  • encryption software can be used to lock the decoder in the autoencoder and the feature extraction model based on the convolutional neural network, or the decoder in the autoencoder and the convolutional neural network-based feature extraction model can be locked.
  • the feature extraction model is stored in the security hardware module of the device, so that the user cannot read the reconstructed face image data output by the decoder, thereby ensuring the privacy of the user's face information.
  • there are many ways to achieve locking of the decoder in the autoencoder and the feature extraction model which is not specifically limited, and it only needs to ensure that the reconstructed face output by the decoder in the autoencoder is guaranteed.
  • the security of the image data is sufficient.
  • the service provider or other users after the service provider or other users obtain the read permission for the reconstructed face image data of the user to be identified, they can also obtain the output of the decoder in the facial feature extraction model based on the read permission. Reconstruction of face image data is beneficial to improve the utilization of data.
  • the service provider can extract the facial feature vector from the encoding vector of the facial image of the user to be identified, there is no need to obtain the facial image of the user to be identified, which avoids the service provider from treating the user to be identified.
  • the transmission, storage and use of facial images to ensure the privacy and security of the facial information of the users to be identified.
  • the feature extraction model based on the convolutional neural network can be used to reconstruct the face from the face.
  • the accuracy of the facial feature vector of the user to be recognized extracted from the image is better.
  • the encoder may include: an input layer, a first hidden layer, and a bottleneck layer in the self-encoder
  • the decoder may include: a second hidden layer and an output layer in the self-encoder
  • the input layer of the encoder is connected to the first hidden layer
  • the first hidden layer is connected to the bottleneck layer
  • the bottleneck layer of the encoder is connected to the second hidden layer of the decoder
  • the second hidden layer is connected to the output layer
  • the output layer is connected to the feature extraction model.
  • the input layer may be used to receive the face image of the user to be identified.
  • the first hidden layer may be used to perform encoding processing on the face image to obtain a first feature vector.
  • the bottleneck layer may be used to perform dimensionality reduction processing on the first feature vector to obtain the coding vector of the face image, where the number of dimensions of the coding vector is smaller than the number of dimensions of the first feature vector.
  • the second hidden layer may be used to decode the encoding vector to obtain the second feature vector.
  • the output layer may be used to generate reconstructed face image data according to the second feature vector.
  • the first The hidden layer and the second hidden layer may include multiple convolutional layers, and the first hidden layer and the second hidden layer may also include a pooling layer and a fully connected layer.
  • the bottleneck layer (Bottleneck layer) can be used to reduce the feature dimension. The dimension of the feature vector output by the hidden layer connected to the bottleneck layer is higher than the dimension of the feature vector output by the bottleneck layer.
  • the feature extraction model based on the convolutional neural network may include: an input layer, a convolutional layer, a fully connected layer, and an output layer; wherein the input layer is connected to the output of the decoder, so The input layer is also connected to the convolutional layer, the convolutional layer is connected to the fully connected layer, and the fully connected layer is connected to the output layer.
  • the input layer may be used to receive the reconstructed face image data output by the decoder; the convolutional layer may be used to perform local feature extraction on the reconstructed face image data to obtain the information of the user to be identified Human face local feature vector; the fully connected layer can be used to generate the face feature vector of the user to be identified according to the local face feature vector.
  • the output layer may be used to generate a face classification result according to the face feature vector of the user to be recognized output by the fully connected layer.
  • the facial feature vector of the user to be recognized may be the output vector of the fully connected layer adjacent to the output layer; or, when the fully connected layer in the feature extraction model based on the convolutional neural network
  • the facial feature vector of the user to be recognized may also be an output vector of a fully connected layer separated from the output layer by N network layers; this is not specifically limited.
  • the face feature extraction model may further include a user matching model, and the input of the user matching model may be connected with the output of the feature extraction model based on the convolutional neural network in the face feature extraction model.
  • step 104 it may further include: making the user matching model receive the facial feature vector of the user to be identified and the facial feature vector of the designated user, and according to the facial feature vector of the user to be identified and the designated user The vector distance between the facial feature vectors of the users is generated to indicate whether the user to be identified is the designated user or not, wherein the facial feature vector of the designated user uses the encoder and the human
  • the facial feature extraction model is obtained by processing the face image of the specified user.
  • the vector distance between the face feature vector of the user to be identified and the face feature vector of the specified user can be used to indicate the distance between the face feature vector of the user to be identified and the face feature vector of the specified user ⁇ similarity. Specifically, when the vector distance is less than or equal to the threshold, it can be determined that the user to be identified and the designated user are the same user. When the vector distance is greater than the threshold, it can be determined that the user to be identified and the designated user are different users.
  • the threshold can be determined according to actual needs, and there is no specific limitation on this.
  • the method in FIG. 1 may be used to generate the facial feature vector of the user to be recognized and the facial feature vector of the designated user. Since the accuracy of the user's face feature vector generated based on the method in FIG. 1 is better, it is beneficial to improve the accuracy of the user recognition result.
  • FIG. 3 is a schematic flowchart of a method for training a face recognition model provided by an embodiment of this specification. From a program perspective, the main body of execution of the process can be a server or a program loaded on a server. As shown in FIG. 3, the process may include step 302 to step 310.
  • Step 302 Obtain a first training sample set, and the training samples in the first training sample set are face images.
  • the training samples in the first training sample set are face images for which use rights have been obtained.
  • face images for which use rights have been obtained.
  • public face images in the face database or face pictures authorized by users, etc. to ensure that the training process of the face recognition model does not affect the privacy of the user's face information.
  • the training samples in the first training sample set may be multi-channel face images.
  • the single-channel image data of the face image may be determined first; a multi-channel image is generated according to the single-channel image data to use the multi-channel image as a training sample in the first training sample set.
  • the image data of each channel is the same as the single-channel image data, thereby ensuring the consistency of the training samples in the first training sample set.
  • Step 304 Use the first training sample set to train the initial autoencoder to obtain the trained autoencoder.
  • step 304 may specifically include: for each training sample in the first training sample set, input the training sample to the initial autoencoder to obtain reconstructed face image data to minimize
  • the image reconstruction loss is the target, and the model parameters of the initial autoencoder are optimized to obtain the trained autoencoder; the image reconstruction loss is the difference between the reconstructed face image data and the training sample .
  • the input layer, the first hidden layer, and the bottleneck layer in the self-encoder constitute an encoder
  • the second hidden layer and the output layer in the self-encoder constitute a decoder.
  • the encoder can be used to encode the face image to obtain the encoding vector of the face image.
  • the decoder can decode the code vector generated by the encoder to obtain a reconstructed face image.
  • the function of each layer of the self-encoder may be the same as the function of each layer of the self-encoder mentioned in the embodiment of the method in FIG. 1, which will not be repeated here.
  • Step 306 Obtain a second training sample set, the training samples in the second training sample set are coding vectors, and the coding vector is to use the encoder in the trained autoencoder to characterize the face image The vector data obtained after processing.
  • the training samples in the second training sample set may be vectors obtained by using the encoder in the trained autoencoder to characterize the face image of the user who needs privacy protection. data.
  • users who need privacy protection can be determined according to actual needs. For example, the operating user and the authenticated user of the registered account at the application site. Or, users to be identified and whitelisted users at the entrance guard based on face recognition technology.
  • the encoder in the trained autoencoder can be used in advance to generate and store the training samples in the second training sample set.
  • step 306 it is only necessary to extract the training samples in the second training sample set generated in advance from the database. Since the training samples in the second training sample set stored in the database are the coding vector of the user’s face image, and the coding vector cannot reflect the appearance information of the user to be recognized, the service provider should The transmission, storage and processing of training samples will not affect the privacy of the user's face information.
  • Step 308 Input the training samples in the second training sample set into the decoder of the initial facial feature extraction model, so that the reconstructed facial image data output by the decoder can be used to extract the initial facial features
  • the initial feature extraction model based on the convolutional neural network in the model is trained to obtain a trained face feature extraction model; the initial face feature extraction model is performed by locking the decoder and the initial feature extraction model And obtained, the decoder is the decoder in the self-encoder after training.
  • the model parameters of the decoder in the initial face feature extraction model, but only the initial feature extraction based on the convolutional neural network.
  • the model parameters of the model can be optimized.
  • the using the reconstructed face image data output by the decoder to train the initial feature extraction model based on the convolutional neural network in the initial face feature extraction model may specifically include: using the initial feature extraction model Performing classification processing on the reconstructed face image data to obtain the predicted value of the category label of the reconstructed face image data; obtaining the preset value of the category label for the reconstructed face image data; aiming at minimizing the classification loss,
  • the model parameters of the initial feature extraction model are optimized, and the classification loss is a difference value between the predicted value of the category label and the preset value of the category label.
  • Step 310 Generate a user feature extraction model for privacy protection according to the encoder and the trained face feature extraction model.
  • the input of the encoder is used to receive the face image of the user to be recognized, and the output of the encoder is connected with the input of the decoder in the trained face feature extraction model, and The output of the decoder is connected to the input of the feature extraction model based on the convolutional neural network in the trained face feature extraction model, and the output of the feature extraction model based on the convolutional neural network is the face of the user to be recognized Feature vector.
  • the autoencoder and the initial facial feature extraction model are trained to build a face feature extraction for privacy protection based on the trained autoencoder and the trained initial facial feature extraction model Model. Since the autoencoder does not need other information except the input image as the supervision information in the network training, the training cost of the face feature extraction model for privacy protection can be reduced, which is economical and practical.
  • the user feature extraction model generated by the method in FIG. 3 can be applied to a user recognition scenario. After the user feature extraction model is used to extract the user's face feature vector, it is usually necessary to compare the user's face feature vector to generate the final user recognition result.
  • the facial feature extraction model for privacy protection in step 310 may further include: establishing a user matching model, the user matching model being used to match the first facial feature vector of the user to be identified with the specified user The vector distance between the second face feature vectors of the, to generate an output result indicating whether the user to be recognized is the specified user, and the first face feature vector is obtained by using the encoder and the trained The face feature extraction model is obtained by processing the face image of the user to be recognized, and the second face feature vector is obtained by using the encoder and the trained face feature extraction model to analyze the specified user Is obtained by processing the face image of;
  • Step 310 may specifically include: generating a user feature extraction model for privacy protection composed of the encoder, the trained facial feature extraction model, and the user matching model.
  • FIG. 4 is a schematic structural diagram of a facial feature extraction device corresponding to FIG. 1 provided by an embodiment of this specification.
  • the device uses a user feature extraction model for privacy protection.
  • the user feature extraction model may include an encoder and a face feature extraction model.
  • the face feature extraction model is based on a decoder and a convolutional neural network.
  • a model obtained by locking the feature extraction model of the network wherein the encoder and the decoder form a self-encoder; the encoder is connected to the decoder in the face feature extraction model, and the decoder Connected to the feature extraction model;
  • the device may include: an input module 402, which may be used to input the face image of the user to be identified into the encoder to obtain the encoding vector of the face image output by the encoder
  • the encoding vector is vector data obtained after characterizing the face image;
  • the face feature vector generating module 404 may be used to make the decoder in the face feature extraction model receive the encoding vector Then, output the reconstructed face image data to the feature extraction model; so that the feature extraction model performs characterization processing on the reconstructed face image data, and then outputs the face feature vector of the user to be identified.
  • the encoder may include: an input layer, a first hidden layer, and a bottleneck layer in the self-encoder
  • the decoder may include: a second hidden layer and an output layer in the self-encoder;
  • the input layer of the encoder is connected to the first hidden layer, the first hidden layer is connected to the bottleneck layer, the bottleneck layer of the encoder is connected to the second hidden layer of the decoder, and the first hidden layer is connected to the second hidden layer of the decoder.
  • the second hidden layer is connected to the output layer, and the output layer is connected to the feature extraction model.
  • the input layer in the self-encoder may be used to receive the face image of the user to be identified; the first hidden layer may be used to encode the face image to obtain a first feature vector; The bottleneck layer may be used to perform dimensionality reduction processing on the first feature vector to obtain an encoding vector of the face image, where the number of dimensions of the encoding vector is less than the number of dimensions of the first feature vector; The second hidden layer can be used to decode the encoding vector to obtain a second feature vector; the output layer can be used to generate reconstructed face image data according to the second feature vector.
  • the feature extraction model based on the convolutional neural network may include: an input layer, a convolutional layer, and a fully connected layer; wherein the input layer is connected to the output of the decoder, and the input layer is also connected to The convolutional layer is connected, and the convolutional layer is connected to the fully connected layer.
  • the input layer of the feature extraction model based on the convolutional neural network may be used to receive reconstructed face image data output by the decoder; the convolutional layer may be used to perform local features on the reconstructed face image data Extraction to obtain the facial feature vector of the user to be recognized; the fully connected layer is used to generate the facial feature vector of the user to be recognized based on the facial feature vector.
  • the user feature extraction model may further include a user matching model, and the user matching model is connected to the feature extraction model; the device may further include:
  • the user matching module is configured to enable the user matching model to receive the facial feature vector of the user to be identified and the facial feature vector of the designated user, and then according to the facial feature vector of the user to be identified and the facial feature vector of the designated user
  • the vector distance between the face feature vectors is used to generate output information indicating whether the user to be identified is the specified user, wherein the face feature vector of the specified user uses the encoder and the face feature
  • the extraction model is obtained by processing the face image of the specified user.
  • Fig. 5 is a training device for a facial feature extraction model for privacy protection provided by an embodiment of the specification and corresponding to Fig. 3. As shown in Figure 5, the device may include the following modules.
  • the first obtaining module 502 is configured to obtain a first training sample set, and the training samples in the first training sample set are face images.
  • the first training module 504 is configured to use the first training sample set to train the initial autoencoder to obtain the trained autoencoder.
  • the second acquisition module 506 is configured to acquire a second training sample set, and the training samples in the second training sample set are coding vectors, and the coding vectors are used for coding the encoder in the trained autoencoder.
  • the vector data obtained after the face image is characterized.
  • the second training module 508 is configured to input the training samples in the second training sample set into the decoder of the initial facial feature extraction model, so as to use the reconstructed face image data output by the decoder to perform the
  • the initial feature extraction model based on the convolutional neural network in the initial face feature extraction model is trained to obtain the trained face feature extraction model; the initial face feature extraction model is performed by comparing the decoder and the initial It is obtained by locking the feature extraction model, and the decoder is the decoder in the trained autoencoder.
  • the user feature extraction model generation module 510 is configured to generate a user feature extraction model for privacy protection according to the encoder and the trained face feature extraction model.
  • the first training module 504 may be specifically configured to: for each training sample in the first training sample set, input the training sample into the initial autoencoder to obtain a reconstructed face image Data; with the goal of minimizing image reconstruction loss, the model parameters of the initial autoencoder are optimized to obtain a trained autoencoder; the image reconstruction loss is the reconstructed face image data and the training sample The difference value between.
  • using the reconstructed face image data output by the decoder to train an initial feature extraction model based on a convolutional neural network in the initial face feature extraction model may specifically include: using the The initial feature extraction model classifies the reconstructed face image data to obtain the predicted value of the category label of the reconstructed face image data; obtains the preset value of the category label for the reconstructed face image data; to minimize the classification Loss is the target, the model parameters of the initial feature extraction model are optimized, and the classification loss is the difference between the predicted value of the category label and the preset value of the category label.
  • the apparatus in FIG. 5 may further include: a user matching model establishment module, configured to establish a user matching model, the user matching model being used to identify the user’s first facial feature vector and the specified user’s second facial feature vector.
  • the vector distance between the face feature vectors is generated to indicate whether the user to be recognized is the specified user or not, and the first face feature vector uses the encoder and the trained face features
  • the extraction model is obtained by processing the face image of the user to be identified, and the second face feature vector is obtained by using the encoder and the trained face feature extraction model to analyze the face of the specified user. The image is processed.
  • the user feature extraction model generation module 510 may be specifically used to generate a user feature extraction model for privacy protection composed of the encoder, the trained facial feature extraction model, and the user matching model.
  • the embodiment of this specification also provides a client device corresponding to the above method.
  • the client device may include: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores an image encoder and instructions executable by the at least one processor,
  • the image encoder is an encoder in a self-encoder, and the instructions are executed by the at least one processor, so that the at least one processor can: input the face image of the user to be recognized into the image encoder To obtain the encoding vector of the face image output by the image encoder, where the encoding vector is vector data obtained after characterizing the face image.
  • the server device uses the facial feature extraction model to generate the facial feature vector of the user to be identified according to the coding vector, and the facial feature extraction model
  • the model obtained by locking the decoder in the self-encoding and the feature extraction model based on the convolutional neural network.
  • the client device can use the encoder in the self-encoder on it to generate the encoding vector of the face image of the user to be recognized, so that the client device can send the waiting device to the server device. Recognize the encoding vector of the user's face image for user identification, without sending the face image of the user to be identified to the server device, avoiding the transmission of the face image of the user to be identified, to ensure the face information of the user to be identified Privacy and security.
  • the embodiment of this specification also provides a server device corresponding to the above method.
  • the server device may include: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores a facial feature extraction model, and the facial feature extraction model is obtained by The decoder in the encoding and the model obtained by locking based on the feature extraction model of the convolutional neural network, the memory also stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor.
  • the server device can generate the facial feature vector of the user to be recognized based on the facial feature extraction model carried on it based on the coding vector of the facial image of the user to be recognized, so that the server can be
  • the device does not need to obtain the face image of the user to be identified to be able to perform user identification, which not only avoids the transmission operation of the face image of the user to be identified, but also prevents the server device from storing and processing the face image of the user to be identified. Improve the privacy and security of the face information of the user to be identified.
  • the embodiment of this specification also provides a training device for the facial feature extraction model for privacy protection corresponding to the method in FIG. 3.
  • the device may include: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are The at least one processor executes, so that the at least one processor can: obtain a first training sample set, and the training samples in the first training sample set are face images.
  • the initial autoencoder is trained by using the first training sample set to obtain the trained autoencoder.
  • the training samples in the second training sample set are coding vectors
  • the coding vectors are obtained by characterizing the face image using the encoder in the trained autoencoder The vector data.
  • the training samples in the second training sample set are input into the decoder of the initial facial feature extraction model, so that the reconstructed facial image data output by the decoder can be used to extract data from the initial facial feature extraction model.
  • the initial feature extraction model based on the convolutional neural network is trained to obtain a trained face feature extraction model; the initial face feature extraction model is obtained by locking the decoder and the initial feature extraction model ,
  • the decoder is a decoder in the self-encoder after training.
  • a user feature extraction model for privacy protection is generated.
  • the improvement of a technology can be clearly distinguished between hardware improvements (for example, improvements in circuit structures such as diodes, transistors, switches, etc.) or software improvements (improvements in method flow).
  • hardware improvements for example, improvements in circuit structures such as diodes, transistors, switches, etc.
  • software improvements improvements in method flow.
  • the improvement of many methods and processes of today can be regarded as a direct improvement of the hardware circuit structure.
  • Designers almost always get the corresponding hardware circuit structure by programming the improved method flow into the hardware circuit. Therefore, it cannot be said that the improvement of a method flow cannot be realized by the hardware entity module.
  • a programmable logic device for example, a Field Programmable Gate Array (Field Programmable Gate Array, FPGA)
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • HDL Hardware Description Language
  • ABEL Advanced Boolean Expression Language
  • AHDL Altera Hardware Description Language
  • HDCal JHDL
  • Lava Lava
  • Lola MyHDL
  • PALASM RHDL
  • VHDL Very-High-Speed Integrated Circuit Hardware Description Language
  • Verilog Verilog
  • the controller can be implemented in any suitable manner.
  • the controller can take the form of, for example, a microprocessor or a processor and a computer-readable medium storing computer-readable program codes (such as software or firmware) executable by the (micro)processor. , Logic gates, switches, application specific integrated circuits (ASICs), programmable logic controllers and embedded microcontrollers.
  • controllers include but are not limited to the following microcontrollers: ARC625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicon Labs C8051F320, the memory controller can also be implemented as a part of the control logic of the memory.
  • controllers in addition to implementing the controller in a purely computer-readable program code manner, it is entirely possible to program the method steps to make the controller use logic gates, switches, application-specific integrated circuits, programmable logic controllers, and embedded logic.
  • the same function can be realized in the form of a microcontroller or the like. Therefore, such a controller can be regarded as a hardware component, and the devices included in it for realizing various functions can also be regarded as a structure within the hardware component. Or even, the device for realizing various functions can be regarded as both a software module for realizing the method and a structure within a hardware component.
  • a typical implementation device is a computer.
  • the computer may be, for example, a personal computer, a laptop computer, a cell phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or Any combination of these devices.
  • one or more embodiments of this specification can be provided as a method, a system, or a computer program product. Therefore, one or more embodiments of this specification may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, one or more embodiments of this specification may adopt computer programs implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. The form of the product.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment can be used to generate It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be stored in a computer-readable memory that can direct a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • the computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-permanent memory in a computer readable medium, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer readable media.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology.
  • the information can be computer-readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cartridges, magnetic tape storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.
  • One or more embodiments of this specification may be described in the general context of computer-executable instructions executed by a computer, such as program modules.
  • program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • One or more embodiments of this specification can also be practiced in distributed computing environments. In these distributed computing environments, tasks are performed by remote processing devices connected through a communication network. In a distributed computing environment, program modules can be located in local and remote computer storage media including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Bioethics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

Facial feature extraction method, apparatus and device for privacy protection. Said method comprises: inputting a facial image of a user to be identified into an encoder, to obtain an encoding vector of the facial image outputted by the encoder, the encoding vector being vector data obtained after performing feature processing on the facial image (102); and after a decoder in a facial feature extraction model receives the encoding vector, outputting reconstructed facial image data to a feature extraction model in the facial feature extraction model, so that after the feature extraction model performs feature processing on the reconstructed facial image data, a facial feature vector of the user to be identified is outputted (104).

Description

一种人脸特征提取方法、装置及设备Method, device and equipment for extracting facial features 技术领域Technical field
本说明书一个或多个实施例涉及计算机技术领域,尤其涉及一种人脸特征提取方法、装置及设备。One or more embodiments of this specification relate to the field of computer technology, and in particular to a method, device, and device for extracting facial features.
背景技术Background technique
随着计算机技术以及光学成像技术的发展,基于人脸识别技术的用户识别方式正在日渐普及。目前,通常需要将客户端设备采集的待识别用户的人脸图像发送至服务端设备处,以便于该服务端设备从该待识别用户的人脸图像中提取出人脸特征向量,从而可以基于该人脸特征向量去生成用户识别结果。由于待识别用户的人脸图像属于用户敏感信息,因此,这种需将待识别用户的人脸图像发送至其他设备进行特征提取的方法存在泄漏用户敏感信息的风险。With the development of computer technology and optical imaging technology, user recognition methods based on face recognition technology are becoming popular day by day. At present, it is usually necessary to send the face image of the user to be identified collected by the client device to the server device, so that the server device can extract the face feature vector from the face image of the user to be identified, which can be based on The face feature vector is used to generate the user recognition result. Since the face image of the user to be identified belongs to user sensitive information, this method that requires the face image of the user to be identified to be sent to other devices for feature extraction has the risk of leaking the user's sensitive information.
基于此,如何在保证用户人脸信息的隐私性的基础上去提取用户人脸特征,已成为亟待解决的技术问题。Based on this, how to extract the user's facial features on the basis of ensuring the privacy of the user's facial information has become an urgent technical problem to be solved.
发明内容Summary of the invention
有鉴于此,本说明书一个或多个实施例提供了一种人脸特征提取方法、装置及设备,用于在保证用户人脸信息的隐私性的基础上去提取用户人脸特征。In view of this, one or more embodiments of this specification provide a method, device and device for extracting facial features, which are used to extract the facial features of a user on the basis of ensuring the privacy of the facial information of the user.
为解决上述技术问题,本说明书实施例是这样实现的。In order to solve the above technical problems, the embodiments of this specification are implemented in this way.
本说明书实施例提供的一种人脸特征提取方法,所述方法使用了用于隐私保护的用户特征提取模型,所述用户特征提取模型包括:编码器及人脸特征提取模型,所述人脸特征提取模型是通过对解码器及基于卷积神经网络的特征提取模型进行锁定而得到的模型,其中,所述编码器和所述解码器组成自编码器;所述编码器与所述人脸特征提取模型中的解码器连接,所述解码器与所述特征提取模型连接;所述方法包括:将待识别用户的人脸图像输入所述编码器,得到所述编码器输出的所述人脸图像的编码向量,所述编码向量为对所述人脸图像进行特征化处理后得到的向量数据;所述人脸特征提取模型中的解码器接收所述编码向量后,向所述特征提取模型输出重建人脸图像数据;以便于所述特征提取模型对所述重建人脸图像数据进行特征化处理后,输出所述待识别用 户的人脸特征向量。The embodiment of this specification provides a face feature extraction method, which uses a user feature extraction model for privacy protection. The user feature extraction model includes an encoder and a face feature extraction model. The feature extraction model is a model obtained by locking a decoder and a feature extraction model based on a convolutional neural network, where the encoder and the decoder constitute an autoencoder; the encoder and the face The decoder in the feature extraction model is connected, and the decoder is connected to the feature extraction model; the method includes: inputting the face image of the user to be identified into the encoder to obtain the person output by the encoder The encoding vector of the face image, where the encoding vector is vector data obtained after characterizing the face image; after receiving the encoding vector, the decoder in the facial feature extraction model extracts the features from the encoding vector The model outputs reconstructed face image data; so that after the feature extraction model performs characterization processing on the reconstructed face image data, it outputs the face feature vector of the user to be identified.
本说明书实施例提供的一种针对用于隐私保护的用户特征提取模型的训练方法,所述方法包括:获取第一训练样本集合,所述第一训练样本集合中的训练样本为人脸图像;利用所述第一训练样本集合对初始自编码器进行训练,得到训练后的自编码器;获取第二训练样本集合,所述第二训练样本集合中的训练样本为编码向量,所述编码向量为利用所述训练后的自编码器中的编码器对人脸图像进行特征化处理后得到的向量数据;将所述第二训练样本集合中的训练样本输入初始人脸特征提取模型的解码器中,以便于利用所述解码器输出的重建人脸图像数据,对所述初始人脸特征提取模型中的基于卷积神经网络的初始特征提取模型进行训练,得到训练后的人脸特征提取模型;所述初始人脸特征提取模型是通过对所述解码器及所述初始特征提取模型进行锁定而得到的,所述解码器为所述训练后的自编码器中的解码器;根据所述编码器及所述训练后的人脸特征提取模型,生成用于隐私保护的用户特征提取模型。The embodiment of this specification provides a training method for a user feature extraction model for privacy protection, the method includes: obtaining a first training sample set, and the training samples in the first training sample set are face images; using The first training sample set trains the initial autoencoder to obtain the trained autoencoder; obtains a second training sample set, and the training samples in the second training sample set are coding vectors, and the coding vector is The vector data obtained after the face image is characterized by the encoder in the trained autoencoder is used; the training samples in the second training sample set are input into the decoder of the initial face feature extraction model , So as to use the reconstructed face image data output by the decoder to train the initial feature extraction model based on the convolutional neural network in the initial face feature extraction model to obtain the trained face feature extraction model; The initial facial feature extraction model is obtained by locking the decoder and the initial feature extraction model, and the decoder is a decoder in the trained autoencoder; according to the encoding And the trained face feature extraction model to generate a user feature extraction model for privacy protection.
本说明书实施例提供的一种人脸特征提取装置,所述装置使用了用于隐私保护的用户特征提取模型,所述用户特征提取模型包括:编码器及人脸特征提取模型,所述人脸特征提取模型是通过对解码器及基于卷积神经网络的特征提取模型进行锁定而得到的模型,其中,所述编码器和所述解码器组成自编码器;所述编码器与所述人脸特征提取模型中的解码器连接,所述解码器与所述特征提取模型连接;所述装置包括:输入模块,用于将待识别用户的人脸图像输入所述编码器,得到所述编码器输出的所述人脸图像的编码向量,所述编码向量为对所述人脸图像进行特征化处理后得到的向量数据;人脸特征向量生成模块,用于令所述人脸特征提取模型中的解码器接收所述编码向量后,向所述特征提取模型输出重建人脸图像数据;以便于所述特征提取模型对所述重建人脸图像数据进行特征化处理后,输出所述待识别用户的人脸特征向量。An embodiment of this specification provides a face feature extraction device, which uses a user feature extraction model for privacy protection, and the user feature extraction model includes an encoder and a face feature extraction model. The feature extraction model is a model obtained by locking a decoder and a feature extraction model based on a convolutional neural network, where the encoder and the decoder constitute an autoencoder; the encoder and the face The decoder in the feature extraction model is connected, and the decoder is connected with the feature extraction model; the device includes: an input module for inputting the face image of the user to be identified into the encoder to obtain the encoder The output encoding vector of the face image, where the encoding vector is vector data obtained after characterizing the face image; the face feature vector generation module is used to make the face feature extraction model After receiving the encoding vector, the decoder in the output of the reconstructed face image data to the feature extraction model; so that the feature extraction model performs characterization processing on the reconstructed face image data, and then outputs the user to be identified Face feature vector.
本说明书实施例提供的一种针对用于隐私保护的用户特征提取模型的训练装置,所述装置包括:第一获取模块,用于获取第一训练样本集合,所述第一训练样本集合中的训练样本为人脸图像;第一训练模块,用于利用所述第一训练样本集合对初始自编码器进行训练,得到训练后的自编码器;第二获取模块,用于获取第二训练样本集合,所述第二训练样本集合中的训练样本为编码向量,所述编码向量为利用所述训练后的自编码器中的编码器对人脸图像进行特征化处理后得到的向量数据;第二训练模块,用于将所述第二训练样本集合中的训练样本输入初始人脸特征提取模型的解码器中,以便于利用所述解码器输出的重建人脸图像数据,对所述初始人脸特征提取模型中的基于卷积神 经网络的初始特征提取模型进行训练,得到训练后的人脸特征提取模型;所述初始人脸特征提取模型是通过对所述解码器及所述初始特征提取模型进行锁定而得到的,所述解码器为所述训练后的自编码器中的解码器;用户特征提取模型生成模块,用于根据所述编码器及所述训练后的人脸特征提取模型,生成用于隐私保护的用户特征提取模型。An embodiment of this specification provides a training device for a user feature extraction model for privacy protection. The device includes: a first acquisition module configured to acquire a first training sample set. The training sample is a face image; the first training module is used to train the initial autoencoder using the first training sample set to obtain the trained autoencoder; the second acquisition module is used to obtain the second training sample set The training samples in the second training sample set are coding vectors, and the coding vectors are vector data obtained by characterizing a face image using an encoder in the trained autoencoder; second The training module is used to input the training samples in the second training sample set into the decoder of the initial face feature extraction model, so as to use the reconstructed face image data output by the decoder to compare the initial face The initial feature extraction model based on the convolutional neural network in the feature extraction model is trained to obtain a trained face feature extraction model; the initial face feature extraction model is performed by comparing the decoder and the initial feature extraction model Obtained by locking, the decoder is the decoder in the trained autoencoder; the user feature extraction model generation module is used to extract the model according to the encoder and the trained face features, Generate user feature extraction model for privacy protection.
本说明书实施例提供的一种客户端设备,包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有图像编码器以及可被所述至少一个处理器执行的指令,所述图像编码器为自编码器中的编码器,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够:将待识别用户的人脸图像输入所述图像编码器,得到所述图像编码器输出的所述人脸图像的编码向量,所述编码向量为对所述人脸图像进行特征化处理后得到的向量数据;发送所述编码向量至服务端设备,以便于所述服务端设备利用人脸特征提取模型根据所述编码向量生成所述待识别用户的人脸特征向量,所述人脸特征提取模型是通过对所述自编码中的解码器以及基于卷积神经网络的特征提取模型进行锁定而得到的模型。A client device provided by an embodiment of this specification includes: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores an image encoder and can be used The instruction executed by the processor, the image encoder is an encoder in a self-encoder, and the instruction is executed by the at least one processor, so that the at least one processor can: Input to the image encoder to obtain the encoding vector of the face image output by the image encoder, where the encoding vector is vector data obtained after characterizing the face image; sending the encoding vector To the server device, so that the server device uses the face feature extraction model to generate the face feature vector of the user to be identified according to the encoding vector, and the face feature extraction model The model obtained by locking the decoder and the feature extraction model based on the convolutional neural network.
本说明书实施例提供的一种服务端设备,包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有人脸特征提取模型,所述人脸特征提取模型是通过对自编码中的解码器以及基于卷积神经网络的特征提取模型进行锁定而得到的模型,所述存储器还存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够:获取待识别用户的人脸图像的编码向量,所述编码向量是利用所述自编码器中的编码器对所述人脸图像进行特征化处理而得到的向量数据;将所述编码向量输入所述人脸特征提取模型中的解码器后,所述解码器向所述特征提取模型输出重建人脸图像数据;以便于所述特征提取模型对所述重建人脸图像数据进行特征化处理后,输出所述待识别用户的人脸特征向量。A server device provided by an embodiment of this specification includes: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores a facial feature extraction model, and the facial feature The extraction model is a model obtained by locking a decoder in self-encoding and a feature extraction model based on a convolutional neural network. The memory also stores instructions that can be executed by the at least one processor, and the instructions are The at least one processor executes, so that the at least one processor can: obtain a coding vector of the face image of the user to be recognized, and the coding vector is to use the encoder in the autoencoder to perform the The vector data obtained by image characterization processing; after the coding vector is input to the decoder in the facial feature extraction model, the decoder outputs the reconstructed facial image data to the feature extraction model; After the feature extraction model performs characterization processing on the reconstructed face image data, it outputs the face feature vector of the user to be identified.
本说明书实施例提供的一种针对用于隐私保护的人脸特征提取模型的训练设备,包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够:获取第一训练样本集合,所述第一训练样本集合中的训练样本为人脸图像;利用所述第一训练样本集合对初始自编码器进行训练,得到训练后的自编码器;获取第二训练样本集合,所述第二训练样本集合中的训练样本为编码向量,所述编码向量为利用所述训练后的自编码器中的编码器对人脸图像进行特征化处理后得到的向量数据;将所述第二训练样本集合中的训练样本输入初始人脸特征提取模 型的解码器中,以便于利用所述解码器输出的重建人脸图像数据,对所述初始人脸特征提取模型中的基于卷积神经网络的初始特征提取模型进行训练,得到训练后的人脸特征提取模型;所述初始人脸特征提取模型是通过对所述解码器及所述初始特征提取模型进行锁定而得到的,所述解码器为所述训练后的自编码器中的解码器;根据所述编码器及所述训练后的人脸特征提取模型,生成用于隐私保护的用户特征提取模型。An embodiment of this specification provides a training device for a facial feature extraction model for privacy protection, including: at least one processor; and, a memory communicatively connected to the at least one processor; wherein the memory stores There is an instruction executable by the at least one processor, the instruction is executed by the at least one processor, so that the at least one processor can: obtain a first training sample set, in the first training sample set The training sample of is a face image; the initial autoencoder is trained using the first training sample set to obtain the trained autoencoder; the second training sample set is obtained, and the training samples in the second training sample set are Encoding vector, the encoding vector is the vector data obtained after the face image is characterized by the encoder in the trained autoencoder; the training samples in the second training sample set are input into the initial person In the decoder of the face feature extraction model, it is convenient to use the reconstructed face image data output by the decoder to train the initial feature extraction model based on the convolutional neural network in the initial face feature extraction model to obtain training After the face feature extraction model; the initial face feature extraction model is obtained by locking the decoder and the initial feature extraction model, and the decoder is the trained autoencoder The decoder; according to the encoder and the trained facial feature extraction model, a user feature extraction model for privacy protection is generated.
本说明书一个实施例实现了能够达到以下有益效果:由于对自编码器中的编码器生成的人脸图像的编码向量进行传输、存储或使用时,不会对用户人脸信息的隐私性及安全性产生影响。因此,服务提供商可以通过获取并处理待识别用户的人脸图像的编码向量,以生成待识别用户的人脸特征向量,而无需去获取待识别用户的原始人脸图像,从而可以在保证用户人脸信息的隐私性及安全性的基础上去提取用户人脸特征向量。An embodiment of this specification achieves the following beneficial effects: because the encoding vector of the face image generated by the encoder in the autoencoder is transmitted, stored or used, the privacy and security of the user's face information will not be compromised. Sex has an impact. Therefore, the service provider can obtain and process the encoding vector of the face image of the user to be identified to generate the face feature vector of the user to be identified without having to obtain the original face image of the user to be identified, thereby ensuring that the user Based on the privacy and security of the facial information, the user's facial feature vector is extracted.
且由于用于提取人脸特征向量的人脸特征提取模型是通过对自编码器中的解码器及基于卷积神经网络的特征提取模型进行锁定而得到的模型,使得该人脸特征提取模型在提取用户人脸特征向量的过程中,不会泄露自编码器中的解码器生成的重建人脸图像数据,以确保用户人脸信息的隐私性及安全性。And because the face feature extraction model used to extract the face feature vector is a model obtained by locking the decoder in the autoencoder and the feature extraction model based on the convolutional neural network, the face feature extraction model is In the process of extracting the user's face feature vector, the reconstructed face image data generated by the decoder in the encoder will not be leaked, so as to ensure the privacy and security of the user's face information.
附图说明Description of the drawings
此处所说明的附图用来提供对本说明书一个或多个实施例的进一步理解,构成本说明书一个或多个实施例的一部分,本说明书的示意性实施例及其说明用于解释本说明书一个或多个实施例,并不构成对本说明书一个或多个实施例的不当限定。在附图中:The drawings described here are used to provide a further understanding of one or more embodiments of this specification, and constitute a part of one or more embodiments of this specification. The exemplary embodiments of this specification and their descriptions are used to explain one or more embodiments of this specification. The multiple embodiments do not constitute an improper limitation of one or more embodiments in this specification. In the attached picture:
图1为本说明书实施例提供的一种人脸特征提取方法的流程示意图;FIG. 1 is a schematic flowchart of a method for extracting facial features according to an embodiment of this specification;
图2为本说明书实施例提供的一种用于隐私保护的人脸特征提取模型的结构示意图;2 is a schematic structural diagram of a face feature extraction model for privacy protection provided by an embodiment of this specification;
图3为本说明书实施例提供的一种针对用于隐私保护的人脸特征提取模型的训练方法的流程示意图;3 is a schematic flowchart of a training method for a facial feature extraction model for privacy protection provided by an embodiment of this specification;
图4为本说明书实施例提供的对应于图1的一种人脸特征提取装置的结构示意图;FIG. 4 is a schematic structural diagram of a face feature extraction device corresponding to FIG. 1 provided by an embodiment of this specification;
图5为本说明书实施例提供的对应于图3的一种针对用于隐私保护的人脸特征提取模型的训练装置。Fig. 5 is a training device for a facial feature extraction model for privacy protection provided by an embodiment of the specification and corresponding to Fig. 3.
具体实施方式Detailed ways
为使本说明书一个或多个实施例的目的、技术方案和优点更加清楚,下面将结合本说明书具体实施例及相应的附图对本说明书一个或多个实施例的技术方案进行清楚、完整地描述。显然,所描述的实施例仅是本说明书的一部分实施例,而不是全部的实施例。基于本说明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本说明书一个或多个实施例保护的范围。In order to make the purpose, technical solutions and advantages of one or more embodiments of this specification clearer, the technical solutions of one or more embodiments of this specification will be clearly and completely described below in conjunction with specific embodiments of this specification and the corresponding drawings. . Obviously, the described embodiments are only a part of the embodiments in this specification, rather than all the embodiments. Based on the embodiments in this specification, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of one or more embodiments of this specification.
以下结合附图,详细说明本说明书各实施例提供的技术方案。The technical solutions provided by the embodiments of this specification will be described in detail below with reference to the accompanying drawings.
在基于人脸识别技术进行用户识别时,通常需要将待识别用户的人脸图像发送至服务器提供商,以令该服务提供商从待识别用户的人脸图像中提取出人脸特征向量,并基于该人脸特征向量进行用户识别。由于这种方法需令服务提供商获取、存储或处理用户的人脸图像,从而容易对用户的人脸信息的隐私性及安全性产生影响。When performing user recognition based on facial recognition technology, it is usually necessary to send the facial image of the user to be recognized to the server provider, so that the service provider can extract the facial feature vector from the facial image of the user to be recognized, and Perform user recognition based on the facial feature vector. Since this method requires the service provider to obtain, store, or process the user's facial image, it is easy to affect the privacy and security of the user's facial information.
且目前,在从用户人脸图像中提取人脸特征向量时,通常会对用户人脸图像进行预处理以后再去提取人脸特征向量。例如,在基于主成分分析(principal component analysis,PCA)的人脸识别方法中,会先从用户人脸图片中提取主成分信息,并丢弃部分细节信息,以基于该主成分信息生成人脸特征向量。基于该方法生成的人脸特征向量存在人脸特征信息丢失的问题,可见,目前提取到的人脸特征向量的准确性也较差。At present, when extracting the face feature vector from the user's face image, the user's face image is usually preprocessed before extracting the face feature vector. For example, in a face recognition method based on principal component analysis (PCA), the principal component information is first extracted from the user's face picture, and part of the detailed information is discarded to generate facial features based on the principal component information vector. The face feature vector generated based on this method has the problem of loss of face feature information. It can be seen that the accuracy of the currently extracted face feature vector is also poor.
图1为本说明书实施例提供的一种人脸特征提取方法的流程示意图。所述方法使用了用于隐私保护的人脸特征提取模型去提取人脸特征向量。FIG. 1 is a schematic flowchart of a method for extracting facial features provided by an embodiment of this specification. The method uses a face feature extraction model for privacy protection to extract face feature vectors.
图2为本说明书实施例提供的一种用于隐私保护的人脸特征提取模型的结构示意图,如图2所示,用于隐私保护的用户特征提取模型201包括:编码器202及人脸特征提取模型203,该人脸特征提取模型是通过对解码器204及基于卷积神经网络的特征提取模型205进行锁定而得到的模型,其中,所述编码器202和所述解码器204组成自编码器。所述编码器202与所述人脸特征提取模型201中的解码器204连接,所述解码器204与所述特征提取模型205连接。FIG. 2 is a schematic structural diagram of a face feature extraction model for privacy protection provided by an embodiment of this specification. As shown in FIG. 2, the user feature extraction model 201 for privacy protection includes: an encoder 202 and face features The extraction model 203, the face feature extraction model is a model obtained by locking the decoder 204 and the feature extraction model 205 based on the convolutional neural network, wherein the encoder 202 and the decoder 204 form a self-encoding Device. The encoder 202 is connected to the decoder 204 in the face feature extraction model 201, and the decoder 204 is connected to the feature extraction model 205.
从程序角度而言,图1中所示的流程的执行主体可以为用户人脸特征提取系统或者搭载于用户人脸特征提取系统上的程序。该用户人脸特征提取系统可以包括客户端设备及服务端设备。其中,该客户端设备中可以搭载有用于隐私保护的人脸特征提取模型中的编码器,该服务端设备中可以搭载有用于隐私保护的人脸特征提取模型中的人脸特征提取模型。From a program perspective, the execution subject of the process shown in FIG. 1 may be a user's facial feature extraction system or a program carried on the user's facial feature extraction system. The user facial feature extraction system may include a client device and a server device. Wherein, the client device may be equipped with an encoder in a face feature extraction model for privacy protection, and the server device may be equipped with a face feature extraction model in a face feature extraction model for privacy protection.
如图1所示,该流程可以包括步骤102~步骤104。As shown in FIG. 1, the process may include step 102 to step 104.
步骤102:将待识别用户的人脸图像输入所述编码器,得到所述编码器输出的所述人脸图像的编码向量,所述编码向量为对所述人脸图像进行特征化处理后得到的向量数据。Step 102: Input the face image of the user to be identified into the encoder to obtain the encoding vector of the face image output by the encoder, where the encoding vector is obtained after characterizing the face image The vector data.
在本说明书实施例中,用户在使用各类应用程序时,通常需在各个应用处注册账号。当用户对该注册账号进行登录、解锁,或者用户使用该注册账号进行支付等场景,通常需对该注册账号的操作用户(即待识别用户)进行用户识别,并在确定待识别用户为该注册账号的认证用户(即指定用户)后,才允许该待识别用户执行后续操作。或者,针对用户需通过门禁系统的这一场景,通常需对该用户进行识别,并在确定该用户(即待识别用户)为门禁系统的白名单用户(即指定用户)后,才允许该用户通过门禁系统。In the embodiments of this specification, when a user uses various applications, he usually needs to register an account at each application. When the user logs in or unlocks the registered account, or the user uses the registered account to make payments, it is usually necessary to perform user identification on the operating user of the registered account (that is, the user to be identified), and confirm that the user to be identified is the registered user. Only after the authenticated user (that is, the designated user) of the account, the user to be identified is allowed to perform subsequent operations. Or, for the scenario where the user needs to pass through the access control system, the user usually needs to be identified, and the user (ie the user to be identified) is determined to be the whitelisted user (ie the designated user) of the access control system before the user is allowed Through the access control system.
当基于人脸识别技术对待识别用户进行用户识别时,客户端设备通常需采集该待识别用户的人脸图像,并利用其搭载的编码器提取该人脸图像的编码向量。客户端设备可以将该编码向量发送至服务端设备,以便于服务端设备根据该编码向量生成待识别用户的人脸特征向量,进而基于生成的待识别用户的人脸特征向量进行用户识别。When performing user recognition on the user to be identified based on the face recognition technology, the client device usually needs to collect the face image of the user to be identified, and extract the encoding vector of the face image by using the encoder mounted on it. The client device may send the encoding vector to the server device, so that the server device generates the facial feature vector of the user to be identified according to the encoding vector, and then performs user identification based on the generated facial feature vector of the user to be identified.
其中,步骤102中的编码器可以为自编码器(auto encoder,AE)中的编码器。自编码器是深度学习中的一种网络模型结构,其特点是可以使用输入图像自身作为监督信息,以对输入图像进行重建为目标进行网络训练,从而达到对输入图像进行编码(encoding)的目的。由于自编码器不需要除输入图像以外的其他信息作为网络训练中的监督信息,使得自编码器的训练成本较低,经济实用。Wherein, the encoder in step 102 may be an encoder in an auto encoder (AE). The autoencoder is a network model structure in deep learning. Its characteristic is that the input image itself can be used as supervision information, and the input image is reconstructed as the goal for network training, so as to achieve the purpose of encoding the input image (encoding) . Since the autoencoder does not need other information except the input image as the supervision information in the network training, the training cost of the autoencoder is low, and it is economical and practical.
自编码器通常包含编码器(encoder)和解码器(decoder)两部分。其中,自编码器中的编码器可以用于对人脸图像进行编码处理,以得到人脸图像的编码向量,而自编码器中的解码器则可以根据该编码向量,对该人脸图像进行重建,以得到重建人脸图像。The autoencoder usually includes two parts: an encoder and a decoder. Among them, the encoder in the self-encoder can be used to encode the face image to obtain the encoding vector of the face image, and the decoder in the self-encoder can perform the encoding process on the face image according to the encoding vector. Reconstruction to obtain a reconstructed face image.
由于自编码器中的编码器生成的人脸图像的编码向量为对人脸图像进行特征化处理后得到的向量数据,而该人脸图像的编码向量无法体现待识别用户的相貌信息,因此,服务提供商对人脸图像的编码向量进行传输、存储及处理,并不会对待识别用户的人脸信息的安全性及隐私性产成影响。Since the encoding vector of the face image generated by the encoder in the autoencoder is the vector data obtained by characterizing the face image, and the encoding vector of the face image cannot reflect the facial information of the user to be identified, therefore, The service provider transmits, stores, and processes the encoding vector of the face image, and will not affect the security and privacy of the face information of the identified user.
在本说明书实施例中,由于自编码器是一种能够通过无监督学习学到输入数据,并可以对输入数据进行高效精确表示的人工神经网络。因此,基于该自编码器中的编码器生成的人脸图像的编码向量中包含的人脸特征信息较全面,且噪声较小,从而在基于 自编码器中的编码器生成的人脸图像的编码向量去提取人脸特征向量时,可以提升得到的人脸特征向量的准确性,进而有利于提升基于该人脸特征向量生成的用户识别结果的准确性。In the embodiments of this specification, since the autoencoder is an artificial neural network that can learn input data through unsupervised learning, and can efficiently and accurately represent the input data. Therefore, the face feature information contained in the encoding vector of the face image generated by the encoder in the autoencoder is more comprehensive and the noise is small, so that the face image generated by the encoder in the autoencoder is When the encoding vector is used to extract the face feature vector, the accuracy of the obtained face feature vector can be improved, which in turn helps to improve the accuracy of the user recognition result generated based on the face feature vector.
在本说明书实施例中,待识别用户的人脸图像可以为多通道人脸图像。在实际应用中,当用户设备采集的待识别用户的人脸图像为单通道人脸图像时,可以先确定该待识别用户的单通道图像数据;根据所述单通道图像数据生成该待识别用户的多通道图像,以便于利用自编码器中的编码器对该待识别用户的多通道人脸图像进行处理,其中,所述待识别用户的多通道人脸图像的各个通道的图像数据均与所述单通道图像数据相同。In the embodiment of this specification, the face image of the user to be identified may be a multi-channel face image. In practical applications, when the face image of the user to be identified collected by the user equipment is a single-channel face image, the single-channel image data of the user to be identified can be determined first; the user to be identified is generated based on the single-channel image data In order to use the encoder in the autoencoder to process the multi-channel face image of the user to be identified, the image data of each channel of the multi-channel face image of the user to be identified is the same as The single-channel image data is the same.
步骤104:所述人脸特征提取模型中的解码器接收所述编码向量后,向所述特征提取模型输出重建人脸图像数据;以便于所述特征提取模型对所述重建人脸图像数据进行特征化处理后,输出所述待识别用户的人脸特征向量。Step 104: After receiving the encoding vector, the decoder in the face feature extraction model outputs reconstructed face image data to the feature extraction model; so that the feature extraction model can perform processing on the reconstructed face image data. After the characterization process, the facial feature vector of the user to be recognized is output.
在本说明书实施例中,由于自编码器的训练目标为使得重建人脸图像与原使人脸图像之间的差异最小,而并非是用于对用户人脸进行分类,因此,若使用自编码器中的编码器提取的人脸图像的编码向量直接作为待识别用户的人脸特征向量,去进行用户识别,会使得用户识别结果的准确性较差。In the embodiment of this specification, since the training goal of the autoencoder is to minimize the difference between the reconstructed face image and the original face image, it is not used to classify the user's face. Therefore, if the autoencoder is used The encoding vector of the face image extracted by the encoder in the device is directly used as the face feature vector of the user to be identified to perform user identification, which will make the accuracy of the user identification result poor.
在本说明书实施例中,可以在服务端设备上部署通过对自编码中的解码器及基于卷积神经网络的特征提取模型进行锁定而得到的人脸特征提取模型。由于自编码中的解码器可以用于根据待识别用户的人脸图像的编码向量生成重建人脸图像数据,而基于卷积神经网络的特征提取模型可以对该重建人脸图像数据进行分类,因此,可以将该基于卷积神经网络的特征提取模型的输出向量作为待识别用户的人脸特征向量,以提升基于该待识别用户的人脸特征向量生成的用户识别结果的准确性。In the embodiments of this specification, a facial feature extraction model obtained by locking a decoder in self-encoding and a feature extraction model based on a convolutional neural network can be deployed on the server device. Since the decoder in the self-encoding can be used to generate reconstructed face image data according to the encoding vector of the face image of the user to be recognized, and the feature extraction model based on the convolutional neural network can classify the reconstructed face image data, so , The output vector of the feature extraction model based on the convolutional neural network can be used as the facial feature vector of the user to be recognized, so as to improve the accuracy of the user recognition result generated based on the facial feature vector of the user to be recognized.
在本说明书实施例中,由于人脸特征提取模型中的基于卷积神经网络的特征提取模型,用于从重建人脸图像中提取人脸特征向量,因此,该基于卷积神经网络的特征提取模型可以采用现有的基于卷积神经网络的人脸识别模型实现,例如,DeepFace、FaceNet、MTCNN、RetinaFace等。可见,该人脸特征提取模型的兼容性较好。In the embodiment of this specification, since the feature extraction model based on the convolutional neural network in the face feature extraction model is used to extract the face feature vector from the reconstructed face image, the feature extraction based on the convolutional neural network The model can be implemented using existing face recognition models based on convolutional neural networks, for example, DeepFace, FaceNet, MTCNN, RetinaFace, etc. It can be seen that the compatibility of the face feature extraction model is better.
且由于该人脸特征提取模型中的解码器对待识别用户的人脸图像的编码向量进行解密处理后得到的重建人脸图像数据,与该待识别用户的人脸图像之间的相似度较高,从而使得基于卷积神经网络的特征提取模型提取到的待识别用户的人脸特征向量的准确性较好。And because the decoder in the facial feature extraction model decrypts the encoding vector of the face image of the user to be identified, the reconstructed face image data obtained after decryption processing has a high degree of similarity with the face image of the user to be identified , So that the accuracy of the facial feature vector of the user to be recognized extracted by the feature extraction model based on the convolutional neural network is better.
在本说明书实施例中,可以使用加密软件对自编码器中的解码器及基于卷积神经网络的特征提取模型进行锁定,或者,可以将自编码器中的解码器及基于卷积神经网络的特征提取模型存储于设备的安全硬件模块内,以令用户无法读取该解码器输出的重建人脸图像数据,从而确保用户人脸信息的隐私性。在本说明书实施例中,对自编码器中的解码器及该特征提取模型进行锁定的实现方式有多种,对此不作具体限定,只需保证自编码器中的解码器输出的重建人脸图像数据的使用安全性即可。In the embodiments of this specification, encryption software can be used to lock the decoder in the autoencoder and the feature extraction model based on the convolutional neural network, or the decoder in the autoencoder and the convolutional neural network-based feature extraction model can be locked. The feature extraction model is stored in the security hardware module of the device, so that the user cannot read the reconstructed face image data output by the decoder, thereby ensuring the privacy of the user's face information. In the embodiment of this specification, there are many ways to achieve locking of the decoder in the autoencoder and the feature extraction model, which is not specifically limited, and it only needs to ensure that the reconstructed face output by the decoder in the autoencoder is guaranteed. The security of the image data is sufficient.
在实际应用中,当服务提供商或其他用户取得了针对待识别用户的重建人脸图像数据的读取权限后,也可以基于该读取权限去获取人脸特征提取模型中的解码器输出的重建人脸图像数据,从而有利于提升数据的利用率。In practical applications, after the service provider or other users obtain the read permission for the reconstructed face image data of the user to be identified, they can also obtain the output of the decoder in the facial feature extraction model based on the read permission. Reconstruction of face image data is beneficial to improve the utilization of data.
应当理解,本说明书一个或多个实施例所述的方法其中部分步骤的顺序可以根据实际需要相互交换,或者其中的部分步骤也可以省略或删除。It should be understood that the order of some of the steps of the method described in one or more embodiments of this specification can be exchanged according to actual needs, or some of the steps can also be omitted or deleted.
图1中的方法,由于服务提供商可以从待识别用户的人脸图像的编码向量中提取出人脸特征向量,从而无需获取待识别用户的人脸图像,避免了服务提供商对待识别用户的人脸图像的传输、存储及使用,以确保待识别用户的人脸信息的隐私性及安全性。In the method in Figure 1, because the service provider can extract the facial feature vector from the encoding vector of the facial image of the user to be identified, there is no need to obtain the facial image of the user to be identified, which avoids the service provider from treating the user to be identified. The transmission, storage and use of facial images to ensure the privacy and security of the facial information of the users to be identified.
且由于人脸特征提取模型中的解码器生成的重建人脸图像数据与待识别用户的人脸图像之间的相似度较高,从而可以令基于卷积神经网络的特征提取模型从重建人脸图像中提取出的待识别用户的人脸特征向量的准确性较好。And because the reconstructed face image data generated by the decoder in the facial feature extraction model has a high degree of similarity with the face image of the user to be identified, the feature extraction model based on the convolutional neural network can be used to reconstruct the face from the face. The accuracy of the facial feature vector of the user to be recognized extracted from the image is better.
基于图1的方法,本说明书实施例还提供了该方法的一些具体实施方案,下面进行说明。Based on the method in FIG. 1, the examples of this specification also provide some specific implementations of the method, which are described below.
本说明书实施例中,所述编码器可以包括:自编码器中的输入层、第一隐藏层及瓶颈层,所述解码器可以包括:自编码器中的第二隐藏层及输出层。In the embodiment of this specification, the encoder may include: an input layer, a first hidden layer, and a bottleneck layer in the self-encoder, and the decoder may include: a second hidden layer and an output layer in the self-encoder.
其中,所述编码器的输入层与所述第一隐藏层连接,所述第一隐藏层与所述瓶颈层连接,所述编码器的瓶颈层与所述解码器的第二隐藏层连接,所述第二隐藏层与所述输出层连接,所述输出层与所述特征提取模型连接。Wherein, the input layer of the encoder is connected to the first hidden layer, the first hidden layer is connected to the bottleneck layer, and the bottleneck layer of the encoder is connected to the second hidden layer of the decoder, The second hidden layer is connected to the output layer, and the output layer is connected to the feature extraction model.
所述输入层,可以用于接收所述待识别用户的人脸图像。The input layer may be used to receive the face image of the user to be identified.
所述第一隐藏层,可以用于对所述人脸图像进行编码处理,得到第一特征向量。The first hidden layer may be used to perform encoding processing on the face image to obtain a first feature vector.
所述瓶颈层,可以用于对所述第一特征向量进行降维处理,得到所述人脸图像的编码向量,该编码向量的维度数量小于所述第一特征向量的维度数量。The bottleneck layer may be used to perform dimensionality reduction processing on the first feature vector to obtain the coding vector of the face image, where the number of dimensions of the coding vector is smaller than the number of dimensions of the first feature vector.
所述第二隐藏层,可以用于对编码向量进行解码处理,得到第二特征向量。The second hidden layer may be used to decode the encoding vector to obtain the second feature vector.
所述输出层,可以用于根据所述第二特征向量生成重建人脸图像数据。The output layer may be used to generate reconstructed face image data according to the second feature vector.
本说明书实施例中,由于自编码器中的编码器需对图像进行编码处理,而自编码器中的解码器需生成重建人脸图像,因此,为保证编码效果及解码效果,所述第一隐藏层及所述第二隐藏层中可以包括多个卷积层,所述第一隐藏层及所述第二隐藏层中还可以包括池化层及全连接层。所述瓶颈层(Bottleneck layer)可以用于降低特征维度。与瓶颈层连接的隐藏层输出的特征向量的维度均高于该瓶颈层输出的特征向量的维度。In the embodiment of this specification, since the encoder in the self-encoder needs to encode the image, and the decoder in the self-encoder needs to generate a reconstructed face image, in order to ensure the encoding effect and the decoding effect, the first The hidden layer and the second hidden layer may include multiple convolutional layers, and the first hidden layer and the second hidden layer may also include a pooling layer and a fully connected layer. The bottleneck layer (Bottleneck layer) can be used to reduce the feature dimension. The dimension of the feature vector output by the hidden layer connected to the bottleneck layer is higher than the dimension of the feature vector output by the bottleneck layer.
本说明书实施例中,所述基于卷积神经网络的特征提取模型可以包括:输入层、卷积层、全连接层及输出层;其中,所述输入层与所述解码器的输出连接,所述输入层还与所述卷积层连接,所述卷积层与所述全连接层连接,所述全连接层与所述输出层连接。In the embodiment of this specification, the feature extraction model based on the convolutional neural network may include: an input layer, a convolutional layer, a fully connected layer, and an output layer; wherein the input layer is connected to the output of the decoder, so The input layer is also connected to the convolutional layer, the convolutional layer is connected to the fully connected layer, and the fully connected layer is connected to the output layer.
所述输入层,可以用于接收所述解码器输出的重建人脸图像数据;所述卷积层,可以用于对所述重建人脸图像数据进行局部特征提取,得到所述待识别用户的人脸局部特征向量;所述全连接层,可以用于根据所述人脸局部特征向量,生成所述待识别用户的人脸特征向量。The input layer may be used to receive the reconstructed face image data output by the decoder; the convolutional layer may be used to perform local feature extraction on the reconstructed face image data to obtain the information of the user to be identified Human face local feature vector; the fully connected layer can be used to generate the face feature vector of the user to be identified according to the local face feature vector.
所述输出层,可以用于根据所述全连接层输出的所述待识别用户的人脸特征向量,生成人脸分类结果。The output layer may be used to generate a face classification result according to the face feature vector of the user to be recognized output by the fully connected layer.
本说明书实施例中,所述待识别用户的人脸特征向量可以为与所述输出层相邻的全连接层的输出向量;或者,当基于卷积神经网络的特征提取模型中的全连接层有多个时,所述待识别用户的人脸特征向量也可以为与所述输出层间隔N个网络层的全连接层的输出向量;对此不做具体限定。In the embodiment of this specification, the facial feature vector of the user to be recognized may be the output vector of the fully connected layer adjacent to the output layer; or, when the fully connected layer in the feature extraction model based on the convolutional neural network When there are multiple, the facial feature vector of the user to be recognized may also be an output vector of a fully connected layer separated from the output layer by N network layers; this is not specifically limited.
本说明书实施例中,由于步骤104中生成的待识别用户的人脸特征向量可以用于用户识别场景。因此,所述人脸特征提取模型还可以包括用户匹配模型,所述用户匹配模型的输入可以与所述人脸特征提取模型中的基于卷积神经网络的特征提取模型的输出连接。In the embodiment of this specification, because the facial feature vector of the user to be recognized generated in step 104 can be used in a user recognition scene. Therefore, the face feature extraction model may further include a user matching model, and the input of the user matching model may be connected with the output of the feature extraction model based on the convolutional neural network in the face feature extraction model.
步骤104之后,还可以包括:令所述用户匹配模型接收所述待识别用户的人脸特征向量及指定用户的人脸特征向量,并根据所述待识别用户的人脸特征向量和所述指定用户的人脸特征向量之间的向量距离,生成表示所述待识别用户是否为所述指定用户的输出信息,其中,所述指定用户的人脸特征向量为利用所述编码器及所述人脸特征提取 模型对所述指定用户的人脸图像进行处理而得到的。After step 104, it may further include: making the user matching model receive the facial feature vector of the user to be identified and the facial feature vector of the designated user, and according to the facial feature vector of the user to be identified and the designated user The vector distance between the facial feature vectors of the users is generated to indicate whether the user to be identified is the designated user or not, wherein the facial feature vector of the designated user uses the encoder and the human The facial feature extraction model is obtained by processing the face image of the specified user.
本说明书实施例中,待识别用户的人脸特征向量和指定用户的人脸特征向量之间的向量距离,可以用于表示待识别用户的人脸特征向量和指定用户的人脸特征向量之间的相似度。具体的,当该向量距离小于等于阈值时,可以确定待识别用户与指定用户为同一用户。而当该向量距离大于阈值时,可以确定待识别用户与指定用户为不同用户。该阈值可以根据实际需求确定,对此不做具体限定。In the embodiments of this specification, the vector distance between the face feature vector of the user to be identified and the face feature vector of the specified user can be used to indicate the distance between the face feature vector of the user to be identified and the face feature vector of the specified user的similarity. Specifically, when the vector distance is less than or equal to the threshold, it can be determined that the user to be identified and the designated user are the same user. When the vector distance is greater than the threshold, it can be determined that the user to be identified and the designated user are different users. The threshold can be determined according to actual needs, and there is no specific limitation on this.
本说明书实施例中,可以采用图1中的方法去生成待识别用户的人脸特征向量和指定用户的人脸特征向量。由于基于图1中方法生成的用户人脸特征向量的准确性较好,因此,有利于提升用户识别结果的准确性。In the embodiment of this specification, the method in FIG. 1 may be used to generate the facial feature vector of the user to be recognized and the facial feature vector of the designated user. Since the accuracy of the user's face feature vector generated based on the method in FIG. 1 is better, it is beneficial to improve the accuracy of the user recognition result.
图3为本说明书实施例提供的一种人脸识别模型的训练方法的流程示意图。从程序角度而言,该流程的执行主体可以为服务器或者搭载于服务器上的程序。如图3所示,该流程可以包括步骤302~步骤310。FIG. 3 is a schematic flowchart of a method for training a face recognition model provided by an embodiment of this specification. From a program perspective, the main body of execution of the process can be a server or a program loaded on a server. As shown in FIG. 3, the process may include step 302 to step 310.
步骤302:获取第一训练样本集合,所述第一训练样本集合中的训练样本为人脸图像。Step 302: Obtain a first training sample set, and the training samples in the first training sample set are face images.
本说明书实施例中,所述第一训练样本集合中的训练样本为已取得使用权限的人脸图像。例如,公开人脸数据库中的人脸图像或者用户授权的人脸图片等,以确保人脸识别模型训练过程不会对用户人脸信息的隐私性产生影响。In the embodiment of this specification, the training samples in the first training sample set are face images for which use rights have been obtained. For example, public face images in the face database or face pictures authorized by users, etc., to ensure that the training process of the face recognition model does not affect the privacy of the user's face information.
本说明书实施例中,所述第一训练样本集合中的训练样本可以为多通道人脸图像,当公开人脸数据库中的人脸图像或者用户授权的人脸图片为单通道人脸图像时,可以先确定该人脸图像的单通道图像数据;根据所述单通道图像数据生成多通道图像,以将该多通道图像作为第一训练样本集合中的训练样本,所述多通道人脸图像的各个通道的图像数据均与所述单通道图像数据相同,从而保证第一训练样本集合中的训练样本的一致性。In the embodiment of this specification, the training samples in the first training sample set may be multi-channel face images. When the face image in the public face database or the face image authorized by the user is a single-channel face image, The single-channel image data of the face image may be determined first; a multi-channel image is generated according to the single-channel image data to use the multi-channel image as a training sample in the first training sample set. The image data of each channel is the same as the single-channel image data, thereby ensuring the consistency of the training samples in the first training sample set.
步骤304:利用所述第一训练样本集合对初始自编码器进行训练,得到训练后的自编码器。Step 304: Use the first training sample set to train the initial autoencoder to obtain the trained autoencoder.
本说明书实施例中,步骤304具体可以包括:针对所述第一训练样本集合中的每个训练样本,将所述训练样本输入所述初始自编码器,得到重建人脸图像数据,以最小化图像重建损失为目标,对所述初始自编码器的模型参数进行优化,得到训练后的自编码器;所述图像重建损失为所述重建人脸图像数据与所述训练样本之间的差异值。In the embodiment of this specification, step 304 may specifically include: for each training sample in the first training sample set, input the training sample to the initial autoencoder to obtain reconstructed face image data to minimize The image reconstruction loss is the target, and the model parameters of the initial autoencoder are optimized to obtain the trained autoencoder; the image reconstruction loss is the difference between the reconstructed face image data and the training sample .
本说明书实施例中,所述自编码器中的输入层、第一隐藏层及瓶颈层组成编码器,所述自编码器中的第二隐藏层与输出层组成解码器。该编码器可以用于对人脸图像进行编码处理,以得到该人脸图像的编码向量。而该解码器则可以对编码器生成的编码向量进行解码处理,得到重建人脸图像。所述自编码器的各个层的功能与图1中方法的实施例中提及的自编码器的各个层的功能可以是相同的,对此,不再赘述。In the embodiment of this specification, the input layer, the first hidden layer, and the bottleneck layer in the self-encoder constitute an encoder, and the second hidden layer and the output layer in the self-encoder constitute a decoder. The encoder can be used to encode the face image to obtain the encoding vector of the face image. The decoder can decode the code vector generated by the encoder to obtain a reconstructed face image. The function of each layer of the self-encoder may be the same as the function of each layer of the self-encoder mentioned in the embodiment of the method in FIG. 1, which will not be repeated here.
步骤306:获取第二训练样本集合,所述第二训练样本集合中的训练样本为编码向量,所述编码向量为利用所述训练后的自编码器中的编码器对人脸图像进行特征化处理后得到的向量数据。Step 306: Obtain a second training sample set, the training samples in the second training sample set are coding vectors, and the coding vector is to use the encoder in the trained autoencoder to characterize the face image The vector data obtained after processing.
本说明书实施例中,所述第二训练样本集合中的训练样本可以是利用训练后的自编码器中的编码器,对需进行隐私保护的用户的人脸图像进行特征化处理而得到的向量数据。其中,需进行隐私保护的用户可以根据实际需求而确定。例如,应用处的已注册账号的操作用户及认证用户。或者,基于人脸识别技术的门禁处的待识别用户及白名单用户等。In the embodiment of this specification, the training samples in the second training sample set may be vectors obtained by using the encoder in the trained autoencoder to characterize the face image of the user who needs privacy protection. data. Among them, users who need privacy protection can be determined according to actual needs. For example, the operating user and the authenticated user of the registered account at the application site. Or, users to be identified and whitelisted users at the entrance guard based on face recognition technology.
本说明书实施例中,可以预先使用训练后的自编码器中的编码器去生成并存储第二训练样本集合中的训练样本。当执行步骤306时,只需从数据库去提取预先生成的第二训练样本集合中的训练样本即可。由于数据库中存储的第二训练样本集合中的训练样本为用户人脸图像的编码向量,而该编码向量无法体现待识别用户的相貌信息,因此,服务提供商对该第二训练样本集合中的训练样本进行传输、存储及处理,均不会对用户人脸信息的隐私性产成影响。In the embodiment of this specification, the encoder in the trained autoencoder can be used in advance to generate and store the training samples in the second training sample set. When step 306 is executed, it is only necessary to extract the training samples in the second training sample set generated in advance from the database. Since the training samples in the second training sample set stored in the database are the coding vector of the user’s face image, and the coding vector cannot reflect the appearance information of the user to be recognized, the service provider should The transmission, storage and processing of training samples will not affect the privacy of the user's face information.
步骤308:将所述第二训练样本集合中的训练样本输入初始人脸特征提取模型的解码器中,以便于利用所述解码器输出的重建人脸图像数据,对所述初始人脸特征提取模型中的基于卷积神经网络的初始特征提取模型进行训练,得到训练后的人脸特征提取模型;所述初始人脸特征提取模型是通过对所述解码器及所述初始特征提取模型进行锁定而得到的,所述解码器为所述训练后的自编码器中的解码器。Step 308: Input the training samples in the second training sample set into the decoder of the initial facial feature extraction model, so that the reconstructed facial image data output by the decoder can be used to extract the initial facial features The initial feature extraction model based on the convolutional neural network in the model is trained to obtain a trained face feature extraction model; the initial face feature extraction model is performed by locking the decoder and the initial feature extraction model And obtained, the decoder is the decoder in the self-encoder after training.
本说明书实施例中,在对初始人脸特征提取模型进行训练时,无需对该初始人脸特征提取模型中的解码器的模型参数进行优化,而只需对基于卷积神经网络的初始特征提取模型的模型参数进行优化即可。In the embodiments of this specification, when training the initial face feature extraction model, it is not necessary to optimize the model parameters of the decoder in the initial face feature extraction model, but only the initial feature extraction based on the convolutional neural network. The model parameters of the model can be optimized.
所述利用所述解码器输出的重建人脸图像数据,对所述初始人脸特征提取模型中的基于卷积神经网络的初始特征提取模型进行训练,具体可以包括:利用所述初始特征 提取模型对所述重建人脸图像数据进行分类处理,得到所述重建人脸图像数据的类别标签预测值;获取针对所述重建人脸图像数据的类别标签预设值;以最小化分类损失为目标,对所述初始特征提取模型的模型参数进行优化,所述分类损失为所述类别标签预测值与所述类别标签预设值之间的差异值。The using the reconstructed face image data output by the decoder to train the initial feature extraction model based on the convolutional neural network in the initial face feature extraction model may specifically include: using the initial feature extraction model Performing classification processing on the reconstructed face image data to obtain the predicted value of the category label of the reconstructed face image data; obtaining the preset value of the category label for the reconstructed face image data; aiming at minimizing the classification loss, The model parameters of the initial feature extraction model are optimized, and the classification loss is a difference value between the predicted value of the category label and the preset value of the category label.
步骤310:根据所述编码器及所述训练后的人脸特征提取模型,生成用于隐私保护的用户特征提取模型。Step 310: Generate a user feature extraction model for privacy protection according to the encoder and the trained face feature extraction model.
本说明书实施例中,所述编码器的输入用于接收待识别用户的人脸图像,所述编码器的输出与所述训练后的人脸特征提取模型中的解码器的输入连接,所述解码器的输出与所述训练后的人脸特征提取模型中的基于卷积神经网络的特征提取模型的输入连接,所述基于卷积神经网络的特征提取模型的输出为待识别用户的人脸特征向量。In the embodiment of this specification, the input of the encoder is used to receive the face image of the user to be recognized, and the output of the encoder is connected with the input of the decoder in the trained face feature extraction model, and The output of the decoder is connected to the input of the feature extraction model based on the convolutional neural network in the trained face feature extraction model, and the output of the feature extraction model based on the convolutional neural network is the face of the user to be recognized Feature vector.
本说明书实施例中,通过对自编码器及初始人脸特征提取模型进行训练,以基于该训练后的自编码器及训练后的初始人脸特征提取模型搭建用于隐私保护的人脸特征提取模型。由于自编码器不需要除输入图像以外的其他信息作为网络训练中的监督信息,从而可以降低该用于隐私保护的人脸特征提取模型的训练成本,经济实用。In the embodiments of this specification, the autoencoder and the initial facial feature extraction model are trained to build a face feature extraction for privacy protection based on the trained autoencoder and the trained initial facial feature extraction model Model. Since the autoencoder does not need other information except the input image as the supervision information in the network training, the training cost of the face feature extraction model for privacy protection can be reduced, which is economical and practical.
基于图3的方法,本说明书实施例还提供了该方法的一些具体实施方案,下面进行说明。Based on the method in Figure 3, the examples of this specification also provide some specific implementations of the method, which will be described below.
本说明书实施例中,图3中方法生成的用户特征提取模型可以应用于用户识别场景。当利用该用户特征提取模型提取出用户人脸特征向量后,通常还需要对用户人脸特征向量进行比较,以生成最终的用户识别结果。In the embodiment of this specification, the user feature extraction model generated by the method in FIG. 3 can be applied to a user recognition scenario. After the user feature extraction model is used to extract the user's face feature vector, it is usually necessary to compare the user's face feature vector to generate the final user recognition result.
因此,步骤310中所述生成用于隐私保护的人脸特征提取模型之前,还可以包括:建立用户匹配模型,所述用户匹配模型用于根据待识别用户的第一人脸特征向量与指定用户的第二人脸特征向量之间的向量距离,生成表示所述待识别用户是否为所述指定用户的输出结果,所述第一人脸特征向量是利用所述编码器及所述训练后的人脸特征提取模型对所述待识别用户的人脸图像进行处理得到的,所述第二人脸特征向量是利用所述编码器及所述训练后的人脸特征提取模型对所述指定用户的人脸图像进行处理得到的;Therefore, before generating the facial feature extraction model for privacy protection in step 310, it may further include: establishing a user matching model, the user matching model being used to match the first facial feature vector of the user to be identified with the specified user The vector distance between the second face feature vectors of the, to generate an output result indicating whether the user to be recognized is the specified user, and the first face feature vector is obtained by using the encoder and the trained The face feature extraction model is obtained by processing the face image of the user to be recognized, and the second face feature vector is obtained by using the encoder and the trained face feature extraction model to analyze the specified user Is obtained by processing the face image of;
步骤310具体可以包括:生成由所述编码器、所述训练后的人脸特征提取模型及所述用户匹配模型构成的用于隐私保护的用户特征提取模型。Step 310 may specifically include: generating a user feature extraction model for privacy protection composed of the encoder, the trained facial feature extraction model, and the user matching model.
基于同样的思路,本说明书实施例还提供了上述方法对应的装置。图4为本说明书实施例提供的对应于图1的一种人脸特征提取装置的结构示意图。所述装置使用了用 于隐私保护的用户特征提取模型,所述用户特征提取模型可以包括:编码器及人脸特征提取模型,所述人脸特征提取模型是通过对解码器及基于卷积神经网络的特征提取模型进行锁定而得到的模型,其中,所述编码器和所述解码器组成自编码器;所述编码器与所述人脸特征提取模型中的解码器连接,所述解码器与所述特征提取模型连接;所述装置可以包括:输入模块402,可以用于将待识别用户的人脸图像输入所述编码器,得到所述编码器输出的所述人脸图像的编码向量,所述编码向量为对所述人脸图像进行特征化处理后得到的向量数据;人脸特征向量生成模块404,可以用于令所述人脸特征提取模型中的解码器接收所述编码向量后,向所述特征提取模型输出重建人脸图像数据;以便于所述特征提取模型对所述重建人脸图像数据进行特征化处理后,输出所述待识别用户的人脸特征向量。Based on the same idea, the embodiment of this specification also provides a device corresponding to the above method. FIG. 4 is a schematic structural diagram of a facial feature extraction device corresponding to FIG. 1 provided by an embodiment of this specification. The device uses a user feature extraction model for privacy protection. The user feature extraction model may include an encoder and a face feature extraction model. The face feature extraction model is based on a decoder and a convolutional neural network. A model obtained by locking the feature extraction model of the network, wherein the encoder and the decoder form a self-encoder; the encoder is connected to the decoder in the face feature extraction model, and the decoder Connected to the feature extraction model; the device may include: an input module 402, which may be used to input the face image of the user to be identified into the encoder to obtain the encoding vector of the face image output by the encoder The encoding vector is vector data obtained after characterizing the face image; the face feature vector generating module 404 may be used to make the decoder in the face feature extraction model receive the encoding vector Then, output the reconstructed face image data to the feature extraction model; so that the feature extraction model performs characterization processing on the reconstructed face image data, and then outputs the face feature vector of the user to be identified.
可选的,所述编码器可以包括:自编码器中的输入层、第一隐藏层及瓶颈层,所述解码器可以包括:自编码器中的第二隐藏层及输出层;其中,所述编码器的输入层与所述第一隐藏层连接,所述第一隐藏层与所述瓶颈层连接,所述编码器的瓶颈层与所述解码器的第二隐藏层连接,所述第二隐藏层与所述输出层连接,所述输出层与所述特征提取模型连接。Optionally, the encoder may include: an input layer, a first hidden layer, and a bottleneck layer in the self-encoder, and the decoder may include: a second hidden layer and an output layer in the self-encoder; The input layer of the encoder is connected to the first hidden layer, the first hidden layer is connected to the bottleneck layer, the bottleneck layer of the encoder is connected to the second hidden layer of the decoder, and the first hidden layer is connected to the second hidden layer of the decoder. The second hidden layer is connected to the output layer, and the output layer is connected to the feature extraction model.
所述自编码器中的输入层,可以用于接收所述待识别用户的人脸图像;所述第一隐藏层,可以用于对所述人脸图像进行编码处理,得到第一特征向量;所述瓶颈层,可以用于对所述第一特征向量进行降维处理,得到所述人脸图像的编码向量,所述编码向量的维度数量小于所述第一特征向量的维度数量;所述第二隐藏层,可以用于对所述编码向量进行解码处理,得到第二特征向量;所述输出层,可以用于根据所述第二特征向量生成重建人脸图像数据。The input layer in the self-encoder may be used to receive the face image of the user to be identified; the first hidden layer may be used to encode the face image to obtain a first feature vector; The bottleneck layer may be used to perform dimensionality reduction processing on the first feature vector to obtain an encoding vector of the face image, where the number of dimensions of the encoding vector is less than the number of dimensions of the first feature vector; The second hidden layer can be used to decode the encoding vector to obtain a second feature vector; the output layer can be used to generate reconstructed face image data according to the second feature vector.
可选的,所述基于卷积神经网络的特征提取模型可以包括:输入层、卷积层及全连接层;其中,所述输入层与所述解码器的输出连接,所述输入层还与所述卷积层连接,所述卷积层与所述全连接层连接。Optionally, the feature extraction model based on the convolutional neural network may include: an input layer, a convolutional layer, and a fully connected layer; wherein the input layer is connected to the output of the decoder, and the input layer is also connected to The convolutional layer is connected, and the convolutional layer is connected to the fully connected layer.
所述基于卷积神经网络的特征提取模型输入层,可以用于接收所述解码器输出的重建人脸图像数据;所述卷积层,可以用于对所述重建人脸图像数据进行局部特征提取,得到所述待识别用户的人脸局部特征向量;所述全连接层,用于根据所述人脸局部特征向量,可以生成所述待识别用户的人脸特征向量。The input layer of the feature extraction model based on the convolutional neural network may be used to receive reconstructed face image data output by the decoder; the convolutional layer may be used to perform local features on the reconstructed face image data Extraction to obtain the facial feature vector of the user to be recognized; the fully connected layer is used to generate the facial feature vector of the user to be recognized based on the facial feature vector.
可选的,所述用户特征提取模型还可以包括用户匹配模型,所述用户匹配模型与所述特征提取模型连接;所述装置还可以包括:Optionally, the user feature extraction model may further include a user matching model, and the user matching model is connected to the feature extraction model; the device may further include:
用户匹配模块,用于令所述用户匹配模型接收所述待识别用户的人脸特征向量及指定用户的人脸特征向量后,根据所述待识别用户的人脸特征向量和所述指定用户的人脸特征向量之间的向量距离,生成表示所述待识别用户是否为所述指定用户的输出信息,其中,所述指定用户的人脸特征向量为利用所述编码器及所述人脸特征提取模型对所述指定用户的人脸图像进行处理而得到的。The user matching module is configured to enable the user matching model to receive the facial feature vector of the user to be identified and the facial feature vector of the designated user, and then according to the facial feature vector of the user to be identified and the facial feature vector of the designated user The vector distance between the face feature vectors is used to generate output information indicating whether the user to be identified is the specified user, wherein the face feature vector of the specified user uses the encoder and the face feature The extraction model is obtained by processing the face image of the specified user.
基于同样的思路,本说明书实施例还提供了上述方法对应的装置。图5为本说明书实施例提供的对应于图3的一种针对用于隐私保护的人脸特征提取模型的训练装置。如图5所示,该装置可以包括以下模块。Based on the same idea, the embodiment of this specification also provides a device corresponding to the above method. Fig. 5 is a training device for a facial feature extraction model for privacy protection provided by an embodiment of the specification and corresponding to Fig. 3. As shown in Figure 5, the device may include the following modules.
第一获取模块502,用于获取第一训练样本集合,所述第一训练样本集合中的训练样本为人脸图像。The first obtaining module 502 is configured to obtain a first training sample set, and the training samples in the first training sample set are face images.
第一训练模块504,用于利用所述第一训练样本集合对初始自编码器进行训练,得到训练后的自编码器。The first training module 504 is configured to use the first training sample set to train the initial autoencoder to obtain the trained autoencoder.
第二获取模块506,用于获取第二训练样本集合,所述第二训练样本集合中的训练样本为编码向量,所述编码向量为利用所述训练后的自编码器中的编码器对人脸图像进行特征化处理后得到的向量数据。The second acquisition module 506 is configured to acquire a second training sample set, and the training samples in the second training sample set are coding vectors, and the coding vectors are used for coding the encoder in the trained autoencoder. The vector data obtained after the face image is characterized.
第二训练模块508,用于将所述第二训练样本集合中的训练样本输入初始人脸特征提取模型的解码器中,以便于利用所述解码器输出的重建人脸图像数据,对所述初始人脸特征提取模型中的基于卷积神经网络的初始特征提取模型进行训练,得到训练后的人脸特征提取模型;所述初始人脸特征提取模型是通过对所述解码器及所述初始特征提取模型进行锁定而得到的,所述解码器为所述训练后的自编码器中的解码器。The second training module 508 is configured to input the training samples in the second training sample set into the decoder of the initial facial feature extraction model, so as to use the reconstructed face image data output by the decoder to perform the The initial feature extraction model based on the convolutional neural network in the initial face feature extraction model is trained to obtain the trained face feature extraction model; the initial face feature extraction model is performed by comparing the decoder and the initial It is obtained by locking the feature extraction model, and the decoder is the decoder in the trained autoencoder.
用户特征提取模型生成模块510,用于根据所述编码器及所述训练后的人脸特征提取模型,生成用于隐私保护的用户特征提取模型。The user feature extraction model generation module 510 is configured to generate a user feature extraction model for privacy protection according to the encoder and the trained face feature extraction model.
可选的,所述第一训练模块504,具体可以用于:针对所述第一训练样本集合中的每个训练样本,将所述训练样本输入所述初始自编码器,得到重建人脸图像数据;以最小化图像重建损失为目标,对所述初始自编码器的模型参数进行优化,得到训练后的自编码器;所述图像重建损失为所述重建人脸图像数据与所述训练样本之间的差异值。Optionally, the first training module 504 may be specifically configured to: for each training sample in the first training sample set, input the training sample into the initial autoencoder to obtain a reconstructed face image Data; with the goal of minimizing image reconstruction loss, the model parameters of the initial autoencoder are optimized to obtain a trained autoencoder; the image reconstruction loss is the reconstructed face image data and the training sample The difference value between.
可选的,所述利用所述解码器输出的重建人脸图像数据,对所述初始人脸特征提取模型中的基于卷积神经网络的初始特征提取模型进行训练,具体可以包括:利用所述初始特征提取模型对所述重建人脸图像数据进行分类处理,得到所述重建人脸图像数据 的类别标签预测值;获取针对所述重建人脸图像数据的类别标签预设值;以最小化分类损失为目标,对所述初始特征提取模型的模型参数进行优化,所述分类损失为所述类别标签预测值与所述类别标签预设值之间的差异值。Optionally, using the reconstructed face image data output by the decoder to train an initial feature extraction model based on a convolutional neural network in the initial face feature extraction model may specifically include: using the The initial feature extraction model classifies the reconstructed face image data to obtain the predicted value of the category label of the reconstructed face image data; obtains the preset value of the category label for the reconstructed face image data; to minimize the classification Loss is the target, the model parameters of the initial feature extraction model are optimized, and the classification loss is the difference between the predicted value of the category label and the preset value of the category label.
可选的,图5中的装置还可以包括:用户匹配模型建立模块,用于建立用户匹配模型,所述用户匹配模型用于根据待识别用户的第一人脸特征向量与指定用户的第二人脸特征向量之间的向量距离,生成表示所述待识别用户是否为所述指定用户的输出结果,所述第一人脸特征向量是利用所述编码器及所述训练后的人脸特征提取模型对所述待识别用户的人脸图像进行处理得到的,所述第二人脸特征向量是利用所述编码器及所述训练后的人脸特征提取模型对所述指定用户的人脸图像进行处理得到的。Optionally, the apparatus in FIG. 5 may further include: a user matching model establishment module, configured to establish a user matching model, the user matching model being used to identify the user’s first facial feature vector and the specified user’s second facial feature vector. The vector distance between the face feature vectors is generated to indicate whether the user to be recognized is the specified user or not, and the first face feature vector uses the encoder and the trained face features The extraction model is obtained by processing the face image of the user to be identified, and the second face feature vector is obtained by using the encoder and the trained face feature extraction model to analyze the face of the specified user. The image is processed.
所述用户特征提取模型生成模块510,具体可以用于:生成由所述编码器、所述训练后的人脸特征提取模型及所述用户匹配模型构成的用于隐私保护的用户特征提取模型。The user feature extraction model generation module 510 may be specifically used to generate a user feature extraction model for privacy protection composed of the encoder, the trained facial feature extraction model, and the user matching model.
基于同样的思路,本说明书实施例还提供了上述方法对应的一种客户端设备。该客户端设备,可以包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有图像编码器以及可被所述至少一个处理器执行的指令,所述图像编码器为自编码器中的编码器,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够:将待识别用户的人脸图像输入所述图像编码器,得到所述图像编码器输出的所述人脸图像的编码向量,所述编码向量为对所述人脸图像进行特征化处理后得到的向量数据。Based on the same idea, the embodiment of this specification also provides a client device corresponding to the above method. The client device may include: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores an image encoder and instructions executable by the at least one processor, The image encoder is an encoder in a self-encoder, and the instructions are executed by the at least one processor, so that the at least one processor can: input the face image of the user to be recognized into the image encoder To obtain the encoding vector of the face image output by the image encoder, where the encoding vector is vector data obtained after characterizing the face image.
发送所述编码向量至服务端设备,以便于所述服务端设备利用人脸特征提取模型根据所述编码向量生成所述待识别用户的人脸特征向量,所述人脸特征提取模型是通过对所述自编码中的解码器以及基于卷积神经网络的特征提取模型进行锁定而得到的模型。Send the coding vector to the server device, so that the server device uses the facial feature extraction model to generate the facial feature vector of the user to be identified according to the coding vector, and the facial feature extraction model The model obtained by locking the decoder in the self-encoding and the feature extraction model based on the convolutional neural network.
在本说明书实施例中,通过令客户端设备可以利用其搭载的自编码器中的编码器去生成待识别用户的人脸图像的编码向量,从而令客户端设备可以向服务端设备发送该待识别用户的人脸图像的编码向量以进行用户识别,而无需向服务端设备发送待识别用户的人脸图像,避免了对待识别用户的人脸图像的传输,以保证待识别用户的人脸信息的隐私性及安全性。In the embodiment of this specification, the client device can use the encoder in the self-encoder on it to generate the encoding vector of the face image of the user to be recognized, so that the client device can send the waiting device to the server device. Recognize the encoding vector of the user's face image for user identification, without sending the face image of the user to be identified to the server device, avoiding the transmission of the face image of the user to be identified, to ensure the face information of the user to be identified Privacy and security.
基于同样的思路,本说明书实施例还提供了上述方法对应的一种服务端设备。该 服务端设备,可以包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有人脸特征提取模型,所述人脸特征提取模型是通过对自编码中的解码器以及基于卷积神经网络的特征提取模型进行锁定而得到的模型,所述存储器还存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够:获取待识别用户的人脸图像的编码向量,所述编码向量是利用所述自编码器中的编码器对所述人脸图像进行特征化处理而得到的向量数据;将所述编码向量输入所述人脸特征提取模型中的解码器后,所述解码器向所述特征提取模型输出重建人脸图像数据;以便于所述特征提取模型对所述重建人脸图像数据进行特征化处理后,输出所述待识别用户的人脸特征向量。Based on the same idea, the embodiment of this specification also provides a server device corresponding to the above method. The server device may include: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores a facial feature extraction model, and the facial feature extraction model is obtained by The decoder in the encoding and the model obtained by locking based on the feature extraction model of the convolutional neural network, the memory also stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor. Is executed to enable the at least one processor to: obtain an encoding vector of the face image of the user to be identified, the encoding vector is obtained by using the encoder in the autoencoder to perform characterization processing on the face image The obtained vector data; after the encoding vector is input to the decoder in the facial feature extraction model, the decoder outputs the reconstructed face image data to the feature extraction model; so that the feature extraction model can After the reconstructed face image data is characterized, the face feature vector of the user to be recognized is output.
在本说明书实施例中,通过令服务端设备可以基于对其搭载的人脸特征提取模型,去根据待识别用户的人脸图像的编码向量生成待识别用户的人脸特征向量,从而令服务端设备无需去获取待识别用户的人脸图像也以可进行用户识别,不仅避免了对待识别用户的人脸图像的传输操作,还可以避免服务端设备存储、处理待识别用户的人脸图像,以提升待识别用户的人脸信息的隐私性及安全性。In the embodiment of this specification, the server device can generate the facial feature vector of the user to be recognized based on the facial feature extraction model carried on it based on the coding vector of the facial image of the user to be recognized, so that the server can be The device does not need to obtain the face image of the user to be identified to be able to perform user identification, which not only avoids the transmission operation of the face image of the user to be identified, but also prevents the server device from storing and processing the face image of the user to be identified. Improve the privacy and security of the face information of the user to be identified.
基于同样的思路,本说明书实施例还提供了图3中方法对应的一种针对用于隐私保护的人脸特征提取模型的训练设备。该设备,可以包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够:获取第一训练样本集合,所述第一训练样本集合中的训练样本为人脸图像。Based on the same idea, the embodiment of this specification also provides a training device for the facial feature extraction model for privacy protection corresponding to the method in FIG. 3. The device may include: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are The at least one processor executes, so that the at least one processor can: obtain a first training sample set, and the training samples in the first training sample set are face images.
利用所述第一训练样本集合对初始自编码器进行训练,得到训练后的自编码器。The initial autoencoder is trained by using the first training sample set to obtain the trained autoencoder.
获取第二训练样本集合,所述第二训练样本集合中的训练样本为编码向量,所述编码向量为利用所述训练后的自编码器中的编码器对人脸图像进行特征化处理后得到的向量数据。Obtain a second training sample set, the training samples in the second training sample set are coding vectors, and the coding vectors are obtained by characterizing the face image using the encoder in the trained autoencoder The vector data.
将所述第二训练样本集合中的训练样本输入初始人脸特征提取模型的解码器中,以便于利用所述解码器输出的重建人脸图像数据,对所述初始人脸特征提取模型中的基于卷积神经网络的初始特征提取模型进行训练,得到训练后的人脸特征提取模型;所述初始人脸特征提取模型是通过对所述解码器及所述初始特征提取模型进行锁定而得到的,所述解码器为所述训练后的自编码器中的解码器。The training samples in the second training sample set are input into the decoder of the initial facial feature extraction model, so that the reconstructed facial image data output by the decoder can be used to extract data from the initial facial feature extraction model. The initial feature extraction model based on the convolutional neural network is trained to obtain a trained face feature extraction model; the initial face feature extraction model is obtained by locking the decoder and the initial feature extraction model , The decoder is a decoder in the self-encoder after training.
根据所述编码器及所述训练后的人脸特征提取模型,生成用于隐私保护的用户特 征提取模型。According to the encoder and the trained face feature extraction model, a user feature extraction model for privacy protection is generated.
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。The foregoing describes specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps described in the claims can be performed in a different order than in the embodiments and still achieve desired results. In addition, the processes depicted in the drawings do not necessarily require the specific order or sequential order shown in order to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
在20世纪90年代,对于一个技术的改进可以很明显地区分是硬件上的改进(例如,对二极管、晶体管、开关等电路结构的改进)还是软件上的改进(对于方法流程的改进)。然而,随着技术的发展,当今的很多方法流程的改进已经可以视为硬件电路结构的直接改进。设计人员几乎都通过将改进的方法流程编程到硬件电路中来得到相应的硬件电路结构。因此,不能说一个方法流程的改进就不能用硬件实体模块来实现。例如,可编程逻辑器件(Programmable Logic Device,PLD)(例如现场可编程门阵列(Field Programmable Gate Array,FPGA))就是这样一种集成电路,其逻辑功能由用户对器件编程来确定。由设计人员自行编程来把一个数字系统“集成”在一片PLD上,而不需要请芯片制造厂商来设计和制作专用的集成电路芯片。而且,如今,取代手工地制作集成电路芯片,这种编程也多半改用“逻辑编译器(logic compiler)”软件来实现,它与程序开发撰写时所用的软件编译器相类似,而要编译之前的原始代码也得用特定的编程语言来撰写,此称之为硬件描述语言(Hardware Description Language,HDL),而HDL也并非仅有一种,而是有许多种,如ABEL(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language)等,目前最普遍使用的是VHDL(Very-High-Speed Integrated Circuit Hardware Description Language)与Verilog。本领域技术人员也应该清楚,只需要将方法流程用上述几种硬件描述语言稍作逻辑编程并编程到集成电路中,就可以很容易得到实现该逻辑方法流程的硬件电路。In the 1990s, the improvement of a technology can be clearly distinguished between hardware improvements (for example, improvements in circuit structures such as diodes, transistors, switches, etc.) or software improvements (improvements in method flow). However, with the development of technology, the improvement of many methods and processes of today can be regarded as a direct improvement of the hardware circuit structure. Designers almost always get the corresponding hardware circuit structure by programming the improved method flow into the hardware circuit. Therefore, it cannot be said that the improvement of a method flow cannot be realized by the hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (for example, a Field Programmable Gate Array (Field Programmable Gate Array, FPGA)) is such an integrated circuit whose logic function is determined by the user's programming of the device. It is programmed by the designer to "integrate" a digital system on a piece of PLD, without requiring chip manufacturers to design and manufacture dedicated integrated circuit chips. Moreover, nowadays, instead of manually making integrated circuit chips, this kind of programming is mostly realized with "logic compiler" software, which is similar to the software compiler used in program development and writing, but before compilation The original code must also be written in a specific programming language, which is called Hardware Description Language (HDL), and there is not only one type of HDL, but many types, such as ABEL (Advanced Boolean Expression Language) , AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, RHDL (Ruby Hardware Description), etc., currently most commonly used It is VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog. It should also be clear to those skilled in the art that only a little logic programming of the method flow in the above-mentioned hardware description languages and programming into an integrated circuit can easily obtain the hardware circuit that implements the logic method flow.
控制器可以按任何适当的方式实现,例如,控制器可以采取例如微处理器或处理器以及存储可由该(微)处理器执行的计算机可读程序代码(例如软件或固件)的计算机可读介质、逻辑门、开关、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑控制器和嵌入微控制器的形式,控制器的例子包括但不限于以下微控制器:ARC 625D、Atmel AT91SAM、Microchip PIC18F26K20以及Silicone Labs C8051F320, 存储器控制器还可以被实现为存储器的控制逻辑的一部分。本领域技术人员也知道,除了以纯计算机可读程序代码方式实现控制器以外,完全可以通过将方法步骤进行逻辑编程来使得控制器以逻辑门、开关、专用集成电路、可编程逻辑控制器和嵌入微控制器等的形式来实现相同功能。因此这种控制器可以被认为是一种硬件部件,而对其内包括的用于实现各种功能的装置也可以视为硬件部件内的结构。或者甚至,可以将用于实现各种功能的装置视为既可以是实现方法的软件模块又可以是硬件部件内的结构。The controller can be implemented in any suitable manner. For example, the controller can take the form of, for example, a microprocessor or a processor and a computer-readable medium storing computer-readable program codes (such as software or firmware) executable by the (micro)processor. , Logic gates, switches, application specific integrated circuits (ASICs), programmable logic controllers and embedded microcontrollers. Examples of controllers include but are not limited to the following microcontrollers: ARC625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicon Labs C8051F320, the memory controller can also be implemented as a part of the control logic of the memory. Those skilled in the art also know that, in addition to implementing the controller in a purely computer-readable program code manner, it is entirely possible to program the method steps to make the controller use logic gates, switches, application-specific integrated circuits, programmable logic controllers, and embedded logic. The same function can be realized in the form of a microcontroller or the like. Therefore, such a controller can be regarded as a hardware component, and the devices included in it for realizing various functions can also be regarded as a structure within the hardware component. Or even, the device for realizing various functions can be regarded as both a software module for realizing the method and a structure within a hardware component.
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机。具体的,计算机例如可以为个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。The systems, devices, modules, or units illustrated in the above embodiments may be specifically implemented by computer chips or entities, or implemented by products with certain functions. A typical implementation device is a computer. Specifically, the computer may be, for example, a personal computer, a laptop computer, a cell phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or Any combination of these devices.
为了描述的方便,描述以上装置时以功能分为各种单元分别描述。当然,在实施本说明书一个或多个实施例时可以把各单元的功能在同一个或多个软件和/或硬件中实现。For the convenience of description, when describing the above device, the functions are divided into various units and described separately. Of course, when implementing one or more embodiments of this specification, the functions of each unit may be implemented in the same one or more software and/or hardware.
本领域内的技术人员应明白,本说明书一个或多个实施例可提供为方法、系统、或计算机程序产品。因此,本说明书一个或多个实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本说明书一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that one or more embodiments of this specification can be provided as a method, a system, or a computer program product. Therefore, one or more embodiments of this specification may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, one or more embodiments of this specification may adopt computer programs implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. The form of the product.
本说明书一个或多个实施例是参照根据本说明书一个或多个实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。One or more embodiments of this specification are described with reference to flowcharts and/or block diagrams of methods, devices (systems), and computer program products according to one or more embodiments of this specification. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment can be used to generate It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions can also be stored in a computer-readable memory that can direct a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, the computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。The memory may include non-permanent memory in a computer readable medium, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer readable media.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带式磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology. The information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cartridges, magnetic tape storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, commodity or equipment including a series of elements not only includes those elements, but also includes Other elements that are not explicitly listed, or also include elements inherent to such processes, methods, commodities, or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, commodity, or equipment that includes the element.
本说明书一个或多个实施例可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本说明书一个或多个实施例,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。One or more embodiments of this specification may be described in the general context of computer-executable instructions executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. One or more embodiments of this specification can also be practiced in distributed computing environments. In these distributed computing environments, tasks are performed by remote processing devices connected through a communication network. In a distributed computing environment, program modules can be located in local and remote computer storage media including storage devices.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。The various embodiments in this specification are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, as for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
以上所述仅为本说明书的实施例而已,并不用于限制本说明书一个或多个实施例。对于本领域技术人员来说,本说明书一个或多个实施例可以有各种更改和变化。凡在本说明书一个或多个实施例的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本说明书一个或多个实施例的权利要求范围之内。The above descriptions are only the embodiments of this specification, and are not used to limit one or more embodiments of this specification. For those skilled in the art, one or more embodiments of this specification may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of one or more embodiments of this specification should be included in the scope of the claims of one or more embodiments of this specification.

Claims (22)

  1. 一种人脸特征提取方法,所述方法使用了用于隐私保护的用户特征提取模型,所述用户特征提取模型包括:编码器及人脸特征提取模型,所述人脸特征提取模型是通过对解码器及基于卷积神经网络的特征提取模型进行锁定而得到的模型,其中,所述编码器和所述解码器组成自编码器;A face feature extraction method that uses a user feature extraction model for privacy protection. The user feature extraction model includes an encoder and a face feature extraction model. The face feature extraction model is based on A decoder and a model obtained by locking a feature extraction model based on a convolutional neural network, wherein the encoder and the decoder form a self-encoder;
    所述编码器与所述人脸特征提取模型中的解码器连接,所述解码器与所述特征提取模型连接;所述方法包括:The encoder is connected to a decoder in the face feature extraction model, and the decoder is connected to the feature extraction model; the method includes:
    将待识别用户的人脸图像输入所述编码器,得到所述编码器输出的所述人脸图像的编码向量,所述编码向量为对所述人脸图像进行特征化处理后得到的向量数据;Input the face image of the user to be identified into the encoder to obtain the encoding vector of the face image output by the encoder, where the encoding vector is vector data obtained after characterizing the face image ;
    所述人脸特征提取模型中的解码器接收所述编码向量后,向所述特征提取模型输出重建人脸图像数据;以便于所述特征提取模型对所述重建人脸图像数据进行特征化处理后,输出所述待识别用户的人脸特征向量。After receiving the encoding vector, the decoder in the face feature extraction model outputs reconstructed face image data to the feature extraction model; so that the feature extraction model performs characterization processing on the reconstructed face image data Then, output the facial feature vector of the user to be recognized.
  2. 如权利要求1所述的方法,所述编码器包括:输入层、第一隐藏层及瓶颈层,所述解码器包括:第二隐藏层及输出层;The method of claim 1, wherein the encoder includes: an input layer, a first hidden layer, and a bottleneck layer, and the decoder includes: a second hidden layer and an output layer;
    其中,所述编码器的输入层与所述第一隐藏层连接,所述第一隐藏层与所述瓶颈层连接,所述编码器的瓶颈层与所述解码器的第二隐藏层连接,所述第二隐藏层与所述输出层连接,所述输出层与所述特征提取模型连接;Wherein, the input layer of the encoder is connected to the first hidden layer, the first hidden layer is connected to the bottleneck layer, and the bottleneck layer of the encoder is connected to the second hidden layer of the decoder, The second hidden layer is connected to the output layer, and the output layer is connected to the feature extraction model;
    所述输入层,用于接收所述待识别用户的人脸图像;The input layer is used to receive the face image of the user to be identified;
    所述第一隐藏层,用于对所述人脸图像进行编码处理,得到第一特征向量;The first hidden layer is used to perform encoding processing on the face image to obtain a first feature vector;
    所述瓶颈层,用于对所述第一特征向量进行降维处理,得到所述人脸图像的编码向量,所述编码向量的维度数量小于所述第一特征向量的维度数量;The bottleneck layer is configured to perform dimensionality reduction processing on the first feature vector to obtain an encoding vector of the face image, where the number of dimensions of the encoding vector is smaller than the number of dimensions of the first feature vector;
    所述第二隐藏层,用于对所述编码向量进行解码处理,得到第二特征向量;The second hidden layer is used to decode the code vector to obtain a second feature vector;
    所述输出层,用于根据所述第二特征向量生成重建人脸图像数据。The output layer is used to generate reconstructed face image data according to the second feature vector.
  3. 如权利要求1所述的方法,所述基于卷积神经网络的特征提取模型包括:输入层、卷积层及全连接层;The method according to claim 1, wherein the feature extraction model based on convolutional neural network includes: input layer, convolutional layer and fully connected layer;
    其中,所述输入层与所述解码器的输出连接,所述输入层还与所述卷积层连接,所述卷积层与所述全连接层连接;Wherein, the input layer is connected to the output of the decoder, the input layer is also connected to the convolutional layer, and the convolutional layer is connected to the fully connected layer;
    所述输入层,用于接收所述解码器输出的重建人脸图像数据;The input layer is used to receive reconstructed face image data output by the decoder;
    所述卷积层,用于对所述重建人脸图像数据进行局部特征提取,得到所述待识别用户的人脸局部特征向量;The convolutional layer is used to perform local feature extraction on the reconstructed face image data to obtain the local feature vector of the face of the user to be recognized;
    所述全连接层,用于根据所述人脸局部特征向量,生成所述待识别用户的人脸特征 向量。The fully connected layer is used to generate the face feature vector of the user to be recognized according to the local feature vector of the face.
  4. 如权利要求3所述的方法,所述基于卷积神经网络的特征提取模型还包括输出层,所述输出层与所述全连接层连接;所述输出层,用于根据所述全连接层输出的所述待识别用户的人脸特征向量,生成人脸分类结果;The method according to claim 3, the feature extraction model based on the convolutional neural network further comprises an output layer, the output layer is connected with the fully connected layer; The output facial feature vector of the user to be recognized is generated to generate a face classification result;
    所述待识别用户的人脸特征向量为与所述输出层相邻的全连接层的输出向量。The face feature vector of the user to be recognized is an output vector of a fully connected layer adjacent to the output layer.
  5. 如权利要求1所述的方法,所述用户特征提取模型还包括用户匹配模型,所述用户匹配模型与所述特征提取模型连接;所述方法还包括:The method according to claim 1, wherein the user feature extraction model further comprises a user matching model, and the user matching model is connected to the feature extraction model; the method further comprises:
    所述用户匹配模型接收所述待识别用户的人脸特征向量及指定用户的人脸特征向量,并根据所述待识别用户的人脸特征向量和所述指定用户的人脸特征向量之间的向量距离,生成表示所述待识别用户是否为所述指定用户的输出信息,其中,所述指定用户的人脸特征向量为利用所述编码器及所述人脸特征提取模型对所述指定用户的人脸图像进行处理而得到的。The user matching model receives the facial feature vector of the user to be recognized and the facial feature vector of the designated user, and based on the difference between the facial feature vector of the user to be recognized and the facial feature vector of the designated user The vector distance is used to generate output information indicating whether the user to be identified is the specified user, wherein the face feature vector of the specified user is used to perform the calculation of the specified user on the specified user using the encoder and the face feature extraction model. The face image is processed.
  6. 一种针对用于隐私保护的用户特征提取模型的训练方法,所述方法包括:A training method for a user feature extraction model for privacy protection, the method includes:
    获取第一训练样本集合,所述第一训练样本集合中的训练样本为人脸图像;Acquiring a first training sample set, where the training samples in the first training sample set are face images;
    利用所述第一训练样本集合对初始自编码器进行训练,得到训练后的自编码器;Training the initial autoencoder by using the first training sample set to obtain a trained autoencoder;
    获取第二训练样本集合,所述第二训练样本集合中的训练样本为编码向量,所述编码向量为利用所述训练后的自编码器中的编码器对人脸图像进行特征化处理后得到的向量数据;Obtain a second training sample set, the training samples in the second training sample set are coding vectors, and the coding vectors are obtained after characterizing the face image using the encoder in the trained autoencoder Vector data;
    将所述第二训练样本集合中的训练样本输入初始人脸特征提取模型的解码器中,以便于利用所述解码器输出的重建人脸图像数据,对所述初始人脸特征提取模型中的基于卷积神经网络的初始特征提取模型进行训练,得到训练后的人脸特征提取模型;所述初始人脸特征提取模型是通过对所述解码器及所述初始特征提取模型进行锁定而得到的,所述解码器为所述训练后的自编码器中的解码器;The training samples in the second training sample set are input into the decoder of the initial facial feature extraction model, so that the reconstructed facial image data output by the decoder can be used to extract data from the initial facial feature extraction model. The initial feature extraction model based on the convolutional neural network is trained to obtain a trained face feature extraction model; the initial face feature extraction model is obtained by locking the decoder and the initial feature extraction model , The decoder is a decoder in the trained autoencoder;
    根据所述编码器及所述训练后的人脸特征提取模型,生成用于隐私保护的用户特征提取模型。According to the encoder and the trained face feature extraction model, a user feature extraction model for privacy protection is generated.
  7. 如权利要求6所述的方法,所述利用所述第一训练样本集合对初始自编码器进行训练,得到训练后的自编码器,具体包括:7. The method according to claim 6, wherein the training of the initial autoencoder by using the first training sample set to obtain the trained autoencoder specifically includes:
    针对所述第一训练样本集合中的每个训练样本,将所述训练样本输入所述初始自编码器,得到重建人脸图像数据;For each training sample in the first training sample set, input the training sample into the initial autoencoder to obtain reconstructed face image data;
    以最小化图像重建损失为目标,对所述初始自编码器的模型参数进行优化,得到训练后的自编码器;所述图像重建损失为所述重建人脸图像数据与所述训练样本之间的差 异值。With the goal of minimizing the image reconstruction loss, the model parameters of the initial autoencoder are optimized to obtain the trained autoencoder; the image reconstruction loss is the difference between the reconstructed face image data and the training sample The difference value.
  8. 如权利要求6所述的方法,所述利用所述解码器输出的重建人脸图像数据,对所述初始人脸特征提取模型中的基于卷积神经网络的初始特征提取模型进行训练,具体包括:8. The method according to claim 6, said using the reconstructed face image data output by the decoder to train an initial feature extraction model based on a convolutional neural network in the initial face feature extraction model, which specifically includes :
    利用所述初始特征提取模型对所述重建人脸图像数据进行分类处理,得到所述重建人脸图像数据的类别标签预测值;Performing classification processing on the reconstructed face image data by using the initial feature extraction model to obtain a category label predicted value of the reconstructed face image data;
    获取针对所述重建人脸图像数据的类别标签预设值;Acquiring a preset value of a category label for the reconstructed face image data;
    以最小化分类损失为目标,对所述初始特征提取模型的模型参数进行优化,所述分类损失为所述类别标签预测值与所述类别标签预设值之间的差异值。With the objective of minimizing the classification loss, the model parameters of the initial feature extraction model are optimized, and the classification loss is the difference value between the predicted value of the category label and the preset value of the category label.
  9. 如权利要求6所述的方法,所述第一训练样本集合中的训练样本为已取得使用权限的人脸图像。7. The method according to claim 6, wherein the training samples in the first training sample set are face images for which use rights have been obtained.
  10. 如权利要求6所述的方法,所述第二训练样本集合中的训练样本是利用所述编码器对需要进行隐私保护的用户的人脸图像进行特征化处理而得到的向量数据。7. The method according to claim 6, wherein the training samples in the second training sample set are vector data obtained by using the encoder to perform characterization processing on the face image of the user who needs privacy protection.
  11. 如权利要求6所述的方法,所述生成用于隐私保护的用户特征提取模型之前,还包括:The method according to claim 6, before generating the user feature extraction model for privacy protection, further comprising:
    建立用户匹配模型,所述用户匹配模型用于根据待识别用户的第一人脸特征向量与指定用户的第二人脸特征向量之间的向量距离,生成表示所述待识别用户是否为所述指定用户的输出结果,所述第一人脸特征向量是利用所述编码器及所述训练后的人脸特征提取模型对所述待识别用户的人脸图像进行处理得到的,所述第二人脸特征向量是利用所述编码器及所述训练后的人脸特征提取模型对所述指定用户的人脸图像进行处理得到的;A user matching model is established, and the user matching model is used to generate the vector distance between the first facial feature vector of the user to be recognized and the second facial feature vector of the specified user to generate a Specify the output result of the user, the first face feature vector is obtained by processing the face image of the user to be identified using the encoder and the trained face feature extraction model, and the second The face feature vector is obtained by processing the face image of the designated user by using the encoder and the trained face feature extraction model;
    所述生成用于隐私保护的用户特征提取模型,具体包括:The generating a user feature extraction model for privacy protection specifically includes:
    生成由所述编码器、所述训练后的人脸特征提取模型及所述用户匹配模型构成的用于隐私保护的用户特征提取模型。A user feature extraction model for privacy protection composed of the encoder, the trained face feature extraction model, and the user matching model is generated.
  12. 一种人脸特征提取装置,所述装置使用了用于隐私保护的用户特征提取模型,所述用户特征提取模型包括:编码器及人脸特征提取模型,所述人脸特征提取模型是通过对解码器及基于卷积神经网络的特征提取模型进行锁定而得到的模型,其中,所述编码器和所述解码器组成自编码器;所述编码器与所述人脸特征提取模型中的解码器连接,所述解码器与所述特征提取模型连接;所述装置包括:A face feature extraction device that uses a user feature extraction model for privacy protection. The user feature extraction model includes an encoder and a face feature extraction model. The face feature extraction model is based on A decoder and a model obtained by locking a feature extraction model based on a convolutional neural network, wherein the encoder and the decoder constitute an autoencoder; the encoder and the decoder in the facial feature extraction model Connected to the decoder, the decoder is connected to the feature extraction model; the device includes:
    输入模块,用于将待识别用户的人脸图像输入所述编码器,得到所述编码器输出的所述人脸图像的编码向量,所述编码向量为对所述人脸图像进行特征化处理后得到的向 量数据;The input module is configured to input the face image of the user to be identified into the encoder to obtain the encoding vector of the face image output by the encoder, and the encoding vector is to perform characterization processing on the face image The vector data obtained afterwards;
    人脸特征向量生成模块,用于令所述人脸特征提取模型中的解码器接收所述编码向量后,向所述特征提取模型输出重建人脸图像数据;以便于所述特征提取模型对所述重建人脸图像数据进行特征化处理后,输出所述待识别用户的人脸特征向量。The face feature vector generation module is used to enable the decoder in the face feature extraction model to output the reconstructed face image data to the feature extraction model after receiving the encoding vector; After the reconstructed face image data is characterized, the face feature vector of the user to be recognized is output.
  13. 如权利要求12所述的装置,所述编码器包括:输入层、第一隐藏层及瓶颈层,所述解码器包括:第二隐藏层及输出层;11. The apparatus of claim 12, wherein the encoder includes: an input layer, a first hidden layer, and a bottleneck layer, and the decoder includes: a second hidden layer and an output layer;
    其中,所述编码器的输入层与所述第一隐藏层连接,所述第一隐藏层与所述瓶颈层连接,所述编码器的瓶颈层与所述解码器的第二隐藏层连接,所述第二隐藏层与所述输出层连接,所述输出层与所述特征提取模型连接;Wherein, the input layer of the encoder is connected to the first hidden layer, the first hidden layer is connected to the bottleneck layer, and the bottleneck layer of the encoder is connected to the second hidden layer of the decoder, The second hidden layer is connected to the output layer, and the output layer is connected to the feature extraction model;
    所述输入层,用于接收所述待识别用户的人脸图像;The input layer is used to receive the face image of the user to be identified;
    所述第一隐藏层,用于对所述人脸图像进行编码处理,得到第一特征向量;The first hidden layer is used to perform encoding processing on the face image to obtain a first feature vector;
    所述瓶颈层,用于对所述第一特征向量进行降维处理,得到所述人脸图像的编码向量,所述编码向量的维度数量小于所述第一特征向量的维度数量;The bottleneck layer is configured to perform dimensionality reduction processing on the first feature vector to obtain an encoding vector of the face image, where the number of dimensions of the encoding vector is smaller than the number of dimensions of the first feature vector;
    所述第二隐藏层,用于对所述编码向量进行解码处理,得到第二特征向量;The second hidden layer is used to decode the code vector to obtain a second feature vector;
    所述输出层,用于根据所述第二特征向量生成重建人脸图像数据。The output layer is used to generate reconstructed face image data according to the second feature vector.
  14. 如权利要求12所述的装置,所述基于卷积神经网络的特征提取模型包括:输入层、卷积层及全连接层;The device according to claim 12, the feature extraction model based on convolutional neural network comprises: an input layer, a convolutional layer, and a fully connected layer;
    其中,所述输入层与所述解码器的输出连接,所述输入层还与所述卷积层连接,所述卷积层与所述全连接层连接;Wherein, the input layer is connected to the output of the decoder, the input layer is also connected to the convolutional layer, and the convolutional layer is connected to the fully connected layer;
    所述输入层,用于接收所述解码器输出的重建人脸图像数据;The input layer is used to receive reconstructed face image data output by the decoder;
    所述卷积层,用于对所述重建人脸图像数据进行局部特征提取,得到所述待识别用户的人脸局部特征向量;The convolutional layer is used to perform local feature extraction on the reconstructed face image data to obtain the local feature vector of the face of the user to be recognized;
    所述全连接层,用于根据所述人脸局部特征向量,生成所述待识别用户的人脸特征向量。The fully connected layer is used to generate the face feature vector of the user to be recognized according to the local feature vector of the face.
  15. 如权利要求11所述的装置,所述用户特征提取模型还包括用户匹配模型,所述用户匹配模型与所述特征提取模型连接;所述装置还包括:11. The device of claim 11, wherein the user feature extraction model further comprises a user matching model, and the user matching model is connected to the feature extraction model; the device further comprises:
    用户匹配模块,用于令所述用户匹配模型接收所述待识别用户的人脸特征向量及指定用户的人脸特征向量后,根据所述待识别用户的人脸特征向量和所述指定用户的人脸特征向量之间的向量距离,生成表示所述待识别用户是否为所述指定用户的输出信息,其中,所述指定用户的人脸特征向量为利用所述编码器及所述人脸特征提取模型对所述指定用户的人脸图像进行处理而得到的。The user matching module is configured to enable the user matching model to receive the facial feature vector of the user to be identified and the facial feature vector of the designated user, and then according to the facial feature vector of the user to be identified and the facial feature vector of the designated user The vector distance between the face feature vectors is used to generate output information indicating whether the user to be identified is the specified user, wherein the face feature vector of the specified user uses the encoder and the face feature The extraction model is obtained by processing the face image of the specified user.
  16. 一种针对用于隐私保护的用户特征提取模型的训练装置,所述装置包括:A training device for a user feature extraction model for privacy protection, the device comprising:
    第一获取模块,用于获取第一训练样本集合,所述第一训练样本集合中的训练样本为人脸图像;The first acquisition module is configured to acquire a first training sample set, and the training samples in the first training sample set are face images;
    第一训练模块,用于利用所述第一训练样本集合对初始自编码器进行训练,得到训练后的自编码器;The first training module is used to train the initial autoencoder by using the first training sample set to obtain the trained autoencoder;
    第二获取模块,用于获取第二训练样本集合,所述第二训练样本集合中的训练样本为编码向量,所述编码向量为利用所述训练后的自编码器中的编码器对人脸图像进行特征化处理后得到的向量数据;The second acquisition module is configured to acquire a second training sample set, the training samples in the second training sample set are coding vectors, and the coding vector is the use of the encoder in the trained autoencoder to compare faces Vector data obtained after image characterization processing;
    第二训练模块,用于将所述第二训练样本集合中的训练样本输入初始人脸特征提取模型的解码器中,以便于利用所述解码器输出的重建人脸图像数据,对所述初始人脸特征提取模型中的基于卷积神经网络的初始特征提取模型进行训练,得到训练后的人脸特征提取模型;所述初始人脸特征提取模型是通过对所述解码器及所述初始特征提取模型进行锁定而得到的,所述解码器为所述训练后的自编码器中的解码器;The second training module is used to input the training samples in the second training sample set into the decoder of the initial face feature extraction model, so as to use the reconstructed face image data output by the decoder to perform the calculation of the initial face image data. The initial feature extraction model based on the convolutional neural network in the face feature extraction model is trained to obtain a trained face feature extraction model; the initial face feature extraction model is performed by comparing the decoder and the initial feature Obtained by extracting and locking the model, the decoder is the decoder in the trained autoencoder;
    用户特征提取模型生成模块,用于根据所述编码器及所述训练后的人脸特征提取模型,生成用于隐私保护的用户特征提取模型。The user feature extraction model generation module is used to generate a user feature extraction model for privacy protection according to the encoder and the trained face feature extraction model.
  17. 如权利要求16所述的装置,所述第一训练模块,具体用于:The device according to claim 16, wherein the first training module is specifically configured to:
    针对所述第一训练样本集合中的每个训练样本,将所述训练样本输入所述初始自编码器,得到重建人脸图像数据;For each training sample in the first training sample set, input the training sample into the initial autoencoder to obtain reconstructed face image data;
    以最小化图像重建损失为目标,对所述初始自编码器的模型参数进行优化,得到训练后的自编码器;所述图像重建损失为所述重建人脸图像数据与所述训练样本之间的差异值。With the goal of minimizing the image reconstruction loss, the model parameters of the initial autoencoder are optimized to obtain the trained autoencoder; the image reconstruction loss is the difference between the reconstructed face image data and the training sample The difference value.
  18. 如权利要求16所述的装置,所述利用所述解码器输出的重建人脸图像数据,对所述初始人脸特征提取模型中的基于卷积神经网络的初始特征提取模型进行训练,具体包括:16. The device according to claim 16, said using the reconstructed face image data output by the decoder to train an initial feature extraction model based on a convolutional neural network in the initial face feature extraction model, which specifically includes :
    利用所述初始特征提取模型对所述重建人脸图像数据进行分类处理,得到所述重建人脸图像数据的类别标签预测值;Performing classification processing on the reconstructed face image data by using the initial feature extraction model to obtain a category label predicted value of the reconstructed face image data;
    获取针对所述重建人脸图像数据的类别标签预设值;Acquiring a preset value of a category label for the reconstructed face image data;
    以最小化分类损失为目标,对所述初始特征提取模型的模型参数进行优化,所述分类损失为所述类别标签预测值与所述类别标签预设值之间的差异值。With the objective of minimizing the classification loss, the model parameters of the initial feature extraction model are optimized, and the classification loss is the difference value between the predicted value of the category label and the preset value of the category label.
  19. 如权利要求16所述的装置,还包括:The device of claim 16, further comprising:
    用户匹配模型建立模块,用于建立用户匹配模型,所述用户匹配模型用于根据待识 别用户的第一人脸特征向量与指定用户的第二人脸特征向量之间的向量距离,生成表示所述待识别用户是否为所述指定用户的输出结果,所述第一人脸特征向量是利用所述编码器及所述训练后的人脸特征提取模型对所述待识别用户的人脸图像进行处理得到的,所述第二人脸特征向量是利用所述编码器及所述训练后的人脸特征提取模型对所述指定用户的人脸图像进行处理得到的;The user matching model establishment module is used to establish a user matching model. The user matching model is used to generate a representation based on the vector distance between the first face feature vector of the user to be identified and the second face feature vector of the specified user. The output result of whether the user to be recognized is the designated user, and the first facial feature vector is performed on the facial image of the user to be recognized using the encoder and the trained facial feature extraction model Obtained by processing, the second face feature vector is obtained by processing the face image of the designated user by using the encoder and the trained face feature extraction model;
    所述用户特征提取模型生成模块,具体用于:The user feature extraction model generation module is specifically used for:
    生成由所述编码器、所述训练后的人脸特征提取模型及所述用户匹配模型构成的用于隐私保护的用户特征提取模型。A user feature extraction model for privacy protection composed of the encoder, the trained face feature extraction model, and the user matching model is generated.
  20. 一种客户端设备,包括:A client device, including:
    至少一个处理器;以及,At least one processor; and,
    与所述至少一个处理器通信连接的存储器;其中,A memory communicatively connected with the at least one processor; wherein,
    所述存储器存储有图像编码器以及可被所述至少一个处理器执行的指令,所述图像编码器为自编码器中的编码器,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够:The memory stores an image encoder and instructions executable by the at least one processor. The image encoder is an encoder in a self-encoder, and the instructions are executed by the at least one processor to enable the Said at least one processor can:
    将待识别用户的人脸图像输入所述图像编码器,得到所述图像编码器输出的所述人脸图像的编码向量,所述编码向量为对所述人脸图像进行特征化处理后得到的向量数据;The face image of the user to be identified is input to the image encoder, and the encoding vector of the face image output by the image encoder is obtained, and the encoding vector is obtained by characterizing the face image Vector data
    发送所述编码向量至服务端设备,以便于所述服务端设备利用人脸特征提取模型根据所述编码向量生成所述待识别用户的人脸特征向量,所述人脸特征提取模型是通过对所述自编码中的解码器以及基于卷积神经网络的特征提取模型进行锁定而得到的模型。Send the coding vector to the server device, so that the server device uses the facial feature extraction model to generate the facial feature vector of the user to be identified according to the coding vector, and the facial feature extraction model The model obtained by locking the decoder in the self-encoding and the feature extraction model based on the convolutional neural network.
  21. 一种服务端设备,包括:A server device, including:
    至少一个处理器;以及,At least one processor; and,
    与所述至少一个处理器通信连接的存储器;其中,A memory communicatively connected with the at least one processor; wherein,
    所述存储器存储有人脸特征提取模型,所述人脸特征提取模型是通过对自编码中的解码器以及基于卷积神经网络的特征提取模型进行锁定而得到的模型,所述存储器还存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够:The memory stores a facial feature extraction model, the facial feature extraction model is a model obtained by locking a decoder in self-encoding and a feature extraction model based on a convolutional neural network, and the memory also stores An instruction executed by the at least one processor, the instruction being executed by the at least one processor, so that the at least one processor can:
    获取待识别用户的人脸图像的编码向量,所述编码向量是利用所述自编码器中的编码器对所述人脸图像进行特征化处理而得到的向量数据;Acquiring an encoding vector of the face image of the user to be identified, where the encoding vector is vector data obtained by performing characterization processing on the face image using an encoder in the autoencoder;
    将所述编码向量输入所述人脸特征提取模型中的解码器后,所述解码器向所述特征提取模型输出重建人脸图像数据;以便于所述特征提取模型对所述重建人脸图像数据进行特征化处理后,输出所述待识别用户的人脸特征向量。After the coding vector is input to the decoder in the face feature extraction model, the decoder outputs reconstructed face image data to the feature extraction model; so that the feature extraction model can perform the reconstruction of the face image After the data is characterized, the facial feature vector of the user to be recognized is output.
  22. 一种针对用于隐私保护的人脸特征提取模型的训练设备,包括:A training device for face feature extraction model for privacy protection, including:
    至少一个处理器;以及,At least one processor; and,
    与所述至少一个处理器通信连接的存储器;其中,A memory communicatively connected with the at least one processor; wherein,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够:The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can:
    获取第一训练样本集合,所述第一训练样本集合中的训练样本为人脸图像;Acquiring a first training sample set, where the training samples in the first training sample set are face images;
    利用所述第一训练样本集合对初始自编码器进行训练,得到训练后的自编码器;Training the initial autoencoder by using the first training sample set to obtain a trained autoencoder;
    获取第二训练样本集合,所述第二训练样本集合中的训练样本为编码向量,所述编码向量为利用所述训练后的自编码器中的编码器对人脸图像进行特征化处理后得到的向量数据;Obtain a second training sample set, the training samples in the second training sample set are coding vectors, and the coding vectors are obtained after characterizing the face image using the encoder in the trained autoencoder Vector data;
    将所述第二训练样本集合中的训练样本输入初始人脸特征提取模型的解码器中,以便于利用所述解码器输出的重建人脸图像数据,对所述初始人脸特征提取模型中的基于卷积神经网络的初始特征提取模型进行训练,得到训练后的人脸特征提取模型;所述初始人脸特征提取模型是通过对所述解码器及所述初始特征提取模型进行锁定而得到的,所述解码器为所述训练后的自编码器中的解码器;Input the training samples in the second training sample set into the decoder of the initial facial feature extraction model, so as to use the reconstructed face image data output by the decoder to extract the original facial features from the model. The initial feature extraction model based on the convolutional neural network is trained to obtain a trained face feature extraction model; the initial face feature extraction model is obtained by locking the decoder and the initial feature extraction model , The decoder is a decoder in the trained autoencoder;
    根据所述编码器及所述训练后的人脸特征提取模型,生成用于隐私保护的用户特征提取模型。According to the encoder and the trained face feature extraction model, a user feature extraction model for privacy protection is generated.
PCT/CN2020/140574 2020-03-19 2020-12-29 Facial feature extraction method, apparatus and device WO2021184898A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010197694.8 2020-03-19
CN202010197694.8A CN111401272B (en) 2020-03-19 2020-03-19 Face feature extraction method, device and equipment

Publications (1)

Publication Number Publication Date
WO2021184898A1 true WO2021184898A1 (en) 2021-09-23

Family

ID=71432637

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/140574 WO2021184898A1 (en) 2020-03-19 2020-12-29 Facial feature extraction method, apparatus and device

Country Status (2)

Country Link
CN (2) CN111401272B (en)
WO (1) WO2021184898A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114821751A (en) * 2022-06-27 2022-07-29 北京瑞莱智慧科技有限公司 Image recognition method, device, system and storage medium
CN114842544A (en) * 2022-07-04 2022-08-02 江苏布罗信息技术有限公司 Intelligent face recognition method and system suitable for facial paralysis patient
CN116844217A (en) * 2023-08-30 2023-10-03 成都睿瞳科技有限责任公司 Image processing system and method for generating face data

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401272B (en) * 2020-03-19 2021-08-24 支付宝(杭州)信息技术有限公司 Face feature extraction method, device and equipment
CN111401273B (en) * 2020-03-19 2022-04-29 支付宝(杭州)信息技术有限公司 User feature extraction system and device for privacy protection
CN111783965A (en) * 2020-08-14 2020-10-16 支付宝(杭州)信息技术有限公司 Method, device and system for biometric identification and electronic equipment
CN112949545B (en) * 2021-03-17 2022-12-30 中国工商银行股份有限公司 Method, apparatus, computing device and medium for recognizing face image
CN112926559B (en) * 2021-05-12 2021-07-30 支付宝(杭州)信息技术有限公司 Face image processing method and device
CN113657498B (en) * 2021-08-17 2023-02-10 展讯通信(上海)有限公司 Biological feature extraction method, training method, authentication method, device and equipment
CN113946858B (en) * 2021-12-20 2022-03-18 湖南丰汇银佳科技股份有限公司 Identity security authentication method and system based on data privacy calculation
CN115190217B (en) * 2022-07-07 2024-03-26 国家计算机网络与信息安全管理中心 Data security encryption method and device integrating self-coding network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109769080A (en) * 2018-12-06 2019-05-17 西北大学 A kind of encrypted image crack method and system based on deep learning
US20190171908A1 (en) * 2017-12-01 2019-06-06 The University Of Chicago Image Transformation with a Hybrid Autoencoder and Generative Adversarial Network Machine Learning Architecture
CN110598580A (en) * 2019-08-25 2019-12-20 南京理工大学 Human face living body detection method
CN111401272A (en) * 2020-03-19 2020-07-10 支付宝(杭州)信息技术有限公司 Face feature extraction method, device and equipment

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104866900B (en) * 2015-01-29 2018-01-19 北京工业大学 A kind of deconvolution neural network training method
JP6318211B2 (en) * 2016-10-03 2018-04-25 株式会社Preferred Networks Data compression apparatus, data reproduction apparatus, data compression method, data reproduction method, and data transfer method
CN107220594B (en) * 2017-05-08 2020-06-12 桂林电子科技大学 Face posture reconstruction and recognition method based on similarity-preserving stacked self-encoder
US11171977B2 (en) * 2018-02-19 2021-11-09 Nec Corporation Unsupervised spoofing detection from traffic data in mobile networks
CN108537120A (en) * 2018-03-06 2018-09-14 安徽电科恒钛智能科技有限公司 A kind of face identification method and system based on deep learning
CN108664967B (en) * 2018-04-17 2020-08-25 上海媒智科技有限公司 Method and system for predicting visual saliency of multimedia page
WO2020033900A1 (en) * 2018-08-10 2020-02-13 L3 Security & Detection Systems, Inc. Systems and methods for image processing
CN109117801A (en) * 2018-08-20 2019-01-01 深圳壹账通智能科技有限公司 Method, apparatus, terminal and the computer readable storage medium of recognition of face
CN109495476B (en) * 2018-11-19 2020-11-20 中南大学 Data stream differential privacy protection method and system based on edge calculation
CN110147721B (en) * 2019-04-11 2023-04-18 创新先进技术有限公司 Three-dimensional face recognition method, model training method and device
CN110321777B (en) * 2019-04-25 2023-03-28 重庆理工大学 Face recognition method based on stacked convolution sparse denoising autoencoder
CN110310351B (en) * 2019-07-04 2023-07-21 北京信息科技大学 Sketch-based three-dimensional human skeleton animation automatic generation method
CN110766048A (en) * 2019-09-18 2020-02-07 平安科技(深圳)有限公司 Image content identification method and device, computer equipment and storage medium
CN110826056B (en) * 2019-11-11 2024-01-30 南京工业大学 Recommended system attack detection method based on attention convolution self-encoder

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190171908A1 (en) * 2017-12-01 2019-06-06 The University Of Chicago Image Transformation with a Hybrid Autoencoder and Generative Adversarial Network Machine Learning Architecture
CN109769080A (en) * 2018-12-06 2019-05-17 西北大学 A kind of encrypted image crack method and system based on deep learning
CN110598580A (en) * 2019-08-25 2019-12-20 南京理工大学 Human face living body detection method
CN111401272A (en) * 2020-03-19 2020-07-10 支付宝(杭州)信息技术有限公司 Face feature extraction method, device and equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU JINGJING: "Image Encryption Based on Variational Auto-encoder Genertive Models", CHINESE MASTER'S THESES FULL-TEXT DATABASE, TIANJIN POLYTECHNIC UNIVERSITY, CN, 15 January 2019 (2019-01-15), CN, XP055852584, ISSN: 1674-0246 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114821751A (en) * 2022-06-27 2022-07-29 北京瑞莱智慧科技有限公司 Image recognition method, device, system and storage medium
CN114842544A (en) * 2022-07-04 2022-08-02 江苏布罗信息技术有限公司 Intelligent face recognition method and system suitable for facial paralysis patient
CN114842544B (en) * 2022-07-04 2022-09-06 江苏布罗信息技术有限公司 Intelligent face recognition method and system suitable for facial paralysis patient
CN116844217A (en) * 2023-08-30 2023-10-03 成都睿瞳科技有限责任公司 Image processing system and method for generating face data
CN116844217B (en) * 2023-08-30 2023-11-14 成都睿瞳科技有限责任公司 Image processing system and method for generating face data

Also Published As

Publication number Publication date
CN111401272B (en) 2021-08-24
CN113657352A (en) 2021-11-16
CN111401272A (en) 2020-07-10

Similar Documents

Publication Publication Date Title
WO2021184898A1 (en) Facial feature extraction method, apparatus and device
WO2021184976A1 (en) User characteristics extraction system and device for privacy protection
WO2021238956A1 (en) Identity verification method, apparatus and device based on privacy protection
US10984225B1 (en) Masked face recognition
US20220172518A1 (en) Image recognition method and apparatus, computer-readable storage medium, and electronic device
CN112000940B (en) User identification method, device and equipment under privacy protection
CN111368795B (en) Face feature extraction method, device and equipment
CN115359219A (en) Virtual image processing method and device of virtual world
CN112084476A (en) Biological identification identity verification method, client, server, equipment and system
WO2020220212A1 (en) Biological feature recognition method and electronic device
CN116994188A (en) Action recognition method and device, electronic equipment and storage medium
CN113221717B (en) Model construction method, device and equipment based on privacy protection
CN116630480B (en) Interactive text-driven image editing method and device and electronic equipment
Wang et al. Multi-format speech biohashing based on energy to zero ratio and improved lp-mmse parameter fusion
CN112395448A (en) Face retrieval method and device
CN113239852B (en) Privacy image processing method, device and equipment based on privacy protection
CN115618375A (en) Service execution method, device, storage medium and electronic equipment
CN111860212B (en) Super-division method, device, equipment and storage medium for face image
CN115048661A (en) Model processing method, device and equipment
CN114662144A (en) Biological detection method, device and equipment
An et al. Verifiable speech retrieval algorithm based on KNN secure hashing
CN117874706B (en) Multi-modal knowledge distillation learning method and device
CN117539452B (en) Face recognition method and device and electronic equipment
CN117612269A (en) Biological attack detection method, device and equipment
CN114882290A (en) Authentication method, training method, device and equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20926282

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20926282

Country of ref document: EP

Kind code of ref document: A1