WO2019237846A1 - 图像处理方法、人脸识别方法、装置和计算机设备 - Google Patents

图像处理方法、人脸识别方法、装置和计算机设备 Download PDF

Info

Publication number
WO2019237846A1
WO2019237846A1 PCT/CN2019/085031 CN2019085031W WO2019237846A1 WO 2019237846 A1 WO2019237846 A1 WO 2019237846A1 CN 2019085031 W CN2019085031 W CN 2019085031W WO 2019237846 A1 WO2019237846 A1 WO 2019237846A1
Authority
WO
WIPO (PCT)
Prior art keywords
generated
sample set
network
network model
glasses
Prior art date
Application number
PCT/CN2019/085031
Other languages
English (en)
French (fr)
Inventor
邰颖
曹赟
丁守鸿
李绍欣
汪铖杰
李季檩
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2019237846A1 publication Critical patent/WO2019237846A1/zh
Priority to US16/991,878 priority Critical patent/US11403876B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present application relates to the field of image processing technologies, and in particular, to an image processing method, a face recognition method, a device, a computer device, and a readable storage medium.
  • the traditional glasses removal model has a low network learning ability, and it is difficult to ensure that the face image after removing the glasses effectively represents the relevant features of the original image, which further reduces the degree of reduction of the face image after removing the glasses.
  • an image processing method a face recognition method, a device, a computer device, and a readable storage medium are provided, which can solve the technical problem of low reduction degree based on the traditional glasses removal model.
  • an image processing method includes:
  • the object in the target image is wearing glasses
  • the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence;
  • a glasses-removed image corresponding to the target image is generated according to the weighted feature map.
  • a face recognition method includes:
  • the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence;
  • an image processing apparatus includes:
  • An image acquisition module configured to acquire a target image, and an object in the target image is wearing glasses;
  • the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence;
  • a convolution module configured to obtain a feature map of each feature channel of the target image through a convolution layer of the convolution-shrink excitation network
  • a weight learning module configured to obtain global information of each feature channel according to the feature map through a shrinking excitation layer in the convolution-shrink excitation network, and learn the global information to generate a weight of each feature channel ;
  • a weighting module configured to use the weighting layer of the convolution-shrink excitation network to weight the feature maps of each feature channel according to the weights to generate a weighted feature map
  • a generating module is configured to generate a de-spectacled image corresponding to the target image according to the weighted feature map through the glasses removal model.
  • a face recognition device includes:
  • a target image acquisition module configured to acquire a target image in an image of a face to be identified, the face of the target image wearing glasses;
  • a target image input module configured to input the target image to a glasses removal model trained based on a generative adversarial network;
  • the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence;
  • a feature convolution module configured to obtain a feature map of each feature channel of the target image through a convolution layer of the convolution-shrink excitation network
  • a feature weight learning module is configured to obtain global information of each feature channel according to the feature map through a shrinking excitation layer in the convolution-shrink excitation network, and learn the global information to generate the information of each feature channel. Weights;
  • a feature weighting module configured to perform weighting processing on the feature maps of each feature channel according to the weights through the weighting layer of the convolution-shrink excitation network to generate a weighted feature map
  • a face image generation module configured to obtain a de-glassed face image corresponding to the target image according to the weighted feature map through the glasses removal model;
  • a matching module is configured to match the de-spectacled face image with a preset face image database, and generate a face recognition result according to the matching result.
  • a computer device including a memory and a processor.
  • the memory stores a computer program
  • the processor implements the following steps when the computer program executes:
  • the object in the target image is wearing glasses
  • the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence;
  • a glasses-removed image corresponding to the target image is generated according to the weighted feature map.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • the object in the target image is wearing glasses
  • the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence;
  • a glasses-removed image corresponding to the target image is generated according to the weighted feature map.
  • a computer device including a memory and a processor.
  • the memory stores a computer program
  • the processor implements the following steps when the computer program executes:
  • the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence;
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence;
  • the foregoing image processing method, face recognition method, device, computer device, and readable storage medium obtain a target image and input the target image to a pre-trained glasses removal model.
  • the glasses removal model includes a plurality of sequentially connected The convolution-shrink excitation network, so the feature map of each feature channel of the target image can be obtained through the convolution layer of the convolution-shrink excitation network, and then the contraction excitation layer in the convolution-shrink excitation network can be obtained from the feature map Global information of each feature channel, learn the global information to generate the weight of each feature channel, and then use the weighting layer in the convolution-shrink excitation network to weight the feature map of each feature channel according to the weight to generate weighted features Figure.
  • a glasses-removed model is used to obtain the corresponding glasses-removed image according to the weighted feature map.
  • the glasses removal model can maintain a high learning ability, so that it can fully learn the importance of different feature channels to obtain corresponding weights.
  • weighting processing it enhances the effective features while suppressing invalid or small effects, and effectively removes the target image.
  • the spectacles at the same time ensure that the de-spectacled image can recover the key features of the target image, and improve the reduction and authenticity of the de-spectacled image.
  • FIG. 1 is an application environment diagram of an image processing method and / or a face recognition method in an embodiment
  • FIG. 2 is a schematic flowchart of an image processing method according to an embodiment
  • FIG. 3 is a schematic structural diagram of a convolution-shrink excitation network according to an embodiment
  • FIG. 4 is a schematic diagram of shrinking excitation processing and weighting processing in a convolution-shrinking excitation network in an embodiment
  • FIG. 5 is a schematic diagram of performing shrinking processing on a feature map in an embodiment
  • FIG. 6 is a schematic flowchart of an image processing method according to an embodiment
  • FIG. 7 is a schematic flowchart of a glasses removal model training method according to an embodiment
  • FIG. 8 is a schematic flowchart of a network loss coefficient generation step in an embodiment
  • FIG. 9 is a schematic structural diagram of a network model in a glasses removal model training method according to an embodiment
  • FIG. 10 is a schematic flowchart of steps of updating and generating a network model and iterating in an embodiment
  • FIG. 11 is a schematic flowchart of a face recognition method according to an embodiment
  • FIG. 12 is a schematic flowchart of a step of obtaining a target image through glasses recognition detection in an embodiment
  • FIG. 13 is a schematic flowchart of a face recognition method according to an embodiment
  • FIG. 14 is a structural block diagram of an image processing apparatus according to an embodiment
  • FIG. 15 is a structural block diagram of a face recognition device in an embodiment
  • FIG. 16 is a structural block diagram of a computer device in one embodiment.
  • FIG. 1 is an application environment diagram of an image processing method and / or a face recognition method in an embodiment.
  • the image processing method is applied to an image processing system.
  • the image processing system includes a terminal or a server 110.
  • the terminal may be a desktop terminal or a mobile terminal, and the mobile terminal may be at least one of a mobile phone, a tablet computer, a notebook computer, and the like; the server may be a server or a server cluster.
  • the terminal or server 110 is configured with a glasses removal model trained based on a generative adversarial network.
  • the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence.
  • the convolution-contraction excitation network in the glasses removal model can target the target image.
  • the glasses are removed.
  • the terminal or server 110 can also perform face recognition, that is, after obtaining a de-spectacled face image based on the glasses removal model, match the de-spectacled face image with a preset face image database, and generate a person according to the matching result. Face recognition results.
  • an image processing method is provided. This embodiment is mainly described by using the method applied to the terminal or server 110 in FIG. 1 described above. Referring to FIG. 2, the image processing method includes the following steps:
  • S201 Obtain a target image, and the object in the target image is wearing glasses.
  • the target image refers to an image that carries glasses wearing information and requires glasses removal processing. That is, the object in the target object is wearing glasses, and glasses removal processing is required.
  • the target image may be a face image wearing glasses; when the object is an eye, the target image may be an eye image segmented from the face image wearing glasses.
  • the image processing software is used to perform glasses removal processing
  • the acquired target image is a face image input to the image processing software or a segmented eye image.
  • the target image is input to a glasses removal model trained based on a generative adversarial network; the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence.
  • the glasses removal model is obtained in advance based on the generation adversarial network training.
  • the glasses removal model may be a model for removing glasses for a global face image, or a model for removing glasses for a local eye image. It can be understood that when the glasses removal model is a model that removes glasses for a global face image, the target image is a global face image; when the glasses removal model is that that removes glasses for a local eye image, the target image is local Eye image.
  • generating an adversarial network includes generating a network model and discriminating a network model.
  • the generating network model is used to generate a fake picture that is as real as possible based on the input data.
  • Generating adversarial network training refers to generating a picture from the generated network model to deceive the network model, and then discriminating the network model to determine whether this picture and the corresponding real picture are true or false.
  • a convolution-shrink excitation network refers to a structure composed of a convolutional layer, a contraction excitation layer, and a weighting layer of a convolutional neural network.
  • the contraction excitation layer includes a contraction module and an excitation module.
  • the contraction module is used to process the feature map of each feature channel to obtain global information of each feature channel.
  • the incentive module is used to learn the global information to generate each feature channel. Weights.
  • FIG. 3 provides a convolution-shrink excitation network obtained by introducing a shrinkage excitation layer in the residual network.
  • the residual layer 302 is connected to the shrinkage layer 304 and the weighting layer 308, respectively.
  • the excitation layer 306 is connected, and the excitation layer 306 is also connected to the weighting layer 308.
  • the convolution-shrink excitation network is used to perform the following steps:
  • S203 A feature map of each feature channel of the target image is obtained through a convolution layer of the convolution-shrink excitation network.
  • the input target image is subjected to convolution processing through a convolution layer of a convolution-shrink excitation network to obtain a feature map of each feature channel of the target image, and the feature map is input to the Shrinking excitation layer in a convolution-shrinking excitation network.
  • each convolutional layer the data exists in a three-dimensional form. Think of it as a stack of two-dimensional pictures of multiple feature channels, each of which is called a feature map. As shown in FIG. 4, after the target image is transformed by convolution, a three-dimensional matrix U of size W ⁇ H ⁇ C is obtained, which can also be called C feature maps of size W ⁇ H, where C represents the number of feature channels.
  • the global information refers to the numerical distribution of the feature map of each feature channel.
  • the feature map is compressed by the shrinking layer 304 to obtain global information of each feature channel.
  • FIG. 5 a two-dimensional matrix corresponding to a 6 ⁇ 6 size feature map is shown in FIG. 5.
  • a 1 ⁇ 1 size feature map representing global information is obtained.
  • the calculation method is shown in formula (1):
  • z c represents the global information of the C feature channel;
  • F sq represents the global information finding function;
  • u c represents the two-dimensional matrix (feature map) corresponding to the C feature channel in the matrix U;
  • i represents W ⁇ H The row labels in the two-dimensional matrix;
  • j represents the column labels in the W ⁇ H two-dimensional matrix;
  • u c (i, j) represents the value of the i-th row and the j-th column in the two-dimensional matrix corresponding to the C-th feature channel.
  • the obtaining of global information is actually an arithmetic average of the feature values of each feature map, transforming each two-dimensional matrix into a real number, so that the information of the position in the entire map of a channel feature map
  • the area of local receptive field extraction information due to the size of the convolution kernel is too small, and the amount of reference information is insufficient, making the evaluation inaccurate.
  • the global information is input to the excitation layer 306, and the global information is learned through the excitation layer 306 to generate weights for each feature channel.
  • the weight is used to indicate the importance of each feature channel.
  • the calculation method of the weight is shown in formula (2):
  • s represents the weight of C feature channels with a dimension of 1 ⁇ 1 ⁇ C
  • z represents a global information matrix composed of C z c with a dimension of 1 ⁇ 1 ⁇ C
  • F ex represents a weight finding function
  • represents sigmoid function
  • represents a linear activation function
  • W 1 represents a dimensionality reduction layer parameter, and the dimensionality reduction ratio is r
  • W 2 represents the ascending dimension parameter
  • the shrinking layer 304 compresses the feature map to obtain z.
  • the dimension of W 2 is C ⁇ (C / r), so the dimension of the output is 1 ⁇ 1 ⁇ C.
  • the sigmoid function goes through the sigmoid function to get s. Since the compression processing of the previous shrinking layer 304 is performed on the feature map of a single feature channel, the feature map information of each feature channel is fused by the two fully connected layers in the excitation layer 306, based on the Dependency learning is used to obtain the weight of each feature channel to accurately characterize the importance of the feature map corresponding to each feature channel, so that the weight of the valid feature map is larger, and the weight of the feature map that is invalid or less effective is smaller.
  • the feature maps of each feature channel are weighted according to the weights to generate a weighted feature map.
  • the weighted layer of the convolution-shrink excitation network is used to multiply the feature map of each feature channel by the corresponding weight to generate a weighted feature map. As shown in the following formula (3):
  • F scale represents the weighting function;
  • s c represents the weight of the C-th feature channel.
  • a weighted feature map is generated and input to the next layer of network for processing. Since the weighted feature map is obtained according to the weight of each feature channel, it can suppress the ineffective or small effects while enhancing the effective features, and strengthen the learning ability of the network, so that the glasses removal model uses fewer convolution kernels ( The convolutional layer uses only 64 or 128 convolution kernels) to complete the glasses removal process, which can reduce the model specifications and reduce the complexity of the model.
  • the glasses removal model is used to generate a glasses-removed image corresponding to the target image according to the weighted feature map.
  • the glasses removal model is a trained model with glasses removal effect. After processing through multiple convolution-contraction excitation networks and other network layers in the glasses removal model, the glasses-removed image corresponding to the target image is generated based on the weighted feature map. .
  • the target image is obtained by inputting the target image into a pre-trained spectacle removal model. Since the spectacle removal model includes a plurality of convolution-contraction excitation networks connected in sequence, the convolution-contraction The convolutional layer of the excitation network is used to obtain the feature map of each feature channel of the target image, and then the shrinkage excitation layer in the convolution-shrink excitation network is used to obtain the global information of each feature channel according to the feature map. The global information is learned to generate The weight of each feature channel is then weighted by the weighting layer in the convolution-shrink excitation network. The feature map of each feature channel is weighted according to the weight to generate a weighted feature map.
  • the weighted feature map is obtained through the glasses removal model.
  • Corresponding de-glasses image the glasses removal model can maintain a high learning ability, so that it can fully learn the importance of different feature channels to obtain corresponding weights.
  • weighting processing it enhances the effective features while suppressing invalid or small effects, and effectively removes the target image.
  • the spectacles at the same time ensure that the de-spectacled image can recover the key features of the target image, and improve the reduction and authenticity of the de-spectacled image.
  • an image processing method is provided.
  • the eyeglass removal model is a model for removing eyeglasses for a local eye image. As shown in Figure 6, the method includes:
  • the face image refers to a picture including the entire face information.
  • S602 Segment the eye image according to the position of the eye in the face image to obtain a target image.
  • the target image is input to a glasses removal model trained based on a generative adversarial network; the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence.
  • a feature map of each feature channel of the target image is obtained through a convolution layer of the convolution-shrink excitation network.
  • the feature maps of each feature channel are weighted according to the weights to generate a weighted feature map.
  • the glasses-removed image is an eye image after the glasses are removed.
  • the position in the face image where the eye is located is determined, and the eye glasses image at the determined position is replaced by the eye glasses removing image to obtain the image after removing the glasses.
  • Face image The eyeglass removal model based on the eye image can enhance the model's processing of the eye area and improve the eyeglass removal effect.
  • the method before the step of inputting the target image to the glasses removal model trained based on generating an adversarial network, the method further includes: performing a normalization process on the target image. After the step of generating glasses-removed glasses corresponding to the target image according to the weighted feature map through the glasses-removal model, the method further includes: performing restoration processing on the glasses-removed images to restore the glasses-removed images to the target image size.
  • the step of inputting the target image to the glasses removal model trained based on the generation of the adversarial network refers to inputting the normalized processing target image to the glasses removal model trained based on the generation of the adversarial network.
  • Normalization processing refers to processing that normalizes the original image to the same size and the same pixel value range.
  • the reduction processing refers to the inverse processing as opposed to the normalization processing, that is, the reduction of the image size to the original image size and the reduction of the pixel value range to the pixel value range of the original image.
  • the original image size is normalized to 256 * 256, and then the pixel values of the image are normalized to [-1, 1]; in the restoration process, it is assumed that the pixel value range of the original image is [0,255], the image is restored to the original image size, and the pixel values are normalized to [0,255].
  • a method for training a glasses removal model in an image processing method including the following steps:
  • the first training sample set is composed of a plurality of normalized first training images (first training samples).
  • the second training sample set is composed of a plurality of normalized second training images.
  • (Second training sample) consists of one-to-one correspondence between the training samples in the first training sample set and the training samples in the second training sample set. The only difference is whether or not glasses are worn. Among them, the glasses to be worn are frame glasses. For example, in the normalization process, the original image size is normalized to 256 * 256, and then the image pixel values are normalized to [-1, 1].
  • the second training sample may be a second training image obtained through each image acquisition channel, or may be obtained by copying the obtained second training image, and the first training sample may be obtained by adding glasses to the second training sample. It is obtained by processing; the first training sample and the second training sample may also be a large number of image samples collected by a face image acquisition device, for example, corresponding image samples are acquired by acquisition devices such as a camera and a camera. It can be understood that when the trained glasses removal model is a model that removes glasses for a global face image, the training sample is a global face image; when the trained glasses removal model is a model that removes glasses for a local eye image, The training samples are local eye images. Model training based on eye images can enhance the model's processing of eye regions and improve the effect of eyeglass removal.
  • the first training sample set is input to a generation network model in the generation adversarial network to obtain a generation sample set with glasses removed.
  • the generation network model includes a plurality of convolution-contraction excitation networks connected in sequence.
  • the generated sample set refers to a set composed of generated samples corresponding to each first training sample. Further, the generated sample refers to a face image generated after the de-spectacle processing is performed on the first training sample by the generated network model.
  • the first training sample in the first training sample set is sequentially input to the generated network model in the generated adversarial network, and the convolution layer of the convolution-shrink excitation network in the generated network model is sequentially obtained Feature map of each feature channel of the first training sample.
  • the shrinking excitation layer in the convolution-shrink excitation network the global information of each feature channel is obtained according to the feature map, and the global information is learned to generate the weight of each feature channel, and the weight of the convolution-shrink excitation network is further weighted. Layer to perform weighting processing on the feature maps of each feature channel according to the weights to generate a weighted feature map corresponding to the first training sample.
  • the weighted feature map corresponding to the first training sample is further processed based on the generated network model, and finally generated samples corresponding to the first training sample are generated, and all the generated samples form a generated sample set.
  • the generated sample set and the second training sample set are respectively input to a discriminative network model in the generation adversarial network, and a generation network loss coefficient is obtained according to the output of the discriminative network model.
  • the loss coefficient refers to a parameter used to evaluate the prediction effect of the network model.
  • the smaller the loss coefficient the better the prediction effect of the network model.
  • the generated network loss coefficient refers to a parameter used to evaluate the effect of generating a network model on removing glasses. Based on the generated network loss coefficient, various parameters in the generated network model are adjusted to achieve a better effect of removing glasses. In this embodiment, a corresponding generated network loss coefficient is generated based on different generated samples.
  • generative adversarial network training refers to generating a picture from the generative network model to deceive the network model, and then discriminating the network model to determine whether this picture and the corresponding real picture are true or false.
  • the purpose of generating the adversarial network training is to make the generated samples generated by generating the network model to achieve the effect of realizing falsehood. In other words, it makes it difficult for the discriminant network model to discern whether the generated sample is a generated image or a real image.
  • the generated sample set and the second training sample set are input to the discriminative network model in the adversarial network, and the parameters of the discriminative network model are adjusted according to the output of the discriminative network model to obtain an updated discriminant network model. ; Then input the generated sample set to the updated discriminant network model, and obtain a generated network loss coefficient according to the output of the updated discriminant network model, so as to adjust parameters of the generated network model according to the generated network loss coefficient.
  • the parameters for generating the network model refer to the connection weights between the neurons in the network model.
  • S708 Update the parameters of the generated network model according to the generated network loss coefficient, obtain the updated generated network model, and return to step S704 until the iteration end condition is satisfied, and use the updated generated network model as the glasses removal model.
  • the parameters of the generated network model are adjusted according to the generated network loss coefficient and the preset method of adjusting the parameters of the generated network model to obtain an updated generated network model. Determine whether the preset iteration end condition is met, if it is met, end the iterative training, and use the updated generated network model as the glasses removal model; otherwise return to step S704, and until the preset iteration end condition is satisfied, update the updated A network model is generated as a glasses removal model.
  • the method for adjusting the parameters of the generated network model includes, but is not limited to, an error correction algorithm such as a gradient descent method and a back propagation algorithm.
  • an error correction algorithm such as a gradient descent method and a back propagation algorithm.
  • Adam Adaptive Moment Estimation
  • the iteration end condition may be that the number of iterations reaches the threshold of the number of iterations, or that the network model is generated to achieve a preset glasses removal effect, which is not limited herein.
  • a generation network model including a plurality of successively connected convolution-contraction excitation networks and a discriminative network model are used to form a generation adversarial network, and the generation adversarial training is performed to obtain a generation that can effectively remove the glasses.
  • the network model is used as a glasses removal model.
  • the global information of each feature channel corresponding to the input training sample is learned to generate the weight of each feature channel, and the feature map of each feature channel is weighted according to the weight to generate corresponding weights.
  • Feature map so that weighting can be used to enhance the effective features while suppressing invalid or small effects, effectively removing the glasses in each first training sample in the first training sample set, and making the generated samples recover the corresponding first training sample.
  • the key features of the method are to improve the reduction and authenticity of the generated samples.
  • the generated sample set and the second training sample set are respectively input to a discriminative network model in the generation adversarial network, and the step of generating a network loss coefficient according to the output of the discriminative network model includes the following steps: step:
  • the generated sample set and the second training sample set are respectively input to a discriminative network model in the generation adversarial network, and a discriminative network loss coefficient is obtained according to an output of the discriminant network model.
  • the discriminative network loss coefficient refers to a parameter used to evaluate the classification effect of the discriminative network model. Based on the discriminant network loss coefficient, various parameters in the discriminative network model are adjusted to achieve a more accurate classification effect. In this embodiment, a corresponding discriminant network loss coefficient is generated based on different generated samples.
  • each generated sample in the generated sample set and each second training sample in the second training sample set are sequentially input to the discriminant network model in the generated adversarial network, and the generated sample and each second The output corresponding to the training sample is obtained from the output of the generated sample and the corresponding second training sample to obtain a discriminant network loss coefficient.
  • the number of discriminant network loss coefficients is the same as the number of generated samples.
  • S804 Update the parameters of the discriminant network model according to the discriminant network loss coefficient to obtain an updated discriminant network model.
  • the parameters of the discriminative network model refer to discriminating the connection weights between the neurons in the network model.
  • the parameters of the discriminative network model are adjusted according to the discriminative network loss coefficient and the preset method of adjusting the parameters of the discriminative network model to obtain an updated discriminant network model.
  • the method for adjusting the parameters of the discriminant network model includes, but is not limited to, gradient correction methods and back-propagation error correction algorithms. For example, the Adam algorithm that optimizes a random objective function based on a step.
  • the generated sample set is input to the updated discrimination network model, and a generated network loss coefficient is obtained according to the output of the updated discrimination network model.
  • the current discriminative network model After the updated discriminant network model is obtained, the current discriminative network model has a better classification effect than the discriminative network model before the update. Therefore, after determining that the network model has a good classification effect, the parameters of the network model are fixed and then the generated network model is trained.
  • each generated sample in the generated sample set is sequentially input to the updated discrimination network model, and each generated sample corresponds to the output of an updated discrimination network model, which is generated based on the output of the updated discrimination network model.
  • Network loss factor When training the generated network model, each generated sample in the generated sample set is sequentially input to the updated discrimination network model, and each generated sample corresponds to the output of an updated discrimination network model, which is generated based on the output of the updated discrimination network model.
  • the parameters of the network model are fixedly generated, and the discriminative network model is trained and updated, so that the classification capability is maintained by the discriminative network model after training. After training the discriminative network model, train and update the generated network model. At this time, the parameters of the discriminative network model are fixed, and only the loss or error generated by the generated network model is passed to the generated network model, which is based on the updated network model. The output of the network model is determined to obtain the generated network loss coefficient, and the parameters of the generated network model are updated based on the generated network loss coefficient. By judging the adversarial game between the network model and the generated network model, the two network models finally reach a steady state.
  • the steps of inputting the generated sample set and the second training sample set to the discriminant network model, and obtaining the discriminant network loss coefficient according to the output of the discriminative network model include: separately generating the generated sample set and the second training sample set. Input to the discriminative network model to obtain the first probability corresponding to the generated sample set and the second probability of the second training sample set; according to the first probability, the second probability, and the discriminant network loss function, a discriminative network loss coefficient is obtained.
  • the first probability refers to the probability that the generated sample belongs to the training sample instead of the generated sample
  • the second probability refers to the probability that the second training sample belongs to the training sample instead of the generated sample.
  • the output of the network model is determined to be a probability value between 0-1, that is, the first probability and the second probability are The range is 0-1.
  • the purpose of discriminative network model training is to make the first probability corresponding to the generated sample tend to 0 as much as possible, so that the corresponding second probability of the second training sample tends to 1 as much as possible, so as to obtain accurate classification ability.
  • the discriminant network loss function refers to a function for calculating a loss coefficient of a discriminant network model based on an output of the discriminant network model.
  • the discriminative network loss function can be a cross-entropy loss function, a function that maximizes the discriminative degree of the discriminative network shown in formula (4)
  • D represents a discriminative network model
  • G represents a generated network model
  • x represents any second training sample
  • p data (x) represents a category identifier of the second training sample
  • D (x) represents a corresponding one of the second training samples.
  • Probability in this embodiment, refers to a second probability
  • y represents any one of the first training samples
  • p y (y) represents a category identifier of the generated sample
  • G (y) represents a generated sample corresponding to any one of the first training samples
  • D (G (y)) represents the probability corresponding to any one of the generated samples, and in this embodiment is the first probability.
  • each generated sample and its category identifier in the generated sample set, and the second training sample and its category identifier in the second training sample set are input to the discriminative network model in order to obtain the first probability corresponding to the generated sample set.
  • the step of inputting the generated sample set to the updated discriminant network model and obtaining the network loss coefficient according to the output of the updated discriminant network model includes: inputting the generated sample set to the updated discriminant network model. To obtain a third probability corresponding to the generated sample set; and obtain a generated network loss coefficient according to the third probability and the generated network loss function.
  • the third probability refers to a probability that the generated sample belongs to the training sample instead of the generated sample.
  • the generated network loss function is a function that calculates the loss coefficient of the generated network model based on the output of the generated network model.
  • the generated network loss function can be a cross-entropy loss function, a function of minimizing the distribution of generated samples and training samples as shown in formula (5).
  • D (G (y)) represents the probability corresponding to any one of the generated samples, and in this embodiment is the third probability.
  • each generated sample in the generated sample set and its category identifier are input to the discriminative network model in order to obtain a third probability corresponding to the generated sample set; according to the third probability and the generated network loss function, the generated network loss is obtained. coefficient.
  • the category identifier of the generated sample is set to 1 to confuse the discriminator, so that the generated sample can gradually approach the real second training sample.
  • the manner of training the glasses removal model further includes a feature network model.
  • the parameters of the generated network model are updated according to the generated network loss coefficient, and before the updated generated network model is obtained, the method further includes: inputting the generated sample set and the second training sample set to the feature network model to obtain the generated sample set. And the second training sample set. Updating the parameters of the generated network model according to the generated network loss coefficient to obtain the updated generated network model includes: updating the parameters of the generated network model according to the generated network loss coefficient and the characteristic error to obtain the updated generated network model.
  • the feature error refers to the difference between the generated sample and its corresponding second training sample in the feature space. It can be understood that the feature error between the generated sample set and the second training sample set refers to the difference in the feature space between each generated sample in the generated sample set and its corresponding second training sample.
  • each generated sample in the generated sample set and its corresponding second training sample are sequentially input to the feature network model, and the features of the feature network model are used to extract the features of the generated sample and the corresponding second training sample.
  • the parameters of the generated network model are adjusted according to the generated network loss coefficient and characteristic error, and the preset method of adjusting the parameters of the generated network model to obtain an updated generated network model. For example, according to the generated network loss coefficient and feature error, Adam algorithm is used to adjust the parameters of the generated network model to obtain an updated generated network model.
  • the final de-spectacled image recovered by the spectacle removal model is further maintained to maintain the discrimination information, that is, the key features of the target image can be recovered more accurately. Improve the reduction of glasses-free images, and ensure the accuracy of face recognition in face recognition applications.
  • the parameters of the generated network model are updated according to the generated network loss coefficient, and before the updated generated network model is obtained, the method further includes: analyzing the pixels of the generated sample set and the second training sample set to obtain the generated Pixel error between the sample set and the second training sample set. Updating the parameters of the generated network model according to the generated network loss coefficient to obtain the updated generated network model includes: updating the parameters of the generated network model according to the generated network loss coefficient and the pixel error to obtain the updated generated network model.
  • the pixel error refers to a difference between each pixel of the generated sample and the corresponding second training sample. It can be understood that the pixel error between the generated sample set and the second training sample set refers to the difference in pixels between each generated sample in the generated sample set and its corresponding second training sample.
  • an error analysis is performed on the pixel points of each generated sample and its corresponding second training sample in the generated sample set in order to obtain the pixel error between each generated sample and its corresponding second training sample.
  • the parameters of the generated network model are adjusted according to the generated network loss coefficient and pixel error, and a preset method of adjusting the parameters of the generated network model to obtain an updated generated network model. For example, according to the generated network loss coefficient and pixel error, Adam algorithm is used to adjust the parameters of the generated network model to obtain the updated generated network model.
  • the parameters of the generated network model are updated according to the generated network loss coefficient, and before the updated generated network model is obtained, the method further includes: analyzing pixels of the generated sample set and the second training sample set to obtain generated samples Pixel error between the training set and the second training sample set; input the generated sample set and the second training sample set to the feature network model, respectively, to obtain the feature error between the generated sample set and the second training sample set; according to the generated network loss
  • the coefficients update the parameters of the generated network model to obtain the updated generated network model, including: updating the parameters of the generated network model according to the generated network loss coefficient, the pixel error, and the characteristic error to obtain the updated generated network model.
  • the degree of restoration of the spectacle-removed image restored by the final spectacle removal model is promoted.
  • step S708 further includes the following steps:
  • S1002 Update the parameters of the generated network model according to the generated network loss coefficient to obtain the updated generated network model.
  • each time the training of generating the adversarial network is completed an operation of adding one to the number of iterations is performed, and the current number of iterations is obtained, and it is determined whether the current number of iterations reaches the threshold of the number of iterations. If it is not reached, the training-related steps are continued. Otherwise, the updated generated network model is used as the glasses removal model, and the training step is exited.
  • a step of testing glasses removal model is further included.
  • the step includes: obtaining a test sample set consisting of a test image, and the objects in the test image are wearing glasses; the test sample set is input to the training to obtain Glasses removal model, and get test results based on the output of the glasses removal model.
  • the test sample set is composed of a plurality of normalized test images (test samples), and the test image is different from the first training image.
  • the performance of the trained glasses removal model is further tested to determine whether the currently obtained glasses removal model satisfies a preset glasses removal effect.
  • a face recognition method for applying face removal model for face recognition includes the following steps:
  • S1101 Obtain a target image in a face image to be identified, and the face in the target image is wearing glasses.
  • the face image to be identified refers to a global face image that currently needs to be identified. For example, during identity verification during the security check, a global face image collected by an image acquisition device.
  • the face image to be identified may be a face image of a face with glasses, or a face image of a face without glasses.
  • the target image refers to an image obtained by analyzing and processing a face image to be recognized, and carrying glasses wearing information, which requires glasses removal processing. That is, the face of the target object is worn with an object, and glasses removal processing is required.
  • the target image may be a face image of a face wearing glasses, or an eye image segmented from a face image of a face wearing glasses.
  • the glasses removal model is a model that removes glasses for a global face image
  • the target image is a global face image
  • the glasses removal model is a model that removes glasses for a local eye image
  • the target image is local Eye image.
  • a target image or a selected target image obtained by glasses recognition detection in a face image to be recognized is obtained, so as to input the target image to a glasses removal model for glasses removal processing.
  • S1102 Input a target image into a glasses removal model trained based on a generative adversarial network; the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence.
  • S1103 Obtain a feature map of each feature channel of the target image through a convolution layer of the convolution-shrink excitation network.
  • the input target image is subjected to convolution processing through a convolution layer of a convolution-shrink excitation network to obtain a feature map of each feature channel of the target image, and the feature map is input to the Shrinking excitation layer in a convolution-shrinking excitation network.
  • step S1104 the global information of each characteristic channel is obtained according to the feature map through the contraction excitation layer in the convolution-contraction excitation network, and the global information is learned to generate the weight of each characteristic channel.
  • the feature map is compressed by the shrinking layer in the shrinking excitation layer to obtain global information of each feature channel, and the global information is learned by the excitation layer in the shrinking excitation layer to generate each feature channel. the weight of.
  • the weighted layer of the convolution-shrink excitation network is used to multiply the feature map of each feature channel by the corresponding weight to generate a weighted feature map.
  • step S1106 the glasses-removed face image corresponding to the target image is obtained according to the weighted feature map through the glasses removal model.
  • the de-spectacled face image refers to a global face image that corresponds to the target image and has glasses removed.
  • the glasses-removed face image refers to the target image without glasses; when the target image is a local eye image, the glasses-removed face image refers to the glasses after the glasses are removed
  • S1107 Match the de-spectacled face image with a preset face image database, and generate a face recognition result according to the matching result.
  • the preset face image library stores registered or verified face images.
  • the face recognition result includes one or more kinds of data of recognition success, recognition failure, and related information of the matched face image, which can be set according to the recognition requirements, and is not limited herein. For example, in a security verification system and a face access control system for public transportation, it is only necessary to identify whether the person to be identified is legal, and the result of the face recognition is that the recognition is successful or the recognition fails. When information is searched in a public security verification system, the face recognition result also includes related information of the matched face image.
  • a traditional face recognition model is used to match the de-spectacled face image with a preset face image database to obtain a matching result, and a face recognition result is generated according to the matching result. For example, when a face image in the preset face image database is matched, a face recognition result that is successfully recognized is generated; or when a face image in the preset face image database is matched, the matched person is obtained Related information of the face image, and a face recognition result is generated based on the related information. When the face image in the preset face image database is not matched, a face recognition result that fails to recognize is generated.
  • traditional face recognition models include, but are not limited to, Bruce-Young models, interactive activation competition models, and the like.
  • the glasses removal model Through the glasses removal model, the glasses in the face image to be identified are removed. There is no need to manually pick up the glasses and then perform face image collection and face recognition, which improves the efficiency of face recognition and avoids unrecognizable problems caused by glasses interference.
  • the glasses removal model composed of multiple convolution-shrink excitation networks can enhance the effective features of the target image, suppress the features that are ineffective or small, effectively remove the glasses in the target image, and ensure that the de-glasses image can be restored.
  • the key features of the target image improve the reduction and authenticity of the de-glasses image, further ensuring the accuracy of the face recognition results.
  • the step of obtaining a target image in the face image to be identified includes:
  • S1202 Acquire a face image to be identified.
  • S1204 Perform glasses recognition detection on the face image to be recognized.
  • S1206 Obtain a target image according to a result of the glasses recognition detection.
  • glasses recognition detection is first performed on the face image to be identified to determine whether the face in the face image to be recognized is wearing glasses.
  • a target image is obtained for input to the glasses.
  • the glasses are removed and then input to the face recognition model for recognition; if the face in the face image to be recognized is not wearing glasses, it is directly input to the face recognition model for recognition.
  • glasses recognition detection can only be detected by traditional object detection models, such as object detection models based on deep learning, region-based convolutional neural networks, and the like.
  • obtaining a target image wearing glasses according to a result of the glasses recognition detection includes: when it is detected that a face in the face image to be recognized is wearing glasses, according to the eye in the face image to be recognized Position, segment the eye image to get the target image.
  • the glasses-removed model and the glasses-removed face image corresponding to the target image according to the weighted feature map include: using the glasses-removed model to generate the glasses-removed image corresponding to the target image based on the weighted feature map; the face image to be recognized and the Glasses images, get de-glassed face images.
  • the target position is detected on the face image to determine the position in the face image where the eye is located, and the segmentation is performed based on the determined position.
  • the eye image uses the segmented eye image as a target image to perform glasses removal processing on the target image.
  • a de-spectacled image corresponding to the target image is generated.
  • the de-glassed image is replaced with the eye image at the determined position to obtain a face image after removing the glasses. .
  • the method before the step of inputting the target image to the glasses removal model trained based on generating an adversarial network, the method further includes: performing a normalization process on the target image.
  • the step of obtaining glasses-removed face images corresponding to the target image according to the weighted feature map through the glasses removal model includes: generating the glasses-removed image corresponding to the target image based on the weighted feature map through the glasses removal model; and restoring the glasses-removed image. Processing, restore the de-spectacled image to the target image size, and obtain the de-spectacled face image corresponding to the target image.
  • the step of inputting the target image into the glasses removal model trained based on the generated adversarial network refers to inputting the normalized target image into the glasses removal model trained based on the generated adversarial network.
  • the following uses the target image as a global face image as an example to provide a face recognition method in a complete embodiment.
  • the method includes the steps of training a glasses removal model. As shown in Figure 13, the method includes:
  • the first training sample in the first training sample set and the second training sample in the second training sample set are global face images.
  • the second training sample may be a second training image obtained through each image acquisition channel, or may be obtained by copying the obtained second training image, and the first training sample may be obtained by adding glasses to the second training sample;
  • the first training sample and the second training sample may also be a large number of image samples collected by a face image acquisition device, for example, corresponding image samples are acquired by acquisition devices such as a camera and a camera.
  • the first training sample set is input to a generation network model in the generation adversarial network to obtain a generation sample set with glasses removed.
  • the generation network model includes a plurality of convolution-contraction excitation networks connected in sequence.
  • the first training sample in the first training sample set is sequentially input to the generated network model in the generated adversarial network, and the convolution layer of the convolution-shrink excitation network in the generated network model is sequentially obtained
  • a feature map of each feature channel of the first training sample, and the feature map is input to a shrinking excitation layer in the convolution-shrinking excitation network.
  • shrinking the excitation layer the global information of each feature channel is obtained according to the feature map, and the global information is learned to generate the weights of each feature channel.
  • the weighting layer of the convolution-shrink excitation network is further used to weight each feature channel according to the weight.
  • the weighted feature map is processed by weighting to generate a weighted feature map corresponding to the first training sample.
  • the weighted feature map corresponding to the first training sample is further processed based on the generated network model, and finally generated samples corresponding to the first training sample are generated, and all the generated samples form a generated sample set.
  • the generated sample set and the second training sample set are respectively input to a discriminant network model in the generated adversarial network, and a first probability corresponding to the generated sample set and a second probability of the second training sample set are obtained.
  • each generated sample and its category identifier in the generated sample set and each second training sample and its category identifier in the second training sample set are sequentially input to the discriminant network model, and the first probability and the second corresponding to the generated sample set are obtained.
  • the second probability of the training sample set is sequentially input to the discriminant network model, and the first probability and the second corresponding to the generated sample set are obtained. The second probability of the training sample set.
  • S1304 Obtain a discriminant network loss coefficient according to the first probability, the second probability, and a discriminant network loss function.
  • S1305 Update the parameters of the discriminant network model according to the discriminant network loss coefficient to obtain an updated discriminant network model.
  • the function for maximizing the discrimination degree of the discriminating network shown in formula (4) is adopted.
  • the discriminative network loss coefficient is calculated, and the parameters of the discriminative network model are updated by using the Adam algorithm, so that the first probability of the updated discriminant network model tends to 0 as much as possible, and the second probability tends to 1 as much as possible to obtain accurate classification capabilities.
  • the generated sample set is input to the updated discriminant network model to obtain a third probability corresponding to the generated sample set.
  • each generated sample and its category identifier in the generated sample set are sequentially input to a discriminant network model to obtain a third probability corresponding to the generated sample set.
  • the generated sample set and the second training sample set are respectively input to a feature network model, and a feature error between the generated sample set and the second training sample set is obtained.
  • Each generated sample in the generated sample set and its corresponding second training sample are sequentially input to the feature network model, and the features of the generated network and the corresponding second training sample are extracted from the feature network model and compared and analyzed to obtain each generated sample and its Feature errors between corresponding second training samples.
  • S1309 Analyze the pixels of the generated sample set and the second training sample set to obtain pixel errors between the generated sample set and the second training sample set.
  • An error analysis is performed on the pixel points of each generated sample and its corresponding second training sample in the generated sample set in order to obtain the pixel error between each generated sample and its corresponding second training sample.
  • S1310 Update the parameters of the generated network model according to the generated network loss coefficient, feature error, and pixel error to obtain an updated generated network model.
  • Adam algorithm is used to adjust and update the generated network model parameters to obtain the updated generated network model.
  • each time the training of generating the adversarial network is completed an operation of adding one to the number of iterations is performed, and the current number of iterations is obtained, and it is determined whether the current number of iterations reaches the threshold of the number of iterations. If it is not reached, the training-related steps are continued. Otherwise, the updated generated network model is used as the glasses removal model, and the training step is exited.
  • S1314 Perform glasses recognition detection on the face image to be recognized.
  • step S1315 When it is detected that the face in the face image to be recognized is wearing glasses, a target image is obtained; otherwise, step S1322 is directly performed.
  • glasses recognition detection is first performed on the face image to be recognized, and it is determined whether the face in the face image to be recognized is wearing glasses.
  • a target image is obtained for input to the glasses for removal.
  • the model performs glasses removal processing, it is input to the face recognition model for recognition; if the face in the face image to be recognized is not wearing glasses, it is directly input to the face recognition model for recognition.
  • the target image is input to a glasses removal model trained based on a generative adversarial network; the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence.
  • a feature map of each feature channel of the target image is obtained through a convolution layer of the convolution-shrink excitation network.
  • the feature maps of each feature channel are weighted according to the weights to generate a weighted feature map.
  • the input target image is subjected to convolution processing through a convolution layer of a convolution-shrink excitation network to obtain a feature map of each feature channel of the target image; the shrink layer in the excitation layer is contracted
  • the feature map is compressed to obtain the global information of each feature channel; the global information is learned by shrinking the excitation layer in the excitation layer to generate the weight of each feature channel; the weighted layer multiplies the feature map of each feature channel by the corresponding Weights to generate a weighted feature map, which is then input to the next layer of network for processing.
  • step S1320 the glasses-removed face image corresponding to the target image is obtained according to the weighted feature map through the glasses removal model.
  • the glasses-free face image corresponding to the target image is generated according to the weighted feature map.
  • S1321 Match the de-spectacled face image with a preset face image database, and generate a face recognition result according to the matching result.
  • S1322 Match the face image to be recognized with a preset face image database, and generate a face recognition result according to the matching result.
  • the traditional face recognition model is used to match the de-spectacled face image or the face image to be identified with a preset face image database to obtain a matching result, and generate a face recognition result based on the matching result. For example, when a face image in the preset face image database is matched, a face recognition result that is successfully recognized is generated; or when a face image in the preset face image database is matched, the matched person is obtained Related information of the face image, and a face recognition result is generated based on the related information. When the face image in the preset face image database is not matched, a face recognition result that fails to recognize is generated.
  • the glasses removal model Through the glasses removal model, the glasses in the face image to be identified are removed. There is no need to manually pick up the glasses and then perform face image collection and face recognition, which improves the efficiency of face recognition and avoids unrecognizable problems caused by glasses interference.
  • the glasses removal model composed of multiple convolution-shrink excitation networks can enhance the effective features of the target image, suppress the features that are ineffective or small, effectively remove the glasses in the target image, and ensure that the de-glasses image can be restored.
  • the key features of the target image improve the reduction and authenticity of the de-glasses image, further ensuring the accuracy of the face recognition results.
  • FIG. 13 is a schematic flowchart of a face recognition method according to an embodiment. It should be understood that although the steps in the flowchart of FIG. 13 are sequentially displayed in accordance with the directions of the arrows, these steps are not necessarily performed in the order indicated by the arrows. Unless explicitly stated in this document, the execution of these steps is not strictly limited, and these steps can be performed in other orders. Moreover, at least a part of the steps in FIG. 13 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times. The execution of these sub-steps or stages The order is not necessarily performed sequentially, but may be performed in turn or alternately with other steps or at least a part of the sub-steps or stages of other steps.
  • an image processing apparatus includes an image acquisition module 1401, an input module 1402, a convolution module 1403, a weight learning module 1404, a weighting module 1405, and an image generation module 1406. .
  • the image acquisition module 1401 is configured to acquire a target image, and an object in the target image is wearing glasses.
  • the target image refers to an image that carries glasses wearing information and requires glasses removal processing. That is, the object in the target object is wearing glasses, and glasses removal processing is required.
  • the target image may be a face image wearing glasses; when the object is an eye, the target image may be an eye image segmented from the face image wearing glasses.
  • the image processing software is used to perform glasses removal processing
  • the acquired target image is a face image input to the image processing software or a segmented eye image.
  • An input module 1402 is configured to input a target image to a glasses removal model trained based on a generative adversarial network; the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence.
  • a convolution module 1403 is configured to obtain a feature map of each feature channel of a target image through a convolution layer of a convolution-shrink excitation network.
  • the convolution module 1403 is configured to perform convolution processing on the input target image through a convolution layer of the convolution-shrink excitation network to obtain a feature map of each feature channel of the target image, and The feature map is input to a shrinking excitation layer in the convolution-shrinking excitation network.
  • a weight learning module 1404 is configured to obtain the global information of each feature channel according to the feature map through the shrinking excitation layer in the convolution-shrink excitation network, and learn the global information to generate the weight of each feature channel.
  • the feature map is compressed by the shrinking layer in the shrinking excitation layer to obtain global information of each feature channel; the global information is learned by the excitation layer in the shrinking excitation layer to generate the feature channel's Weights.
  • a weighting module 1405 is configured to perform weighting processing on the feature maps of each feature channel according to the weights through a weighting layer of the convolution-shrink excitation network to generate a weighted feature map.
  • the weighting module 1405 uses the weighting layer to multiply the feature maps of each feature channel by corresponding weights to generate weighted feature maps, and the weighted feature maps are continuously input to the next layer of network for processing.
  • a generating module 1406 is configured to remove the model through the glasses and generate the glasses-removed image corresponding to the target image according to the weighted feature map.
  • the glasses removal model is a trained model with glasses removal effect. After processing through multiple convolution-contraction excitation networks and other network layers in the glasses removal model, the glasses-removed image corresponding to the target image is generated based on the weighted feature map. .
  • the above image processing device obtains a target image and inputs the target image to a pre-trained glasses removal model. Since the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence, the convolution-contraction excitation can be used.
  • the convolutional layer of the network is used to obtain the feature map of each feature channel of the target image, and the shrinkage excitation layer in the convolution-shrink excitation network is used to obtain the global information of each feature channel according to the feature map.
  • the global information is learned to generate each feature.
  • the weight of the channel is then weighted by the weighting layer in the convolution-shrink excitation network.
  • the feature map of each feature channel is weighted according to the weight to generate a weighted feature map.
  • the corresponding model is obtained by the weighted feature map through the glasses removal model. Go to the glasses image.
  • the glasses removal model can maintain a high learning ability, so that it can fully learn the importance of different feature channels to obtain corresponding weights.
  • Through weighting processing it enhances the effective features while suppressing invalid or small effects, and effectively removes the target image.
  • the spectacles at the same time ensure that the de-spectacled image can recover the key features of the target image, and improve the reduction and authenticity of the de-spectacled image.
  • the image processing apparatus further includes an image fusion module.
  • the image obtaining module 1401 is further configured to obtain a face image.
  • the face in the face image is wearing glasses, and the eye image is segmented according to the position of the eye in the face image to obtain a target image.
  • An image fusion module is used to fuse a face image and a de-glasses image to obtain a face image after removing the glasses.
  • the image acquisition module 1401 determines the position in the face image where the eye is located by performing target detection on the face image, segmenting the eye image based on the determined position, and using the segmented eye image as a target image. After the spectacle removal image corresponding to the target image is generated through the spectacle removal model, the face fusion image and the spectacle removal image are fused by the image fusion module, and the spectacle removal image is replaced with the eye image at the determined position to obtain the completeness after the spectacle removal Face image.
  • the image processing apparatus further includes a model training module.
  • the model training module further includes a sample acquisition module, a sample generation module, a network loss coefficient generation module, and an update iteration module, wherein:
  • a sample acquisition module configured to acquire a first training sample set composed of a first training image and a second training sample set composed of a second training image; an object in the first training image is wearing glasses; and the second training image The subject in is not wearing glasses.
  • the first training sample in the first training sample set and the second training sample in the second training sample set are global face images.
  • the second training sample may be a second training image obtained through each image acquisition channel, or may be obtained by copying the obtained second training image, and the first training sample may be obtained by adding glasses to the second training sample;
  • the first training sample and the second training sample may also be a large number of image samples collected by a face image acquisition device, for example, corresponding image samples are acquired by acquisition devices such as a camera and a camera.
  • the generating sample module is used for inputting the first training sample set to the generating network model in the generating adversarial network to obtain the generating sample set after removing the glasses.
  • the generating network model includes a plurality of convolution-contraction excitation networks connected in sequence.
  • the first training sample in the first training sample set is sequentially input to the generation network model in the generation adversarial network, and the convolution layer of the convolution-shrink excitation network in the generation network model is sequentially generated.
  • a feature map of each feature channel of the first training sample is obtained, and the feature map is input to a shrinking excitation layer in the convolution-shrinking excitation network.
  • shrinking the excitation layer the global information of each feature channel is obtained according to the feature map, and the global information is learned to generate the weights of each feature channel.
  • the weighting layer of the convolution-shrink excitation network is further used to weight each feature channel according to the weight.
  • the weighted feature map is processed by weighting to generate a weighted feature map corresponding to the first training sample.
  • the weighted feature map corresponding to the first training sample is further processed based on the generated network model, and finally generated samples corresponding to the first training sample are generated, and all the generated samples form a generated sample set.
  • the generating network loss coefficient generating module is configured to input the generated sample set and the second training sample set to the discriminant network model in the generation adversarial network, and obtain the generated network loss coefficient according to the output of the discriminative network model.
  • the generated sample set and the second training sample set are respectively input to a discriminative network model in the generation adversarial network, and parameters of the discriminant network model are adjusted according to the output of the discriminant network model to obtain an updated discriminant network.
  • Model the generated sample set is then input to the updated discriminant network model, and the generated network loss coefficient is obtained according to the output of the updated discriminant network model, so as to adjust the parameters of the generated network model according to the generated network loss coefficient.
  • the update iteration module is used to update the parameters of the generated network model according to the generated network loss coefficient, obtain the updated generated network model, and return to the generated sample module until the iteration end condition is met, and remove the updated generated network model as glasses. model.
  • the parameters of the generated network model are adjusted according to the generated network loss coefficient and the preset parameter adjustment method of the generated network model to obtain an updated generated network model. Determine whether the preset end-of-iteration condition is met, if it is met, end the iterative training, and use the updated generated network model as the glasses removal model; if not, trigger the generation of a sample module to continue to perform related operations.
  • the update iteration module is further configured to update the parameters of the generated network model according to the generated network loss coefficient to obtain the updated generated network model; obtain the current number of iterations; and trigger when the number of iterations is less than a preset number of iterations threshold
  • the generating sample module continues to perform related operations; when the number of iterations reaches a preset number of iterations threshold, the updated generated network model is used as the glasses removal model.
  • the generating network loss coefficient generating module includes a discriminating network loss coefficient generating module, a discriminating network updating module, and a generating network loss coefficient determining module. among them:
  • the discriminative network loss coefficient generating module is configured to input the generated sample set and the second training sample set to the discriminative network model in the generation adversarial network, and obtain the discriminant network loss coefficient according to the output of the discriminative network model.
  • the discriminant network loss coefficient generating module is configured to input the generated sample set and the second training sample set to the discriminant network model, respectively, to obtain a first probability corresponding to the generated sample set and a second probability corresponding to the second training sample set; According to the first probability, the second probability, and the discriminant network loss function, a discriminative network loss coefficient is obtained.
  • the discriminant network update module is used to update the parameters of the discriminant network model according to the discriminant network loss coefficient to obtain an updated discriminant network model.
  • the discriminant network update module adjusts the parameters of the discriminant network model according to the discriminative network loss coefficient and the preset method of discriminant network parameter adjustment to obtain an updated discriminant network model.
  • the method of adjusting the parameters of the discriminative network model includes, but is not limited to, gradient correction methods, back-propagation algorithms, and other error correction algorithms.
  • the Adam algorithm that optimizes a random objective function based on a step.
  • the generating network loss coefficient determining module is configured to input the generated sample set to the updated discrimination network model, and obtain the generated network loss coefficient according to the output of the updated discrimination network model.
  • the generating network loss coefficient determination module is configured to input the generated sample set to the updated discriminant network model to obtain a third probability corresponding to the generated sample set; and obtain the generated network according to the third probability and the generated network loss function. Loss factor.
  • the image processing apparatus further includes a feature error generating module, configured to input the generated sample set and the second training sample set to the feature network model to obtain the feature error between the generated sample set and the second training sample set.
  • the update iteration module is further configured to update the parameters of the generated network model according to the generated network loss coefficient and the characteristic error to obtain the updated generated network model.
  • the spectacles image restored by the final spectacle removal model is further maintained to maintain the discrimination information, that is, the key features of the target image are more accurately restored, which improves The reduction degree of the glasses image is removed, and in the face recognition application, the accuracy of the face recognition is guaranteed.
  • the image processing apparatus further includes a pixel error generating module, configured to analyze pixels of the generated sample set and the second training sample set to obtain a pixel error between the generated sample set and the second training sample set.
  • the update iteration module is further configured to update the parameters of the generated network model according to the generated network loss coefficient and the pixel error to obtain the updated generated network model.
  • the above image processing device uses the glasses removal model to fully learn the importance of different feature channels to obtain corresponding weights, enhances effective features through weighting processing while suppressing invalid or small effects, effectively removes glasses in the target image, and ensures that the de-glasses image can
  • the key features of the target image are recovered, and the reduction and authenticity of the de-glassed image are improved.
  • a face recognition device includes a target image acquisition module 1501, a target image input module 1502, a feature convolution module 1503, a feature weight learning module 1504, and a feature weighting.
  • Module 1505 a face image generation module 1506, and a matching module 1507. among them:
  • a target image acquisition module 1501 is configured to acquire a target image in a face image to be identified, the face in the target image is wearing glasses.
  • the face image to be identified refers to a global face image that currently needs to be identified.
  • the target image acquisition module 1501 acquires a target image or a selected target image obtained through glasses recognition detection in a face image to be recognized, so as to input the target image to a glasses removal model for glasses removal processing.
  • a target image input module 1502 is configured to input a target image to a glasses removal model trained based on a generative adversarial network; the glasses removal model includes a plurality of convolution-contraction excitation networks connected in sequence.
  • a feature convolution module 1503 is configured to obtain a feature map of each feature channel of a target image through a convolution layer of a convolution-shrink excitation network.
  • the feature convolution module 1503 is configured to perform convolution processing on the input target image through a convolution layer of a convolution-shrink excitation network to obtain a feature map of each feature channel of the target image, and The feature map is input to a shrinking excitation layer in the convolution-shrinking excitation network.
  • a feature weight learning module 1504 is configured to obtain the global information of each feature channel according to the feature map through the shrinking excitation layer in the convolution-shrink excitation network, and learn the global information to generate the weight of each feature channel.
  • the feature map is compressed by the shrink layer in the shrink excitation layer to obtain global information of each feature channel; the global information is learned by the excitation layer in the shrink excitation layer to generate each feature channel. the weight of.
  • a feature weighting module 1505 is configured to weight the feature maps of each feature channel by using a weighting layer of the convolution-shrink excitation network to generate weighted feature maps.
  • the feature weighting module 1505 uses the weighting layer to multiply the feature map of each feature channel by the corresponding weight, to generate a weighted feature map, and the weighted feature map is further input to the next layer of network for processing.
  • a face image generating module 1506 is configured to remove the model through glasses and obtain a de-glassed face image corresponding to the target image according to the weighted feature map.
  • the de-spectacled face image refers to a global face image that corresponds to the target image and has glasses removed.
  • the glasses-removed face image refers to the target image without glasses; when the target image is a local eye image, the glasses-removed face image refers to the glasses after the glasses are removed
  • the matching module 1507 is configured to match the de-spectacled face image with a preset face image database, and generate a face recognition result according to the matching result.
  • the matching module 1507 matches the de-glasses face image with a preset face image database through a traditional face recognition model to obtain a matching result, and generates a face recognition result based on the matching result.
  • the above-mentioned face recognition device removes the glasses in the face image to be recognized through the glasses removing model, and does not need to manually remove the glasses before performing face image collection and face recognition, which improves the face recognition efficiency and avoids the interference caused by the glasses. Unrecognized issue.
  • the glasses removal model composed of multiple convolution-shrink excitation networks can enhance the effective features of the target image, suppress the features that are ineffective or small, effectively remove the glasses in the target image, and ensure that the glasses images can be restored.
  • the key features of the target image improve the reduction and authenticity of the de-glasses image, further ensuring the accuracy of the face recognition results.
  • the target image acquisition module 1501 includes a face image acquisition module, a glasses detection module, and a target image determination module.
  • a face image acquisition module is used to obtain a face image to be identified
  • a glasses detection module is used to perform eyeglass recognition detection on the face image to be identified
  • a target image determination module is used to obtain a wearable image according to the result of the glasses recognition detection.
  • Target image for glasses is used to obtain a wearable image according to the result of the glasses recognition detection.
  • the target image determination module includes an eye segmentation module.
  • the target segmentation module detects that the face image to be identified is wearing glasses, the target segmentation module is configured to segment the face image based on the position of the eye in the face image to be identified. Eye image to obtain a target image with glasses.
  • the face image generation module 1506 is further configured to remove the model through glasses to generate a de-spectacled image corresponding to the target image; fuse the face image to be identified and the de-glassed image to obtain a de-glassed face image .
  • the face recognition device further includes a model training module.
  • the model training module further includes a sample acquisition module, a sample generation module, a network loss coefficient generation module, and an update iteration module. For details, refer to the implementation shown in FIG. 14. It is described in the example and will not be repeated here.
  • the network loss coefficient generation module includes: a judgment network loss coefficient generation module, a judgment network update module, and a generation network loss coefficient determination module, which are described in detail in the embodiment shown in FIG. 14, and are not described herein.
  • the face recognition device further includes at least one of a feature error generation module and a pixel error generation module, which are described in detail in the embodiment shown in FIG. 14 and will not be repeated here.
  • the above-mentioned face recognition device removes the glasses in the face image to be recognized through the glasses removing model, and does not need to manually pick up the glasses before performing face image collection and face recognition, which improves the efficiency of face recognition and avoids the inability to cause interference due to glasses. Identify the problem.
  • FIG. 16 shows an internal structure diagram of a computer device in one embodiment.
  • the computer device may be the terminal or server 110 in FIG. 1.
  • the computer device includes the computer device including a processor, a memory, and a network interface connected through a system bus.
  • the memory includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium of the computer device stores an operating system and a computer program.
  • the processor can enable the processor to implement an image processing method and / or a face recognition method.
  • a computer program may also be stored in the internal memory.
  • the processor may cause the processor to execute an image processing method and / or a face recognition method.
  • FIG. 16 is only a block diagram of a part of the structure related to the scheme of the present application, and does not constitute a limitation on the computer equipment to which the scheme of the present application is applied.
  • the specific computer equipment may be Include more or fewer parts than shown in the figure, or combine certain parts, or have a different arrangement of parts.
  • the image processing apparatus and the face recognition apparatus provided in this application may be implemented in the form of a computer program, and the computer program may be run on a computer device as shown in FIG. 16.
  • the memory of the computer device may store various program modules constituting the image processing apparatus and / or the face recognition apparatus, for example, an image acquisition module 1401, an input module 1402, a convolution module 1403, a weight learning module 1404, shown in FIG. 14, Weighting module 1405 and generation module 1406.
  • the computer program constituted by each program module causes the processor to execute the steps in the image processing method of each embodiment of the present application described in this specification.
  • a computer device which includes a memory and a processor.
  • a computer program is stored in the memory, and the processor executes the computer program to implement the image processing method described in the foregoing embodiment.
  • the face recognition method described in the above embodiment is also implemented.
  • a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the image processing method described in the above embodiment is implemented.
  • the face recognition method described in the above embodiment is also implemented.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM dual data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain (Synchlink) DRAM
  • SLDRAM synchronous chain (Synchlink) DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

本申请涉及一种图像处理方法、人脸识别方法、装置、计算机设备和可读存储介质,图像处理方法包括:获取目标图像,目标图像中的对象佩戴有眼镜;将目标图像输入至眼镜去除模型;眼镜去除模型包括多个依次连接的卷积-收缩激励网络;通过卷积-收缩激励网络的卷积层,得到目标图像的各特征通道的特征图;通过卷积-收缩激励网络中的收缩激励层,根据特征图得到各特征通道的全局信息,对全局信息进行学习,生成各特征通道的权重;通过卷积-收缩激励网络的加权层,根据权重分别对各特征通道的特征图进行加权处理,生成加权特征图;通过眼镜去除模型,根据加权特征图生成与目标图像对应的去眼镜图像。通过该方法能够有效去除图像中的眼镜。

Description

图像处理方法、人脸识别方法、装置和计算机设备
本申请要求于2018年06月11日提交的申请号为201810594760.8、发明名称为“图像处理方法、人脸识别方法、装置和计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,特别是涉及一种图像处理方法、人脸识别方法、装置、计算机设备和可读存储介质。
背景技术
随着图像处理技术的应用范围越来越广,利用图像处理技术去除图像中的无用信息,得到所需的图像也成为当前图像处理的研究热点。比如,在公共交通的安全验证系统、信用卡验证系统等身份验证系统中,当被验证人员佩戴有眼镜时,通常需要对采集的人脸图像去除眼镜之后再进行验证。
然而,传统的眼镜去除模型的网络学习能力较低,难以保证去除眼镜后的人脸图像有效表征原始图像的相关特征,进而使得对去除眼镜后的人脸图像还原度低。
发明内容
基于此,提供了一种图像处理方法、人脸识别方法、装置、计算机设备和可读存储介质,可以解决基于传统的眼镜去除模型的还原度低的技术问题。
一方面,提供了一种图像处理方法,所述方法包括:
获取目标图像,所述目标图像中的对象佩戴有眼镜;
将所述目标图像输入至基于生成对抗网络训练的眼镜去除模型;所述眼镜去除模型包括多个依次连接的卷积-收缩激励网络;
通过所述卷积-收缩激励网络的卷积层,得到所述目标图像的各特征通道的特征图;
通过所述卷积-收缩激励网络中的收缩激励层,根据所述特征图得到各特征通道的全局信息,对所述全局信息进行学习,生成所述各特征通道的权重;
通过所述卷积-收缩激励网络的加权层,根据所述权重分别对所述各特征通道的特征图进行加权处理,生成加权特征图;
通过所述眼镜去除模型,根据所述加权特征图生成与所述目标图像对应的去眼镜图像。
一方面,提供了一种人脸识别方法,所述方法包括:
获取待识别人脸图像中的目标图像,所述目标图像中的人脸佩戴有眼镜;
将所述目标图像输入至基于生成对抗网络训练的眼镜去除模型;所述眼镜去除模型包括多个依次连接的卷积-收缩激励网络;
通过所述卷积-收缩激励网络的卷积层,得到所述目标图像的各特征通道的特征图;
通过所述卷积-收缩激励网络中的收缩激励层,根据所述特征图得到各特征通道的全局信息,对所述全局信息进行学习,生成所述各特征通道的权重;
通过所述卷积-收缩激励网络的加权层,根据所述权重分别对所述各特征通道的特征图进 行加权处理,生成加权特征图;
通过所述眼镜去除模型,根据所述加权特征图得到与所述目标图像对应的去眼镜人脸图像;
将所述去眼镜人脸图像与预设人脸图像库进行匹配,根据匹配结果生成人脸识别结果。
一方面,提供了一种图像处理装置,所述装置包括:
图像获取模块,用于获取目标图像,所述目标图像中的对象佩戴有眼镜;
输入模块,用于将所述目标图像输入至基于生成对抗网络训练的眼镜去除模型;所述眼镜去除模型包括多个依次连接的卷积-收缩激励网络;
卷积模块,用于通过所述卷积-收缩激励网络的卷积层,得到所述目标图像的各特征通道的特征图;
权重学习模块,用于通过所述卷积-收缩激励网络中的收缩激励层,根据所述特征图得到各特征通道的全局信息,对所述全局信息进行学习,生成所述各特征通道的权重;
加权模块,用于通过所述卷积-收缩激励网络的加权层,根据所述权重分别对所述各特征通道的特征图进行加权处理,生成加权特征图;
生成模块,用于通过所述眼镜去除模型,根据所述加权特征图生成与所述目标图像对应的去眼镜图像。
一方面,提供了一种人脸识别装置,所述装置包括:
目标图像获取模块,用于获取待识别人脸图像中的目标图像,所述目标图像中的人脸佩戴有眼镜;
目标图像输入模块,用于将所述目标图像输入至基于生成对抗网络训练的眼镜去除模型;所述眼镜去除模型包括多个依次连接的卷积-收缩激励网络;
特征卷积模块,用于通过所述卷积-收缩激励网络的卷积层,得到所述目标图像的各特征通道的特征图;
特征权重学习模块,用于通过所述卷积-收缩激励网络中的收缩激励层,根据所述特征图得到各特征通道的全局信息,对所述全局信息进行学习,生成所述各特征通道的权重;
特征加权模块,用于通过所述卷积-收缩激励网络的加权层,根据所述权重分别对所述各特征通道的特征图进行加权处理,生成加权特征图;
人脸图像生成模块,用于通过所述眼镜去除模型,根据所述加权特征图得到与所述目标图像对应的去眼镜人脸图像;
匹配模块,用于将所述去眼镜人脸图像与预设人脸图像库进行匹配,根据匹配结果生成人脸识别结果。
一方面,提供了一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现如下步骤:
获取目标图像,所述目标图像中的对象佩戴有眼镜;
将所述目标图像输入至基于生成对抗网络训练的眼镜去除模型;所述眼镜去除模型包括多个依次连接的卷积-收缩激励网络;
通过所述卷积-收缩激励网络的卷积层,得到所述目标图像的各特征通道的特征图;
通过所述卷积-收缩激励网络中的收缩激励层,根据所述特征图得到各特征通道的全局信息,对所述全局信息进行学习,生成所述各特征通道的权重;
通过所述卷积-收缩激励网络的加权层,根据所述权重分别对所述各特征通道的特征图进 行加权处理,生成加权特征图;
通过所述眼镜去除模型,根据所述加权特征图生成与所述目标图像对应的去眼镜图像。
一方面,提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如下步骤:
获取目标图像,所述目标图像中的对象佩戴有眼镜;
将所述目标图像输入至基于生成对抗网络训练的眼镜去除模型;所述眼镜去除模型包括多个依次连接的卷积-收缩激励网络;
通过所述卷积-收缩激励网络的卷积层,得到所述目标图像的各特征通道的特征图;
通过所述卷积-收缩激励网络中的收缩激励层,根据所述特征图得到各特征通道的全局信息,对所述全局信息进行学习,生成所述各特征通道的权重;
通过所述卷积-收缩激励网络的加权层,根据所述权重分别对所述各特征通道的特征图进行加权处理,生成加权特征图;
通过所述眼镜去除模型,根据所述加权特征图生成与所述目标图像对应的去眼镜图像。
一方面,提供了一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现如下步骤:
获取待识别人脸图像中的目标图像,所述目标图像中的人脸佩戴有眼镜;
将所述目标图像输入至基于生成对抗网络训练的眼镜去除模型;所述眼镜去除模型包括多个依次连接的卷积-收缩激励网络;
通过所述卷积-收缩激励网络的卷积层,得到所述目标图像的各特征通道的特征图;
通过所述卷积-收缩激励网络中的收缩激励层,根据所述特征图得到各特征通道的全局信息,对所述全局信息进行学习,生成所述各特征通道的权重;
通过所述卷积-收缩激励网络的加权层,根据所述权重分别对所述各特征通道的特征图进行加权处理,生成加权特征图;
通过所述眼镜去除模型,根据所述加权特征图得到与所述目标图像对应的去眼镜人脸图像;
将所述去眼镜人脸图像与预设人脸图像库进行匹配,根据匹配结果生成人脸识别结果。
一方面,提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如下步骤:
获取待识别人脸图像中的目标图像,所述目标图像中的人脸佩戴有眼镜;
将所述目标图像输入至基于生成对抗网络训练的眼镜去除模型;所述眼镜去除模型包括多个依次连接的卷积-收缩激励网络;
通过所述卷积-收缩激励网络的卷积层,得到所述目标图像的各特征通道的特征图;
通过所述卷积-收缩激励网络中的收缩激励层,根据所述特征图得到各特征通道的全局信息,对所述全局信息进行学习,生成所述各特征通道的权重;
通过所述卷积-收缩激励网络的加权层,根据所述权重分别对所述各特征通道的特征图进行加权处理,生成加权特征图;
通过所述眼镜去除模型,根据所述加权特征图得到与所述目标图像对应的去眼镜人脸图像;
将所述去眼镜人脸图像与预设人脸图像库进行匹配,根据匹配结果生成人脸识别结果。
上述图像处理方法、人脸识别方法、装置、计算机设备和可读存储介质,通过获取目标图像,将该目标图像输入至预先训练得到的眼镜去除模型,由于该眼镜去除模型包括多个依次连接的卷积-收缩激励网络,所以,可以通过卷积-收缩激励网络的卷积层来得到目标图像的各特征通道的特征图,再通过卷积-收缩激励网络中的收缩激励层根据特征图得到各特征通道的全局信息,对全局信息进行学习以生成各特征通道的权重,再通过卷积-收缩激励网络中的加权层,根据权重分别对各特征通道的特征图进行加权处理,生成加权特征图,最后,通过眼镜去除模型根据该加权特征图得到对应的去眼镜图像。这样,该眼镜去除模型能够保持较高的学习能力,从而能够充分学习不同特征通道的重要性得到对应权重,通过加权处理来在增强有效特征的同时抑制无效或效果小的特征,有效去除目标图像中的眼镜,同时确保去眼镜图像能够恢复出目标图像的关键特征,提高去眼镜图像的还原度和真实性。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为一个实施例中图像处理方法和/或人脸识别方法的应用环境图;
图2为一个实施例中图像处理方法的流程示意图;
图3为一个实施例中卷积-收缩激励网络的结构示意图;
图4为一个实施例中卷积-收缩激励网络中收缩激励处理和加权处理的示意图;
图5为一个实施例中对特征图进行收缩处理的示意图;
图6为一个实施例中图像处理方法的流程示意图;
图7为一个实施例中眼镜去除模型训练方法的流程示意图;
图8为一个实施例中生成网络损失系数生成步骤的流程示意图;
图9为一个实施例中眼镜去除模型训练方法中网络模型的结构示意图;
图10为一个实施例中更新生成网络模型并迭代的步骤的流程示意图;
图11为一个实施例中人脸识别方法的流程示意图;
图12为一个实施例中通过眼镜识别检测得到目标图像的步骤的流程示意图;
图13为一个实施例中人脸识别方法的流程示意图;
图14为一个实施例中图像处理装置的结构框图;
图15为一个实施例中人脸识别装置的结构框图;
图16为一个实施例中计算机设备的结构框图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
图1为一个实施例中图像处理方法和/或人脸识别方法的应用环境图。参照图1,该图像处理方法应用于图像处理系统。该图像处理系统包括终端或服务器110。终端可以是台式终端或移动终端,该移动终端可以是手机、平板电脑、笔记本电脑等中的至少一种;服务器可 以是一个服务器或者服务器集群。终端或服务器110配置有基于生成对抗网络训练的眼镜去除模型,该眼镜去除模型包括多个依次连接的卷积-收缩激励网络,通过该眼镜去除模型中的卷积-收缩激励网络能够对目标图像进行眼镜去除处理。进一步地,终端或服务器110还能够进行人脸识别,即在基于眼镜去除模型得到去眼镜人脸图像后,将该去眼镜人脸图像与预设人脸图像库进行匹配,根据匹配结果生成人脸识别结果。
如图2所示,在一个实施例中,提供了一种图像处理方法。本实施例主要以该方法应用于上述图1中的终端或服务器110来举例说明。参照图2,该图像处理方法包括如下步骤:
S201,获取目标图像,该目标图像中的对象佩戴有眼镜。
其中,目标图像是指携带有眼镜佩戴信息,需要进行眼镜去除处理的图像。即,目标对象中的对象佩戴有眼镜,且需要进行眼镜去除处理。当对象为人脸时,目标图像可以是佩戴有眼镜的人脸图像;当对象为眼部时,目标图像可以是从佩戴有眼镜的人脸图像中分割出的眼部图像。比如,当采用图像处理软件进行眼镜去除处理时,所获取的目标图像为输入至图像处理软件的人脸图像或分割出的眼部图像。
S202,将目标图像输入至基于生成对抗网络训练的眼镜去除模型;该眼镜去除模型包括多个依次连接的卷积-收缩激励网络。
在本实施例中,眼镜去除模型预先基于生成对抗网络训练得到,该眼镜去除模型可以是针对全局的人脸图像去除眼镜的模型,也可以是针对局部的眼部图像去除眼镜的模型。可以理解,当眼镜去除模型为针对全局的人脸图像去除眼镜的模型时,目标图像为全局的人脸图像;当眼镜去除模型为针对局部的眼部图像去除眼镜的模型时,目标图像为局部的眼部图像。
其中,生成对抗网络包括生成网络模型和判别网络模型,生成网络模型用于根据输入数据生成一张尽可能真实的假图片,判别网络模型用于判别出输入的一张图片属于真实图片还是假图片。生成对抗网络训练是指由生成网络模型生成一张图片去欺骗判别网络模型,然后判别网络模型去判断这张图片以及对应的真实图片是真是假,在这两个模型训练的过程中,使得两个模型的能力越来越强,最终达到稳态的过程。卷积-收缩激励网络是指由卷积神经网络的卷积层、收缩激励层以及加权层构成的一种结构。收缩激励层包括收缩模块和激励模块,该收缩模块用于对各特征通道的特征图进行处理以得到各特征通道的全局信息,该激励模块用于对该全局信息进行学习以生成各特征通道的权重。
如图3所示,图3提供了一种在残差网络中引入收缩激励层得到的卷积-收缩激励网络,残差层302分别与收缩层304和加权层308连接,收缩层304还与激励层306连接,激励层306还与加权层308连接。
继续参照图2,卷积-收缩激励网络用于执行以下步骤:
S203,通过卷积-收缩激励网络的卷积层,得到目标图像的各特征通道的特征图。
在一种可能的实现方式中,通过卷积-收缩激励网络的卷积层,对输入的目标图像进行卷积处理,得到目标图像的各特征通道的特征图,并将该特征图输入至该卷积-收缩激励网络中的收缩激励层。
在每个卷积层中,数据均以三维形式存在,把它看成由多个特征通道的二维图片叠在一起组成,其中每一张二维图片即称为一张特征图。如图4所示,目标图像通过卷积变换后,得到一个大小为W×H×C的三维矩阵U,也可称之为C个大小为W×H的特征图,C表示特征通道数量。
S204,通过卷积-收缩激励网络中的收缩激励层,根据特征图得到各特征通道的全局信 息,对该全局信息进行学习,生成各特征通道的权重。
其中,全局信息是指各特征通道的特征图的数值分布情况。在一种可能的实现方式中,通过收缩层304对特征图进行压缩,得到各特征通道的全局信息。如图5所示,图5中示出的是一个6×6大小的特征图对应的二维矩阵,通过压缩处理,得到一个表示全局信息的1×1大小的特征图。计算方式如公式(1)所示:
Figure PCTCN2019085031-appb-000001
其中,z c表示第C个特征通道的全局信息;F sq表示全局信息求取函数;u c表示矩阵U中第C个特征通道对应的二维矩阵(特征图);i表示W×H二维矩阵中的行标号;j表示W×H二维矩阵中的列标号;u c(i,j)表示第C个特征通道对应的二维矩阵中第i行第j列的数值。
在本实施例中,全局信息的求取实际上就是对每张特征图的特征数值求算术平均,将每个二维矩阵转变为一个实数,使得一个通道特征图中的整个图中位置的信息相融合,避免对通道进行权值评估时,由于卷积核尺寸问题造成的局部感受野提取信息范围太小,参考信息量不足,使得评估不准确的问题。
在收缩层304得到全局信息后,将该全局信息输入至激励层306,通过激励层306对该全局信息进行学习,生成各特征通道的权重。其中,权重用于表示各特征通道的重要性。其中,权重的计算方法如公式(2)所示:
s=F ex(z,W 1,W 2)=σ(W 2δ(W 1z))          (2)
其中,s表示C个特征通道的权重,维度为1×1×C;z表示由C个z c组成的全局信息矩阵,维度为1×1×C;F ex表示权重求取函数;σ表示sigmoid函数;δ表示线性激活函数;W 1表示降维层参数,降维比例为r,
Figure PCTCN2019085031-appb-000002
W 2表示升维层参数,
Figure PCTCN2019085031-appb-000003
收缩层304对特征图进行压缩得到z,参照公式(2),这里先用W 1乘以z,进行一个全连接层操作,其中,W 1的维度是(C/r)×C,r是一个缩放参数,这个参数的目的是为了减少特征通道的个数从而降低计算量。又因为z的维度是1×1×C,所以W 1z的维度就是1×1×C/r。然后再经过一个线性激活层,输出的维度不变。线性激活层的输出再和W 2相乘,进行一个全连接层操作,W 2的维度是C×(C/r),因此输出的维度就是1×1×C。最后再经过sigmoid函数,得到s。由于前面收缩层304的压缩处理都是针对单个特征通道的特征图进行处理的,因此,通过激励层306中的两个全连接层融合各特征通道的特征图信息,基于各特征通道之间的依赖关系学习,得到各特征通道的权重,以精确刻画出各特征通道对应的特征图的重要性,使得有效的特征图权重更大,无效或者效果小的特征图权重更小。
S205,通过卷积-收缩激励网络的加权层,根据权重分别对各特征通道的特征图进行加权处理,生成加权特征图。
在一种可能的实现方式中,通过卷积-收缩激励网络的加权层,将各特征通道的特征图分别乘以对应的权重,生成加权特征图。如以下公式(3)所示:
Figure PCTCN2019085031-appb-000004
其中,
Figure PCTCN2019085031-appb-000005
表示第C个特征通道的加权特征图;F scale表示加权函数;s c表示第C个特征通道的权重。
基于上述收缩激励操作,生成加权特征图并输入至下一层网络进行处理。由于加权特征图为根据各特征通道的权重得到的,因此,能够在增强有效特征的同时抑制无效或效果小的特征,加强网络的学习能力,以使得眼镜去除模型使用较少的卷积核(卷积层仅使用64或者 128个卷积核)即能够完成眼镜去除处理,从而可以减小模型规格,并降低了模型的复杂度。
S206,通过眼镜去除模型,根据加权特征图生成与目标图像对应的去眼镜图像。
眼镜去除模型为已经训练好的模型,具有眼镜去除效果,通过述眼镜去除模型中的多个卷积-收缩激励网络以及其他网络层处理后,根据加权特征图生成与目标图像对应的去眼镜图像。
上述图像处理方法,通过获取目标图像,将该目标图像输入至预先训练得到的眼镜去除模型,由于该眼镜去除模型包括多个依次连接的卷积-收缩激励网络,所以,可以通过卷积-收缩激励网络的卷积层来得到目标图像的各特征通道的特征图,再通过该卷积-收缩激励网络中的收缩激励层根据特征图得到各特征通道的全局信息,对全局信息进行学习以生成各特征通道的权重,再通过卷积-收缩激励网络中的加权层,根据权重分别对各特征通道的特征图进行加权处理,生成加权特征图,最后,通过眼镜去除模型根据该加权特征图得到对应的去眼镜图像。这样,该眼镜去除模型能够保持较高的学习能力,从而能够充分学习不同特征通道的重要性得到对应权重,通过加权处理来在增强有效特征的同时抑制无效或效果小的特征,有效去除目标图像中的眼镜,同时确保去眼镜图像能够恢复出目标图像的关键特征,提高去眼镜图像的还原度和真实性。
在一实施例中,提供一种图像处理方法,该实施例中,眼镜去除模型为针对局部的眼部图像去除眼镜的模型。如图6所示,该方法包括:
S601,获取人脸图像,该人脸图像中的人脸佩戴有眼镜。
在本实施例中,人脸图像是指包括整张人脸信息的图片。
S602,根据眼部在人脸图像中的位置,分割出眼部图像,得到目标图像。
在一种可能的实现方式中,通过对人脸图像进行目标检测,确定眼部所处人脸图像中的位置,基于所确定的位置分割出眼部图像,将分割出的眼部图像作为目标图像。
S603,将目标图像输入至基于生成对抗网络训练的眼镜去除模型;眼镜去除模型包括多个依次连接的卷积-收缩激励网络。
S604,通过卷积-收缩激励网络的卷积层,得到目标图像的各特征通道的特征图。
S605,通过卷积-收缩激励网络中的收缩激励层,根据特征图得到各特征通道的全局信息,对该全局信息进行学习,生成各特征通道的权重。
S606,通过卷积-收缩激励网络的加权层,根据权重分别对各特征通道的特征图进行加权处理,生成加权特征图。
S607,通过眼镜去除模型,根据加权特征图生成与目标图像对应的去眼镜图像。
S608,融合人脸图像和去眼镜图像,得到去除眼镜后的人脸图像。
在本实施例中,去眼镜图像为去除眼镜后的眼部图像。在生成去眼镜后的人脸图像时,通过对人脸图像进行目标检测,确定眼部所处人脸图像中的位置,将去眼镜图像替换所确定位置的眼部图像,得到去除眼镜后的人脸图像。基于眼部图像的眼镜去除模型,能够增强模型对眼部区域的处理,提升眼镜去除效果。
在一实施例中,将目标图像输入至基于生成对抗网络训练的眼镜去除模型的步骤之前,该方法还包括:对目标图像进行归一化处理。通过眼镜去除模型,根据加权特征图生成与目标图像对应的去眼镜图像的步骤之后,该方法还包括:对去眼镜图像进行还原处理,将去眼镜图像的还原至目标图像大小。可以理解,在本实施中,将目标图像输入至基于生成对抗网络训练的眼镜去除模型的步骤是指将归一化处理后的目标图像输入至基于生成对抗网络训练 的眼镜去除模型。
归一化处理是指将原始图像归一化成同一大小、同一像素值范围的处理。还原处理是指与归一化处理相对的逆处理,也即把图像大小还原成原始图像大小,把像素值范围还原至原始图像的像素值范围。比如,在归一化处理中,将原始图像大小归一化至256*256,然后将图像像素值归一化至[-1,1];在还原处理中,假设原始图像的像素值范围为[0,255],则将图像还原至原始图像大小,并将像素值归一化至[0,255]。
在一实施例中,如图7所示,提供一种图像处理方法中训练眼镜去除模型的方式,包括以下步骤:
S702,获取由第一训练图像组成的第一训练样本集和由第二训练图像组成的第二训练样本集,该第一训练图像中的对象佩戴有眼镜,该第二训练图像中的对象未佩戴眼镜。
其中,第一训练样本集由多个经归一化处理的第一训练图像(第一训练样本)组成的,对应地,第二训练样本集由多个经归一化处理的第二训练图像(第二训练样本)组成,第一训练样本集中的训练样本和第二训练样本集中的训练样本一一对应,其区别仅在于是否佩戴有眼镜。其中,所佩戴的眼镜为框架眼镜。比如,在归一化处理中,将原始图像大小归一化至256*256,然后将图像像素值归一化至[-1,1]。
进一步地,第二训练样本可以是通过各图像获取途径获取得到的第二训练图像,或者由已得到的第二训练图像进行复制得到,第一训练样本可以是通过对第二训练样本进行加眼镜处理得到;第一训练样本和第二训练样本还可以是通过人脸图像采集设备采集的大量图像样本,比如通过照相机、摄像头等采集设备采集得到对应图像样本。可以理解,当训练的眼镜去除模型为针对全局的人脸图像去除眼镜的模型时,训练样本为全局的人脸图像;当训练的眼镜去除模型为针对局部的眼部图像去除眼镜的模型时,训练样本为局部的眼部图像。基于眼部图像的模型训练,能够增强模型对眼部区域的处理,提升眼镜去除效果。
S704,将第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集,该生成网络模型包括多个依次连接的卷积-收缩激励网络。
其中,生成样本集是指由与各第一训练样本对应的生成样本组成的集合。进一步地,生成样本是指由生成网络模型对第一训练样本进行去眼镜处理后,生成的人脸图像。
在得到生成样本集时,将第一训练样本集中的第一训练样本依次输入至生成对抗网络中的生成网络模型,通过该生成网络模型中的卷积-收缩激励网络的卷积层,依次得到第一训练样本的各特征通道的特征图。通过该卷积-收缩激励网络中的收缩激励层,根据该特征图得到各特征通道的全局信息,对该全局信息进行学习,生成各特征通道的权重,进一步通过卷积-收缩激励网络的加权层,根据权重分别对各特征通道的特征图进行加权处理,生成与第一训练样本对应的加权特征图。基于生成网络模型对第一训练样本对应的加权特征图进一步处理,最终生成与第一训练样本对应的生成样本,所有生成样本即组成了生成样本集。
S706,分别将生成样本集和第二训练样本集输入至生成对抗网络中的判别网络模型,根据判别网络模型的输出得到生成网络损失系数。
其中,损失系数是指用于评价网络模型预测效果的一个参数,通常损失系数越小,代表网络模型预测效果越好。对应地,生成网络损失系数是指用于评价生成网络模型去除眼镜效果的一个参数,基于生成网络损失系数来调整生成网络模型中的各项参数,以达到更好的眼镜去除效果。在本实施例中,基于不同的生成样本均会产生一个对应的生成网络损失系数。
如上,生成对抗网络训练是指由生成网络模型生成一张图片去欺骗判别网络模型,然后 判别网络模型去判断这张图片以及对应的真实图片是真是假。可以理解,在本实施例中,生成对抗网络训练的目的在于使得生成网络模型生成的生成样本,能够达到以假乱真的效果。换而言之,也就是使判别网络模型难以辨别生成样本是生成的图像还是真实的图像。
在训练生成对抗网络时,分别将生成样本集和第二训练样本集输入至生成对抗网络中的判别网络模型,根据该判别网络模型的输出调整判别网络模型的参数,得到更新后的判别网络模型;再将该生成样本集输入至更新后的判别网络模型,根据更新后的判别网络模型的输出得到生成网络损失系数,以根据该生成网络损失系数调整该生成网络模型的参数。其中,生成网络模型的参数是指生成网络模型中各神经元之间的连接权重。
S708,根据生成网络损失系数更新生成网络模型的参数,得到更新后的生成网络模型,并返回至步骤S704,直至满足迭代结束条件,将更新后的生成网络模型作为眼镜去除模型。
在本实施例中,根据生成网络损失系数以及预设的生成网络模型参数调整方法,调整生成网络模型的参数,得到更新后的生成网络模型。判断是否满足预设的迭代结束条件,若满足,则结束迭代训练,将更新后的生成网络模型作为眼镜去除模型;否则返回至步骤S704,直到满足预设的迭代结束条件时,将更新后的生成网络模型作为眼镜去除模型。
其中,生成网络模型参数调整方法包括但不限于梯度下降法、反向传播算法等误差修正算法。比如,基于一阶梯度来优化随机目标函数的Adam(Adaptive Moment Estimation,自适应矩估计)算法。迭代结束条件可以是迭代次数达到迭代次数阈值,也可以是生成网络模型达到预设的眼镜去除效果,在此不作限定。
通过上述眼镜去除模型的训练方式,采用包括多个依次连接的卷积-收缩激励网络的生成网络模型以及一个判别网络模型构成生成对抗网络,并进行生成对抗训练,以得到可有效去除眼镜的生成网络模型作为眼镜去除模型。并且,基于卷积-收缩激励网络,对输入训练样本对应的各特征通道的全局信息进行学习,生成各特征通道的权重,根据权重分别对各特征通道的特征图进行加权处理,生成对应的加权特征图,从而能够通过加权处理来在增强有效特征的同时抑制无效或效果小的特征,有效去除第一训练样本集中各第一训练样本中的眼镜,同时使得生成样本恢复出对应第一训练样本的关键特征,提高生成样本的还原度和真实性。
在一实施例中,如图8所示,分别将生成样本集和第二训练样本集输入至生成对抗网络中的判别网络模型,根据判别网络模型的输出得到生成网络损失系数的步骤,包括以下步骤:
S802,分别将生成样本集和第二训练样本集输入至生成对抗网络中的判别网络模型,根据判别网络模型的输出得到判别网络损失系数。
其中,判别网络损失系数是指用于评价判别网络模型分类效果的一个参数,基于判别网络损失系数来调整判别网络模型中的各项参数,以达到更准确的分类效果。在本实施例中,基于不同的生成样本均会产生一个对应的判别网络损失系数。
在得到判别网络损失系数时,将生成样本集中的各生成样本和第二训练样本集中的各第二训练样本依次输入至生成对抗网络中的判别网络模型,分别得到与各生成样本和各第二训练样本对应的输出,根据生成样本及其对应的第二训练样本的输出得到判别网络损失系数,该判别网络损失系数的个数与生成样本的个数相同。
S804,根据判别网络损失系数更新判别网络模型的参数,得到更新后的判别网络模型。
其中,判别网络模型的参数是指判别网络模型中各神经元之间的连接权重。在本实施例中,根据判别网络损失系数以及预设的判别网络模型参数调整方法,调整判别网络模型的参数,得到更新后的判别网络模型。其中,判别网络模型参数调整方法包括但不限于梯度下降 法、反向传播算法等误差修正算法。比如,基于一阶梯度来优化随机目标函数的Adam算法。
S806,将生成样本集输入至更新后的判别网络模型,根据更新后的判别网络模型的输出得到生成网络损失系数。
在得到更新后的判别网络模型后,当前的判别网络模型相较于更新之前的判别网络模型,具有更好的分类效果。因此,在判别网络模型具有较好的分类效果之后,固定判别网络模型的参数,再对生成网络模型进行训练。
在训练生成网络模型时,将生成样本集中各生成样本依次输入至更新后的判别网络模型,每一生成样本对应一个更新后的判别网络模型的输出,根据更新后的判别网络模型的输出得到生成网络损失系数。
在本实施例中,首先固定生成网络模型的参数,对判别网络模型进行训练更新,使得通过训练后的判别网络模型保持分类能力。在训练完判别网络模型之后,再对生成网络模型进行训练更新,此时判别网络模型的参数固定不变,而仅将生成网络模型产生的损失或误差传递给生成网络模型,即根据更新后的判别网络模型的输出得到生成网络损失系数,基于生成网络损失系数更新生成网络模型的参数。通过判别网络模型和生成网络模型之间的对抗博弈,以使得两个网络模型最终达到稳态。
在一实施例中,分别将生成样本集和第二训练样本集输入至判别网络模型,根据判别网络模型的输出得到判别网络损失系数的步骤,包括:分别将生成样本集和第二训练样本集输入至判别网络模型,得到生成样本集对应的第一概率和第二训练样本集的第二概率;根据第一概率、第二概率和判别网络损失函数,得到判别网络损失系数。
其中,第一概率是指生成样本属于训练样本而非生成样本的概率,第二概率是指第二训练样本属于训练样本而非生成样本的概率。假设将生成样本的类别标识设置为0,第二训练样本的类别标识设置为1,则判别网络模型的输出为一个0-1之间的概率值,也就是说第一概率和第二概率的范围为0-1。判别网络模型训练的目的是使得生成样本对应的第一概率尽可能趋向于0,使得第二训练样本的对应的第二概率尽可能趋向于1,从而获得准确的分类能力。
判别网络损失函数是指根据判别网络模型的输出,计算判别网络模型的损失系数的函数。比如,判别网络损失函数可以是交叉熵损失函数、公式(4)所示的最大化判别网络区分度的函数
Figure PCTCN2019085031-appb-000006
Figure PCTCN2019085031-appb-000007
其中,D表示判别网络模型,G表示生成网络模型,x表示任一个第二训练样本,p data(x)表示第二训练样本的类别标识,D(x)表示任一个第二训练样本对应的概率,在本实施例中是指第二概率,y表示任一个第一训练样本,p y(y)表示生成样本的类别标识,G(y)表示任一个第一训练样本对应的生成样本,D(G(y))表示任一个生成样本对应的概率,在本实施例中是指第一概率。
在得到判别网络损失系数时,依次将生成样本集中各生成样本及其类别标识、第二训练样本集中各第二训练样本及其类别标识输入至判别网络模型,得到生成样本集对应的第一概率和第二训练样本集的第二概率;根据第一概率、第二概率和判别网络损失函数,得到判别网络损失系数。
在一实施例中,将生成样本集输入至更新后的判别网络模型,根据更新后的判别网络模型的输出得到生成网络损失系数的步骤,包括:将生成样本集输入至更新后的判别网络模型, 得到生成样本集对应的第三概率;根据第三概率和生成网络损失函数,得到生成网络损失系数。
其中,第三概率是指生成样本属于训练样本而非生成样本的概率。生成网络损失函数是指根据生成网络模型的输出,计算生成网络模型的损失系数的函数。比如,生成网络损失函数可以是交叉熵损失函数、公式(5)所示的最小化生成样本与训练样本数据分布的函数
Figure PCTCN2019085031-appb-000008
Figure PCTCN2019085031-appb-000009
其中,D(G(y))表示任一个生成样本对应的概率,在本实施例中指第三概率。
在得到生成网络损失系数时,依次将生成样本集中各生成样本及其类别标识输入至判别网络模型,得到生成样本集对应的第三概率;根据第三概率和生成网络损失函数,得到生成网络损失系数。
与判别网络模型训练时相反,在本实施例中,将生成样本的类别标识设置为1,以起到迷惑判别器的目的,从而才能使得生成样本逐渐逼近为真实的第二训练样本。
在一实施例中,如图9所示,训练眼镜去除模型的方式还包括特征网络模型。进一步地,根据生成网络损失系数更新生成网络模型的参数,得到更新后的生成网络模型之前,该方法还包括:分别将生成样本集和第二训练样本集输入至特征网络模型,得到生成样本集和第二训练样本集之间的特征误差。根据生成网络损失系数更新生成网络模型的参数,得到更新后的生成网络模型,包括:根据生成网络损失系数和特征误差更新生成网络模型的参数,得到更新后的生成网络模型。
其中,特征误差是指生成样本及其对应的第二训练样本在特征空间存在的差异。可以理解,生成样本集和第二训练样本集之间的特征误差,也就是指生成样本集中各生成样本及其对应的第二训练样本在特征空间存在的差异。
在基于特征网络模型更新生成网络模型时,依次将生成样本集中各生成样本及其对应的第二训练样本输入至特征网络模型,由该特征网络模型提取生成样本和对应的第二训练样本的特征,并进行比较分析,得到各生成样本及其对应的第二训练样本之间的特征误差。根据生成网络损失系数和特征误差,以及预设的生成网络模型参数调整方法,调整生成网络模型的参数,得到更新后的生成网络模型。比如,根据生成网络损失系数和特征误差,采用Adam算法对生成网络模型参数进行调整,得到更新后的生成网络模型。
通过对生成样本及其对应的第二训练样本进行特征误差的分析,促使最后得到的眼镜去除模型恢复的去眼镜图像进一步地保持鉴别信息,也即能够更准确地恢复出目标图像的关键特征,提高去眼镜图像的还原度,并且在人脸识别应用中,保证人脸识别的准确性。
在另一实施例中,根据生成网络损失系数更新生成网络模型的参数,得到更新后的生成网络模型之前,该方法还包括:对生成样本集和第二训练样本集的像素进行分析,得到生成样本集和第二训练样本集之间的像素误差。根据生成网络损失系数更新生成网络模型的参数,得到更新后的生成网络模型,包括:根据生成网络损失系数和像素误差更新生成网络模型的参数,得到更新后的生成网络模型。
其中,像素误差是指生成样本及其对应的第二训练样本各像素点存在的差异。可以理解,生成样本集和第二训练样本集之间的像素误差,也就是指生成样本集中各生成样本及其对应的第二训练样本在像素上存在的差异。
在基于更新生成网络模型时,依次对生成样本集中各生成样本及其对应的第二训练样本的像素点进行误差分析,得到各生成样本及其对应的第二训练样本之间的像素误差。根据生成网络损失系数和像素误差,以及预设的生成网络模型参数调整方法,调整生成网络模型的参数,得到更新后的生成网络模型。比如,根据生成网络损失系数和像素误差,采用Adam算法对生成网络模型参数进行调整,得到更新后的生成网络模型。
在一实施例中,根据生成网络损失系数更新生成网络模型的参数,得到更新后的生成网络模型之前,该方法还包括:对生成样本集和第二训练样本集的像素进行分析,得到生成样本集和第二训练样本集之间的像素误差;分别将生成样本集和第二训练样本集输入至特征网络模型,得到生成样本集和第二训练样本集之间的特征误差;根据生成网络损失系数更新生成网络模型的参数,得到更新后的生成网络模型,包括:根据生成网络损失系数、像素误差和特征误差更新生成网络模型的参数,得到更新后的生成网络模型。
通过对生成样本及其对应的第二训练样本进行特征误差、像素误差的分析,促使最后得到的眼镜去除模型恢复的去眼镜图像的还原程度高。
在一实施例中,如图10所示,步骤S708进一步包括以下步骤:
S1002,根据生成网络损失系数更新生成网络模型的参数,得到更新后的生成网络模型。
S1004,获取当前迭代次数。
S1006,当迭代次数小于预设的迭代次数阈值时,返回至将第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜的生成样本集。
S1008,当迭代次数达到预设的迭代次数阈值时,将更新后的生成网络模型作为眼镜去除模型。
在本实施例中,每完成一次生成对抗网络训练就对迭代次数执行加一操作,并获取当前迭代次数,判断当前迭代次数是否达到迭代次数阈值,若未达到,则继续执行训练的相关步骤;否则,将更新后的生成网络模型作为眼镜去除模型,并退出训练步骤。
在一实施例中,步骤S708之后还包括眼镜去除模型测试的步骤,该步骤包括:获取由测试图像组成的测试样本集,该测试图像中的对象佩戴有眼镜;将测试样本集输入至训练得到的眼镜去除模型,根据该眼镜去除模型的输出得到测试结果。其中,测试样本集由多个经归一化处理的测试图像(测试样本)组成的,测试图像与第一训练图像为不同的图像。通过进一步对训练得到的眼镜去除模型的性能进行测试,以确定当前得到的眼镜去除模型是否满足预设的眼镜去除效果。
在一实施例中,提供一种应用眼镜去除模型进行人脸识别的人脸识别方法,如图11所示,该方法包括以下步骤:
S1101,获取待识别人脸图像中的目标图像,该目标图像中的人脸佩戴有眼镜。
其中,待识别人脸图像是指当前需要进行识别的全局人脸图像。比如,在安检过程中的身份验证中,由图像采集设备采集的全局人脸图像。待识别人脸图像可以是佩戴有眼镜的人脸的人脸图像,也可以是未佩戴有眼镜的人脸的人脸图像。目标图像是指通过对待识别人脸图像进行分析处理,得到的携带有眼镜佩戴信息,需要进行眼镜去除处理的图像。即,目标对象中的人脸佩戴有对象,且需要进行眼镜去除处理。其中,目标图像可以是佩戴有眼镜的人脸的人脸图像,也可以是从佩戴有眼镜的人脸的人脸图像中分割出的眼部图像。例如,当眼镜去除模型为针对全局的人脸图像去除眼镜的模型时,目标图像为全局的人脸图像;当眼镜去除模型为针对局部的眼部图像去除眼镜的模型时,目标图像为局部的眼部图像。
本实施例中,获取待识别人脸图像中经眼镜识别检测后得到的目标图像或者已选定的目标图像,以将目标图像输入至眼镜去除模型进行眼镜去除处理。
S1102,将目标图像输入至基于生成对抗网络训练的眼镜去除模型;眼镜去除模型包括多个依次连接的卷积-收缩激励网络。
S1103,通过卷积-收缩激励网络的卷积层,得到目标图像的各特征通道的特征图。
在一种可能的实现方式中,通过卷积-收缩激励网络的卷积层,对输入的目标图像进行卷积处理,得到目标图像的各特征通道的特征图,并将该特征图输入至该卷积-收缩激励网络中的收缩激励层。
S1104,通过卷积-收缩激励网络中的收缩激励层,根据特征图得到各特征通道的全局信息,对该全局信息进行学习,生成各特征通道的权重。
在一种可能的实现方式中,通过收缩激励层中的收缩层对特征图进行压缩,得到各特征通道的全局信息,通过收缩激励层中的激励层对该全局信息进行学习,生成各特征通道的权重。
S1105,通过卷积-收缩激励网络的加权层,根据权重分别对各特征通道的特征图进行加权处理,生成加权特征图。
在一种可能的实现方式中,通过卷积-收缩激励网络的加权层,将各特征通道的特征图分别乘以对应的权重,生成加权特征图。
S1106,通过眼镜去除模型,根据加权特征图得到与目标图像对应的去眼镜人脸图像。
其中,去眼镜人脸图像是指与目标图像对应、且去除了眼镜之后的全局的人脸图像。其中,当目标图像为全局的人脸图像时,去眼镜人脸图像是指目标图像去除眼镜后的图像;当目标图像为局部的眼部图像时,去眼镜人脸图像是指由去除眼镜后的目标图像和目标图像对应的待识别人脸图像融合的得到的人脸图像。
S1107,将去眼镜人脸图像与预设人脸图像库进行匹配,根据匹配结果生成人脸识别结果。
预设人脸图像库中存储有已注册或者已验证的人脸图像。人脸识别结果包括识别成功、识别失败和所匹配的人脸图像的相关信息中的一种或几种数据,可根据识别需求设定,在此不作限定。比如,在公共交通的安全验证系统和人脸门禁系统中,仅需要识别验证待识别的人是否合法,则人脸识别结果为识别成功或者识别失败。在公安验证系统中进行信息查询时,则人脸识别结果还包括所匹配的人脸图像的相关信息。
本实施例通过传统的人脸识别模型,将去眼镜人脸图像与预设人脸图像库进行匹配,得到匹配结果,根据匹配结果生成人脸识别结果。比如,当匹配到预设人脸图像库中的人脸图像时,生成识别成功的人脸识别结果;或者,当匹配到预设人脸图像库中的人脸图像时,获取所匹配的人脸图像的相关信息,根据相关信息生成人脸识别结果。当未匹配到预设人脸图像库中的人脸图像时,生成识别失败的人脸识别结果。其中,传统的人脸识别模型包括但不限于Bruce-Young模型、交互激活竞争模型等。
通过眼镜去除模型,去除待识别人脸图像中的眼镜,无需手动摘取眼镜之后再进行人脸图像采集以及人脸识别,提高了人脸识别效率,避免因眼镜干扰造成无法识别的问题。并且,通过由多个卷积-收缩激励网络构成的眼镜去除模型,能够增强目标图像的有效特征,抑制无效或效果小的特征,有效去除目标图像中的眼镜,并确保去眼镜图像能够恢复出目标图像的关键特征,提高去眼镜图像的还原度和真实性,进一步保证了人脸识别结果的准确性。
在一实施例中,如图12所示,获取待识别人脸图像中的目标图像的步骤,包括:
S1202,获取待识别人脸图像。
S1204,对待识别人脸图像进行眼镜识别检测。
S1206,根据眼镜识别检测的结果,得到目标图像。
在人脸识别时,首先对待识别人脸图像进行眼镜识别检测,判断该待识别人脸图像中的人脸是否佩戴有眼镜,当人脸佩戴有眼镜时,则得到目标图像,以输入至眼镜去除模型进行眼镜去除处理后,再输入至人脸识别模型进行识别;若该待识别人脸图像中的人脸未佩戴有眼镜,则直接输入至人脸识别模型进行识别。其中,眼镜识别检测可以通过传统的目标检测模型仅检测,比如,基于深度学习的目标检测模型、基于区域的卷积神经网络等。
在一实施例中,根据眼镜识别检测的结果,得到佩戴有眼镜的目标图像,包括:当检测到待识别人脸图像中的人脸佩戴有眼镜时,根据眼部在待识别人脸图像中的位置,分割出眼部图像,得到目标图像。通过眼镜去除模型,根据加权特征图得到与目标图像对应的去眼镜人脸图像,包括:通过眼镜去除模型,根据加权特征图生成与目标图像对应的去眼镜图像;融合待识别人脸图像和去眼镜图像,得到去眼镜人脸图像。
本实施例中,当检测到待识别人脸图像中的人脸佩戴有眼镜时,通过对人脸图像进行目标检测,确定眼部所处人脸图像中的位置,基于所确定的位置分割出眼部图像,将分割出的眼部图像作为目标图像,以对目标图像进行眼镜去除处理。通过眼镜去除模型,生成与目标图像对应的去眼镜图像,根据所确定眼部所处人脸图像中的位置,将去眼镜图像替换所确定位置的眼部图像,得到去除眼镜后的人脸图像。
在一实施例中,将目标图像输入至基于生成对抗网络训练的眼镜去除模型的步骤之前,该方法还包括:对目标图像进行归一化处理。通过眼镜去除模型,根据加权特征图得到与目标图像对应的去眼镜人脸图像的步骤,包括:通过眼镜去除模型,根据加权特征图生成与目标图像对应的去眼镜图像;对去眼镜图像进行还原处理,将去眼镜图像的还原至目标图像大小,得到与目标图像对应的去眼镜人脸图像。可以理解,在本实施中,将目标图像输入至基于生成对抗网络训练的眼镜去除模型的步骤是指将归一化处理后的目标图像输入至基于生成对抗网络训练的眼镜去除模型。
下面以目标图像为全局的人脸图像为例,提供一完整实施例中的人脸识别方法,该方法中包括训练眼镜去除模型的步骤。如图13所示,该方法包括:
S1301,获取由第一训练图像组成的第一训练样本集和由第二训练图像组成的第二训练样本集,该第一训练图像中的对象佩戴有眼镜,该第二训练图像中的对象未佩戴眼镜。
在本实施例中,第一训练样本集中的第一训练样本和第二训练样本集中的第二训练样本均为全局的人脸图像。第二训练样本可以是通过各图像获取途径获取得到的第二训练图像,或者由已得到的第二训练图像进行复制得到,第一训练样本可以是通过对第二训练样本进行加眼镜处理得到;第一训练样本和第二训练样本还可以是通过人脸图像采集设备采集的大量图像样本,比如通过照相机、摄像头等采集设备采集得到对应图像样本。
S1302,将第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集,该生成网络模型包括多个依次连接的卷积-收缩激励网络。
在得到生成样本集时,将第一训练样本集中的第一训练样本依次输入至生成对抗网络中的生成网络模型,通过该生成网络模型中的卷积-收缩激励网络的卷积层,依次得到第一训练样本的各特征通道的特征图,将该特征图输入至该卷积-收缩激励网络中的收缩激励层。通过 收缩激励层,根据该特征图得到各特征通道的全局信息,对该全局信息进行学习,生成各特征通道的权重,进一步通过卷积-收缩激励网络的加权层,根据权重分别对各特征通道的特征图进行加权处理,生成与第一训练样本对应的加权特征图。基于生成网络模型对第一训练样本对应的加权特征图进一步处理,最终生成与第一训练样本对应的生成样本,所有生成样本即组成了生成样本集。
S1303,分别将生成样本集和第二训练样本集输入至生成对抗网络中的判别网络模型,得到生成样本集对应的第一概率和第二训练样本集的第二概率。
本实施例中,依次将生成样本集中各生成样本及其类别标识、第二训练样本集中各第二训练样本及其类别标识输入至判别网络模型,得到生成样本集对应的第一概率和第二训练样本集的第二概率。
S1304,根据第一概率、第二概率和判别网络损失函数,得到判别网络损失系数。
S1305,根据判别网络损失系数更新判别网络模型的参数,得到更新后的判别网络模型。
在本实施例中,采用公式(4)所示的最大化判别网络区分度的函数
Figure PCTCN2019085031-appb-000010
计算判别网络损失系数,并采用Adam算法更新判别网络模型的参数,使得更新后的判别网络模型输出的第一概率尽可能趋向于0,第二概率尽可能趋向于1,获得准确的分类能力。
S1306,将生成样本集输入至更新后的判别网络模型,得到生成样本集对应的第三概率。
S1307,根据第三概率和生成网络损失函数,得到生成网络损失系数。
本实施例中,依次将生成样本集中各生成样本及其类别标识输入至判别网络模型,得到生成样本集对应的第三概率。采用公式(5)所示的最小化生成样本与训练样本数据分布的函数
Figure PCTCN2019085031-appb-000011
计算生成网络损失系数。
S1308,分别将生成样本集和第二训练样本集输入至特征网络模型,得到生成样本集和第二训练样本集之间的特征误差。
依次将生成样本集中各生成样本及其对应的第二训练样本输入至特征网络模型,由特征网络模型提取生成样本和对应的第二训练样本的特征,并进行比较分析,得到各生成样本及其对应的第二训练样本之间的特征误差。
S1309,对生成样本集和第二训练样本集的像素进行分析,得到生成样本集和第二训练样本集之间的像素误差。
依次对生成样本集中各生成样本及其对应的第二训练样本的像素点进行误差分析,得到各生成样本及其对应的第二训练样本之间的像素误差。
S1310,根据生成网络损失系数、特征误差和像素误差更新生成网络模型的参数,得到更新后的生成网络模型。
本实施例中,根据生成网络损失系数、特征误差和像素误差,采用Adam算法对生成网络模型参数进行调整更新,得到更新后的生成网络模型。
S1311,获取当前迭代次数。
S1312,当迭代次数达到预设的迭代次数阈值时,将更新后的生成网络模型作为眼镜去除模型;否则,返回至S1302。
在本实施例中,每完成一次生成对抗网络训练就对迭代次数执行加一操作,并获取当前迭代次数,判断当前迭代次数是否达到迭代次数阈值,若未达到,则继续执行训练的相关步骤;否则,将更新后的生成网络模型作为眼镜去除模型,并退出训练步骤。
S1313,获取待识别人脸图像。
S1314,对待识别人脸图像进行眼镜识别检测。
S1315,当检测到待识别人脸图像中的人脸佩戴有眼镜时,得到目标图像;否则,直接执行步骤S1322。
本实施例中,首先对待识别人脸图像进行眼镜识别检测,判断该待识别人脸图像中的人脸是否佩戴有眼镜,当人脸佩戴有眼镜时,则得到目标图像,以输入至眼镜去除模型进行眼镜去除处理后,再输入至人脸识别模型进行识别;若该待识别人脸图像中的人脸未佩戴有眼镜,则直接输入至人脸识别模型进行识别。
S1316,将目标图像输入至基于生成对抗网络训练的眼镜去除模型;眼镜去除模型包括多个依次连接的卷积-收缩激励网络。
S1317,通过卷积-收缩激励网络的卷积层,得到目标图像的各特征通道的特征图。
S1318,通过卷积-收缩激励网络中的收缩激励层,根据特征图得到各特征通道的全局信息,对该全局信息进行学习,生成各特征通道的权重。
S1319,通过卷积-收缩激励网络的加权层,根据权重分别对各特征通道的特征图进行加权处理,生成加权特征图。
在一种可能的实现方式中,通过卷积-收缩激励网络的卷积层,对输入的目标图像进行卷积处理,得到目标图像的各特征通道的特征图;通过收缩激励层中的收缩层对特征图进行压缩,得到各特征通道的全局信息;通过收缩激励层中的激励层对该全局信息进行学习生成各特征通道的权重;由加权层将各特征通道的特征图分别乘以对应的权重,生成加权特征图,该加权特征图继续输入至下一层网络进行处理。
S1320,通过眼镜去除模型,根据加权特征图得到与目标图像对应的去眼镜人脸图像。
通过述眼镜去除模型中的多个卷积-收缩激励网络以及其他网络层处理后,根据加权特征图生成与目标图像对应的去眼镜人脸图像。
S1321,将去眼镜人脸图像与预设人脸图像库进行匹配,根据匹配结果生成人脸识别结果。
S1322,将待识别人脸图像与预设人脸图像库进行匹配,根据匹配结果生成人脸识别结果。
本实施例通过传统的人脸识别模型,将去眼镜人脸图像或待识别人脸图像与预设人脸图像库进行匹配,得到匹配结果,根据匹配结果生成人脸识别结果。比如,当匹配到预设人脸图像库中的人脸图像时,生成识别成功的人脸识别结果;或者,当匹配到预设人脸图像库中的人脸图像时,获取所匹配的人脸图像的相关信息,根据相关信息生成人脸识别结果。当未匹配到预设人脸图像库中的人脸图像时,生成识别失败的人脸识别结果。
通过眼镜去除模型,去除待识别人脸图像中的眼镜,无需手动摘取眼镜之后再进行人脸图像采集以及人脸识别,提高了人脸识别效率,避免因眼镜干扰造成无法识别的问题。并且,通过由多个卷积-收缩激励网络构成的眼镜去除模型,能够增强目标图像的有效特征,抑制无效或效果小的特征,有效去除目标图像中的眼镜,并确保去眼镜图像能够恢复出目标图像的关键特征,提高去眼镜图像的还原度和真实性,进一步保证了人脸识别结果的准确性。
图13为一个实施例中人脸识别方法的流程示意图。应该理解的是,虽然图13的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图13中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些 子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在一实施例中,如图14所示,提供一种图像处理装置,该装置包括:图像获取模块1401、输入模块1402、卷积模块1403、权重学习模块1404、加权模块1405和图像生成模块1406。
图像获取模块1401,用于获取目标图像,该目标图像中的对象佩戴有眼镜。
其中,目标图像是指携带有眼镜佩戴信息,需要进行眼镜去除处理的图像。即,目标对象中的对象佩戴有眼镜,且需要进行眼镜去除处理。当对象为人脸时,目标图像可以是佩戴有眼镜的人脸图像;当对象为眼部时,目标图像可以是从佩戴有眼镜的人脸图像中分割出的眼部图像。比如,当采用图像处理软件进行眼镜去除处理时,所获取的目标图像为输入至图像处理软件的人脸图像或分割出的眼部图像。
输入模块1402,用于将目标图像输入至基于生成对抗网络训练的眼镜去除模型;该眼镜去除模型包括多个依次连接的卷积-收缩激励网络。
卷积模块1403,用于通过卷积-收缩激励网络的卷积层,得到目标图像的各特征通道的特征图。
在一种可能的实现方式中,卷积模块1403用于通过卷积-收缩激励网络的卷积层,对输入的目标图像进行卷积处理,得到目标图像的各特征通道的特征图,并将该特征图输入至该卷积-收缩激励网络中的收缩激励层。
权重学习模块1404,用于通过卷积-收缩激励网络中的收缩激励层,根据特征图得到各特征通道的全局信息,对该全局信息进行学习,生成各特征通道的权重。在一种可能的实现方式中,通过收缩激励层中的收缩层对特征图进行压缩,得到各特征通道的全局信息;通过收缩激励层中的激励层对全局信息进行学习,生成各特征通道的权重。
加权模块1405,用于通过卷积-收缩激励网络的加权层,根据权重分别对各特征通道的特征图进行加权处理,生成加权特征图。加权模块1405利用加权层将各特征通道的特征图分别乘以对应的权重,生成加权特征图,加权特征图继续输入至下一层网络进行处理。
生成模块1406,用于通过眼镜去除模型,根据加权特征图生成与目标图像对应的去眼镜图像。
眼镜去除模型为已经训练好的模型,具有眼镜去除效果,通过述眼镜去除模型中的多个卷积-收缩激励网络以及其他网络层处理后,根据加权特征图生成与目标图像对应的去眼镜图像。
上述图像处理装置,通过获取目标图像,将目标图像输入至预先训练得到的眼镜去除模型,由于该眼镜去除模型包括多个依次连接的卷积-收缩激励网络,所以,可以通过卷积-收缩激励网络的卷积层来得到目标图像的各特征通道的特征图,再通过卷积-收缩激励网络中的收缩激励层根据特征图得到各特征通道的全局信息,对全局信息进行学习以生成各特征通道的权重,再通过卷积-收缩激励网络中的加权层,根据权重分别对各特征通道的特征图进行加权处理,生成加权特征图,最后,通过眼镜去除模型根据该加权特征图得到对应的去眼镜图像。这样,该眼镜去除模型能够保持较高的学习能力,从而能够充分学习不同特征通道的重要性得到对应权重,通过加权处理来在增强有效特征的同时抑制无效或效果小的特征,有效去除目标图像中的眼镜,同时确保去眼镜图像能够恢复出目标图像的关键特征,提高去眼镜 图像的还原度和真实性。
在一实施例中,图像处理装置还包括图像融合模块。在本实施例中,图像获取模块1401还用于获取人脸图像,该人脸图像中的人脸佩戴有眼镜,根据眼部在人脸图像中的位置,分割出眼部图像,得到目标图像。图像融合模块,用于融合人脸图像和去眼镜图像,得到去除眼镜后的人脸图像。
本实施例中,图像获取模块1401通过对人脸图像进行目标检测,确定眼部所处人脸图像中的位置,基于所确定的位置分割出眼部图像,将分割出的眼部图像作为目标图像。通过眼镜去除模型生成与目标图像对应的去眼镜图像后,再由图像融合模块对人脸图像和去眼镜图像进行融合,将去眼镜图像替换所确定位置的眼部图像,得到去除眼镜后的完整的人脸图像。
在一实施例中,图像处理装置还包括模型训练模块,该模型训练模块进一步包括:样本获取模块、生成样本模块、生成网络损失系数生成模块和更新迭代模块,其中:
样本获取模块,用于获取由第一训练图像组成的第一训练样本集和由第二训练图像组成的第二训练样本集,该第一训练图像中的对象佩戴有眼镜,该第二训练图像中的对象未佩戴眼镜。
在本实施例中,第一训练样本集中的第一训练样本和第二训练样本集中的第二训练样本均为全局的人脸图像。第二训练样本可以是通过各图像获取途径获取得到的第二训练图像,或者由已得到的第二训练图像进行复制得到,第一训练样本可以是通过对第二训练样本进行加眼镜处理得到;第一训练样本和第二训练样本还可以是通过人脸图像采集设备采集的大量图像样本,比如通过照相机、摄像头等采集设备采集得到对应图像样本。
生成样本模块,用于将第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集,该生成网络模型包括多个依次连接的卷积-收缩激励网络。
在一种可能的实现方式中,将第一训练样本集中的第一训练样本依次输入至生成对抗网络中的生成网络模型,通过生成网络模型中的卷积-收缩激励网络的卷积层,依次得到第一训练样本的各特征通道的特征图,将该特征图输入至该卷积-收缩激励网络中的收缩激励层。通过收缩激励层,根据该特征图得到各特征通道的全局信息,对该全局信息进行学习,生成各特征通道的权重,进一步通过卷积-收缩激励网络的加权层,根据权重分别对各特征通道的特征图进行加权处理,生成与第一训练样本对应的加权特征图。基于生成网络模型对第一训练样本对应的加权特征图进一步处理,最终生成与第一训练样本对应的生成样本,所有生成样本即组成了生成样本集。
生成网络损失系数生成模块,用于分别将生成样本集和第二训练样本集输入至生成对抗网络中的判别网络模型,根据判别网络模型的输出得到生成网络损失系数。
在一种可能的实现方式中,分别将生成样本集和第二训练样本集输入至生成对抗网络中的判别网络模型,根据判别网络模型的输出调整判别网络模型的参数,得到更新后的判别网络模型;再将生成样本集输入至更新后的判别网络模型,根据更新后的判别网络模型的输出得到生成网络损失系数,以根据生成网络损失系数调整生成网络模型的参数。
更新迭代模块,用于根据生成网络损失系数更新生成网络模型的参数,得到更新后的生成网络模型,并返回至生成样本模块,直至满足迭代结束条件时,将更新后的生成网络模型作为眼镜去除模型。
在本实施例中,根据生成网络损失系数以及预设的生成网络模型参数调整方法,调整生 成网络模型的参数,得到更新后的生成网络模型。判断是否满足预设的迭代结束条件,若满足,则结束迭代训练,将更新后的生成网络模型作为眼镜去除模型;若不满足,则触发生成样本模块继续执行相关操作。
在一实施例中,更新迭代模块还用于根据生成网络损失系数更新生成网络模型的参数,得到更新后的生成网络模型;获取当前迭代次数;当迭代次数小于预设的迭代次数阈值时,触发生成样本模块继续执行相关操作;当迭代次数达到预设的迭代次数阈值时,将更新后的生成网络模型作为眼镜去除模型。
进一步地,生成网络损失系数生成模块包括:判别网络损失系数生成模块、判别网络更新模块和生成网络损失系数确定模块。其中:
判别网络损失系数生成模块,用于分别将生成样本集和第二训练样本集输入至生成对抗网络中的判别网络模型,根据判别网络模型的输出得到判别网络损失系数。
本实施例中,判别网络损失系数生成模块用于分别将生成样本集和第二训练样本集输入至判别网络模型,得到生成样本集对应的第一概率和第二训练样本集的第二概率;根据第一概率、第二概率和判别网络损失函数,得到判别网络损失系数。
判别网络更新模块,用于根据判别网络损失系数更新判别网络模型的参数,得到更新后的判别网络模型。
本实施例中,判别网络更新模块根据判别网络损失系数以及预设的判别网络模型参数调整方法,调整判别网络模型的参数,得到更新后的判别网络模型。其中,判别网络模型参数调整方法包括但不限于梯度下降法、反向传播算法等误差修正算法。比如,基于一阶梯度来优化随机目标函数的Adam算法。
生成网络损失系数确定模块,用于将生成样本集输入至更新后的判别网络模型,根据更新后的判别网络模型的输出得到生成网络损失系数。
在一实施例中,生成网络损失系数确定模块用于将生成样本集输入至更新后的判别网络模型,得到生成样本集对应的第三概率;根据第三概率和生成网络损失函数,得到生成网络损失系数。
在一实施例中,图像处理装置还包括特征误差生成模块,用于分别将生成样本集和第二训练样本集输入至特征网络模型,得到生成样本集和第二训练样本集之间的特征误差。本实施例中,更新迭代模块还用于根据生成网络损失系数和特征误差更新生成网络模型的参数,得到更新后的生成网络模型。
通过对生成样本及其对应的第二训练样本进行特征误差的分析,促使最后得到的眼镜去除模型恢复的去眼镜图像进一步地保持鉴别信息,也即更准确地恢复出目标图像的关键特征,提高去眼镜图像的还原度,并且在人脸识别应用中,保证人脸识别的准确性。
在一实施例中,图像处理装置还包括像素误差生成模块,用于对生成样本集和第二训练样本集的像素进行分析,得到生成样本集和第二训练样本集之间的像素误差。本实施例中,更新迭代模块还用于根据生成网络损失系数和像素误差更新生成网络模型的参数,得到更新后的生成网络模型。
上述图像处理装置,利用眼镜去除模型充分学习不同特征通道的重要性得到对应权重,通过加权处理增强有效特征同时抑制无效或效果小的特征,有效去除目标图像中的眼镜,并且确保去眼镜图像能够恢复出目标图像的关键特征,提高去眼镜图像的还原度和真实性。
在一实施例中,如图15所示,提供一种人脸识别装置,该装置包括:目标图像获取模 块1501、目标图像输入模块1502、特征卷积模块1503、特征权重学习模块1504、特征加权模块1505、人脸图像生成模块1506和匹配模块1507。其中:
目标图像获取模块1501,用于获取待识别人脸图像中的目标图像,该目标图像中的人脸佩戴有眼镜。
其中,待识别人脸图像是指当前需要进行识别的全局人脸图像。其中,目标图像获取模块1501获取待识别人脸图像中,经眼镜识别检测后得到的目标图像或者已选定的目标图像,以将目标图像输入至眼镜去除模型进行眼镜去除处理。
目标图像输入模块1502,用于将目标图像输入至基于生成对抗网络训练的眼镜去除模型;眼镜去除模型包括多个依次连接的卷积-收缩激励网络。
特征卷积模块1503,用于通过卷积-收缩激励网络的卷积层,得到目标图像的各特征通道的特征图。
在一种可能的实现方式中,特征卷积模块1503用于通过卷积-收缩激励网络的卷积层,对输入的目标图像进行卷积处理,得到目标图像的各特征通道的特征图,并将该特征图输入至该卷积-收缩激励网络中的收缩激励层。
特征权重学习模块1504,用于通过卷积-收缩激励网络中的收缩激励层,根据特征图得到各特征通道的全局信息,对该全局信息进行学习生成各特征通道的权重。在一种可能的实现方式中,通过收缩激励层中的收缩层对特征图进行压缩,得到各特征通道的全局信息;通过收缩激励层中的激励层对该全局信息进行学习,生成各特征通道的权重。
特征加权模块1505,用于通过卷积-收缩激励网络的加权层,根据权重分别对各特征通道的特征图进行加权处理,生成加权特征图。在一种可能的实现方式中,特征加权模块1505利用加权层将各特征通道的特征图分别乘以对应的权重,生成加权特征图,加权特征图继续输入至下一层网络进行处理。
人脸图像生成模块1506,用于通过眼镜去除模型,根据加权特征图得到与目标图像对应的去眼镜人脸图像。其中,去眼镜人脸图像是指与目标图像对应、且去除了眼镜之后的全局的人脸图像。其中,当目标图像为全局的人脸图像时,去眼镜人脸图像是指目标图像去除眼镜后的图像;当目标图像为局部的眼部图像时,去眼镜人脸图像是指由去除眼镜后的目标图像和目标图像对应的待识别人脸图像融合的得到的人脸图像。
匹配模块1507,用于将去眼镜人脸图像与预设人脸图像库进行匹配,根据匹配结果生成人脸识别结果。本实施例匹配模块1507通过传统的人脸识别模型,将去眼镜人脸图像与预设人脸图像库进行匹配,得到匹配结果,根据匹配结果生成人脸识别结果。
上述人脸识别装置,通过眼镜去除模型,去除待识别人脸图像中的眼镜,无需手动摘取眼镜之后再进行人脸图像采集以及人脸识别,提高了人脸识别效率,避免因眼镜干扰造成无法识别的问题。并且,通过由多个卷积-收缩激励网络构成的眼镜去除模型,能够增强目标图像的有效特征,抑制无效或效果小的特征,有效去除目标图像中的眼镜,并确保能够去眼镜图像恢复出目标图像的关键特征,提高去眼镜图像的还原度和真实性,进一步保证了人脸识别结果的准确性。
在一实施例中,目标图像获取模块1501包括人脸图像获取模块、眼镜检测模块和目标图像确定模块。其中,人脸图像获取模块,用于获取待识别人脸图像;眼镜检测模块,用于对待识别人脸图像进行眼镜识别检测;目标图像确定模块,用于根据眼镜识别检测的结果,得到佩戴有眼镜的目标图像。
在一实施例中,目标图像确定模块包括眼部分割模块,眼部分割模块用于当检测到待识别人脸图像佩戴有眼镜时,根据眼部在待识别人脸图像中的位置,分割出眼部图像,得到佩戴有眼镜的目标图像。相应地,本实施例中,人脸图像生成模块1506进一步还用于通过眼镜去除模型,生成与目标图像对应的去眼镜图像;融合待识别人脸图像和去眼镜图像,得到去眼镜人脸图像。
在一实施例中,人脸识别装置还包括模型训练模块,该模型训练模块进一步包括:样本获取模块、生成样本模块、生成网络损失系数生成模块和更新迭代模块,详见图14所示的实施例中描述,此处不作赘述。
在一实施例中,生成网络损失系数生成模块包括:判别网络损失系数生成模块、判别网络更新模块和生成网络损失系数确定模块,详见图14所示的实施例中描述,此处不作赘述。
在一实施例中,人脸识别装置还包括特征误差生成模块和像素误差生成模块中的至少一种,详见图14所示的实施例中描述,此处不作赘述。
上述人脸识别装置通过眼镜去除模型,去除待识别人脸图像中的眼镜,无需手动摘取眼镜之后再进行人脸图像采集以及人脸识别,提高了人脸识别效率,避免因眼镜干扰造成无法识别的问题。
图16示出了一个实施例中计算机设备的内部结构图。该计算机设备可以是图1中的终端或服务器110。如图16所示,该计算机设备包括该计算机设备包括通过系统总线连接的处理器、存储器、网络接口。其中,存储器包括非易失性存储介质和内存储器。该计算机设备的非易失性存储介质存储有操作系统,还可存储有计算机程序,该计算机程序被处理器执行时,可使得处理器实现图像处理方法和/或人脸识别方法。该内存储器中也可储存有计算机程序,该计算机程序被处理器执行时,可使得处理器执行图像处理方法和/或人脸识别方法。
本领域技术人员可以理解,图16中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,本申请提供的图像处理装置和人脸识别装置可以实现为一种计算机程序的形式,计算机程序可在如图16所示的计算机设备上运行。计算机设备的存储器中可存储组成该图像处理装置和/或人脸识别装置的各个程序模块,比如,图14所示的图像获取模块1401、输入模块1402、卷积模块1403、权重学习模块1404、加权模块1405和生成模块1406。各个程序模块构成的计算机程序使得处理器执行本说明书中描述的本申请各个实施例的图像处理方法中的步骤。
在一个实施例中,提供了一种计算机设备,包括存储器和处理器,存储器中存储有计算机程序,该处理器执行计算机程序时实现上述实施例中所述的图像处理方法。
在一个实施例中,处理器执行计算机程序时还实现上述实施例中所述的人脸识别方法。
在一个实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现上述实施例中所述的图像处理方法。
在一个实施例中,计算机程序被处理器执行时还实现上述实施例中所述的人脸识别方法。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (40)

  1. 一种图像处理方法,其特征在于,所述方法包括:
    获取目标图像,所述目标图像中的对象佩戴有眼镜;
    将所述目标图像输入至基于生成对抗网络训练的眼镜去除模型;所述眼镜去除模型包括多个依次连接的卷积-收缩激励网络;
    通过所述卷积-收缩激励网络的卷积层,得到所述目标图像的各特征通道的特征图;
    通过所述卷积-收缩激励网络中的收缩激励层,根据所述特征图得到各特征通道的全局信息,对所述全局信息进行学习,生成所述各特征通道的权重;
    通过所述卷积-收缩激励网络的加权层,根据所述权重分别对所述各特征通道的特征图进行加权处理,生成加权特征图;
    通过所述眼镜去除模型,根据所述加权特征图生成与所述目标图像对应的去眼镜图像。
  2. 根据权利要求1所述的方法,其特征在于,训练所述眼镜去除模型的方式,包括:
    获取由第一训练图像组成的第一训练样本集和由第二训练图像组成的第二训练样本集,所述第一训练图像中的对象佩戴有眼镜,所述第二训练图像中的对象未佩戴眼镜;
    将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集,所述生成网络模型包括多个依次连接的卷积-收缩激励网络;
    分别将所述生成样本集和所述第二训练样本集输入至所述生成对抗网络中的判别网络模型,根据所述判别网络模型的输出得到生成网络损失系数;
    根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型,并返回至所述将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集,直至满足迭代结束条件,将所述更新后的生成网络模型作为所述眼镜去除模型。
  3. 根据权利要求2所述的方法,其特征在于,所述分别将所述生成样本集和所述第二训练样本集输入至所述生成对抗网络中的判别网络模型,根据所述判别网络模型的输出得到生成网络损失系数,包括:
    分别将所述生成样本集和所述第二训练样本集输入至所述生成对抗网络中的判别网络模型,根据所述判别网络模型的输出得到判别网络损失系数;
    根据所述判别网络损失系数更新所述判别网络模型的参数,得到更新后的判别网络模型;
    将所述生成样本集输入至所述更新后的判别网络模型,根据所述更新后的判别网络模型的输出得到所述生成网络损失系数。
  4. 根据权利要求3所述的方法,其特征在于,所述分别将所述生成样本集和所述第二训练样本集输入至所述生成对抗网络中的判别网络模型,根据所述判别网络模型的输出得到判别网络损失系数,包括:
    分别将所述生成样本集和所述第二训练样本集输入至所述判别网络模型,得到所述生成样本集对应的第一概率和所述第二训练样本集的第二概率;
    根据所述第一概率、所述第二概率和判别网络损失函数,得到所述判别网络损失系数。
  5. 根据权利要求3所述的方法,其特征在于,所述将所述生成样本集输入至所述更新后的判别网络模型,根据所述更新后的判别网络模型的输出得到所述生成网络损失系数,包括:
    将所述生成样本集输入至所述更新后的判别网络模型,得到所述生成样本集对应的第三概率;
    根据所述第三概率和生成网络损失函数,得到所述生成网络损失系数。
  6. 根据权利要求2所述的方法,其特征在于,在所述根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型之前,所述方法还包括:
    分别将所述生成样本集和所述第二训练样本集输入至特征网络模型,得到所述生成样本集和所述第二训练样本集之间的特征误差;
    所述根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型,包括:
    根据所述生成网络损失系数和所述特征误差更新所述生成网络模型的参数,得到更新后的生成网络模型。
  7. 根据权利要求2所述的方法,其特征在于,在所述根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型之前,所述方法还包括:
    对所述生成样本集和所述第二训练样本集的像素进行分析,得到所述生成样本集和所述第二训练样本集之间的像素误差;
    所述根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型,包括:
    根据所述生成网络损失系数和所述像素误差更新所述生成网络模型的参数,得到更新后的生成网络模型。
  8. 根据权利要求2至7中任一项所述的方法,其特征在于,所述根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型,并返回至所述将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集,直至满足迭代结束条件,将所述更新后的生成网络模型作为所述眼镜去除模型,包括:
    根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型;
    获取当前迭代次数;
    当所述迭代次数小于预设的迭代次数阈值时,返回至所述将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集;
    当所述迭代次数达到预设的迭代次数阈值时,将所述更新后的生成网络模型作为所述眼镜去除模型。
  9. 根据权利要求1所述的方法,其特征在于,所述获取目标图像,包括:
    获取人脸图像,所述人脸图像中的人脸佩戴有眼镜;
    根据眼部在所述人脸图像中的位置,分割出眼部图像,得到所述目标图像;
    所述方法还包括:融合所述人脸图像和所述去眼镜图像,得到去除眼镜后的人脸图像。
  10. 一种人脸识别方法,其特征在于,所述方法包括:
    获取待识别人脸图像中的目标图像,所述目标图像中的人脸佩戴有眼镜;
    将所述目标图像输入至基于生成对抗网络训练的眼镜去除模型;所述眼镜去除模型包括多个依次连接的卷积-收缩激励网络;
    通过所述卷积-收缩激励网络的卷积层,得到所述目标图像的各特征通道的特征图;
    通过所述卷积-收缩激励网络中的收缩激励层,根据所述特征图得到各特征通道的全局信 息,对所述全局信息进行学习,生成所述各特征通道的权重;
    通过所述卷积-收缩激励网络的加权层,根据所述权重分别对所述各特征通道的特征图进行加权处理,生成加权特征图;
    通过所述眼镜去除模型,根据所述加权特征图得到与所述目标图像对应的去眼镜人脸图像;
    将所述去眼镜人脸图像与预设人脸图像库进行匹配,根据匹配结果生成人脸识别结果。
  11. 根据权利要求10所述的方法,其特征在于,所述获取待识别人脸图像中的目标图像,包括:
    获取待识别人脸图像;
    对所述待识别人脸图像进行眼镜识别检测;
    根据所述眼镜识别检测的结果,得到所述目标图像。
  12. 根据权利要求11所述的方法,其特征在于,所述根据所述眼镜识别检测的结果,得到所述目标图像,包括:
    当检测到所述待识别人脸图像中的人脸佩戴有眼镜时,根据眼部在所述待识别人脸图像中的位置,分割出眼部图像,得到所述目标图像;
    所述通过所述眼镜去除模型,根据所述加权特征图得到与所述目标图像对应的去眼镜人脸图像,包括:
    通过所述眼镜去除模型,根据所述加权特征图生成与所述目标图像对应的去眼镜图像;
    融合所述待识别人脸图像和所述去眼镜图像,得到所述去眼镜人脸图像。
  13. 根据权利要求10所述的方法,其特征在于,训练所述眼镜去除模型的方式,包括:
    获取由第一训练图像组成的第一训练样本集和由第二训练图像组成的第二训练样本集,所述第一训练图像中的对象佩戴有眼镜,所述第二训练图像中的对象未佩戴眼镜;
    将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集,所述生成网络模型包括多个依次连接的卷积-收缩激励网络;
    分别将所述生成样本集和所述第二训练样本集输入至所述生成对抗网络中的判别网络模型,根据所述判别网络模型的输出得到生成网络损失系数;
    根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型,并返回至所述将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集,直至满足迭代结束条件,将所述更新后的生成网络模型作为所述眼镜去除模型。
  14. 根据权利要求13所述的方法,其特征在于,所述分别将所述生成样本集和所述第二训练样本集输入至所述生成对抗网络中的判别网络模型,根据所述判别网络模型的输出得到生成网络损失系数,包括:
    分别将所述生成样本集和所述第二训练样本集输入至所述生成对抗网络中的判别网络模型,根据所述判别网络模型的输出得到判别网络损失系数;
    根据所述判别网络损失系数更新所述判别网络模型的参数,得到更新后的判别网络模型;
    将所述生成样本集输入至所述更新后的判别网络模型,根据所述更新后的判别网络模型的输出得到所述生成网络损失系数。
  15. 根据权利要求14所述的方法,其特征在于,所述分别将所述生成样本集和所述第二 训练样本集输入至所述生成对抗网络中的判别网络模型,根据所述判别网络模型的输出得到判别网络损失系数,包括:
    分别将所述生成样本集和所述第二训练样本集输入至所述判别网络模型,得到所述生成样本集对应的第一概率和所述第二训练样本集的第二概率;
    根据所述第一概率、所述第二概率和判别网络损失函数,得到所述判别网络损失系数。
  16. 根据权利要求14所述的方法,其特征在于,所述将所述生成样本集输入至所述更新后的判别网络模型,根据所述更新后的判别网络模型的输出得到所述生成网络损失系数,包括:
    将所述生成样本集输入至所述更新后的判别网络模型,得到所述生成样本集对应的第三概率;
    根据所述第三概率和生成网络损失函数,得到所述生成网络损失系数。
  17. 根据权利要求13所述的方法,其特征在于,在所述根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型之前,所述方法还包括:
    分别将所述生成样本集和所述第二训练样本集输入至特征网络模型,得到所述生成样本集和所述第二训练样本集之间的特征误差;
    所述根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型,包括:
    根据所述生成网络损失系数和所述特征误差更新所述生成网络模型的参数,得到更新后的生成网络模型。
  18. 根据权利要求13所述的方法,其特征在于,在所述根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型之前,所述方法还包括:
    对所述生成样本集和所述第二训练样本集的像素进行分析,得到所述生成样本集和所述第二训练样本集之间的像素误差;
    所述根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型,包括:
    根据所述生成网络损失系数和所述像素误差更新所述生成网络模型的参数,得到更新后的生成网络模型。
  19. 根据权利要求13至18中任一项所述的方法,其特征在于,所述根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型,并返回至所述将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集,直至满足迭代结束条件,将所述更新后的生成网络模型作为所述眼镜去除模型,包括:
    根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型;
    获取当前迭代次数;
    当所述迭代次数小于预设的迭代次数阈值时,返回至所述将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集;
    当所述迭代次数达到预设的迭代次数阈值时,将所述更新后的生成网络模型作为所述眼镜去除模型。
  20. 一种图像处理装置,其特征在于,所述装置包括:
    图像获取模块,用于获取目标图像,所述目标图像中的对象佩戴有眼镜;
    输入模块,用于将所述目标图像输入至基于生成对抗网络训练的眼镜去除模型;所述眼镜去除模型包括多个依次连接的卷积-收缩激励网络;
    卷积模块,用于通过所述卷积-收缩激励网络的卷积层,得到所述目标图像的各特征通道的特征图;
    权重学习模块,用于通过所述卷积-收缩激励网络中的收缩激励层,根据所述特征图得到各特征通道的全局信息,对所述全局信息进行学习,生成所述各特征通道的权重;
    加权模块,用于通过所述卷积-收缩激励网络的加权层,根据所述权重分别对所述各特征通道的特征图进行加权处理,生成加权特征图;
    生成模块,用于通过所述眼镜去除模型,根据所述加权特征图生成与所述目标图像对应的去眼镜图像。
  21. 根据权利要求20所述的装置,其特征在于,所述装置还包括模型训练模块,所述模型训练模块,包括:
    样本获取模块,用于获取由第一训练图像组成的第一训练样本集和由第二训练图像组成的第二训练样本集,所述第一训练图像中的对象佩戴有眼镜,所述第二训练图像中的对象未佩戴眼镜;
    生成样本模块,用于将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集,所述生成网络模型包括多个依次连接的卷积-收缩激励网络;
    生成网络损失系数生成模块,用于分别将所述生成样本集和所述第二训练样本集输入至所述生成对抗网络中的判别网络模型,根据所述判别网络模型的输出得到生成网络损失系数;
    更新迭代模块,用于根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型,并返回至所述将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集,直至满足迭代结束条件,将所述更新后的生成网络模型作为所述眼镜去除模型。
  22. 根据权利要求21所述的装置,其特征在于,所述生成网络损失系数生成模块,包括:
    判别网络损失系数生成模块,用于分别将所述生成样本集和所述第二训练样本集输入至所述生成对抗网络中的判别网络模型,根据所述判别网络模型的输出得到判别网络损失系数;
    判别网络更新模块,用于根据所述判别网络损失系数更新所述判别网络模型的参数,得到更新后的判别网络模型;
    生成网络损失系数确定模块,用于将所述生成样本集输入至所述更新后的判别网络模型,根据所述更新后的判别网络模型的输出得到所述生成网络损失系数。
  23. 根据权利要求22所述的装置,其特征在于,所述生成网络损失系数生成模块,还用于:
    分别将所述生成样本集和所述第二训练样本集输入至所述判别网络模型,得到所述生成样本集对应的第一概率和所述第二训练样本集的第二概率;
    根据所述第一概率、所述第二概率和判别网络损失函数,得到所述判别网络损失系数。
  24. 根据权利要求22所述的装置,其特征在于,所述生成网络损失系数生成模块,还用于:
    将所述生成样本集输入至所述更新后的判别网络模型,得到所述生成样本集对应的第三概率;
    根据所述第三概率和生成网络损失函数,得到所述生成网络损失系数。
  25. 根据权利要求21所述的装置,其特征在于,所述装置还包括:
    特征误差生成模块,用于在所述更新迭代模块根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型之前,分别将所述生成样本集和所述第二训练样本集输入至特征网络模型,得到所述生成样本集和所述第二训练样本集之间的特征误差;
    所述更新迭代模块,还用于根据所述生成网络损失系数和所述特征误差更新所述生成网络模型的参数,得到更新后的生成网络模型。
  26. 根据权利要求21所述的装置,其特征在于,所述装置还包括:
    像素误差生成模块,用于在所述更新迭代模块根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型之前,对所述生成样本集和所述第二训练样本集的像素进行分析,得到所述生成样本集和所述第二训练样本集之间的像素误差;
    所述更新迭代模块,还用于根据所述生成网络损失系数和所述像素误差更新所述生成网络模型的参数,得到更新后的生成网络模型。
  27. 根据权利要求21至26中任一项所述的装置,其特征在于,所述更新迭代模块,还用于:
    根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型;
    获取当前迭代次数;
    当所述迭代次数小于预设的迭代次数阈值时,返回至所述将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集;
    当所述迭代次数达到预设的迭代次数阈值时,将所述更新后的生成网络模型作为所述眼镜去除模型。
  28. 根据权利要求20所述的装置,其特征在于,所述图像获取模块,还用于:
    获取人脸图像,所述人脸图像中的人脸上佩戴有眼镜;
    根据眼部在所述人脸图像中的位置,分割出眼部图像,得到所述目标图像;
    所述装置还包括:图像融合模块,用于融合所述人脸图像和所述去眼镜图像,得到去除眼镜后的人脸图像。
  29. 一种人脸识别装置,其特征在于,所述装置包括:
    目标图像获取模块,用于获取待识别人脸图像中的目标图像,所述目标图像中的人脸佩戴有眼镜;
    目标图像输入模块,用于将所述目标图像输入至基于生成对抗网络训练的眼镜去除模型;所述眼镜去除模型包括多个依次连接的卷积-收缩激励网络;
    特征卷积模块,用于通过所述卷积-收缩激励网络的卷积层,得到所述目标图像的各特征通道的特征图;
    特征权重学习模块,用于通过所述卷积-收缩激励网络中的收缩激励层,根据所述特征图得到各特征通道的全局信息,对所述全局信息进行学习,生成所述各特征通道的权重;
    特征加权模块,用于通过所述卷积-收缩激励网络的加权层,根据所述权重分别对所述各特征通道的特征图进行加权处理,生成加权特征图;
    人脸图像生成模块,用于通过所述眼镜去除模型,根据所述加权特征图得到与所述目标图像对应的去眼镜人脸图像;
    匹配模块,用于将所述去眼镜人脸图像与预设人脸图像库进行匹配,根据匹配结果生成人脸识别结果。
  30. 根据权利要求29所述的装置,其特征在于,所述目标图像获取模块,包括:
    人脸图像获取模块,用于获取待识别人脸图像;
    眼镜检测模块,用于对所述待识别人脸图像进行眼镜识别检测;
    目标图像确定模块,用于根据所述眼镜识别检测的结果,得到所述目标图像。
  31. 根据权利要求30所述的装置,其特征在于,所述目标图像确定模块,包括:
    眼部分割模块,用于当检测到所述待识别人脸图像中的人脸佩戴有眼镜时,根据眼部在所述待识别人脸图像中的位置,分割出眼部图像,得到所述目标图像;
    所述人脸图像生成模块,还用于:
    通过所述眼镜去除模型,根据所述加权特征图生成与所述目标图像对应的去眼镜图像;
    融合所述待识别人脸图像和所述去眼镜图像,得到所述去眼镜人脸图像。
  32. 根据权利要求29所述的装置,其特征在于,所述装置还包括模型训练模块,所述模型训练模块,包括:
    样本获取模块,用于获取由第一训练图像组成的第一训练样本集和由第二训练图像组成的第二训练样本集,所述第一训练图像中的对象佩戴有眼镜,所述第二训练图像中的对象未佩戴眼镜;
    生成样本模块,用于将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集,所述生成网络模型包括多个依次连接的卷积-收缩激励网络;
    生成网络损失系数生成模块,用于分别将所述生成样本集和所述第二训练样本集输入至所述生成对抗网络中的判别网络模型,根据所述判别网络模型的输出得到生成网络损失系数;
    更新迭代模块,用于根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型,并返回至所述将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集,直至满足迭代结束条件,将所述更新后的生成网络模型作为所述眼镜去除模型。
  33. 根据权利要求32所述的装置,其特征在于,所述生成网络损失系数生成模块,包括:
    判别网络损失系数生成模块,用于分别将所述生成样本集和所述第二训练样本集输入至所述生成对抗网络中的判别网络模型,根据所述判别网络模型的输出得到判别网络损失系数;
    判别网络更新模块,用于根据所述判别网络损失系数更新所述判别网络模型的参数,得到更新后的判别网络模型;
    生成网络损失系数确定模块,用于将所述生成样本集输入至所述更新后的判别网络模型,根据所述更新后的判别网络模型的输出得到所述生成网络损失系数。
  34. 根据权利要求33所述的装置,其特征在于,所述生成网络损失系数生成模块,还用于:
    分别将所述生成样本集和所述第二训练样本集输入至所述判别网络模型,得到所述生成样本集对应的第一概率和所述第二训练样本集的第二概率;
    根据所述第一概率、所述第二概率和判别网络损失函数,得到所述判别网络损失系数。
  35. 根据权利要33所述的装置,其特征在于,所述生成网络损失系数生成模块,还用于:
    将所述生成样本集输入至所述更新后的判别网络模型,得到所述生成样本集对应的第三 概率;
    根据所述第三概率和生成网络损失函数,得到所述生成网络损失系数。
  36. 根据权利要求32所述的装置,其特征在于,所述装置还包括:
    特征误差生成模块,用于在所述更新迭代模块根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型之前,分别将所述生成样本集和所述第二训练样本集输入至特征网络模型,得到所述生成样本集和所述第二训练样本集之间的特征误差;
    所述更新迭代模块,还用于根据所述生成网络损失系数和所述特征误差更新所述生成网络模型的参数,得到更新后的生成网络模型。
  37. 根据权利要求32所述的装置,其特征在于,所述装置还包括:
    像素误差生成模块,用于在所述更新迭代模块根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型之前,对所述生成样本集和所述第二训练样本集的像素进行分析,得到所述生成样本集和所述第二训练样本集之间的像素误差;
    所述更新迭代模块,还用于根据所述生成网络损失系数和所述像素误差更新所述生成网络模型的参数,得到更新后的生成网络模型。
  38. 根据权利要求32至37中任一项所述的装置,其特征在于,所述更新迭代模块,还用于:
    根据所述生成网络损失系数更新所述生成网络模型的参数,得到更新后的生成网络模型;
    获取当前迭代次数;
    当所述迭代次数小于预设的迭代次数阈值时,返回至所述将所述第一训练样本集输入至生成对抗网络中的生成网络模型,得到去除眼镜后的生成样本集;
    当所述迭代次数达到预设的迭代次数阈值时,将所述更新后的生成网络模型作为所述眼镜去除模型。
  39. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至19中任一项所述方法的步骤。
  40. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至19中任一项所述的方法的步骤。
PCT/CN2019/085031 2018-06-11 2019-04-29 图像处理方法、人脸识别方法、装置和计算机设备 WO2019237846A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/991,878 US11403876B2 (en) 2018-06-11 2020-08-12 Image processing method and apparatus, facial recognition method and apparatus, and computer device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810594760.8 2018-06-11
CN201810594760.8A CN108846355B (zh) 2018-06-11 2018-06-11 图像处理方法、人脸识别方法、装置和计算机设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/991,878 Continuation US11403876B2 (en) 2018-06-11 2020-08-12 Image processing method and apparatus, facial recognition method and apparatus, and computer device

Publications (1)

Publication Number Publication Date
WO2019237846A1 true WO2019237846A1 (zh) 2019-12-19

Family

ID=64211497

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/085031 WO2019237846A1 (zh) 2018-06-11 2019-04-29 图像处理方法、人脸识别方法、装置和计算机设备

Country Status (3)

Country Link
US (1) US11403876B2 (zh)
CN (1) CN108846355B (zh)
WO (1) WO2019237846A1 (zh)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111210409A (zh) * 2019-12-30 2020-05-29 浙江大学 一种基于条件生成对抗网络的结构损伤识别方法
CN111241985A (zh) * 2020-01-08 2020-06-05 腾讯科技(深圳)有限公司 一种视频内容识别方法、装置、存储介质、以及电子设备
CN111241614A (zh) * 2019-12-30 2020-06-05 浙江大学 基于条件生成对抗网络模型的工程结构荷载反演方法
CN111325317A (zh) * 2020-01-21 2020-06-23 北京空间机电研究所 一种基于生成对抗网络的波前像差确定方法及装置
CN111340137A (zh) * 2020-03-26 2020-06-26 上海眼控科技股份有限公司 图像识别方法、装置及存储介质
CN111861952A (zh) * 2020-06-05 2020-10-30 北京嘀嘀无限科技发展有限公司 一种图像生成模型、卡号识别模型的训练方法及装置
CN112102193A (zh) * 2020-09-15 2020-12-18 北京金山云网络技术有限公司 图像增强网络的训练方法、图像处理方法及相关设备
CN112215840A (zh) * 2020-10-30 2021-01-12 上海商汤临港智能科技有限公司 图像检测、行驶控制方法、装置、电子设备及存储介质
CN112215180A (zh) * 2020-10-20 2021-01-12 腾讯科技(深圳)有限公司 一种活体检测方法及装置
CN112364827A (zh) * 2020-11-30 2021-02-12 腾讯科技(深圳)有限公司 人脸识别方法、装置、计算机设备和存储介质
CN112465115A (zh) * 2020-11-25 2021-03-09 科大讯飞股份有限公司 Gan网络压缩方法、装置、设备及存储介质
CN113191404A (zh) * 2021-04-16 2021-07-30 深圳数联天下智能科技有限公司 发型迁移模型训练方法、发型迁移方法及相关装置
CN113688799A (zh) * 2021-09-30 2021-11-23 合肥工业大学 一种基于改进深度卷积生成对抗网络的人脸表情识别方法
CN113820693A (zh) * 2021-09-20 2021-12-21 西北工业大学 基于生成对抗网络的均匀线列阵阵元失效校准方法
CN113888443A (zh) * 2021-10-21 2022-01-04 福州大学 一种基于自适应层实例归一化gan的演唱会拍摄方法
CN116863279A (zh) * 2023-09-01 2023-10-10 南京理工大学 用于移动端模型轻量化的基于可解释指导的模型蒸馏方法
CN112465115B (zh) * 2020-11-25 2024-05-31 科大讯飞股份有限公司 Gan网络压缩方法、装置、设备及存储介质

Families Citing this family (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407912B (zh) * 2016-08-31 2019-04-02 腾讯科技(深圳)有限公司 一种人脸验证的方法及装置
CN108846355B (zh) * 2018-06-11 2020-04-28 腾讯科技(深圳)有限公司 图像处理方法、人脸识别方法、装置和计算机设备
CN111274855B (zh) * 2018-12-05 2024-03-26 北京猎户星空科技有限公司 图像处理方法、装置、机器学习模型训练方法及装置
CN111414922B (zh) * 2019-01-07 2022-11-15 阿里巴巴集团控股有限公司 特征提取方法、图像处理方法、模型训练方法及装置
CN110021052B (zh) * 2019-04-11 2023-05-30 北京百度网讯科技有限公司 用于生成眼底图像生成模型的方法和装置
CN110135301B (zh) * 2019-04-30 2022-02-22 百度在线网络技术(北京)有限公司 交通牌识别方法、装置、设备和计算机可读介质
US11494616B2 (en) * 2019-05-09 2022-11-08 Shenzhen Malong Technologies Co., Ltd. Decoupling category-wise independence and relevance with self-attention for multi-label image classification
CN111986278B (zh) 2019-05-22 2024-02-06 富士通株式会社 图像编码装置、概率模型生成装置和图像压缩系统
CN110008940B (zh) * 2019-06-04 2020-02-11 深兰人工智能芯片研究院(江苏)有限公司 一种图像中移除目标物体的方法、装置及电子设备
CN110321805B (zh) * 2019-06-12 2021-08-10 华中科技大学 一种基于时序关系推理的动态表情识别方法
CN110569826B (zh) * 2019-09-18 2022-05-24 深圳市捷顺科技实业股份有限公司 一种人脸识别方法、装置、设备及介质
CN110675312B (zh) * 2019-09-24 2023-08-29 腾讯科技(深圳)有限公司 图像数据处理方法、装置、计算机设备以及存储介质
CN110752028A (zh) * 2019-10-21 2020-02-04 腾讯科技(深圳)有限公司 一种图像处理方法、装置、设备以及存储介质
KR20210059060A (ko) * 2019-11-13 2021-05-25 삼성디스플레이 주식회사 검출 장치
CN112836554A (zh) * 2019-11-25 2021-05-25 广东博智林机器人有限公司 图像校验模型的构建方法、图像校验方法和装置
CN110991325A (zh) * 2019-11-29 2020-04-10 腾讯科技(深圳)有限公司 一种模型训练的方法、图像识别的方法以及相关装置
CN113688840A (zh) * 2020-05-19 2021-11-23 武汉Tcl集团工业研究院有限公司 图像处理模型的生成方法、处理方法、存储介质及终端
CN111556278B (zh) * 2020-05-21 2022-02-01 腾讯科技(深圳)有限公司 一种视频处理的方法、视频展示的方法、装置及存储介质
CN111709497B (zh) * 2020-08-20 2020-11-20 腾讯科技(深圳)有限公司 一种信息处理方法、装置及计算机可读存储介质
US11263436B1 (en) 2020-08-27 2022-03-01 The Code Dating LLC Systems and methods for matching facial images to reference images
US11482041B2 (en) * 2020-10-21 2022-10-25 Adobe Inc. Identity obfuscation in images utilizing synthesized faces
CN112711984B (zh) * 2020-12-09 2022-04-12 北京航空航天大学 注视点定位方法、装置和电子设备
CN112561785B (zh) * 2020-12-21 2021-11-16 东华大学 基于风格迁移的丝绸文物图像数据扩充方法
CN112836623B (zh) * 2021-01-29 2024-04-16 北京农业智能装备技术研究中心 设施番茄农事决策辅助方法及装置
CN112991171B (zh) * 2021-03-08 2023-07-28 Oppo广东移动通信有限公司 图像处理方法、装置、电子设备及存储介质
CN113052068B (zh) * 2021-03-24 2024-04-30 深圳威富云数科技有限公司 图像处理方法、装置、计算机设备和存储介质
CN113191193B (zh) * 2021-03-30 2023-08-04 河海大学 一种基于图和格子的卷积方法
CN113205131A (zh) * 2021-04-28 2021-08-03 阿波罗智联(北京)科技有限公司 图像数据的处理方法、装置、路侧设备和云控平台
CN113033518B (zh) * 2021-05-25 2021-08-31 北京中科闻歌科技股份有限公司 图像检测方法、装置、电子设备及存储介质
CN113382243B (zh) * 2021-06-11 2022-11-01 上海壁仞智能科技有限公司 图像压缩方法、装置、电子设备和存储介质
CN113255831A (zh) * 2021-06-23 2021-08-13 长沙海信智能系统研究院有限公司 样本处理方法、装置、设备及计算机存储介质
CN113421191A (zh) * 2021-06-28 2021-09-21 Oppo广东移动通信有限公司 图像处理方法、装置、设备及存储介质
CN113486807B (zh) * 2021-07-08 2024-02-27 网易(杭州)网络有限公司 脸部的检测模型训练方法、识别方法、装置、介质和设备
CN113688873B (zh) * 2021-07-28 2023-08-22 华东师范大学 一种具有直观交互能力的矢量路网生成方法
CN114301850B (zh) * 2021-12-03 2024-03-15 成都中科微信息技术研究院有限公司 一种基于生成对抗网络与模型压缩的军用通信加密流量识别方法
CN114495222A (zh) * 2022-01-20 2022-05-13 杭州登虹科技有限公司 图像处理模型的构建方法与系统、图像处理方法及系统
CN114627005B (zh) * 2022-02-16 2024-04-12 武汉大学 一种雨密度分类引导的双阶段单幅图像去雨方法
CN115082288B (zh) * 2022-05-16 2023-04-07 西安电子科技大学 基于偏微分方程启发的sar图像到光学图像的转换方法
CN114758136B (zh) * 2022-06-13 2022-10-18 深圳比特微电子科技有限公司 目标去除模型建立方法、装置及可读存储介质
CN115272136B (zh) * 2022-09-27 2023-05-05 广州卓腾科技有限公司 基于大数据的证件照眼镜反光消除方法、装置、介质及设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085716A (zh) * 2017-05-24 2017-08-22 复旦大学 基于多任务生成对抗网络的跨视角步态识别方法
CN107220600A (zh) * 2017-05-17 2017-09-29 清华大学深圳研究生院 一种基于深度学习的图片生成方法及生成对抗网络
CN107392973A (zh) * 2017-06-06 2017-11-24 中国科学院自动化研究所 像素级手写体汉字自动生成方法、存储设备、处理装置
CN107679483A (zh) * 2017-09-27 2018-02-09 北京小米移动软件有限公司 号牌识别方法及装置
CN107945118A (zh) * 2017-10-30 2018-04-20 南京邮电大学 一种基于生成式对抗网络的人脸图像修复方法
CN108846355A (zh) * 2018-06-11 2018-11-20 腾讯科技(深圳)有限公司 图像处理方法、人脸识别方法、装置和计算机设备

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8363951B2 (en) * 2007-03-05 2013-01-29 DigitalOptics Corporation Europe Limited Face recognition training method and apparatus
CN109934062A (zh) * 2017-12-18 2019-06-25 比亚迪股份有限公司 眼镜摘除模型的训练方法、人脸识别方法、装置和设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220600A (zh) * 2017-05-17 2017-09-29 清华大学深圳研究生院 一种基于深度学习的图片生成方法及生成对抗网络
CN107085716A (zh) * 2017-05-24 2017-08-22 复旦大学 基于多任务生成对抗网络的跨视角步态识别方法
CN107392973A (zh) * 2017-06-06 2017-11-24 中国科学院自动化研究所 像素级手写体汉字自动生成方法、存储设备、处理装置
CN107679483A (zh) * 2017-09-27 2018-02-09 北京小米移动软件有限公司 号牌识别方法及装置
CN107945118A (zh) * 2017-10-30 2018-04-20 南京邮电大学 一种基于生成式对抗网络的人脸图像修复方法
CN108846355A (zh) * 2018-06-11 2018-11-20 腾讯科技(深圳)有限公司 图像处理方法、人脸识别方法、装置和计算机设备

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111241614A (zh) * 2019-12-30 2020-06-05 浙江大学 基于条件生成对抗网络模型的工程结构荷载反演方法
CN111241614B (zh) * 2019-12-30 2022-08-23 浙江大学 基于条件生成对抗网络模型的工程结构荷载反演方法
CN111210409B (zh) * 2019-12-30 2022-08-23 浙江大学 一种基于条件生成对抗网络的结构损伤识别方法
CN111210409A (zh) * 2019-12-30 2020-05-29 浙江大学 一种基于条件生成对抗网络的结构损伤识别方法
CN111241985A (zh) * 2020-01-08 2020-06-05 腾讯科技(深圳)有限公司 一种视频内容识别方法、装置、存储介质、以及电子设备
CN111241985B (zh) * 2020-01-08 2022-09-09 腾讯科技(深圳)有限公司 一种视频内容识别方法、装置、存储介质、以及电子设备
CN111325317B (zh) * 2020-01-21 2023-12-12 北京空间机电研究所 一种基于生成对抗网络的波前像差确定方法及装置
CN111325317A (zh) * 2020-01-21 2020-06-23 北京空间机电研究所 一种基于生成对抗网络的波前像差确定方法及装置
CN111340137A (zh) * 2020-03-26 2020-06-26 上海眼控科技股份有限公司 图像识别方法、装置及存储介质
CN111861952A (zh) * 2020-06-05 2020-10-30 北京嘀嘀无限科技发展有限公司 一种图像生成模型、卡号识别模型的训练方法及装置
CN111861952B (zh) * 2020-06-05 2023-11-24 北京嘀嘀无限科技发展有限公司 一种图像生成模型、卡号识别模型的训练方法及装置
CN112102193A (zh) * 2020-09-15 2020-12-18 北京金山云网络技术有限公司 图像增强网络的训练方法、图像处理方法及相关设备
CN112102193B (zh) * 2020-09-15 2024-01-23 北京金山云网络技术有限公司 图像增强网络的训练方法、图像处理方法及相关设备
CN112215180A (zh) * 2020-10-20 2021-01-12 腾讯科技(深圳)有限公司 一种活体检测方法及装置
CN112215180B (zh) * 2020-10-20 2024-05-07 腾讯科技(深圳)有限公司 一种活体检测方法及装置
CN112215840A (zh) * 2020-10-30 2021-01-12 上海商汤临港智能科技有限公司 图像检测、行驶控制方法、装置、电子设备及存储介质
CN112465115A (zh) * 2020-11-25 2021-03-09 科大讯飞股份有限公司 Gan网络压缩方法、装置、设备及存储介质
CN112465115B (zh) * 2020-11-25 2024-05-31 科大讯飞股份有限公司 Gan网络压缩方法、装置、设备及存储介质
CN112364827A (zh) * 2020-11-30 2021-02-12 腾讯科技(深圳)有限公司 人脸识别方法、装置、计算机设备和存储介质
CN112364827B (zh) * 2020-11-30 2023-11-10 腾讯科技(深圳)有限公司 人脸识别方法、装置、计算机设备和存储介质
CN113191404A (zh) * 2021-04-16 2021-07-30 深圳数联天下智能科技有限公司 发型迁移模型训练方法、发型迁移方法及相关装置
CN113191404B (zh) * 2021-04-16 2023-12-12 深圳数联天下智能科技有限公司 发型迁移模型训练方法、发型迁移方法及相关装置
CN113820693B (zh) * 2021-09-20 2023-06-23 西北工业大学 基于生成对抗网络的均匀线列阵阵元失效校准方法
CN113820693A (zh) * 2021-09-20 2021-12-21 西北工业大学 基于生成对抗网络的均匀线列阵阵元失效校准方法
CN113688799A (zh) * 2021-09-30 2021-11-23 合肥工业大学 一种基于改进深度卷积生成对抗网络的人脸表情识别方法
CN113888443A (zh) * 2021-10-21 2022-01-04 福州大学 一种基于自适应层实例归一化gan的演唱会拍摄方法
CN116863279B (zh) * 2023-09-01 2023-11-21 南京理工大学 用于移动端模型轻量化的基于可解释指导的模型蒸馏方法
CN116863279A (zh) * 2023-09-01 2023-10-10 南京理工大学 用于移动端模型轻量化的基于可解释指导的模型蒸馏方法

Also Published As

Publication number Publication date
CN108846355A (zh) 2018-11-20
US11403876B2 (en) 2022-08-02
US20200372243A1 (en) 2020-11-26
CN108846355B (zh) 2020-04-28

Similar Documents

Publication Publication Date Title
WO2019237846A1 (zh) 图像处理方法、人脸识别方法、装置和计算机设备
CN110569721B (zh) 识别模型训练方法、图像识别方法、装置、设备及介质
CN110399799B (zh) 图像识别和神经网络模型的训练方法、装置和系统
CN110222573B (zh) 人脸识别方法、装置、计算机设备及存储介质
CN110503076B (zh) 基于人工智能的视频分类方法、装置、设备和介质
EP3975039A1 (en) Masked face recognition
CN110765860A (zh) 摔倒判定方法、装置、计算机设备及存储介质
WO2020098257A1 (zh) 一种图像分类方法、装置及计算机可读存储介质
CN108985190B (zh) 目标识别方法和装置、电子设备、存储介质
CN110489951A (zh) 风险识别的方法、装置、计算机设备和存储介质
CN111709313B (zh) 基于局部和通道组合特征的行人重识别方法
CN111611873A (zh) 人脸替换检测方法及装置、电子设备、计算机存储介质
CN111191568A (zh) 翻拍图像识别方法、装置、设备及介质
CN113255557B (zh) 一种基于深度学习的视频人群情绪分析方法及系统
CN116580257A (zh) 特征融合模型训练及样本检索方法、装置和计算机设备
US20220327189A1 (en) Personalized biometric anti-spoofing protection using machine learning and enrollment data
CN112232397A (zh) 图像分类模型的知识蒸馏方法、装置和计算机设备
CN112232971A (zh) 反欺诈检测方法、装置、计算机设备和存储介质
Jain et al. An efficient image forgery detection using biorthogonal wavelet transform and improved relevance vector machine
CN113205002A (zh) 非受限视频监控的低清人脸识别方法、装置、设备及介质
CN112001285B (zh) 一种美颜图像的处理方法、装置、终端和介质
CN110717407A (zh) 基于唇语密码的人脸识别方法、装置及存储介质
CN116975828A (zh) 一种人脸融合攻击检测方法、装置、设备及存储介质
CN113762249A (zh) 图像攻击检测、图像攻击检测模型训练方法和装置
CN115797990A (zh) 图像分类、图像处理方法、装置和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19818906

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19818906

Country of ref document: EP

Kind code of ref document: A1