CN113435334B - Small target face recognition method based on deep learning - Google Patents

Small target face recognition method based on deep learning Download PDF

Info

Publication number
CN113435334B
CN113435334B CN202110718863.2A CN202110718863A CN113435334B CN 113435334 B CN113435334 B CN 113435334B CN 202110718863 A CN202110718863 A CN 202110718863A CN 113435334 B CN113435334 B CN 113435334B
Authority
CN
China
Prior art keywords
face image
teacher
pixel
network part
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110718863.2A
Other languages
Chinese (zh)
Other versions
CN113435334A (en
Inventor
宋尧哲
童官军
李宝清
袁晓兵
吴萌萌
舒子婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Institute of Microsystem and Information Technology of CAS
Original Assignee
Shanghai Institute of Microsystem and Information Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Institute of Microsystem and Information Technology of CAS filed Critical Shanghai Institute of Microsystem and Information Technology of CAS
Priority to CN202110718863.2A priority Critical patent/CN113435334B/en
Publication of CN113435334A publication Critical patent/CN113435334A/en
Application granted granted Critical
Publication of CN113435334B publication Critical patent/CN113435334B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a small target face recognition method based on deep learning, which comprises the following steps: constructing a high-to-low generation countermeasure network, and inputting the first pixel face image into the trained generation countermeasure network to obtain a second pixel face image close to a real scene; constructing a teacher-student network, training the teacher-student network by using the first pixel face image and the second pixel face image, and inputting the second pixel face image to be recognized into the trained teacher-student network to obtain a recognition result. The invention can improve the recognition capability of the small target face image.

Description

Small target face recognition method based on deep learning
Technical Field
The invention relates to the technical field of face recognition, in particular to a small-target face recognition method based on deep learning.
Background
Due to the development of deep learning, the face recognition field has been rapidly developed in recent years. At present, the optimal face recognition algorithm with front and high pixels has reached an accuracy of more than 99%. But most of these algorithms are only applicable in situations where identification of credentials is limited. One hotspot in face recognition research is small target face image recognition in a real environment. Mainly by several cases: caused by the relative movement of the defocused lens, the objective lens and the like; caused by a larger camera-to-face distance and a low spatial resolution camera sensor; due to low-scale compression settings, interlacing or other conditions; picture noise increases (e.g., illumination decreases upon acquisition).
In real environments, such as surveillance videos, algorithms that are currently excellent in front-side, high-pixel face images often suffer from significant performance loss. The reason is mainly that in the training process, the data of model learning is mostly face images with high pixels, but the problem of domain transfer can be generated by directly applying a model to the face images in the monitoring video.
At present, aiming at the face recognition of a small target, two main methods are provided: super-resolution (see fig. 1) and common feature subspace (see fig. 2). The super-resolution method is used for intuitively converting the small target face image into a high-pixel face image and then carrying out face recognition. The public feature subspace method is used for judging whether the high-pixel face image and the small-target face image are the same identity face image or not according to the distance by projecting the high-pixel face image and the small-target face image into the same space.
The super resolution method is more prone to enhancing visual characteristics in the super resolution process, and can introduce artificial characteristics to damage the identity recognition performance. The current common feature subspace method trains on high and low pixel face images at the same time, and can improve the recognition performance of the low pixel face images, but simultaneously damages the recognition performance of the high pixel face images.
On the other hand, most of the current small-target face recognition algorithms adopt a direct downsampling mode when making high-low pixel image pairs, so that a model is trained on the downsampled small-target face images, but the small-target face images are tested in a real environment, and the problem of domain transfer still exists. And makes downsampling the face image a difficult sample during training.
Disclosure of Invention
The invention aims to solve the technical problem of providing a small target face recognition method based on deep learning, which can improve the recognition capability of small target face images.
The technical scheme adopted for solving the technical problems is as follows: the small target face recognition method based on deep learning comprises the following steps:
(1) Constructing a high-to-low generation countermeasure network, and inputting a first pixel face image into the trained generation countermeasure network to obtain a second pixel face image close to a real scene, wherein the pixel resolution of the first pixel face image is larger than that of the second pixel face image;
(2) Constructing a teacher-student network, and training the teacher-student network through the first pixel face image and the second pixel face image to obtain a trained teacher-student network, wherein the teacher-student network comprises a teacher network part and a student network part, and the teacher network part trains the student network part in a knowledge distillation mode, so that the student network part obtains the accuracy rate close to the teacher network part;
(3) And inputting the second pixel face image to be recognized into the trained teacher-student network to obtain a recognition result.
The generating countermeasure network in the step (1) comprises a generator and a discriminator, wherein the generator is used for generating the second pixel face image according to the first pixel face image and a random vector; and the discriminator judges whether the image input into the discriminator is a small target face image collected in a natural monitoring environment or not through deep learning.
The penalty function generated in step (1) against the network consists of pixel level penalty and arbiter penalty.
The teacher network part and the student network part in the step (2) adopt the same model.
The teacher network part in the step (2) adopts the first pixel face image to pretrain and fix parameters; the student network part inputs a mixed face image, wherein the mixed face image is the corresponding first pixel face image and second pixel face image; the first pixel face image of the teacher network part and the mixed face image of the student network part are input into a trainable classification layer together for training, and the student network part obtains the accuracy rate close to the teacher network part by minimizing the output distance between the teacher network part and the student network part and sharing the parameters of the trainable classification layer.
The loss function of the teacher-student network is vector q pre Sum vector p pre Distance from each other and vector q pre Sum vector p t The sum of the distances between, wherein the vector q pre For the vector p obtained when the first pixel face image passes through the teacher network part and fixes parameters pre Sum vector p t The first pixel face image passing through the teacher network part and the mixed face image passing through the student network part are input into a trainable classification layer together to obtain a vector after training.
The teacher-student network in step (2) trains simultaneously with the cascade of the generation countermeasure network.
Advantageous effects
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: according to the invention, a high-to-low generation countermeasure network is established, and the high-pixel face image is converted into the small target face image with low pixels through learning, so that the direct downsampling method commonly used at present is replaced, the problems of difficult sample and domain transfer in the training process are eliminated, the subsequent model is learned in the small target face image in the real environment and tested in the same domain, and the model recognition rate is enhanced. On the other hand, through a teacher-student network, the network learns the pixel public feature subspace, and the problem that the recognition rate of the current small target face image algorithm on a high-pixel face image is reduced by minimizing the cosine distance output by the teacher and student networks and sharing the Softmax layer parameter between the two networks is solved. The invention has the advantages of high recognition rate, strong environment adaptability, reliable performance and the like.
Drawings
Fig. 1 is a flowchart of a small target face recognition method based on a conventional super resolution method;
FIG. 2 is a flow chart of a small target face recognition method based on a traditional common feature subspace method;
FIG. 3 is a flow chart of an embodiment of the present invention;
figure 4 is a flow chart of a GAN network constructed in an embodiment of the invention;
fig. 5 is a flow chart of a teacher-student network constructed in an embodiment of the present invention.
Detailed Description
The invention will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention. Further, it is understood that various changes and modifications may be made by those skilled in the art after reading the teachings of the present invention, and such equivalents are intended to fall within the scope of the claims appended hereto.
The embodiment of the invention relates to a small target face recognition method based on deep learning, which is shown in fig. 3 and comprises the following steps: constructing a high-to-low generation countermeasure network (GAN network for short), and inputting a first pixel face image into the trained generation countermeasure network to obtain a second pixel face image close to a real scene, wherein the pixel resolution of the first pixel face image is larger than that of the second pixel face image; constructing a teacher-student network, and training the teacher-student network through the first pixel face image and the second pixel face image to obtain a trained teacher-student network, wherein the teacher-student network comprises a teacher network part and a student network part, and the teacher network part trains the student network part in a knowledge distillation mode, so that the student network part obtains the accuracy rate close to the teacher network part; and inputting the second pixel face image to be recognized into the trained teacher-student network to obtain a recognition result.
In the embodiment, a CasiaWebFace training set and a TinyFace training set are selected as training sets, and a TinyFace testing set is selected as a testing set. The CasiaWebFace training set was cut 64 x 64.
And establishing a GAN network with high to low pixels, and training the network to obtain a small target face image close to a real scene, wherein the network flow is shown in figure 4. The GAN network of the present embodiment includes a generator and a arbiter, where the generator is configured to generate the second pixel face image according to the first pixel face image and a random vector; and the discriminator judges whether the image input into the discriminator is a small target face image collected in a natural monitoring environment or not through deep learning.
Specifically, the training set after clipping and a random vector are input into the generator. The random vector is a gaussian random vector of 1 x 64, passes through a full connected layer, and changes size to 64 x 64, and then is stitched to the input picture. The generator outputs a corresponding small target face image with 16 pixels by 16 pixels, and then the small target face image and the Tinyface training set are input into the discriminator together, so that the discriminator learns and discriminates the small target face image in the real scene.
The loss function of the GAN network consists of pixel level loss and arbiter loss, with α set to 1 and β set to 0.05 in this embodiment.
l GAN =αl pixel +βl g (1)
The loss of the discriminator is as follows:
wherein P is r Refers to a small target face image in a natural environment, namely a Tinyface training set. P (P) g Is a picture generated by a generator, D (x) is the characteristic obtained by a discriminator of a small target face image in a natural environment,is the feature of the generator picture obtained by the discriminator, E [. Cndot.]Indicating the desire.
The pixel level loss is:
wherein W, H is the length, width of the picture produced by the generator. F is let high pixel face image I hr Conversion to a function of the same size as the generator generates the picture, i.e. 64 x 64 g 0 To let a high pixel face image I hr And the probability distribution of the small target face image is the same as that of the small target face image in the natural environment. In this embodiment, the average pooling is used.
Step three: a teacher-student network is established, and the network flow is shown in fig. 5. The system comprises a teacher network part and a student network part, wherein the teacher network part adopts a first pixel face image to perform pre-training and fix parameters; the student network part inputs a mixed face image, wherein the mixed face image is a first pixel face image and a second pixel face image which correspond to each other; the first pixel face image through the teacher network part and the mixed face image through the student network part are input into a trainable classification layer together for training, and the student network part obtains the accuracy rate close to the teacher network part by minimizing the output distance between the teacher network part and the student network part and sharing the parameters of the trainable classification layer.
In this embodiment, the teacher network portion and the student network portion are both ResNet34. The weights of the teacher network part and the fixed Softmax layer weights are pre-trained by the CasiaWebFace dataset training set, and the fixed weights are unchanged during training and only serve as a feature extractor.
During training, 64 x 64 high-pixel face images are input to a teacher network part, and corresponding 16 x 16 and 64 x 64 high-low-pixel mixed face images are input to a student network part, wherein the 16 x 16 low-pixel face images are generated by a high-low-pixel GAN network.
The high-pixel face image passes through the teacher network part and then fixes the Softmax layer to obtain a vector q pre Then, the high-pixel face image through the teacher network part and the high-low-pixel mixed face image through the student network part are input into a trainable Softmax layer together to respectively obtain vectors p pre And p t
The teacher-schoolThe loss function of the raw network is q pre And p pre Distance between and q pre And p t The distance between them. The present embodiment employs a cosine distance metric function:
the teacher-student network loss function is: l (L) ts =sim(q pre ,p pre )+sim(q pre ,p l )。
During training, the teacher student network and the GAN network in the step two are cascaded, and training is performed simultaneously.
After training the GAN network and the teacher-student network, only a small target face image is input into the student network part, and an original pixel image to be compared is input into the teacher network part, so that a final recognition result can be obtained.
It is easy to find that the invention replaces the direct downsampling method commonly used at present by establishing a high-to-low generation countermeasure network and converting the high-pixel face image into the small-pixel target face image through learning, eliminates the difficult sample and domain transfer problems encountered in the training process, and enables the subsequent model to learn in the small-target face image in the real environment and test in the same domain, thereby enhancing the model recognition rate. On the other hand, through a teacher-student network, the network learns the pixel public feature subspace, and the problem that the recognition rate of the current small target face image algorithm on a high-pixel face image is reduced by minimizing the cosine distance output by the teacher and student networks and sharing the Softmax layer parameter between the two networks is solved. The invention has the advantages of high recognition rate, strong environment adaptability, reliable performance and the like.

Claims (5)

1. The small target face recognition method based on deep learning is characterized by comprising the following steps of:
(1) Constructing a high-to-low generation countermeasure network, and inputting a first pixel face image into the trained generation countermeasure network to obtain a second pixel face image close to a real scene, wherein the pixel resolution of the first pixel face image is larger than that of the second pixel face image;
(2) Constructing a teacher-student network, and training the teacher-student network through the first pixel face image and the second pixel face image to obtain a trained teacher-student network, wherein the teacher-student network comprises a teacher network part and a student network part, and the teacher network part trains the student network part in a knowledge distillation mode, so that the student network part obtains the accuracy rate close to the teacher network part; the teacher network part adopts the first pixel face image to pretrain and fix parameters; the student network part inputs a mixed face image, wherein the mixed face image is the corresponding first pixel face image and second pixel face image; inputting a first pixel face image through the teacher network part and a mixed face image through the student network part together into a trainable classification layer for training, and enabling the student network part to obtain accuracy rate close to that of the teacher network part by minimizing the output distance between the teacher network part and the student network part and sharing parameters of the trainable classification layer; the loss function of the teacher-student network is vector q pre Sum vector p pre Distance from each other and vector q pre Sum vector p t The sum of the distances between, wherein the vector q pre For the vector p obtained when the first pixel face image passes through the teacher network part and fixes parameters pre Sum vector p t The first pixel face image passing through the teacher network part and the mixed face image passing through the student network part are input into a trainable classification layer together to obtain a vector after training;
(3) And inputting the second pixel face image to be recognized into the trained teacher-student network to obtain a recognition result.
2. The deep learning based small target face recognition method of claim 1 wherein the generating an countermeasure network in step (1) includes a generator and a arbiter, the generator for generating the second pixel face image from the first pixel face image and a random vector; and the discriminator judges whether the image input into the discriminator is a small target face image collected in a natural monitoring environment or not through deep learning.
3. The deep learning based small target face recognition method of claim 2 wherein the penalty function of generating the countermeasure network in step (1) consists of pixel level penalty and discriminant penalty.
4. The deep learning-based small target face recognition method of claim 1, wherein the teacher network part and the student network part in the step (2) use the same model.
5. The deep learning based small target face recognition method of claim 1, wherein the teacher-student network in step (2) trains simultaneously with the generating of the challenge network cascade.
CN202110718863.2A 2021-06-28 2021-06-28 Small target face recognition method based on deep learning Active CN113435334B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110718863.2A CN113435334B (en) 2021-06-28 2021-06-28 Small target face recognition method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110718863.2A CN113435334B (en) 2021-06-28 2021-06-28 Small target face recognition method based on deep learning

Publications (2)

Publication Number Publication Date
CN113435334A CN113435334A (en) 2021-09-24
CN113435334B true CN113435334B (en) 2024-02-27

Family

ID=77754911

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110718863.2A Active CN113435334B (en) 2021-06-28 2021-06-28 Small target face recognition method based on deep learning

Country Status (1)

Country Link
CN (1) CN113435334B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108805840A (en) * 2018-06-11 2018-11-13 Oppo(重庆)智能科技有限公司 Method, apparatus, terminal and the computer readable storage medium of image denoising
CN109063565A (en) * 2018-06-29 2018-12-21 中国科学院信息工程研究所 A kind of low resolution face identification method and device
CN109145958A (en) * 2018-07-27 2019-01-04 哈尔滨工业大学 A kind of real scene wisp detection method generating confrontation network based on multitask
CN110689482A (en) * 2019-09-18 2020-01-14 中国科学技术大学 Face super-resolution method based on supervised pixel-by-pixel generation countermeasure network
CN111160533A (en) * 2019-12-31 2020-05-15 中山大学 Neural network acceleration method based on cross-resolution knowledge distillation
CN111340708A (en) * 2020-03-02 2020-06-26 北京理工大学 Method for rapidly generating high-resolution complete face image according to prior information
CN111461226A (en) * 2020-04-01 2020-07-28 深圳前海微众银行股份有限公司 Countermeasure sample generation method, device, terminal and readable storage medium
CN112163998A (en) * 2020-09-24 2021-01-01 肇庆市博士芯电子科技有限公司 Single-image super-resolution analysis method matched with natural degradation conditions
CN112368719A (en) * 2018-05-17 2021-02-12 奇跃公司 Gradient antagonism training of neural networks

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112368719A (en) * 2018-05-17 2021-02-12 奇跃公司 Gradient antagonism training of neural networks
CN108805840A (en) * 2018-06-11 2018-11-13 Oppo(重庆)智能科技有限公司 Method, apparatus, terminal and the computer readable storage medium of image denoising
CN109063565A (en) * 2018-06-29 2018-12-21 中国科学院信息工程研究所 A kind of low resolution face identification method and device
CN109145958A (en) * 2018-07-27 2019-01-04 哈尔滨工业大学 A kind of real scene wisp detection method generating confrontation network based on multitask
CN110689482A (en) * 2019-09-18 2020-01-14 中国科学技术大学 Face super-resolution method based on supervised pixel-by-pixel generation countermeasure network
CN111160533A (en) * 2019-12-31 2020-05-15 中山大学 Neural network acceleration method based on cross-resolution knowledge distillation
CN111340708A (en) * 2020-03-02 2020-06-26 北京理工大学 Method for rapidly generating high-resolution complete face image according to prior information
CN111461226A (en) * 2020-04-01 2020-07-28 深圳前海微众银行股份有限公司 Countermeasure sample generation method, device, terminal and readable storage medium
CN112163998A (en) * 2020-09-24 2021-01-01 肇庆市博士芯电子科技有限公司 Single-image super-resolution analysis method matched with natural degradation conditions

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Research on Knowledge Distillation of Generative Adversarial Networks;Wei Wang, Baohua Zhang;《2021 Data Compression Conference》;全文 *

Also Published As

Publication number Publication date
CN113435334A (en) 2021-09-24

Similar Documents

Publication Publication Date Title
CN108986050B (en) Image and video enhancement method based on multi-branch convolutional neural network
CN113688723B (en) Infrared image pedestrian target detection method based on improved YOLOv5
CN109635661B (en) Far-field wireless charging receiving target detection method based on convolutional neural network
CN107123091A (en) A kind of near-infrared face image super-resolution reconstruction method based on deep learning
CN110210498B (en) Digital image equipment evidence obtaining system based on residual learning convolution fusion network
CN110852964A (en) Image bit enhancement method based on deep learning
CN112465727A (en) Low-illumination image enhancement method without normal illumination reference based on HSV color space and Retinex theory
CN113610732B (en) Full-focus image generation method based on interactive countermeasure learning
CN111291669A (en) Two-channel depression angle human face fusion correction GAN network and human face fusion correction method
Zhou et al. Deep multi-scale features learning for distorted image quality assessment
CN114022823A (en) Shielding-driven pedestrian re-identification method and system and storable medium
Xue et al. Research on gan-based image super-resolution method
CN117635428A (en) Super-resolution reconstruction method for lung CT image
CN116309483A (en) DDPM-based semi-supervised power transformation equipment characterization defect detection method and system
CN113807214B (en) Small target face recognition method based on deit affiliated network knowledge distillation
CN105389820A (en) Infrared image definition evaluating method based on cepstrum
CN113435334B (en) Small target face recognition method based on deep learning
CN113378672A (en) Multi-target detection method for defects of power transmission line based on improved YOLOv3
CN110223273B (en) Image restoration evidence obtaining method combining discrete cosine transform and neural network
Li et al. An improved method for underwater image super-resolution and enhancement
CN116862779A (en) Real image self-supervision denoising method and system
CN113379001B (en) Processing method and device for image recognition model
CN114005157B (en) Micro-expression recognition method for pixel displacement vector based on convolutional neural network
CN114596219B (en) Image motion blur removing method based on condition generation countermeasure network
Luo et al. Maximum a posteriori on a submanifold: a general image restoration method with gan

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant