CN113435334B - Small target face recognition method based on deep learning - Google Patents
Small target face recognition method based on deep learning Download PDFInfo
- Publication number
- CN113435334B CN113435334B CN202110718863.2A CN202110718863A CN113435334B CN 113435334 B CN113435334 B CN 113435334B CN 202110718863 A CN202110718863 A CN 202110718863A CN 113435334 B CN113435334 B CN 113435334B
- Authority
- CN
- China
- Prior art keywords
- face image
- teacher
- pixel
- network part
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000013135 deep learning Methods 0.000 title claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 28
- 239000013598 vector Substances 0.000 claims description 24
- 230000006870 function Effects 0.000 claims description 9
- 238000012544 monitoring process Methods 0.000 claims description 4
- 238000013140 knowledge distillation Methods 0.000 claims description 3
- 238000012546 transfer Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 3
- 230000002708 enhancing effect Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a small target face recognition method based on deep learning, which comprises the following steps: constructing a high-to-low generation countermeasure network, and inputting the first pixel face image into the trained generation countermeasure network to obtain a second pixel face image close to a real scene; constructing a teacher-student network, training the teacher-student network by using the first pixel face image and the second pixel face image, and inputting the second pixel face image to be recognized into the trained teacher-student network to obtain a recognition result. The invention can improve the recognition capability of the small target face image.
Description
Technical Field
The invention relates to the technical field of face recognition, in particular to a small-target face recognition method based on deep learning.
Background
Due to the development of deep learning, the face recognition field has been rapidly developed in recent years. At present, the optimal face recognition algorithm with front and high pixels has reached an accuracy of more than 99%. But most of these algorithms are only applicable in situations where identification of credentials is limited. One hotspot in face recognition research is small target face image recognition in a real environment. Mainly by several cases: caused by the relative movement of the defocused lens, the objective lens and the like; caused by a larger camera-to-face distance and a low spatial resolution camera sensor; due to low-scale compression settings, interlacing or other conditions; picture noise increases (e.g., illumination decreases upon acquisition).
In real environments, such as surveillance videos, algorithms that are currently excellent in front-side, high-pixel face images often suffer from significant performance loss. The reason is mainly that in the training process, the data of model learning is mostly face images with high pixels, but the problem of domain transfer can be generated by directly applying a model to the face images in the monitoring video.
At present, aiming at the face recognition of a small target, two main methods are provided: super-resolution (see fig. 1) and common feature subspace (see fig. 2). The super-resolution method is used for intuitively converting the small target face image into a high-pixel face image and then carrying out face recognition. The public feature subspace method is used for judging whether the high-pixel face image and the small-target face image are the same identity face image or not according to the distance by projecting the high-pixel face image and the small-target face image into the same space.
The super resolution method is more prone to enhancing visual characteristics in the super resolution process, and can introduce artificial characteristics to damage the identity recognition performance. The current common feature subspace method trains on high and low pixel face images at the same time, and can improve the recognition performance of the low pixel face images, but simultaneously damages the recognition performance of the high pixel face images.
On the other hand, most of the current small-target face recognition algorithms adopt a direct downsampling mode when making high-low pixel image pairs, so that a model is trained on the downsampled small-target face images, but the small-target face images are tested in a real environment, and the problem of domain transfer still exists. And makes downsampling the face image a difficult sample during training.
Disclosure of Invention
The invention aims to solve the technical problem of providing a small target face recognition method based on deep learning, which can improve the recognition capability of small target face images.
The technical scheme adopted for solving the technical problems is as follows: the small target face recognition method based on deep learning comprises the following steps:
(1) Constructing a high-to-low generation countermeasure network, and inputting a first pixel face image into the trained generation countermeasure network to obtain a second pixel face image close to a real scene, wherein the pixel resolution of the first pixel face image is larger than that of the second pixel face image;
(2) Constructing a teacher-student network, and training the teacher-student network through the first pixel face image and the second pixel face image to obtain a trained teacher-student network, wherein the teacher-student network comprises a teacher network part and a student network part, and the teacher network part trains the student network part in a knowledge distillation mode, so that the student network part obtains the accuracy rate close to the teacher network part;
(3) And inputting the second pixel face image to be recognized into the trained teacher-student network to obtain a recognition result.
The generating countermeasure network in the step (1) comprises a generator and a discriminator, wherein the generator is used for generating the second pixel face image according to the first pixel face image and a random vector; and the discriminator judges whether the image input into the discriminator is a small target face image collected in a natural monitoring environment or not through deep learning.
The penalty function generated in step (1) against the network consists of pixel level penalty and arbiter penalty.
The teacher network part and the student network part in the step (2) adopt the same model.
The teacher network part in the step (2) adopts the first pixel face image to pretrain and fix parameters; the student network part inputs a mixed face image, wherein the mixed face image is the corresponding first pixel face image and second pixel face image; the first pixel face image of the teacher network part and the mixed face image of the student network part are input into a trainable classification layer together for training, and the student network part obtains the accuracy rate close to the teacher network part by minimizing the output distance between the teacher network part and the student network part and sharing the parameters of the trainable classification layer.
The loss function of the teacher-student network is vector q pre Sum vector p pre Distance from each other and vector q pre Sum vector p t The sum of the distances between, wherein the vector q pre For the vector p obtained when the first pixel face image passes through the teacher network part and fixes parameters pre Sum vector p t The first pixel face image passing through the teacher network part and the mixed face image passing through the student network part are input into a trainable classification layer together to obtain a vector after training.
The teacher-student network in step (2) trains simultaneously with the cascade of the generation countermeasure network.
Advantageous effects
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: according to the invention, a high-to-low generation countermeasure network is established, and the high-pixel face image is converted into the small target face image with low pixels through learning, so that the direct downsampling method commonly used at present is replaced, the problems of difficult sample and domain transfer in the training process are eliminated, the subsequent model is learned in the small target face image in the real environment and tested in the same domain, and the model recognition rate is enhanced. On the other hand, through a teacher-student network, the network learns the pixel public feature subspace, and the problem that the recognition rate of the current small target face image algorithm on a high-pixel face image is reduced by minimizing the cosine distance output by the teacher and student networks and sharing the Softmax layer parameter between the two networks is solved. The invention has the advantages of high recognition rate, strong environment adaptability, reliable performance and the like.
Drawings
Fig. 1 is a flowchart of a small target face recognition method based on a conventional super resolution method;
FIG. 2 is a flow chart of a small target face recognition method based on a traditional common feature subspace method;
FIG. 3 is a flow chart of an embodiment of the present invention;
figure 4 is a flow chart of a GAN network constructed in an embodiment of the invention;
fig. 5 is a flow chart of a teacher-student network constructed in an embodiment of the present invention.
Detailed Description
The invention will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention. Further, it is understood that various changes and modifications may be made by those skilled in the art after reading the teachings of the present invention, and such equivalents are intended to fall within the scope of the claims appended hereto.
The embodiment of the invention relates to a small target face recognition method based on deep learning, which is shown in fig. 3 and comprises the following steps: constructing a high-to-low generation countermeasure network (GAN network for short), and inputting a first pixel face image into the trained generation countermeasure network to obtain a second pixel face image close to a real scene, wherein the pixel resolution of the first pixel face image is larger than that of the second pixel face image; constructing a teacher-student network, and training the teacher-student network through the first pixel face image and the second pixel face image to obtain a trained teacher-student network, wherein the teacher-student network comprises a teacher network part and a student network part, and the teacher network part trains the student network part in a knowledge distillation mode, so that the student network part obtains the accuracy rate close to the teacher network part; and inputting the second pixel face image to be recognized into the trained teacher-student network to obtain a recognition result.
In the embodiment, a CasiaWebFace training set and a TinyFace training set are selected as training sets, and a TinyFace testing set is selected as a testing set. The CasiaWebFace training set was cut 64 x 64.
And establishing a GAN network with high to low pixels, and training the network to obtain a small target face image close to a real scene, wherein the network flow is shown in figure 4. The GAN network of the present embodiment includes a generator and a arbiter, where the generator is configured to generate the second pixel face image according to the first pixel face image and a random vector; and the discriminator judges whether the image input into the discriminator is a small target face image collected in a natural monitoring environment or not through deep learning.
Specifically, the training set after clipping and a random vector are input into the generator. The random vector is a gaussian random vector of 1 x 64, passes through a full connected layer, and changes size to 64 x 64, and then is stitched to the input picture. The generator outputs a corresponding small target face image with 16 pixels by 16 pixels, and then the small target face image and the Tinyface training set are input into the discriminator together, so that the discriminator learns and discriminates the small target face image in the real scene.
The loss function of the GAN network consists of pixel level loss and arbiter loss, with α set to 1 and β set to 0.05 in this embodiment.
l GAN =αl pixel +βl g (1)
The loss of the discriminator is as follows:
wherein P is r Refers to a small target face image in a natural environment, namely a Tinyface training set. P (P) g Is a picture generated by a generator, D (x) is the characteristic obtained by a discriminator of a small target face image in a natural environment,is the feature of the generator picture obtained by the discriminator, E [. Cndot.]Indicating the desire.
The pixel level loss is:
wherein W, H is the length, width of the picture produced by the generator. F is let high pixel face image I hr Conversion to a function of the same size as the generator generates the picture, i.e. 64 x 64 g 0 To let a high pixel face image I hr And the probability distribution of the small target face image is the same as that of the small target face image in the natural environment. In this embodiment, the average pooling is used.
Step three: a teacher-student network is established, and the network flow is shown in fig. 5. The system comprises a teacher network part and a student network part, wherein the teacher network part adopts a first pixel face image to perform pre-training and fix parameters; the student network part inputs a mixed face image, wherein the mixed face image is a first pixel face image and a second pixel face image which correspond to each other; the first pixel face image through the teacher network part and the mixed face image through the student network part are input into a trainable classification layer together for training, and the student network part obtains the accuracy rate close to the teacher network part by minimizing the output distance between the teacher network part and the student network part and sharing the parameters of the trainable classification layer.
In this embodiment, the teacher network portion and the student network portion are both ResNet34. The weights of the teacher network part and the fixed Softmax layer weights are pre-trained by the CasiaWebFace dataset training set, and the fixed weights are unchanged during training and only serve as a feature extractor.
During training, 64 x 64 high-pixel face images are input to a teacher network part, and corresponding 16 x 16 and 64 x 64 high-low-pixel mixed face images are input to a student network part, wherein the 16 x 16 low-pixel face images are generated by a high-low-pixel GAN network.
The high-pixel face image passes through the teacher network part and then fixes the Softmax layer to obtain a vector q pre Then, the high-pixel face image through the teacher network part and the high-low-pixel mixed face image through the student network part are input into a trainable Softmax layer together to respectively obtain vectors p pre And p t 。
The teacher-schoolThe loss function of the raw network is q pre And p pre Distance between and q pre And p t The distance between them. The present embodiment employs a cosine distance metric function:
the teacher-student network loss function is: l (L) ts =sim(q pre ,p pre )+sim(q pre ,p l )。
During training, the teacher student network and the GAN network in the step two are cascaded, and training is performed simultaneously.
After training the GAN network and the teacher-student network, only a small target face image is input into the student network part, and an original pixel image to be compared is input into the teacher network part, so that a final recognition result can be obtained.
It is easy to find that the invention replaces the direct downsampling method commonly used at present by establishing a high-to-low generation countermeasure network and converting the high-pixel face image into the small-pixel target face image through learning, eliminates the difficult sample and domain transfer problems encountered in the training process, and enables the subsequent model to learn in the small-target face image in the real environment and test in the same domain, thereby enhancing the model recognition rate. On the other hand, through a teacher-student network, the network learns the pixel public feature subspace, and the problem that the recognition rate of the current small target face image algorithm on a high-pixel face image is reduced by minimizing the cosine distance output by the teacher and student networks and sharing the Softmax layer parameter between the two networks is solved. The invention has the advantages of high recognition rate, strong environment adaptability, reliable performance and the like.
Claims (5)
1. The small target face recognition method based on deep learning is characterized by comprising the following steps of:
(1) Constructing a high-to-low generation countermeasure network, and inputting a first pixel face image into the trained generation countermeasure network to obtain a second pixel face image close to a real scene, wherein the pixel resolution of the first pixel face image is larger than that of the second pixel face image;
(2) Constructing a teacher-student network, and training the teacher-student network through the first pixel face image and the second pixel face image to obtain a trained teacher-student network, wherein the teacher-student network comprises a teacher network part and a student network part, and the teacher network part trains the student network part in a knowledge distillation mode, so that the student network part obtains the accuracy rate close to the teacher network part; the teacher network part adopts the first pixel face image to pretrain and fix parameters; the student network part inputs a mixed face image, wherein the mixed face image is the corresponding first pixel face image and second pixel face image; inputting a first pixel face image through the teacher network part and a mixed face image through the student network part together into a trainable classification layer for training, and enabling the student network part to obtain accuracy rate close to that of the teacher network part by minimizing the output distance between the teacher network part and the student network part and sharing parameters of the trainable classification layer; the loss function of the teacher-student network is vector q pre Sum vector p pre Distance from each other and vector q pre Sum vector p t The sum of the distances between, wherein the vector q pre For the vector p obtained when the first pixel face image passes through the teacher network part and fixes parameters pre Sum vector p t The first pixel face image passing through the teacher network part and the mixed face image passing through the student network part are input into a trainable classification layer together to obtain a vector after training;
(3) And inputting the second pixel face image to be recognized into the trained teacher-student network to obtain a recognition result.
2. The deep learning based small target face recognition method of claim 1 wherein the generating an countermeasure network in step (1) includes a generator and a arbiter, the generator for generating the second pixel face image from the first pixel face image and a random vector; and the discriminator judges whether the image input into the discriminator is a small target face image collected in a natural monitoring environment or not through deep learning.
3. The deep learning based small target face recognition method of claim 2 wherein the penalty function of generating the countermeasure network in step (1) consists of pixel level penalty and discriminant penalty.
4. The deep learning-based small target face recognition method of claim 1, wherein the teacher network part and the student network part in the step (2) use the same model.
5. The deep learning based small target face recognition method of claim 1, wherein the teacher-student network in step (2) trains simultaneously with the generating of the challenge network cascade.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110718863.2A CN113435334B (en) | 2021-06-28 | 2021-06-28 | Small target face recognition method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110718863.2A CN113435334B (en) | 2021-06-28 | 2021-06-28 | Small target face recognition method based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113435334A CN113435334A (en) | 2021-09-24 |
CN113435334B true CN113435334B (en) | 2024-02-27 |
Family
ID=77754911
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110718863.2A Active CN113435334B (en) | 2021-06-28 | 2021-06-28 | Small target face recognition method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113435334B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108805840A (en) * | 2018-06-11 | 2018-11-13 | Oppo(重庆)智能科技有限公司 | Method, apparatus, terminal and the computer readable storage medium of image denoising |
CN109063565A (en) * | 2018-06-29 | 2018-12-21 | 中国科学院信息工程研究所 | A kind of low resolution face identification method and device |
CN109145958A (en) * | 2018-07-27 | 2019-01-04 | 哈尔滨工业大学 | A kind of real scene wisp detection method generating confrontation network based on multitask |
CN110689482A (en) * | 2019-09-18 | 2020-01-14 | 中国科学技术大学 | Face super-resolution method based on supervised pixel-by-pixel generation countermeasure network |
CN111160533A (en) * | 2019-12-31 | 2020-05-15 | 中山大学 | Neural network acceleration method based on cross-resolution knowledge distillation |
CN111340708A (en) * | 2020-03-02 | 2020-06-26 | 北京理工大学 | Method for rapidly generating high-resolution complete face image according to prior information |
CN111461226A (en) * | 2020-04-01 | 2020-07-28 | 深圳前海微众银行股份有限公司 | Countermeasure sample generation method, device, terminal and readable storage medium |
CN112163998A (en) * | 2020-09-24 | 2021-01-01 | 肇庆市博士芯电子科技有限公司 | Single-image super-resolution analysis method matched with natural degradation conditions |
CN112368719A (en) * | 2018-05-17 | 2021-02-12 | 奇跃公司 | Gradient antagonism training of neural networks |
-
2021
- 2021-06-28 CN CN202110718863.2A patent/CN113435334B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112368719A (en) * | 2018-05-17 | 2021-02-12 | 奇跃公司 | Gradient antagonism training of neural networks |
CN108805840A (en) * | 2018-06-11 | 2018-11-13 | Oppo(重庆)智能科技有限公司 | Method, apparatus, terminal and the computer readable storage medium of image denoising |
CN109063565A (en) * | 2018-06-29 | 2018-12-21 | 中国科学院信息工程研究所 | A kind of low resolution face identification method and device |
CN109145958A (en) * | 2018-07-27 | 2019-01-04 | 哈尔滨工业大学 | A kind of real scene wisp detection method generating confrontation network based on multitask |
CN110689482A (en) * | 2019-09-18 | 2020-01-14 | 中国科学技术大学 | Face super-resolution method based on supervised pixel-by-pixel generation countermeasure network |
CN111160533A (en) * | 2019-12-31 | 2020-05-15 | 中山大学 | Neural network acceleration method based on cross-resolution knowledge distillation |
CN111340708A (en) * | 2020-03-02 | 2020-06-26 | 北京理工大学 | Method for rapidly generating high-resolution complete face image according to prior information |
CN111461226A (en) * | 2020-04-01 | 2020-07-28 | 深圳前海微众银行股份有限公司 | Countermeasure sample generation method, device, terminal and readable storage medium |
CN112163998A (en) * | 2020-09-24 | 2021-01-01 | 肇庆市博士芯电子科技有限公司 | Single-image super-resolution analysis method matched with natural degradation conditions |
Non-Patent Citations (1)
Title |
---|
Research on Knowledge Distillation of Generative Adversarial Networks;Wei Wang, Baohua Zhang;《2021 Data Compression Conference》;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113435334A (en) | 2021-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108986050B (en) | Image and video enhancement method based on multi-branch convolutional neural network | |
CN113688723B (en) | Infrared image pedestrian target detection method based on improved YOLOv5 | |
CN109635661B (en) | Far-field wireless charging receiving target detection method based on convolutional neural network | |
CN107123091A (en) | A kind of near-infrared face image super-resolution reconstruction method based on deep learning | |
CN110210498B (en) | Digital image equipment evidence obtaining system based on residual learning convolution fusion network | |
CN110852964A (en) | Image bit enhancement method based on deep learning | |
CN112465727A (en) | Low-illumination image enhancement method without normal illumination reference based on HSV color space and Retinex theory | |
CN113610732B (en) | Full-focus image generation method based on interactive countermeasure learning | |
CN111291669A (en) | Two-channel depression angle human face fusion correction GAN network and human face fusion correction method | |
Zhou et al. | Deep multi-scale features learning for distorted image quality assessment | |
CN114022823A (en) | Shielding-driven pedestrian re-identification method and system and storable medium | |
Xue et al. | Research on gan-based image super-resolution method | |
CN117635428A (en) | Super-resolution reconstruction method for lung CT image | |
CN116309483A (en) | DDPM-based semi-supervised power transformation equipment characterization defect detection method and system | |
CN113807214B (en) | Small target face recognition method based on deit affiliated network knowledge distillation | |
CN105389820A (en) | Infrared image definition evaluating method based on cepstrum | |
CN113435334B (en) | Small target face recognition method based on deep learning | |
CN113378672A (en) | Multi-target detection method for defects of power transmission line based on improved YOLOv3 | |
CN110223273B (en) | Image restoration evidence obtaining method combining discrete cosine transform and neural network | |
Li et al. | An improved method for underwater image super-resolution and enhancement | |
CN116862779A (en) | Real image self-supervision denoising method and system | |
CN113379001B (en) | Processing method and device for image recognition model | |
CN114005157B (en) | Micro-expression recognition method for pixel displacement vector based on convolutional neural network | |
CN114596219B (en) | Image motion blur removing method based on condition generation countermeasure network | |
Luo et al. | Maximum a posteriori on a submanifold: a general image restoration method with gan |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |