CN114495210A - Posture change face recognition method based on attention mechanism - Google Patents

Posture change face recognition method based on attention mechanism Download PDF

Info

Publication number
CN114495210A
CN114495210A CN202210013502.2A CN202210013502A CN114495210A CN 114495210 A CN114495210 A CN 114495210A CN 202210013502 A CN202210013502 A CN 202210013502A CN 114495210 A CN114495210 A CN 114495210A
Authority
CN
China
Prior art keywords
module
attention mechanism
face recognition
attention
senet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210013502.2A
Other languages
Chinese (zh)
Inventor
张鹏
赵锋
张悦
李孟委
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nantong Institute for Advanced Study
North University of China
Original Assignee
Nantong Institute Of Intelligent Optics North China University
North University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nantong Institute Of Intelligent Optics North China University, North University of China filed Critical Nantong Institute Of Intelligent Optics North China University
Priority to CN202210013502.2A priority Critical patent/CN114495210A/en
Publication of CN114495210A publication Critical patent/CN114495210A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention discloses a face recognition method based on the attitude change of an attention mechanism, which comprises the steps of firstly labeling image data with a face in a natural scene, and dividing the image data into a training set and a testing set; then introducing a double-sided attention mechanism module LA-SENET into a face recognition system, and highlighting feature information with the most distinguishing property of the face under the posture change by the module; the introduction of the LANet module automatically locates the most distinctive facial regions and at the same time the introduction of the SENet module highlights the more important channels; then designing a Bottleneck-attention module of an inverted multi-scale residual error structure based on the MobileNet V2 to obtain three features with different scales; fusing the characteristics of different layers by adopting a multi-scale characteristic fusion method, carrying out channel splicing operation by using a Concatenate operation, and finally outputting through a full connection layer; the finally obtained local features are fused with the global information finally output by the network and output; the invention can effectively learn the face characteristics under the posture change and improve the face recognition accuracy.

Description

Posture change face recognition method based on attention mechanism
Technical Field
The invention relates to the technical field of pattern recognition, in particular to a posture change face recognition method based on an attention mechanism.
Background
As an important direction in computer vision, face recognition uses a computer technology to recognize face information in an image or a video, and extracts the most critical visual feature information as a specific feature from the face information, thereby finally distinguishing identity information. Although the accuracy of face recognition is high at present, the accuracy of most face recognition algorithms is relatively reduced when processing images with large posture change. This also means that pose changes become an important challenge in the field of face recognition, and therefore how to accurately recognize faces affected by different pose changes becomes a key problem in the field of face recognition.
In order to overcome the influence of the posture change on the face recognition, researchers have conducted diligent research for the purpose. The most important way is to collect face data sets under the influence of different posture changes, such as YouTubeFace, and the face recognition model trained by using a large number of data sets has better self-adaption capability and fitting capability for images, but the process of collecting the data sets consumes a large amount of material resources and manpower, thereby causing unnecessary resource waste. Another important way is to solve the problem of face recognition with few training samples by generating faces at different angles through the front images. Such as those proposed by Blanz and Vetter: the three-dimensional deformation model (3DMM) is established on the basis of the three-dimensional face database, and the influence of posture change and the like is considered, so that the generated three-dimensional face model is high in precision. However, the method has higher requirement on the precision of the three-dimensional model, the three-dimensional modeling time is longer, and the model optimization is more complex. Another important way is to design a corresponding face recognition network for each different pose, adaptively select a suitable network from the images during the test, and then integrate the results of the different networks. However, this method has significant drawbacks: it requires multiple views to be acquired from multiple perspectives of each face. This is not possible in many practical situations and often only a single view of the front view or other pose is available.
Disclosure of Invention
The invention aims to provide a posture change face recognition method based on an attention mechanism, so as to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme: a face recognition method based on posture change of an attention mechanism comprises the following steps:
the method comprises the following steps: dividing the acquired data set into a training data set and a testing data set, and performing data preprocessing on the selected training data set;
step two: cutting the images of the training set in the step one into 112 × 112 image blocks;
step three: designing a convolutional neural network based on an attention mechanism;
step four: a convolutional neural network based on an attention mechanism is trained.
Preferably, the third step specifically comprises the following steps:
3.1: designing a double-sided attention mechanism module LA-SENEt: the attention mechanism module includes: the LANet space attention module and the SENet channel attention module are used for highlighting feature information which is most distinctive of the human face under the posture change;
3.2: designing an inverted residual error structure: designing Bottleneck-attention of an inverted residual error structure based on the MobileNet V2, namely expanding the number of channels and then compressing for reducing the calculated amount;
3.3: constructing a multi-scale feature fusion module: the multi-scale feature fusion module is formed by connecting a feature fusion module and an up-down sampling module;
3.4: constructing a global average pooling module: the global average pooling module is composed of GDConv and Linear Conv;
3.5: the method comprises the steps of building an attention mechanism network based on a MobileFaceNet model as a main body, wherein the attention mechanism-based posture change face recognition system network consists of six parts, namely an input module, an attention mechanism module LA-SENet, an inverted residual error module Bottleneck-attention, a multi-scale feature fusion module, a global average pooling module and an output module.
Preferably, the fourth step specifically comprises the following steps:
4.1: setting an activation function and a loss function, and estimating network parameters by using a difference value between a real image and an image after the convolution neural network based on an attention mechanism;
constructing SoftMax as the loss function of the invention, which adds a probability for each class, wherein the expression of the SoftMax function is as follows:
Figure RE-GDA0003595888370000031
wherein x isiRepresenting the depth feature of the ith sample, belonging to the ith class; wjJ-th column representing weight W; bjIs a deviation term; n represents the total number of training data categories, and N represents the batch size;
4.2: selecting an optimization function to carry out iterative training on the convolutional neural network based on the attention mechanism;
4.3: setting training parameters including learning rate, iterative Batch and Batch value important parameters;
4.4: the trained network was tested using the image dataset of lfw, cplfw, agendb _30, cfp as the test dataset of the present invention.
Preferably, in step 3.1: the attention mechanism module LA-SENEet is composed of a space attention mechanism LANet and a channel attention mechanism SENEet. The LANet is formed by two continuous 1 x 1 convolution kernels, and after each convolution, Relu and Sigmod are connected in series respectively; the SENET is composed of two continuous FC layers, and Relu and Sigmod are respectively connected in series after each FC layer.
Preferably, in step 3.2: the inverted residual module consists of different Stride structures.
Preferably, in the step 4.2: the network is iteratively trained using the SGD algorithm.
Preferably, in step 4.3: the learning rate initial value is set to 0.1, the number of iterations is set to 25, and the Batch value is set to 512.
Compared with the prior art, the invention has the beneficial effects that: the invention designs the convolutional neural network module based on the attention mechanism, and the range of the convolutional neural network module based on the attention mechanism is larger than that of a general convolutional neural network sensing field. Therefore, not only can more characteristics of the low-resolution image be extracted, but also a high-frequency information part in the image can be extracted by utilizing the attention module; designing a multi-scale feature fusion module which can fuse the features of different convolutional layers; an inverted residual error module is designed, the convolutional neural network based on the inverted residual error module successfully solves the problem that the training difficulty is increased along with the deepening of the network, and the features with different scales are output; a global average pooling module is constructed that can minimize the over-fitting effect by reducing the number of parameters of the model.
Drawings
FIG. 1 is a flow chart of feature extraction of an attention-based pose change face recognition system according to the present invention.
FIG. 2 is a schematic block diagram of a LA-SENET module of the double-sided power amplifier of the present invention;
FIG. 3 is a schematic block diagram of a Bottleneck-attention module of an inverted residual error structure according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it should be noted that the terms "upper", "lower", "inner", "outer", "front", "rear", "both ends", "one end", "the other end", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In the description of the present invention, it is to be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "disposed," "connected," and the like are to be construed broadly, such as "connected," which may be fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Referring to fig. 1, the present invention provides the following technical solutions: a face recognition method based on posture change of an attention mechanism comprises the following steps:
the method comprises the following steps: dividing the collected data set into a training data set and a testing data set, and carrying out data preprocessing on the selected training data set: vgg2face data, namely a large-scale face database comprising multiple postures and multiple visual angles, are selected as a training data set of the posture change face recognition system based on the attention mechanism, and image data sets of lfw, cplfw, agenb _30 and cfp are selected as a test data set;
step two: cutting the images of the training set in the step one into 112 × 112 image blocks;
step three: designing a convolutional neural network based on an attention mechanism;
step four: a convolutional neural network based on an attention mechanism is trained.
In the invention, the third step specifically comprises the following steps:
3.1: designing a double-sided attention mechanism module LA-SENEt: the attention mechanism module includes: the LANet space attention module and the SENet channel attention module are used for highlighting feature information which is most distinctive of the human face under the posture change; as shown in fig. 3, in the LANet module, there are two consecutive 1 × 1 convolutional layers, the ReLu function is added after the first convolution, and the Sigmod activation function is added after the second convolution, so as to aggregate the spatial information across channels into one channel. Where the second convolution outputs 1 channel with a sigmoid function, i.e., spatial attention. The SEnet module is divided into a compression part and an excitation part. The module compresses the image into a one-dimensional image with a larger field, and after the one-dimensional image is added into the FC layer, each same importance is predicted;
3.2: designing an inverted residual error structure: designing Bottleneck-attribute of an inverted residual error structure based on MobileNet V2, namely expanding the number of channels and then compressing for reducing the calculated amount, wherein an inverted residual error module consists of different Stride structures; as shown in fig. 4, the residual block of the multi-scale inversion is composed of Stride with different values. After 3 1 × 1 convolutions, two 1 × 1 convolutions are subjected to depth separable convolution by 3 × 3, one of the depth separable convolutions is subjected to 3 × 3 convolution, three branches are connected by using a Concat function, and finally the size of the channel is adjusted by using a 1 × 1 convolution kernel, if Stride ═ 2, the number of the convolutions is subjected to 1 × 1 convolution and expansion, and then the size of the channel is adjusted by using a 3 × 3 depth separable convolution kernel;
3.3: constructing a multi-scale feature fusion module: the multi-scale feature fusion module is formed by connecting a feature fusion module and an up-down sampling module;
as shown in fig. 1, the images of different scales output by the residual error module after multi-scale inversion are subjected to feature fusion. Let x1、 x2And x3For the features of different layers, the features of these different layers are multiplied by the weighting parameters α, β, and γ and added to obtain a new fusion feature, as shown in the following formula:
Figure RE-GDA0003595888370000061
wherein the content of the first and second substances,
Figure RE-GDA0003595888370000062
an (i, j) th vector representing an output feature map;
Figure RE-GDA0003595888370000063
representing a feature vector at position (i, j) on the feature map adjusted from level n to level l;
Figure RE-GDA0003595888370000064
and
Figure RE-GDA0003595888370000065
the spatial importance weight of feature mapping from three different levels to l levels learned by network adaptation is referred to;
the weight parameters α, β, and γ are obtained by performing 1 × 1 convolution on the feature maps of the respective layers after resize. The formula of the weight parameter after the SoftMax function is expressed as follows:
Figure RE-GDA0003595888370000066
3.4: constructing a global average pooling module: the global average pooling module is composed of GDConv and Linear Conv;
3.5: the method comprises the steps of building an attention mechanism network based on a MobileFaceNet model as a main body, wherein the attention mechanism-based posture change face recognition system network consists of six parts, namely an input module, an attention mechanism module LA-SENet, an inverted residual error module Bottleneck-attention, a multi-scale feature fusion module, a global average pooling module and an output module. The input module is formed by convolution of 3 x 3, and the attention mechanism module LA-SENET uses the attention mechanism module described in 3.1; the residual module uses the MobileNetV2 inverted residual structure described in 3.2; the multi-scale feature fusion module shown in the figure IV is formed by connecting 3 multi-channel feature extraction modules of 3.3; as shown in the fourth figure, the global average pooling module is composed of GDConv with 7 × 7 kernels and Linear Conv with 1 × 1 kernels, and the output module is composed of a Concat function and an FC layer.
In the invention, the fourth step specifically comprises the following steps:
4.1: setting an activation function and a loss function, and estimating network parameters by using a difference value between a real image and an image after the convolution neural network based on an attention mechanism;
constructing SoftMax as the loss function of the invention, which adds a probability for each class, wherein the expression of the SoftMax function is as follows:
Figure RE-GDA0003595888370000071
wherein x isiRepresenting the depth feature of the ith sample, belonging to the ith class; wjJ-th column representing weight W; bjIs a deviation term; n represents the total number of training data categories, and N represents the batch size;
4.2: selecting an optimization function to carry out iterative training on the convolutional neural network based on the attention mechanism, and carrying out iterative training on the network by using an SGD algorithm;
4.3: setting training parameters including learning rate, iterative batches and important parameters of a Batch value, setting an initial value of the learning rate to be 0.1, setting the iteration times to be 25 and setting the Batch value to be 512;
4.4: the trained network was tested using the image dataset of lfw, cplfw, agendb _30, cfp as the test dataset of the present invention.
In summary, the invention designs the convolutional neural network module based on the attention mechanism, and the range of the perception field of the convolutional neural network module based on the attention mechanism is larger than that of the convolutional neural network in a general sense. Therefore, not only can more characteristics of the low-resolution image be extracted, but also a high-frequency information part in the image can be extracted by utilizing the attention module; designing a multi-scale feature fusion module which can fuse the features of different convolutional layers; an inverted residual error module is designed, and the convolutional neural network based on the inverted residual error module successfully solves the problem that the training difficulty is increased along with the network deepening and outputs the characteristics of different scales; a global average pooling module is constructed that can minimize the over-fitting effect by reducing the number of parameters of the model.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims (7)

1. A face recognition method based on attitude change of an attention mechanism is characterized in that: the method comprises the following steps:
the method comprises the following steps: dividing the acquired data set into a training data set and a testing data set, and performing data preprocessing on the selected training data set;
step two: cutting the images of the training set in the step one into 112 × 112 image blocks;
step three: designing a convolutional neural network based on an attention mechanism;
step four: a convolutional neural network based on an attention mechanism is trained.
2. The method of claim 1, wherein the face recognition method based on the attention mechanism comprises: the third step specifically comprises the following steps:
2.1: designing a double-sided attention mechanism module LA-SENEt: the attention mechanism module comprises: the LANet space attention module and the SENet channel attention module are used for highlighting feature information which is most distinctive of the human face under the posture change;
2.2: designing an inverted residual error structure: designing Bottleneck-attention of an inverted residual error structure based on the MobileNet V2, namely expanding the number of channels and then compressing for reducing the calculated amount;
2.3: constructing a multi-scale feature fusion module: the multi-scale feature fusion module is formed by connecting a feature fusion module and an up-down sampling module;
2.4: constructing a global average pooling module: the global average pooling module is composed of GDConv and Linear Conv;
2.5: the method comprises the steps of building an attention mechanism network based on a MobileFaceNet model as a main body, wherein the attention mechanism-based posture change face recognition system network consists of six parts, namely an input module, an attention mechanism module LA-SENet, an inverted residual error module Bottleneck-attention, a multi-scale feature fusion module, a global average pooling module and an output module.
3. The method for recognizing the human face with the posture change based on the attention mechanism as claimed in claim 1, wherein: the fourth step specifically comprises the following steps:
3.1: setting an activation function and a loss function, and estimating network parameters by using a difference value between a real image and an image after the convolution neural network based on an attention mechanism;
constructing SoftMax as the loss function of the invention, which adds a probability for each class, wherein the SoftMax function expression is as follows:
Figure RE-FDA0003595888360000021
wherein x isiRepresenting the depth feature of the ith sample, belonging to the ith class; wjJ-th column representing weight W; bjIs a deviation term; n represents the total number of training data categories, and N represents the batch size;
3.2: selecting an optimization function to carry out iterative training on the convolutional neural network based on the attention mechanism;
3.3: setting training parameters including learning rate, iterative Batch and Batch value important parameters;
3.4: the trained network was tested using the image dataset of lfw, cplfw, agendb _30, cfp as the test dataset of the present invention.
4. The method of claim 1, wherein the face recognition method based on the attention mechanism comprises: in the step 3.1: the attention mechanism module LA-SENEt is composed of a space attention mechanism LANet and a channel attention mechanism SENEt; the LANet is formed by two continuous 1 x 1 convolution kernels, and after each convolution, Relu and Sigmod are connected in series respectively; the SENET is composed of two continuous FC layers, and Relu and Sigmod are respectively connected in series after each FC layer.
5. The method of claim 1, wherein the face recognition method based on the attention mechanism comprises: in the step 3.2: the inverted residual module consists of different Stride structures.
6. The method of claim 1, wherein the face recognition method based on the attention mechanism comprises: in the step 4.2: the network is iteratively trained using the SGD algorithm.
7. The method of claim 1, wherein the face recognition method based on the attention mechanism comprises: in the step 4.3: the learning rate initial value is set to 0.1, the number of iterations is set to 25, and the Batch value is set to 512.
CN202210013502.2A 2022-01-07 2022-01-07 Posture change face recognition method based on attention mechanism Pending CN114495210A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210013502.2A CN114495210A (en) 2022-01-07 2022-01-07 Posture change face recognition method based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210013502.2A CN114495210A (en) 2022-01-07 2022-01-07 Posture change face recognition method based on attention mechanism

Publications (1)

Publication Number Publication Date
CN114495210A true CN114495210A (en) 2022-05-13

Family

ID=81510085

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210013502.2A Pending CN114495210A (en) 2022-01-07 2022-01-07 Posture change face recognition method based on attention mechanism

Country Status (1)

Country Link
CN (1) CN114495210A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115063685A (en) * 2022-07-11 2022-09-16 河海大学 Remote sensing image building feature extraction method based on attention network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115063685A (en) * 2022-07-11 2022-09-16 河海大学 Remote sensing image building feature extraction method based on attention network
CN115063685B (en) * 2022-07-11 2023-10-03 河海大学 Remote sensing image building feature extraction method based on attention network

Similar Documents

Publication Publication Date Title
US11783579B2 (en) Hyperspectral remote sensing image classification method based on self-attention context network
CN111563508B (en) Semantic segmentation method based on spatial information fusion
CN108875076B (en) Rapid trademark image retrieval method based on Attention mechanism and convolutional neural network
CN108520213B (en) Face beauty prediction method based on multi-scale depth
CN112597955B (en) Single-stage multi-person gesture estimation method based on feature pyramid network
CN110852393A (en) Remote sensing image segmentation method and system
CN112766229B (en) Human face point cloud image intelligent identification system and method based on attention mechanism
CN111259735B (en) Single-person attitude estimation method based on multi-stage prediction feature enhanced convolutional neural network
CN113870160B (en) Point cloud data processing method based on transformer neural network
CN111652273A (en) Deep learning-based RGB-D image classification method
CN112215157B (en) Multi-model fusion-based face feature dimension reduction extraction method
CN112949740A (en) Small sample image classification method based on multilevel measurement
CN111414875A (en) Three-dimensional point cloud head attitude estimation system based on depth regression forest
CN112733627A (en) Finger vein identification method based on fusion of local feature network and global feature network
CN114511710A (en) Image target detection method based on convolutional neural network
CN115966010A (en) Expression recognition method based on attention and multi-scale feature fusion
CN112766283A (en) Two-phase flow pattern identification method based on multi-scale convolution network
CN112507904A (en) Real-time classroom human body posture detection method based on multi-scale features
CN116310339A (en) Remote sensing image segmentation method based on matrix decomposition enhanced global features
CN110728186A (en) Fire detection method based on multi-network fusion
CN114495210A (en) Posture change face recognition method based on attention mechanism
CN113032613B (en) Three-dimensional model retrieval method based on interactive attention convolution neural network
CN113781385A (en) Joint attention-seeking convolution method for brain medical image automatic classification
CN113128560A (en) Attention module enhancement-based CNN regular script style classification method
CN109583406B (en) Facial expression recognition method based on feature attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20220915

Address after: 030000 Xueyuan Road 3, Taiyuan, Shanxi

Applicant after: NORTH University OF CHINA

Applicant after: Nantong Institute for Advanced Study

Address before: 226000 building w-9, Zilang science and Technology City, central innovation District, Nantong City, Jiangsu Province

Applicant before: Nantong Institute of intelligent optics, North China University

Applicant before: NORTH University OF CHINA

TA01 Transfer of patent application right