CN112115790A

CN112115790A - Face recognition method and device, readable storage medium and electronic equipment

Info

Publication number: CN112115790A
Application number: CN202010832409.5A
Authority: CN
Inventors: 林航东; 张法朝; 唐剑
Original assignee: Beijing Didi Infinity Technology and Development Co Ltd
Current assignee: Beijing Didi Infinity Technology and Development Co Ltd
Priority date: 2020-08-18
Filing date: 2020-08-18
Publication date: 2020-12-22

Abstract

The embodiment of the invention discloses a face recognition method, a face recognition device, a readable storage medium and electronic equipment. And respectively inputting the target characteristic vectors determined according to the facial image characteristic vectors into a first classifier and a second classifier to obtain corresponding first region states and second region states so as to finally determine a facial recognition result. The embodiment of the invention can directly determine the face recognition result of the target face image comprising the whole face area through the classification model frame comprising the two classifiers when the face recognition is carried out.

Description

Face recognition method and device, readable storage medium and electronic equipment

Technical Field

The invention relates to the technical field of computers, in particular to a face recognition method, a face recognition device, a readable storage medium and electronic equipment.

Background

At present, the face recognition technology is widely applied to various fields. In the practical application process, because the scene for face recognition is variable and complicated, sometimes the face to be recognized is covered with a blocking object. For example, the eye area may be provided with a decorative article such as glasses or sunglasses, the mouth area may be provided with a decorative article such as a mask or a face shield, or the face may be blocked by hand. In some application scenarios, it is desirable to identify the covering of the face of a person.

Disclosure of Invention

In view of this, embodiments of the present invention provide a face recognition method, an apparatus, a readable storage medium, and an electronic device, which are used to recognize a mask covering a face in a target face image.

In a first aspect, an embodiment of the present invention provides a face recognition method, where the method includes:

determining a target face image to be recognized;

performing feature extraction on the target face image to determine a face image feature vector comprising a first regional feature and a second regional feature, wherein the first regional feature and the second regional feature are respectively used for representing feature information of a first region and a second region in the target face image;

determining a target feature vector corresponding to the face image feature vector;

inputting the target feature vectors into a first classifier and a second classifier respectively to determine corresponding first region states and second region states respectively;

and determining a face recognition result according to the first region state and the second region state.

In a second aspect, an embodiment of the present invention provides a face recognition apparatus, where the apparatus includes:

the image determining module is used for determining a target face image to be recognized;

the feature extraction module is used for performing feature extraction on the target face image to determine a face image feature vector comprising a first region feature and a second region feature, wherein the first region feature and the second region feature are respectively used for representing feature information of a first region and a second region in the target face image;

the feature conversion module is used for determining a target feature vector corresponding to the face image feature vector;

the classification module is used for respectively inputting the target feature vectors into a first classifier and a second classifier so as to respectively determine a corresponding first region state and a corresponding second region state;

and the recognition module is used for determining a face recognition result according to the first region state and the second region state.

In a third aspect, an embodiment of the present invention provides a computer-readable storage medium for storing computer program instructions, which when executed by a processor implement the method according to the first aspect.

In a fourth aspect, an embodiment of the present invention provides an electronic device, including a memory and a processor, wherein the memory is configured to store one or more computer program instructions, and wherein the one or more computer program instructions are executed by the processor to implement the method according to the first aspect.

The embodiment of the invention obtains the target face image to be recognized and performs feature extraction on the target face image to obtain the face image feature vector comprising the first regional feature and the second regional feature. And respectively inputting the target characteristic vectors determined according to the facial image characteristic vectors into a first classifier and a second classifier to obtain corresponding first region states and second region states so as to finally determine a facial recognition result. The embodiment of the invention can directly determine the face recognition result of the target face image comprising the whole face area through the classification model frame comprising the two classifiers when the face recognition is carried out.

Drawings

The above and other objects, features and advantages of the present invention will become more apparent from the following description of the embodiments of the present invention with reference to the accompanying drawings, in which:

FIG. 1 is a flow chart of a face recognition method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a target face image according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a face recognition process according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a data processing procedure of a face recognition method according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a face recognition apparatus according to an embodiment of the present invention;

fig. 6 is a schematic diagram of an electronic device according to an embodiment of the invention.

Detailed Description

The present invention will be described below based on examples, but the present invention is not limited to only these examples. In the following detailed description of the present invention, certain specific details are set forth. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details. Well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the present invention.

Further, those of ordinary skill in the art will appreciate that the drawings provided herein are for illustrative purposes and are not necessarily drawn to scale.

Unless the context clearly requires otherwise, throughout the description, the words "comprise", "comprising", and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, what is meant is "including, but not limited to".

In the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified.

The face recognition method of the embodiment of the invention can be realized by the terminal equipment or the server which is provided with the pre-trained classification model frame, namely, the face image to be processed is input into the classification model frame through the terminal equipment or the server to obtain the corresponding face recognition result. The terminal device may be a general data processing terminal capable of running a computer program and having a communication function, such as a smart phone, a tablet computer, or a notebook computer. The server may be a single server or a cluster of servers configured in a distributed manner. The face image to be processed can be acquired by an image acquisition device arranged on the terminal equipment or connected with the server, and can also be transmitted to the terminal equipment or the server for face recognition through other equipment.

Fig. 1 is a flowchart of a face recognition method according to an embodiment of the present invention, and as shown in fig. 1, the face recognition method includes the following steps:

and step S100, determining a target face image to be recognized.

Specifically, the target face image is an image that needs face recognition, and includes a plurality of regions to be recognized, such as face regions of eyes, a mouth, and a nose. When the face recognition method is implemented by a terminal device or a server in which a classification model frame trained in advance is installed, the target face image may be determined before the input into the classification model frame, or may be determined by the classification model frame.

Fig. 2 is a schematic diagram of a target face image according to an embodiment of the present invention, and as shown in fig. 2, the target face image 20 includes a face region to be recognized. The human face area to be recognized can be further divided into a first area 21 for recognizing the near state of the human face eyes and a second area 22 for recognizing the near state of the human face mouth.

In the embodiment of the present invention, the process of acquiring the target face image may include the following steps:

and step S110, acquiring a face image.

Specifically, the face image may be acquired directly by an image acquisition device installed or connected to the terminal device for performing the face recognition processing, or acquired directly by an image acquisition device connected to the server for performing the face recognition processing. For example, when the terminal device is a notebook computer, the face image may be acquired by a camera built in the notebook computer or a connected camera device. Optionally, the face image may also be transmitted to a terminal device or a server that performs face recognition processing through other devices. For example, the stored face image may be transmitted to a terminal device or a server via an image storage device with a communication function to perform face recognition.

And step S120, preprocessing the face image to determine a target face image to be recognized.

In particular, the pre-treatment process may comprise one or more pre-set pre-treatment steps. In the embodiment of the present invention, the preprocessing step may be set according to a classification model framework feature for performing face recognition. For example, after the face image is input into the classification model frame, the preprocessing layer in the classification model frame preprocesses the face image to obtain a target face image. Optionally, the preprocessing step is to perform normalization processing on each pixel point in the face image, that is, to convert each channel value of each pixel point in the face image into a value in a specific range according to a preset normalization condition. For example, since the range of each channel value of each pixel in the image is 0 to 255, when the normalization condition is that (image/255.0) × 2.0 to 1.0 is calculated for each channel value image of each pixel in the face image, respectively, to obtain a value between-1 and 1.

Further, the preprocessing process may further include a preprocessing step performed before the face image is input into the classification model frame. For example, a face region in the face image is extracted, and light supplement is performed on the face image shot in a night environment.

Step S200, extracting the characteristics of the target face image to determine a face image characteristic vector comprising a first regional characteristic and a second regional characteristic.

Specifically, the target face image includes feature information corresponding to a plurality of regions to be identified. In the embodiment of the invention, two regions to be identified, namely the first region and the second region, can be determined according to the positions in the target face image. The first region feature is used for representing feature information of a first region in the target face image, and the second region feature is used for representing feature information of a second region in the target face image. For example, when it is determined that the region where the eyes are located in the target face image is a first region and the region where the mouth is located is a second region, the first region feature is used for characterizing feature information of the eyes and the vicinity of the eyes in the target face image, and the second region feature is used for characterizing feature information of the mouth and the vicinity of the mouth in the target face image.

In the embodiment of the invention, in order to improve the accuracy of the target face image feature extraction process, the classification model framework for face recognition comprises a conventional feature extraction layer and an attention mechanism layer. Therefore, the classification model framework can perform feature extraction on the first region and the second region appointed in the target face image through two different feature extraction modes to obtain accurate and comprehensive face image feature vectors. The feature extraction process includes the steps of:

step S210, extracting the characteristics of the first region and the second region in the target face image through a characteristic extraction layer to determine a first image characteristic vector.

Specifically, the feature extraction layer may be a CNN convolutional neural network obtained through pre-training, and extract features of an eye region and features of a mouth region in the target face image through a first convolution kernel corresponding to the first region and a second convolution kernel corresponding to the second region, so as to obtain a first feature map for characterizing the eye region and a second feature map for characterizing the mouth region. And combining the first feature mapping and the second feature mapping to obtain a first image feature vector comprising eye features and mouth features.

Step S220, extracting features of the first region and the second region in the target face image through an attention mechanism layer to determine a second image feature vector.

Specifically, the attention mechanism layer further includes a plurality of attention modules, and the plurality of attention modules can collectively extract features in the target face image to determine a second image feature vector. In an embodiment of the present invention, the determining process of the second image feature vector includes the following steps:

step S221, extracting the features of the first region and the features of the second region in the target face image through a first attention module to determine a channel feature vector.

Specifically, after the target face image is input into an attention mechanism layer, a plurality of feature images are obtained through convolution and input into a first attention module in the attention mechanism layer. In an embodiment of the present invention, the first attention module may be a Channel attention module (Channel attention module). And after different characteristic images obtained after the target face image is convolved are input into the channel attention module, performing global pooling and average pooling on each characteristic image respectively to obtain a global vector and an average vector. The global vector is used for representing a first regional feature and a second regional feature in the target face image, and the average vector is used for representing the overall feature of the target face image. After determining the global vector and the average vector, processing the global vector and the average vector by a Perceptron (multilayered Perceptron) to obtain a global channel vector and an average channel vector. Further, the global channel vector and the average channel vector can be calculated through a preset function to obtain a channel feature vector.

Step S222, extracting features of the first region and features of the second region in the target face image through a second attention module to determine a spatial feature vector.

Specifically, the target face image is convolved to obtain a plurality of feature images, and the feature images are further input into a second attention module in the attention mechanism layer. In an embodiment of the present invention, the second attention module is a Spatial attention module (Spatial attention module). After different feature images obtained after the convolution of the target face image are input into the space attention module, the pixel values of the same position of each feature image are subjected to global pooling and average pooling to obtain a global vector and an average vector. The global vector is used for representing a first regional feature and a second regional feature in the target face image, and the average vector is used for representing the overall feature of the target face image. And combining the global vector and the average vector, performing convolution, and activating the result of the convolution through an activation function to obtain a spatial feature vector.

And step S223, determining a second image feature vector according to the channel feature vector and the space feature vector.

Specifically, after a channel feature vector and a space feature vector are respectively extracted by a first attention module and a second attention module, the channel feature vector and the space feature vector are combined to obtain a second image feature vector including features of a first region and a second region in the target face image.

And step S230, determining a facial image feature vector according to the first image feature vector and the second image feature vector.

Specifically, the facial image feature vector may be determined by inputting the first image feature vector and the second image feature vector into a decoding layer (decoding) of a classification model framework. After a first image feature vector extracted by the feature extraction layer and a second image feature vector extracted by the attention mechanism layer are input into the decoding layer, the first image feature vector and the second image feature vector are sequentially merged and convolved through the decoding layer, and a face image feature vector comprising a first region feature and a second region feature is obtained. That is, when the first region is an eye region and the second region is a mouth region, the face image feature vector includes eye region features and mouth region features of a face to be recognized.

And step S300, determining a target feature vector corresponding to the face image feature vector.

Specifically, the target feature vector may be determined by inputting the face image feature vector into a post-processing layer (Postprocessing) of a classification model framework. The post-processing layer is used for tiling the facial image feature vectors, namely converting the multidimensional vectors into one-dimensional vectors so as to determine target feature vectors.

Step S400, inputting the target feature vectors into a first classifier and a second classifier respectively so as to determine a corresponding first region state and a corresponding second region state respectively.

Specifically, the first classifier and the second classifier are classifiers obtained by pre-training. The first classifier comprises a plurality of preset first candidate region states corresponding to the first region and is used for judging the state of the eye region in the target face image according to the input target feature vector. The first candidate area state may include, for example, a normal state, a state of wearing glasses, a state of wearing sunglasses, or the like, based on a preset setting. The second classifier comprises a plurality of preset second candidate region states corresponding to the second region and is used for judging the state of the mouth region in the target face image according to the input target feature vector. The second candidate region state may include, for example, a normal, mask-wearing, mouth-covering state, or the like, based on a predetermined setting.

After determining the target feature vector, inputting the target feature vector into a trained first classifier. The first classifier corresponds to a first classification function obtained through training and is used for calculating an input target feature vector so as to output probability values of the states of the preset first candidate regions corresponding to the first classifier. The first classification function is a normalized exponential function (Softmax function), the probability value of each first candidate region state is 0-1, and the sum of the probability values is one. For example, when the first candidate region state corresponding to the first classifier includes normal, glasses worn and sunglasses worn, the probability value that the eye region of the target face image is normal, the probability value that glasses are worn and the probability value that sunglasses are worn are obtained after the target feature vector is input into the first classifier. And further determining that the first candidate region state corresponding to the highest probability value is a first region state, namely the state of the eye region in the target face image.

Meanwhile, the target feature vector is input into a second classifier. The second classifier corresponds to a trained second classification function and is used for calculating an input target feature vector so as to output probability values of the states of the preset second candidate regions corresponding to the second classifier. The second classification function is a normalized exponential function (Softmax function), the probability value of each second candidate region state is 0-1, and the sum of the probability values is one. For example, when the second candidate region state corresponding to the second classifier includes normal mouth wearing mask and mouth covering, the target feature vector is input into the second classifier to obtain a probability value that the mouth region of the target face image is normal, a probability value of mouth wearing mask and a probability value of mouth covering by hands. And further determining that the second candidate region state corresponding to the highest probability value is a second region state, namely the state of the mouth region in the target face image.

Further, in the process of training the classification model framework, a first loss of the first classifier and a second loss of the second classifier are respectively determined, and a sum of the first loss and the second loss is calculated to obtain a third loss. And respectively training the classification model framework according to the first loss, the second loss and the third loss so as to realize antagonism training in different regions and auxiliary antagonism training among different regions in a training process.

And S500, determining a face recognition result according to the first region state and the second region state.

Specifically, after a first region state and a second region state are determined, a face recognition result is determined according to the first region state and the second region state. The first candidate area state corresponding to the first classifier includes normal, wearing glasses, and wearing sunglasses, and the second candidate area state corresponding to the second classifier includes normal, wearing a mask, and covering a mouth as an example. When the first area state determined by the first classifier is that glasses are worn and the second area state determined by the second classifier is that a mask is worn, the finally determined face recognition result is that the glasses and the mask are worn.

Fig. 3 is a schematic diagram of a face recognition process according to an embodiment of the present invention. As shown in fig. 3, after a to-be-recognized face image 30 including two to-be-recognized regions, namely a first region and a second region, is obtained, the face image 30 is input into a trained classification model frame 31, and a corresponding face recognition result 32 is output. The face recognition result 32 includes a first region state and a second region state, and the first region is taken as an eye region and the second region is taken as a mouth region for example. The face recognition result 32 may be, for example, normal eyes wearing glasses and mouths wearing masks, normal eyes wearing sunglasses and mouths, normal eyes wearing mouths, and the like.

Fig. 4 is a schematic diagram of a data processing process of the face recognition method according to the embodiment of the present invention. As shown in fig. 4, after the face image is input into the classification model frame 40, the face image is preprocessed by the preprocessing layer 41 to obtain a target face image. And inputting the target face image into a feature extraction layer 42 and an attention mechanism layer 43 respectively for feature extraction to obtain a first image feature vector and a second image feature vector respectively including the features of the first region and the features of the second region in the face image. In the embodiment of the present invention, the attention mechanism layer 43 includes a first attention module and a second attention module. And extracting the channel characteristic vector corresponding to the target face image through a first attention module, and extracting the spatial characteristic vector corresponding to the target face image through a second attention module. And merging and convolving the channel feature vector and the space feature vector to obtain a second image feature vector.

After the feature extraction layer 42 and the attention mechanism layer 43 respectively extract features of the target image, the obtained first image feature vector and second image feature vector are input into the decoding layer 44, and then are merged and convolved to determine a face image feature vector including a first region feature and a second region feature. And the post-processing layer 45 performs tiling processing on the facial image feature vectors to obtain target feature vectors which can be identified by the classifier. The target feature vectors are input to a trained first classifier 46 and a second classifier 47, respectively. The first classifier 46 includes a plurality of preset first candidate region features, is determined among the plurality of first candidate region features according to the input target feature vector, and outputs a first region feature corresponding to a first region in the target face image. The second classifier 47 includes a plurality of preset second candidate region features, determines the second candidate region features from the plurality of second candidate region features according to the input target feature vector, and outputs second region features corresponding to a second region in the target face image. And finally, obtaining a face recognition result according to the first regional characteristic and the second regional characteristic. For example, when the first region is an eye region and the second region is a mouth region, a face recognition result including eye region features and mouth region features is output.

According to the face recognition method, in the process of face recognition, the feature extraction layer and the attention mechanism layer are used for respectively extracting the features of the target face image, so that the accuracy and the integrity of extracting feature information from the target face image are improved. Meanwhile, the classification model frame comprises two classifiers respectively corresponding to different areas in the target face image, so that the target face image can be directly input into the classification model frame without being segmented in the face recognition process, and a face recognition result comprising the two areas in the target face image can be obtained.

Fig. 5 is a schematic diagram of a face recognition apparatus according to an embodiment of the present invention, and as shown in fig. 5, the face recognition apparatus includes an image determination module 50, a feature extraction module 51, a feature conversion module 52, a classification module 53, and a recognition module 54.

Specifically, the image determining module 50 is configured to determine a target face image to be recognized. The feature extraction module 51 is configured to perform feature extraction on the target face image to determine a face image feature vector including a first regional feature and a second regional feature, where the first regional feature and the second regional feature are respectively used to represent feature information of a first region and a second region in the target face image. The feature conversion module 52 is configured to determine a target feature vector corresponding to the feature vector of the face image. The classification module 53 is configured to input the target feature vector into a first classifier and a second classifier, respectively, so as to determine a corresponding first region state and a corresponding second region state, respectively. The recognition module 54 is configured to determine a face recognition result according to the first region status and the second region status.

Further, the image determination module includes:

the image acquisition unit is used for acquiring a face image;

and the preprocessing unit is used for preprocessing the face image so as to determine a target face image to be recognized.

Further, the feature extraction module comprises:

the first feature extraction unit is used for extracting features of a first region and a second region in the target face image through a feature extraction layer to determine a first image feature vector;

the second feature extraction unit is used for extracting features of the first region and the second region in the target face image through an attention mechanism layer to determine a second image feature vector;

and the characteristic determining unit is used for determining the characteristic vector of the face image according to the first image characteristic vector and the second image characteristic vector.

Further, the second feature extraction unit includes:

the first feature extraction subunit is used for extracting the features of the first region and the features of the second region in the target face image through the first attention module so as to determine a channel feature vector;

the second feature extraction subunit is used for extracting the features of the first region and the features of the second region in the target face image by the second attention module so as to determine a spatial feature vector;

and the characteristic determining subunit is used for determining a second image characteristic vector according to the channel characteristic vector and the space characteristic vector.

Further, the feature determination unit specifically includes:

and the feature merging subunit is used for merging the first image features and the second image features to obtain a face image feature vector.

Further, the feature conversion module specifically includes:

and the vector tiling unit is used for tiling the facial image feature vectors to obtain target feature vectors.

Further, the first region state is an eye state of the target face image;

the second region state is a mouth state of the target face image.

In the process of face recognition, the face recognition device provided by the embodiment of the invention respectively extracts the features of the target face image through the feature extraction layer and the attention mechanism layer, so that the accuracy and the integrity of extracting feature information from the target face image are improved. Meanwhile, the classification model frame comprises two classifiers respectively corresponding to different areas in the target face image, so that the target face image can be directly input into the classification model frame without being segmented in the face recognition process, and a face recognition result comprising the two areas in the target face image can be obtained.

Fig. 6 is a schematic diagram of an electronic device according to an embodiment of the invention. As shown in fig. 6, the electronic device shown in fig. 6 is a general address query device, which includes a general computer hardware structure, which includes at least a processor 60 and a memory 61. The processor 60 and the memory 61 are connected by a bus 62. The memory 61 is adapted to store instructions or programs executable by the processor 60. Processor 60 may be a stand-alone microprocessor or a collection of one or more microprocessors. Thus, processor 60 implements the processing of data and the control of other devices by executing instructions stored by memory 61 to thereby perform the method flows of embodiments of the present invention as described above. The bus 62 connects the above components together, and also connects the above components to a display controller 63 and a display device and an input/output (I/O) device 64. Input/output (I/O) devices 64 may be a mouse, keyboard, modem, network interface, touch input device, motion sensing input device, printer, and other devices known in the art. Typically, the input/output devices 64 are connected to the system through input/output (I/O) controllers 65.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus (device) or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may employ a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations of methods, apparatus (devices) and computer program products according to embodiments of the application. It will be understood that each flow in the flow diagrams can be implemented by computer program instructions.

These computer program instructions may be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows.

These computer program instructions may also be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows.

Another embodiment of the invention is directed to a non-transitory storage medium storing a computer-readable program for causing a computer to perform some or all of the above-described method embodiments.

That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A face recognition method, comprising:

determining a target face image to be recognized;

2. The method of claim 1, wherein the determining the target face image to be recognized comprises:

acquiring a face image;

and preprocessing the face image to determine a target face image to be recognized.

3. The method of claim 1, wherein the feature extracting the target face image to determine a face image feature vector comprising a first region feature and a second region feature comprises:

extracting the features of a first region and a second region in the target face image through a feature extraction layer to determine a first image feature vector;

extracting the features of the first region and the second region in the target face image through an attention mechanism layer to determine a second image feature vector;

and determining a facial image feature vector according to the first image feature vector and the second image feature vector.

4. The method of claim 3, wherein the extracting, by the attention-driven layer, the features of the first region and the second region in the target face image to determine the second image feature comprises:

extracting the features of a first region and the features of a second region in the target face image through a first attention module to determine a channel feature vector;

extracting the features of the first region and the features of the second region in the target face image through a second attention module to determine a spatial feature vector;

and determining a second image feature vector according to the channel feature vector and the space feature vector.

5. The method according to claim 3, wherein the determining the facial image feature vector from the first image feature and the second image feature is specifically:

and combining the first image characteristic and the second image characteristic to obtain a face image characteristic vector.

6. The method according to claim 1, wherein the determining the target feature vector corresponding to the face image feature vector specifically comprises:

and tiling the facial image feature vectors to obtain target feature vectors.

7. The method according to claim 1, wherein the first region state is an eye state of the target face image;

the second region state is a mouth state of the target face image.

8. An apparatus for face recognition, the apparatus comprising:

9. The apparatus of claim 8, wherein the image determination module comprises:

the image acquisition unit is used for acquiring a face image;

10. The apparatus of claim 8, wherein the feature extraction module comprises:

11. The apparatus according to claim 10, wherein the second feature extraction unit comprises:

12. The apparatus according to claim 10, wherein the feature determining unit specifically includes:

13. The apparatus of claim 8, wherein the feature transformation module specifically comprises:

14. The apparatus according to claim 8, wherein the first region state is an eye state of the target face image;

the second region state is a mouth state of the target face image.

15. A computer readable storage medium storing computer program instructions which, when executed by a processor, implement the method of any one of claims 1-7.

16. An electronic device comprising a memory and a processor, wherein the memory is configured to store one or more computer program instructions, wherein the one or more computer program instructions are executed by the processor to implement the method of any of claims 1-7.