CN115240249B

CN115240249B - Feature extraction classification metric learning method, system and storage medium for face recognition

Info

Publication number: CN115240249B
Application number: CN202210804354.6A
Authority: CN
Inventors: 陈侃松; 姚福娟; 许立君
Original assignee: Hubei University
Current assignee: Hubei University
Priority date: 2022-07-07
Filing date: 2022-07-07
Publication date: 2023-06-06
Anticipated expiration: 2042-07-07
Also published as: CN115240249A

Abstract

The invention discloses a feature extraction classification measurement learning method, a system and a storage medium for face recognition, wherein the method comprises the following steps: acquiring an initial face image which is detected to be aligned; performing initial feature extraction processing on the initial face image to obtain a face feature image to be mapped; and mapping the face feature map through a theta MEX function to obtain a mapping feature map. Carrying out classification measurement learning on the mapping feature map through a classification loss function; therefore, the image precision can be improved, the false recognition rate can be reduced, and the time complexity of classifying measurement learning is low.

Description

Feature extraction classification metric learning method, system and storage medium for face recognition

Technical Field

The invention relates to the field of face recognition, in particular to a method, a system and a storage medium for learning feature extraction classification metrics of face recognition.

Background

The face recognition technology is used as the most important identity recognition technology, has the characteristics of non-contact performance, real-time performance, safety and the like, and is widely applied to numerous scientific and technological fields such as intelligent security, mobile payment and the like at present. The face image feature extraction is a key step of face recognition, and with the rapid development of deep learning and computer vision technology, the self-learning feature extraction of the convolutional neural network has higher accuracy and generalization capability compared with the artificial design feature extraction.

The conventional common face recognition neural network has the following defects:

the widely used face image feature extraction operation can better extract face features, but feature images have certain sparsity, and the sparsity of the feature images can be increased by common global averaging, maximum and random pooling operations, and the too sparse feature images can lose image features to cause high false recognition rate.

The loss function plays a crucial role in the deep neural network, and compared with the two commonly used measurement learning loss functions of the loss function and the triplet loss function, the loss function based on measurement learning (similarity learning) has the problem of complex sampling, so that the training difficulty of the neural network is greatly improved.

Disclosure of Invention

The invention provides a feature extraction and classification measurement learning method, a system and a storage medium for face recognition, which can improve image precision, reduce false recognition rate and simultaneously have lower time complexity of classification measurement learning during feature extraction.

In a first aspect, a method for learning feature extraction classification metrics of face recognition is provided, including the following steps:

acquiring an initial face image which is detected to be aligned;

performing initial feature extraction processing on the initial face image to obtain a face feature image to be mapped;

and mapping the face feature map through a theta MEX function to obtain a mapping feature map. The method comprises the steps of carrying out a first treatment on the surface of the

And carrying out classification metric learning on the mapping feature map through a classification loss function.

According to the first aspect, in a first possible implementation manner of the first aspect, the step of performing an initial feature extraction process on the initial face image to obtain a face feature map to be mapped specifically includes the following steps:

sliding the convolution kernel from an initial position to an end position on the initial face image to obtain feature extraction matrix data;

carrying out data normalization processing on the feature extraction matrix data;

and carrying out nonlinear transformation on the feature extraction matrix data subjected to normalization processing through a nonlinear ReLU function to obtain a face feature map to be mapped.

In a second possible implementation manner of the first aspect, the step of mapping the face feature map by using a theta mex function to obtain a mapping feature map specifically includes the following steps:

obtaining a mapping feature map feature according to the following formula:

formula (1);

in the formula (1), featurek is the kth mapping feature map output by the theta mex function, M is the width of the face feature map, N is the height of the face feature map,

for the feature value of the ith row and jth column in the inputted kth face feature map,/th>

Is a hyper-parameter of the kth theta mex function operator.

In a third possible implementation manner of the first aspect according to the second possible implementation manner of the first aspect, the step of performing a classification metric learning on the mapping feature map by a classification loss function specifically includes the following steps:

the classified loss function comprises a classified comparison loss function and a classified triplet loss function;

carrying out classification measurement learning on the mapping feature map through a classification comparison loss function; or alternatively, the process may be performed,

and carrying out classification metric learning on the mapping feature map through a classification triplet loss function.

In a fourth possible implementation manner of the first aspect, the step of performing a classification metric learning on the mapping feature map by using a classification contrast loss function specifically includes the following steps:

carrying out classification measurement learning on the mapping feature map through the following classification comparison loss function;

formula (2);

in the formula (2), N is the number of mapping feature maps,

is a feature vector +_>

For the column vector of the ith column of the weight matrix,

is->

Is a preset threshold.

In a fifth possible implementation manner of the first aspect according to the fourth possible implementation manner of the first aspect, the step of performing a classification metric learning on the mapping feature map by using a classification triplet loss function specifically includes the following steps:

performing classification metric learning on the mapping feature map through the following classification triplet loss function;

formula (3);

in the formula (3), the amino acid sequence of the compound,

for mapping the anchor points of the feature map, < >>

Is an anchorPoint feature vector +_>

For a mapping profile of the same kind as the anchor point +.>

Column vector of weight matrix similar to anchor feature vector, +.>

For mapping feature map different from anchor point, +.>

For column vectors of the weight matrix that are dissimilar to the anchor feature vector, margin is a preset threshold.

In a fifth possible implementation manner of the first aspect, before the step of acquiring the initial face image that has been aligned and detected, the method specifically includes the following steps:

controlling a camera to shoot a portrait target object, obtaining a portrait video stream, and detecting a face image to be adjusted in the portrait video stream;

and adjusting the face image to be adjusted to be aligned with the camera to obtain an initial face head image.

In a second aspect, a feature extraction classification metric learning system for face recognition is provided, including:

the initial face image acquisition module is used for acquiring the initial face image which is detected and aligned;

the face feature map to be mapped is connected with the initial face image acquisition module in a communication way and is used for carrying out initial feature extraction processing on the initial face image to acquire a face feature map to be mapped;

the mapping feature map acquisition module is in communication connection with the face feature map acquisition module to be mapped and is used for carrying out mapping processing on the face feature map through a theta MEX function to acquire a mapping feature map.

And the classifying measurement learning module is in communication connection with the mapping characteristic diagram acquisition module and is used for classifying measurement learning on the mapping characteristic diagram through a classifying loss function.

In a first possible implementation manner of the second aspect according to the second aspect, the method includes: the detection alignment module is in communication connection with the initial face image acquisition module and is used for controlling a camera to shoot a portrait target object, acquiring a portrait video stream and detecting a face image to be adjusted in the portrait video stream; and adjusting the face image to be adjusted to be aligned with the camera to obtain an initial face head image.

In a third aspect, a storage medium is provided, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements a feature extraction classification metric learning method for face recognition as described above.

Compared with the prior art, the invention has the following advantages: firstly, acquiring an initial face image which is detected to be aligned; then, carrying out initial feature extraction processing on the initial face image to obtain a face feature image to be mapped; the face feature map is mapped through a theta MEX function, a mapping feature map is obtained, the value of a parameter of the theta MEX function can be adaptively mapped to any value between a maximum value and a minimum value, therefore, the output of the theta MEX function can be freely controlled, the sparsity of the feature map is controlled, the image precision of the obtained mapping feature map can be improved, and the false recognition rate is reduced; and then, carrying out classification measurement learning on the mapping feature map through a classification loss function, and introducing a classification idea on the basis of measurement learning of the loss function, wherein the classification loss only needs to extract a class center as a 'simulation class' for a plurality of samples in advance, and then compares the samples with the 'simulation class', so that the time complexity of the function is low, two samples are not required to be sampled at the same time each time, and measurement learning can be carried out only by collecting one sample image each time.

Drawings

FIG. 1 is a flow chart of an embodiment of a feature extraction classification metric learning method for face recognition according to the present invention;

FIG. 2 is a schematic diagram of a ThetaMEX function mapping flow of the present invention;

FIG. 3 is a schematic flow chart of the feature extraction classification metric learning system for face recognition according to the present invention;

reference numerals:

100. a feature extraction classification measurement learning system for face recognition; 110. an initial face image acquisition module; 120. the face feature map obtaining module is used for obtaining a face feature map to be mapped; 130. a mapping feature map acquisition module; 140. a classification metric learning module; 150. and detecting an alignment module.

Detailed Description

Reference will now be made in detail to the present embodiments of the invention, examples of which are illustrated in the accompanying drawings. While the invention will be described in conjunction with the specific embodiments, it will be understood that they are not intended to limit the invention to the described embodiments. On the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims. It should be noted that the method steps described herein may be implemented by any functional block or arrangement of functions, and any functional block or arrangement of functions may be implemented as a physical entity or a logical entity, or a combination of both.

The present invention will be described in further detail below with reference to the drawings and detailed description for the purpose of enabling those skilled in the art to understand the invention better.

Note that: the examples to be described below are only one specific example, and not as limiting the embodiments of the present invention necessarily to the following specific steps, values, conditions, data, sequences, etc. Those skilled in the art can, upon reading the present specification, make and use the concepts of the invention to construct further embodiments not mentioned in the specification.

Referring to fig. 1, an embodiment of the present invention provides a feature extraction classification metric learning method for face recognition, including the following steps:

s100, acquiring an initial face image which is detected to be aligned;

s200, carrying out initial feature extraction processing on the initial face image to obtain a face feature image to be mapped;

s300, mapping the face feature map through a theta MEX function to obtain a mapping feature map;

s400, carrying out classification metric learning on the mapping feature map through a classification loss function.

Specifically, in this embodiment, an initial face image that has been detected to be aligned is first acquired; then, carrying out initial feature extraction processing on the initial face image to obtain a face feature image to be mapped;

because the existing face image feature extraction operation can better extract face features, but the extracted face feature images have certain sparsity, and the common global averaging, maximum and random pooling operations can aggravate the sparsity of the feature images, and the too sparse feature images can lose image features to cause high false recognition rate, so that the theta MEX function is embedded into the face verification neural network models MobileFaceneT and Light-CNN for the face feature extraction operation, the pooling layers of the two models are replaced, and the specific steps are that Global depthwise Conv in the MobileFaceneT is replaced by ThetaMEX Pool, and other network structures are not changed; similarly, the global average pooling layer in the Light-CNN is replaced by a theta MEX Pool, so that the face feature map is mapped through a theta MEX function to obtain a mapping feature map, the value of the parameter theta of the theta MEX function can be adaptively mapped to any value between a maximum value and a minimum value, the output of the theta MEX function can be freely controlled, the sparsity of the feature map is controlled, and the obtained mapping feature map can improve the image precision and reduce the false recognition rate.

Because the existing loss function based on measurement learning (similarity learning) is comparison between samples, the time complexity is higher, and the sampling difficulty is higher, after the mapping feature images are acquired, the mapping feature images are subjected to classification measurement learning through the classification loss function, the classification thought can be introduced on the basis of measurement learning of the loss function, the classification loss only needs to extract a class center as a 'simulation class' for a plurality of samples in advance, and then the samples are compared with the 'simulation class', so that the time complexity of the function is lower, two samples are not required to be simultaneously sampled each time, and measurement learning can be performed only by acquiring one sample image each time.

Preferably, in another embodiment of the present application, the step of "S200, performing an initial feature extraction process on the initial face image, and obtaining a face feature map to be mapped" specifically includes the following steps:

s210, sliding a convolution kernel from an initial position to an end position on the initial face image to obtain feature extraction matrix data;

s220, carrying out data normalization processing on the feature extraction matrix data;

s230, carrying out nonlinear transformation on the feature extraction matrix data subjected to normalization processing through a nonlinear ReLU function, and obtaining a face feature map to be mapped.

Specifically, in this embodiment, S210 performs convolution processing on the initial face image, and the image convolution calculation process is similar to the convolution calculation process in signal processing, and slides a smaller square convolution kernel on the initial face image (two-dimensional image), and the pixel values covered by the convolution kernel on the two-dimensional image are multiplied by the values corresponding to the convolution kernel and then added until the convolution kernel slides to the end position of the two-dimensional image, so as to obtain feature extraction matrix data, so as to reduce the number of parameters input into the theta mex function pooling layer.

S220, normalizing the feature extraction matrix data; calculating the mean value of the feature extraction matrix data by the formula (IV)

And calculating standard deviation of feature extraction matrix data by formula (five)>

；

Formula (IV);

formula (V);

in the formulas (four) (five), n is the batch size of the feature extraction matrix data,

and extracting the ith feature map data in the matrix data set for the features, wherein c is a constant which is not 0.

Then data normalization processing is carried out through a formula (six),

formula (six);

in the formula (six), γ is a scaling variable, and β is a translation variable.

Therefore, through the data normalization processing to the interval of [0,10], the original learning characteristics of the data can be reserved, training can be accelerated, and better data distribution is provided for the subsequent activation function.

S230, since the convolution processing and the normalization processing are both linear transformation of data, the neural network of the linear transformation has poor learning expression capability, which is equivalent to the most original perceptron, and no matter how many layers are added in the whole network, the linear transformation can still be provided. Therefore, the feature extraction matrix data subjected to normalization processing is subjected to nonlinear transformation through the nonlinear ReLU function, and the nonlinear transformation is used as an activation function of neural network learning, so that a face feature map to be mapped is obtained, and therefore the ReLU function not only can provide nonlinear transformation for the data, but also can enhance the network fault tolerance and improve the generalization capability of the network.

Preferably, in another embodiment of the present application, the step of "S300, mapping the face feature map by using a theta mex function, and obtaining a mapping feature map" specifically includes the following steps:

obtaining a mapping feature map feature according to the following formula:

formula (1);

Is a hyper-parameter of the kth theta mex function operator.

Specifically, in this embodiment, referring to fig. 2, in a pooling layer designed by a theta mex function, each face feature map is sent to a theta mex function operator that is independent of each other, mapping is performed on the face feature map by using formula (1) to obtain a mapping feature map, the theta mex function pooling layer uses maximum sampling, the sampling size is 2×2, that is, the input face feature map is divided into rectangles with sizes of 2×2 that are not overlapped, and the maximum value is taken for each rectangle, so that the length and width of the output feature map are half of the input feature map. After replacing the pooling layer of the original network, the theta MEX pooling layer improves the recall rate by about 1% -2% under the condition that the false recognition rate is less than 0.1%, and improves the correct rate by about 2% -6% under the condition that the false recognition rate is not more than 1%; therefore, the output of the ThetaMEX function can be freely controlled, the sparsity of the feature map is controlled, and the obtained mapping feature map can improve the image precision and reduce the false recognition rate.

Preferably, in another embodiment of the present application, the step of "S400, performing a classification metric learning on the map by using a classification loss function" specifically includes the following steps:

s410, classifying the loss function comprises classifying the contrast loss function and classifying the triplet loss function;

s420, carrying out classification measurement learning on the mapping feature map through a classification comparison loss function; or alternatively, the process may be performed,

s430, performing classification metric learning on the mapping feature map through a classification triplet loss function.

Specifically, in this embodiment, the time complexity of the comparison loss function and the triplet loss function, which are two commonly used measurement learning loss functions, respectively reach O (N2) and O (N3), and the time complexity of the classification loss function is O (N), so that the classification ideas are introduced into the comparison loss function and the triplet loss function, so that the time complexity of measurement learning can be reduced.

Preferably, in another embodiment of the present application, the step of "S420, performing a classification metric learning on the map by using a classification contrast loss function" specifically includes the following steps:

formula (2);

in the formula (2), N is the number of mapping feature maps,

is a feature vector +_>

For the column vector of the ith column of the weight matrix,

is->

Is a preset threshold. Specifically, in this embodiment, the formula of the conventional contrast loss function is as follows:

formula (seven);

where N represents the number of sample images, xi and Xj represent feature vectors of two sample images, y=1 represents that the two sample images are similar samples, y=0 represents that the two sample images are dissimilar, and margin represents a threshold value, meaning that when the two samples are dissimilar, the distance between the two feature vectors in a high-dimensional space must be optimized to be equal to or greater than margin value.

Whereas the above-mentioned classified comparison loss function of formula (2),

in order to obtain the column vector of the ith column of the weight matrix, namely the simulation class of the mapping feature map, namely compared with the method in the seventh step that two sample images are required to be acquired for each measurement learning, the classification contrast loss function only needs to set a simulation class in advance>

And then selecting the mapping characteristic map one by one and comparing the mapping characteristic map with the simulation class.

Preferably, in another embodiment of the present application, the step of "S430, performing a classification metric learning on the mapping feature map by using a classification triplet loss function, specifically includes the following steps:

formula (3);

in the formula (3), the amino acid sequence of the compound,

for mapping the anchor points of the feature map, < >>

Is anchor feature vector->

For a mapping profile of the same kind as the anchor point +.>

Column vector of weight matrix similar to anchor feature vector, +.>

Is different from the anchor pointMapping feature map of class,/>

Specifically, in this embodiment, the formula of the conventional triplet loss function is as follows:

formula (eight);

in the formula (eight), the amino acid sequence of the formula (II),

feature vector representing anchor sample image, +.>

Representing feature vectors similar to the anchor sample image, < >>

Representing feature vectors dissimilar to the anchor sample image, the triplet loss function needs to reduce the pitch of the similarity vectors as much as possible in the multidimensional space, and increase the pitch of the dissimilar vectors above the margin value.

Randomly selecting a sample from the map feature map data set, which is called an Anchor, and then randomly selecting a sum Anchor (noted as

) Samples belonging to the same class and samples belonging to different classes, which correspond to the same class and are called Positive (denoted +.>

) And Negative (denoted as +.>

) Thereby forming an (Anchor, positive, negative) triplet; />

Is PosiEuropean distance metric between active and Anchor,>

the Euclidean distance between the Negative and the Anchor is measured; />

Column vector-similarity modeling class for weight matrix similar to anchor feature vector, +_>

For the column vector-dissimilar simulation class of the weight matrix dissimilar to the anchor point feature vector, the classifying triplet loss function only needs to preset two simulation classes +.>

、/>

And then, respectively carrying out distance measurement with the anchor points of the mapping feature images, namely, carrying out distance measurement on the simulation class, so as to reduce the distance between similar tensors and enlarge the distance between dissimilar tensors as much as possible, thus, each similar and dissimilar sample image does not need to be subjected to distance measurement, the problems of difficult sampling and difficult optimization in measurement learning are effectively avoided, and the measurement learning of each sample is converted into the measurement learning of the simulation class, so that the time complexity is lower.

Preferably, in another embodiment of the present application, before the step of acquiring the initial face image with the detected alignment, the step of S100 specifically includes the following steps:

s010, controlling a camera to photograph a portrait target object, obtaining a portrait video stream, and detecting a face image to be adjusted in the cropped portrait video stream;

and S011, adjusting the face image to be adjusted to be aligned with the camera, and obtaining an initial face head image.

Specifically, in this embodiment, when a portrait moves within a range captured by the camera, the camera is controlled to automatically track the portrait, and to capture a target object of the portrait, so as to obtain a video stream of the portrait, and detect a face image to be adjusted in the video stream of the portrait. The specific method comprises the following steps: training is performed with a large number of face and non-face sample images to obtain a classifier which solves the class 2 classification problem, also called a face detection template. The classifier accepts an input picture of a fixed size and determines whether the input picture is a face.

If the cut face image has a left offset or right offset of a face, the face alignment module positions key positioning points such as eyes, noses, mouth angles and the like on the face, and corrects the face of the face image into an angle opposite to a camera through an affine transformation matrix.

Referring to fig. 3, the present invention further provides a feature extraction classification metric learning system 100 for face recognition, including:

an initial face image acquisition module 110, configured to acquire an initial face image that has been detected to be aligned;

the face feature map obtaining module 120 to be mapped is in communication connection with the initial face image obtaining module 110, and is configured to perform initial feature extraction processing on the initial face image to obtain a face feature map to be mapped;

the mapping feature map obtaining module 130 is communicatively connected to the face feature map obtaining module 120 to be mapped, and is configured to perform mapping processing on the face feature map through a theta mex function, so as to obtain a mapping feature map.

And the classification metric learning module 140 is in communication connection with the mapping feature map acquisition module 130 and is used for performing classification metric learning on the mapping feature map through a classification loss function.

Preferably, in another embodiment of the present application, the system further includes a detection alignment module 150, which is communicatively connected to the initial face image obtaining module 110, and is configured to control a camera to take a photograph of a portrait target object, obtain a portrait video stream, and detect a face image to be adjusted in the cropped portrait video stream; and adjusting the face image to be adjusted to be aligned with the camera to obtain an initial face head image.

The feature extraction classification measurement learning system of the face recognition firstly acquires an initial face image which is detected and aligned; then, carrying out initial feature extraction processing on the initial face image to obtain a face feature image to be mapped; the face feature map is mapped through a theta MEX function, a mapping feature map is obtained, the value of the parameter theta of the theta MEX function can be adaptively mapped to any value between a maximum value and a minimum value, therefore, the output of the theta MEX function can be freely controlled, the sparsity of the feature map is controlled, the image precision of the obtained mapping feature map can be improved, and the false recognition rate is reduced; and then, carrying out classification measurement learning on the mapping feature map through a classification loss function, and introducing a classification idea on the basis of measurement learning of the loss function, wherein the classification loss only needs to extract a class center as a 'simulation class' for a plurality of samples in advance, and then compares the samples with the 'simulation class', so that the time complexity of the function is low, two samples are not required to be sampled at the same time each time, and measurement learning can be carried out only by collecting one sample image each time.

Specifically, the functions of each module in this embodiment are described in detail in the corresponding method embodiment, so that a detailed description is omitted.

Based on the same inventive concept, the embodiments of the present application also provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements all or part of the method steps of the above method.

The present invention may be implemented by implementing all or part of the above-described method flow, or by instructing the relevant hardware by a computer program, which may be stored in a computer readable storage medium, and which when executed by a processor, may implement the steps of the above-described method embodiments. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be appropriately increased or decreased according to the requirements of the jurisdiction's jurisdiction and the patent practice, for example, in some jurisdictions, the computer readable medium does not include electrical carrier signals and telecommunication signals according to the jurisdiction and the patent practice.

Based on the same inventive concept, the embodiments of the present application further provide an electronic device, including a memory and a processor, where the memory stores a computer program running on the processor, and when the processor executes the computer program, the processor implements all or part of the method steps in the above method.

The processor may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor being a control center of the computer device, and the various interfaces and lines connecting the various parts of the overall computer device.

The memory may be used to store computer programs and/or modules, and the processor implements various functions of the computer device by running or executing the computer programs and/or modules stored in the memory, and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (e.g., a sound playing function, an image playing function, etc.); the storage data area may store data (e.g., audio data, video data, etc.) created according to the use of the handset. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, server, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), servers and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. The feature extraction classification metric learning method for face recognition is characterized by comprising the following steps of:

acquiring an initial face image which is detected to be aligned;

mapping the face feature map through a theta MEX function to obtain a mapping feature map;

carrying out classification measurement learning on the mapping feature map through a classification loss function;

the step of mapping the face feature map through a theta mex function to obtain a mapping feature map specifically comprises the following steps:

obtaining a mapping feature map feature according to the following formula:

formula (1);

in formula (1), feature _k The kth mapping feature map output for the theta mex function, M is the width of the face feature map, N is the height of the face feature map,

Super parameters for the kth theta mex function operator;

sending each face feature map into mutually independent theta MEX function operators respectively to obtain a mapping feature map;

each ThetaMEX function has a parameter theta, can be adjusted, can freely control the output of the ThetaMEX function, and can control the sparsity of the feature map.

2. The method for learning feature extraction classification metrics of face recognition according to claim 1, wherein the step of performing initial feature extraction processing on the initial face image to obtain a face feature map to be mapped specifically comprises the following steps:

3. The method for learning feature extraction classification metrics of face recognition according to claim 1, wherein the step of learning the classification metrics of the map feature map by a classification loss function comprises the steps of:

4. A method for learning feature extraction classification metrics for face recognition according to claim 3, characterized in that the step of "learning the classification metrics for the map feature map by classifying a contrast loss function" specifically comprises the steps of:

formula (2);

in the formula (2), N is the number of mapping feature maps,

is a feature vector +_>

Column vector for the ith column of the weight matrix, < >>

Is->

Is a preset threshold.

5. A feature extraction classification metric learning method for face recognition as claimed in claim 3, wherein the step of classifying the mapped feature map by a classification triplet loss function comprises the steps of:

formula (3);

in the formula (3), the amino acid sequence of the compound,

to map specialAnchor point of sign graph, let us go of>

Is anchor feature vector->

For a mapping profile of the same kind as the anchor point +.>

Column vector of weight matrix similar to anchor feature vector, +.>

For a map feature map that is heterogeneous to the anchor point,

6. The method for learning feature extraction classification metrics for face recognition according to claim 1, characterized in that before the step of acquiring the initial face image of the detected alignment, the method specifically comprises the following steps:

7. A feature extraction classification metric learning system for face recognition, comprising:

the mapping feature map acquisition module is in communication connection with the face feature map acquisition module to be mapped and is used for carrying out mapping processing on the face feature map through a theta MEX function to acquire a mapping feature map;

the classifying measurement learning module is in communication connection with the mapping characteristic diagram acquisition module and is used for classifying measurement learning on the mapping characteristic diagram through a classifying loss function;

the map feature map obtaining module is configured to obtain a map feature according to the following formula:

formula (1);

Super parameters for the kth theta mex function operator;

8. The feature extraction classification metric learning system of claim 7, comprising:

the detection alignment module is in communication connection with the initial face image acquisition module and is used for controlling a camera to shoot a portrait target object, acquiring a portrait video stream and detecting a face image to be adjusted in the portrait video stream; and adjusting the face image to be adjusted to be aligned with the camera to obtain an initial face head image.

9. A storage medium having stored thereon a computer program which, when executed by a processor, implements the feature extraction classification metric learning method of face recognition as claimed in any one of claims 1 to 6.