CN115240249B - Feature extraction classification metric learning method, system and storage medium for face recognition - Google Patents

Feature extraction classification metric learning method, system and storage medium for face recognition Download PDF

Info

Publication number
CN115240249B
CN115240249B CN202210804354.6A CN202210804354A CN115240249B CN 115240249 B CN115240249 B CN 115240249B CN 202210804354 A CN202210804354 A CN 202210804354A CN 115240249 B CN115240249 B CN 115240249B
Authority
CN
China
Prior art keywords
feature map
face
feature
mapping
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210804354.6A
Other languages
Chinese (zh)
Other versions
CN115240249A (en
Inventor
陈侃松
姚福娟
许立君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei University
Original Assignee
Hubei University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei University filed Critical Hubei University
Priority to CN202210804354.6A priority Critical patent/CN115240249B/en
Publication of CN115240249A publication Critical patent/CN115240249A/en
Application granted granted Critical
Publication of CN115240249B publication Critical patent/CN115240249B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a feature extraction classification measurement learning method, a system and a storage medium for face recognition, wherein the method comprises the following steps: acquiring an initial face image which is detected to be aligned; performing initial feature extraction processing on the initial face image to obtain a face feature image to be mapped; and mapping the face feature map through a theta MEX function to obtain a mapping feature map. Carrying out classification measurement learning on the mapping feature map through a classification loss function; therefore, the image precision can be improved, the false recognition rate can be reduced, and the time complexity of classifying measurement learning is low.

Description

Feature extraction classification metric learning method, system and storage medium for face recognition
Technical Field
The invention relates to the field of face recognition, in particular to a method, a system and a storage medium for learning feature extraction classification metrics of face recognition.
Background
The face recognition technology is used as the most important identity recognition technology, has the characteristics of non-contact performance, real-time performance, safety and the like, and is widely applied to numerous scientific and technological fields such as intelligent security, mobile payment and the like at present. The face image feature extraction is a key step of face recognition, and with the rapid development of deep learning and computer vision technology, the self-learning feature extraction of the convolutional neural network has higher accuracy and generalization capability compared with the artificial design feature extraction.
The conventional common face recognition neural network has the following defects:
the widely used face image feature extraction operation can better extract face features, but feature images have certain sparsity, and the sparsity of the feature images can be increased by common global averaging, maximum and random pooling operations, and the too sparse feature images can lose image features to cause high false recognition rate.
The loss function plays a crucial role in the deep neural network, and compared with the two commonly used measurement learning loss functions of the loss function and the triplet loss function, the loss function based on measurement learning (similarity learning) has the problem of complex sampling, so that the training difficulty of the neural network is greatly improved.
Disclosure of Invention
The invention provides a feature extraction and classification measurement learning method, a system and a storage medium for face recognition, which can improve image precision, reduce false recognition rate and simultaneously have lower time complexity of classification measurement learning during feature extraction.
In a first aspect, a method for learning feature extraction classification metrics of face recognition is provided, including the following steps:
acquiring an initial face image which is detected to be aligned;
performing initial feature extraction processing on the initial face image to obtain a face feature image to be mapped;
and mapping the face feature map through a theta MEX function to obtain a mapping feature map. The method comprises the steps of carrying out a first treatment on the surface of the
And carrying out classification metric learning on the mapping feature map through a classification loss function.
According to the first aspect, in a first possible implementation manner of the first aspect, the step of performing an initial feature extraction process on the initial face image to obtain a face feature map to be mapped specifically includes the following steps:
sliding the convolution kernel from an initial position to an end position on the initial face image to obtain feature extraction matrix data;
carrying out data normalization processing on the feature extraction matrix data;
and carrying out nonlinear transformation on the feature extraction matrix data subjected to normalization processing through a nonlinear ReLU function to obtain a face feature map to be mapped.
In a second possible implementation manner of the first aspect, the step of mapping the face feature map by using a theta mex function to obtain a mapping feature map specifically includes the following steps:
obtaining a mapping feature map feature according to the following formula:
Figure SMS_1
formula (1);
in the formula (1), featurek is the kth mapping feature map output by the theta mex function, M is the width of the face feature map, N is the height of the face feature map,
Figure SMS_2
for the feature value of the ith row and jth column in the inputted kth face feature map,/th>
Figure SMS_3
Is a hyper-parameter of the kth theta mex function operator.
In a third possible implementation manner of the first aspect according to the second possible implementation manner of the first aspect, the step of performing a classification metric learning on the mapping feature map by a classification loss function specifically includes the following steps:
the classified loss function comprises a classified comparison loss function and a classified triplet loss function;
carrying out classification measurement learning on the mapping feature map through a classification comparison loss function; or alternatively, the process may be performed,
and carrying out classification metric learning on the mapping feature map through a classification triplet loss function.
In a fourth possible implementation manner of the first aspect, the step of performing a classification metric learning on the mapping feature map by using a classification contrast loss function specifically includes the following steps:
carrying out classification measurement learning on the mapping feature map through the following classification comparison loss function;
Figure SMS_4
formula (2);
in the formula (2), N is the number of mapping feature maps,
Figure SMS_5
is a feature vector +_>
Figure SMS_6
For the column vector of the ith column of the weight matrix,
Figure SMS_7
is->
Figure SMS_8
Is a preset threshold.
In a fifth possible implementation manner of the first aspect according to the fourth possible implementation manner of the first aspect, the step of performing a classification metric learning on the mapping feature map by using a classification triplet loss function specifically includes the following steps:
performing classification metric learning on the mapping feature map through the following classification triplet loss function;
Figure SMS_9
formula (3);
in the formula (3), the amino acid sequence of the compound,
Figure SMS_10
for mapping the anchor points of the feature map, < >>
Figure SMS_11
Is an anchorPoint feature vector +_>
Figure SMS_12
For a mapping profile of the same kind as the anchor point +.>
Figure SMS_13
Column vector of weight matrix similar to anchor feature vector, +.>
Figure SMS_14
For mapping feature map different from anchor point, +.>
Figure SMS_15
For column vectors of the weight matrix that are dissimilar to the anchor feature vector, margin is a preset threshold.
In a fifth possible implementation manner of the first aspect, before the step of acquiring the initial face image that has been aligned and detected, the method specifically includes the following steps:
controlling a camera to shoot a portrait target object, obtaining a portrait video stream, and detecting a face image to be adjusted in the portrait video stream;
and adjusting the face image to be adjusted to be aligned with the camera to obtain an initial face head image.
In a second aspect, a feature extraction classification metric learning system for face recognition is provided, including:
the initial face image acquisition module is used for acquiring the initial face image which is detected and aligned;
the face feature map to be mapped is connected with the initial face image acquisition module in a communication way and is used for carrying out initial feature extraction processing on the initial face image to acquire a face feature map to be mapped;
the mapping feature map acquisition module is in communication connection with the face feature map acquisition module to be mapped and is used for carrying out mapping processing on the face feature map through a theta MEX function to acquire a mapping feature map.
And the classifying measurement learning module is in communication connection with the mapping characteristic diagram acquisition module and is used for classifying measurement learning on the mapping characteristic diagram through a classifying loss function.
In a first possible implementation manner of the second aspect according to the second aspect, the method includes: the detection alignment module is in communication connection with the initial face image acquisition module and is used for controlling a camera to shoot a portrait target object, acquiring a portrait video stream and detecting a face image to be adjusted in the portrait video stream; and adjusting the face image to be adjusted to be aligned with the camera to obtain an initial face head image.
In a third aspect, a storage medium is provided, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements a feature extraction classification metric learning method for face recognition as described above.
Compared with the prior art, the invention has the following advantages: firstly, acquiring an initial face image which is detected to be aligned; then, carrying out initial feature extraction processing on the initial face image to obtain a face feature image to be mapped; the face feature map is mapped through a theta MEX function, a mapping feature map is obtained, the value of a parameter of the theta MEX function can be adaptively mapped to any value between a maximum value and a minimum value, therefore, the output of the theta MEX function can be freely controlled, the sparsity of the feature map is controlled, the image precision of the obtained mapping feature map can be improved, and the false recognition rate is reduced; and then, carrying out classification measurement learning on the mapping feature map through a classification loss function, and introducing a classification idea on the basis of measurement learning of the loss function, wherein the classification loss only needs to extract a class center as a 'simulation class' for a plurality of samples in advance, and then compares the samples with the 'simulation class', so that the time complexity of the function is low, two samples are not required to be sampled at the same time each time, and measurement learning can be carried out only by collecting one sample image each time.
Drawings
FIG. 1 is a flow chart of an embodiment of a feature extraction classification metric learning method for face recognition according to the present invention;
FIG. 2 is a schematic diagram of a ThetaMEX function mapping flow of the present invention;
FIG. 3 is a schematic flow chart of the feature extraction classification metric learning system for face recognition according to the present invention;
reference numerals:
100. a feature extraction classification measurement learning system for face recognition; 110. an initial face image acquisition module; 120. the face feature map obtaining module is used for obtaining a face feature map to be mapped; 130. a mapping feature map acquisition module; 140. a classification metric learning module; 150. and detecting an alignment module.
Detailed Description
Reference will now be made in detail to the present embodiments of the invention, examples of which are illustrated in the accompanying drawings. While the invention will be described in conjunction with the specific embodiments, it will be understood that they are not intended to limit the invention to the described embodiments. On the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims. It should be noted that the method steps described herein may be implemented by any functional block or arrangement of functions, and any functional block or arrangement of functions may be implemented as a physical entity or a logical entity, or a combination of both.
The present invention will be described in further detail below with reference to the drawings and detailed description for the purpose of enabling those skilled in the art to understand the invention better.
Note that: the examples to be described below are only one specific example, and not as limiting the embodiments of the present invention necessarily to the following specific steps, values, conditions, data, sequences, etc. Those skilled in the art can, upon reading the present specification, make and use the concepts of the invention to construct further embodiments not mentioned in the specification.
Referring to fig. 1, an embodiment of the present invention provides a feature extraction classification metric learning method for face recognition, including the following steps:
s100, acquiring an initial face image which is detected to be aligned;
s200, carrying out initial feature extraction processing on the initial face image to obtain a face feature image to be mapped;
s300, mapping the face feature map through a theta MEX function to obtain a mapping feature map;
s400, carrying out classification metric learning on the mapping feature map through a classification loss function.
Specifically, in this embodiment, an initial face image that has been detected to be aligned is first acquired; then, carrying out initial feature extraction processing on the initial face image to obtain a face feature image to be mapped;
because the existing face image feature extraction operation can better extract face features, but the extracted face feature images have certain sparsity, and the common global averaging, maximum and random pooling operations can aggravate the sparsity of the feature images, and the too sparse feature images can lose image features to cause high false recognition rate, so that the theta MEX function is embedded into the face verification neural network models MobileFaceneT and Light-CNN for the face feature extraction operation, the pooling layers of the two models are replaced, and the specific steps are that Global depthwise Conv in the MobileFaceneT is replaced by ThetaMEX Pool, and other network structures are not changed; similarly, the global average pooling layer in the Light-CNN is replaced by a theta MEX Pool, so that the face feature map is mapped through a theta MEX function to obtain a mapping feature map, the value of the parameter theta of the theta MEX function can be adaptively mapped to any value between a maximum value and a minimum value, the output of the theta MEX function can be freely controlled, the sparsity of the feature map is controlled, and the obtained mapping feature map can improve the image precision and reduce the false recognition rate.
Because the existing loss function based on measurement learning (similarity learning) is comparison between samples, the time complexity is higher, and the sampling difficulty is higher, after the mapping feature images are acquired, the mapping feature images are subjected to classification measurement learning through the classification loss function, the classification thought can be introduced on the basis of measurement learning of the loss function, the classification loss only needs to extract a class center as a 'simulation class' for a plurality of samples in advance, and then the samples are compared with the 'simulation class', so that the time complexity of the function is lower, two samples are not required to be simultaneously sampled each time, and measurement learning can be performed only by acquiring one sample image each time.
Preferably, in another embodiment of the present application, the step of "S200, performing an initial feature extraction process on the initial face image, and obtaining a face feature map to be mapped" specifically includes the following steps:
s210, sliding a convolution kernel from an initial position to an end position on the initial face image to obtain feature extraction matrix data;
s220, carrying out data normalization processing on the feature extraction matrix data;
s230, carrying out nonlinear transformation on the feature extraction matrix data subjected to normalization processing through a nonlinear ReLU function, and obtaining a face feature map to be mapped.
Specifically, in this embodiment, S210 performs convolution processing on the initial face image, and the image convolution calculation process is similar to the convolution calculation process in signal processing, and slides a smaller square convolution kernel on the initial face image (two-dimensional image), and the pixel values covered by the convolution kernel on the two-dimensional image are multiplied by the values corresponding to the convolution kernel and then added until the convolution kernel slides to the end position of the two-dimensional image, so as to obtain feature extraction matrix data, so as to reduce the number of parameters input into the theta mex function pooling layer.
S220, normalizing the feature extraction matrix data; calculating the mean value of the feature extraction matrix data by the formula (IV)
Figure SMS_16
And calculating standard deviation of feature extraction matrix data by formula (five)>
Figure SMS_17
Figure SMS_18
Formula (IV);
Figure SMS_19
formula (V);
in the formulas (four) (five), n is the batch size of the feature extraction matrix data,
Figure SMS_20
and extracting the ith feature map data in the matrix data set for the features, wherein c is a constant which is not 0.
Then data normalization processing is carried out through a formula (six),
Figure SMS_21
formula (six);
in the formula (six), γ is a scaling variable, and β is a translation variable.
Therefore, through the data normalization processing to the interval of [0,10], the original learning characteristics of the data can be reserved, training can be accelerated, and better data distribution is provided for the subsequent activation function.
S230, since the convolution processing and the normalization processing are both linear transformation of data, the neural network of the linear transformation has poor learning expression capability, which is equivalent to the most original perceptron, and no matter how many layers are added in the whole network, the linear transformation can still be provided. Therefore, the feature extraction matrix data subjected to normalization processing is subjected to nonlinear transformation through the nonlinear ReLU function, and the nonlinear transformation is used as an activation function of neural network learning, so that a face feature map to be mapped is obtained, and therefore the ReLU function not only can provide nonlinear transformation for the data, but also can enhance the network fault tolerance and improve the generalization capability of the network.
Preferably, in another embodiment of the present application, the step of "S300, mapping the face feature map by using a theta mex function, and obtaining a mapping feature map" specifically includes the following steps:
obtaining a mapping feature map feature according to the following formula:
Figure SMS_22
formula (1);
in the formula (1), featurek is the kth mapping feature map output by the theta mex function, M is the width of the face feature map, N is the height of the face feature map,
Figure SMS_23
for the feature value of the ith row and jth column in the inputted kth face feature map,/th>
Figure SMS_24
Is a hyper-parameter of the kth theta mex function operator.
Specifically, in this embodiment, referring to fig. 2, in a pooling layer designed by a theta mex function, each face feature map is sent to a theta mex function operator that is independent of each other, mapping is performed on the face feature map by using formula (1) to obtain a mapping feature map, the theta mex function pooling layer uses maximum sampling, the sampling size is 2×2, that is, the input face feature map is divided into rectangles with sizes of 2×2 that are not overlapped, and the maximum value is taken for each rectangle, so that the length and width of the output feature map are half of the input feature map. After replacing the pooling layer of the original network, the theta MEX pooling layer improves the recall rate by about 1% -2% under the condition that the false recognition rate is less than 0.1%, and improves the correct rate by about 2% -6% under the condition that the false recognition rate is not more than 1%; therefore, the output of the ThetaMEX function can be freely controlled, the sparsity of the feature map is controlled, and the obtained mapping feature map can improve the image precision and reduce the false recognition rate.
Preferably, in another embodiment of the present application, the step of "S400, performing a classification metric learning on the map by using a classification loss function" specifically includes the following steps:
s410, classifying the loss function comprises classifying the contrast loss function and classifying the triplet loss function;
s420, carrying out classification measurement learning on the mapping feature map through a classification comparison loss function; or alternatively, the process may be performed,
s430, performing classification metric learning on the mapping feature map through a classification triplet loss function.
Specifically, in this embodiment, the time complexity of the comparison loss function and the triplet loss function, which are two commonly used measurement learning loss functions, respectively reach O (N2) and O (N3), and the time complexity of the classification loss function is O (N), so that the classification ideas are introduced into the comparison loss function and the triplet loss function, so that the time complexity of measurement learning can be reduced.
Preferably, in another embodiment of the present application, the step of "S420, performing a classification metric learning on the map by using a classification contrast loss function" specifically includes the following steps:
carrying out classification measurement learning on the mapping feature map through the following classification comparison loss function;
Figure SMS_25
formula (2);
in the formula (2), N is the number of mapping feature maps,
Figure SMS_26
is a feature vector +_>
Figure SMS_27
For the column vector of the ith column of the weight matrix,
Figure SMS_28
is->
Figure SMS_29
Is a preset threshold. Specifically, in this embodiment, the formula of the conventional contrast loss function is as follows:
Figure SMS_30
formula (seven);
where N represents the number of sample images, xi and Xj represent feature vectors of two sample images, y=1 represents that the two sample images are similar samples, y=0 represents that the two sample images are dissimilar, and margin represents a threshold value, meaning that when the two samples are dissimilar, the distance between the two feature vectors in a high-dimensional space must be optimized to be equal to or greater than margin value.
Whereas the above-mentioned classified comparison loss function of formula (2),
Figure SMS_31
in order to obtain the column vector of the ith column of the weight matrix, namely the simulation class of the mapping feature map, namely compared with the method in the seventh step that two sample images are required to be acquired for each measurement learning, the classification contrast loss function only needs to set a simulation class in advance>
Figure SMS_32
And then selecting the mapping characteristic map one by one and comparing the mapping characteristic map with the simulation class.
Preferably, in another embodiment of the present application, the step of "S430, performing a classification metric learning on the mapping feature map by using a classification triplet loss function, specifically includes the following steps:
performing classification metric learning on the mapping feature map through the following classification triplet loss function;
Figure SMS_33
formula (3);
in the formula (3), the amino acid sequence of the compound,
Figure SMS_34
for mapping the anchor points of the feature map, < >>
Figure SMS_35
Is anchor feature vector->
Figure SMS_36
For a mapping profile of the same kind as the anchor point +.>
Figure SMS_37
Column vector of weight matrix similar to anchor feature vector, +.>
Figure SMS_38
Is different from the anchor pointMapping feature map of class,/>
Figure SMS_39
For column vectors of the weight matrix that are dissimilar to the anchor feature vector, margin is a preset threshold.
Specifically, in this embodiment, the formula of the conventional triplet loss function is as follows:
Figure SMS_40
formula (eight);
in the formula (eight), the amino acid sequence of the formula (II),
Figure SMS_41
feature vector representing anchor sample image, +.>
Figure SMS_42
Representing feature vectors similar to the anchor sample image, < >>
Figure SMS_43
Representing feature vectors dissimilar to the anchor sample image, the triplet loss function needs to reduce the pitch of the similarity vectors as much as possible in the multidimensional space, and increase the pitch of the dissimilar vectors above the margin value.
Randomly selecting a sample from the map feature map data set, which is called an Anchor, and then randomly selecting a sum Anchor (noted as
Figure SMS_45
) Samples belonging to the same class and samples belonging to different classes, which correspond to the same class and are called Positive (denoted +.>
Figure SMS_47
) And Negative (denoted as +.>
Figure SMS_50
) Thereby forming an (Anchor, positive, negative) triplet; />
Figure SMS_46
Is PosiEuropean distance metric between active and Anchor,>
Figure SMS_49
the Euclidean distance between the Negative and the Anchor is measured; />
Figure SMS_51
Column vector-similarity modeling class for weight matrix similar to anchor feature vector, +_>
Figure SMS_52
For the column vector-dissimilar simulation class of the weight matrix dissimilar to the anchor point feature vector, the classifying triplet loss function only needs to preset two simulation classes +.>
Figure SMS_44
、/>
Figure SMS_48
And then, respectively carrying out distance measurement with the anchor points of the mapping feature images, namely, carrying out distance measurement on the simulation class, so as to reduce the distance between similar tensors and enlarge the distance between dissimilar tensors as much as possible, thus, each similar and dissimilar sample image does not need to be subjected to distance measurement, the problems of difficult sampling and difficult optimization in measurement learning are effectively avoided, and the measurement learning of each sample is converted into the measurement learning of the simulation class, so that the time complexity is lower.
Preferably, in another embodiment of the present application, before the step of acquiring the initial face image with the detected alignment, the step of S100 specifically includes the following steps:
s010, controlling a camera to photograph a portrait target object, obtaining a portrait video stream, and detecting a face image to be adjusted in the cropped portrait video stream;
and S011, adjusting the face image to be adjusted to be aligned with the camera, and obtaining an initial face head image.
Specifically, in this embodiment, when a portrait moves within a range captured by the camera, the camera is controlled to automatically track the portrait, and to capture a target object of the portrait, so as to obtain a video stream of the portrait, and detect a face image to be adjusted in the video stream of the portrait. The specific method comprises the following steps: training is performed with a large number of face and non-face sample images to obtain a classifier which solves the class 2 classification problem, also called a face detection template. The classifier accepts an input picture of a fixed size and determines whether the input picture is a face.
If the cut face image has a left offset or right offset of a face, the face alignment module positions key positioning points such as eyes, noses, mouth angles and the like on the face, and corrects the face of the face image into an angle opposite to a camera through an affine transformation matrix.
Referring to fig. 3, the present invention further provides a feature extraction classification metric learning system 100 for face recognition, including:
an initial face image acquisition module 110, configured to acquire an initial face image that has been detected to be aligned;
the face feature map obtaining module 120 to be mapped is in communication connection with the initial face image obtaining module 110, and is configured to perform initial feature extraction processing on the initial face image to obtain a face feature map to be mapped;
the mapping feature map obtaining module 130 is communicatively connected to the face feature map obtaining module 120 to be mapped, and is configured to perform mapping processing on the face feature map through a theta mex function, so as to obtain a mapping feature map.
And the classification metric learning module 140 is in communication connection with the mapping feature map acquisition module 130 and is used for performing classification metric learning on the mapping feature map through a classification loss function.
Preferably, in another embodiment of the present application, the system further includes a detection alignment module 150, which is communicatively connected to the initial face image obtaining module 110, and is configured to control a camera to take a photograph of a portrait target object, obtain a portrait video stream, and detect a face image to be adjusted in the cropped portrait video stream; and adjusting the face image to be adjusted to be aligned with the camera to obtain an initial face head image.
The feature extraction classification measurement learning system of the face recognition firstly acquires an initial face image which is detected and aligned; then, carrying out initial feature extraction processing on the initial face image to obtain a face feature image to be mapped; the face feature map is mapped through a theta MEX function, a mapping feature map is obtained, the value of the parameter theta of the theta MEX function can be adaptively mapped to any value between a maximum value and a minimum value, therefore, the output of the theta MEX function can be freely controlled, the sparsity of the feature map is controlled, the image precision of the obtained mapping feature map can be improved, and the false recognition rate is reduced; and then, carrying out classification measurement learning on the mapping feature map through a classification loss function, and introducing a classification idea on the basis of measurement learning of the loss function, wherein the classification loss only needs to extract a class center as a 'simulation class' for a plurality of samples in advance, and then compares the samples with the 'simulation class', so that the time complexity of the function is low, two samples are not required to be sampled at the same time each time, and measurement learning can be carried out only by collecting one sample image each time.
Specifically, the functions of each module in this embodiment are described in detail in the corresponding method embodiment, so that a detailed description is omitted.
Based on the same inventive concept, the embodiments of the present application also provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements all or part of the method steps of the above method.
The present invention may be implemented by implementing all or part of the above-described method flow, or by instructing the relevant hardware by a computer program, which may be stored in a computer readable storage medium, and which when executed by a processor, may implement the steps of the above-described method embodiments. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be appropriately increased or decreased according to the requirements of the jurisdiction's jurisdiction and the patent practice, for example, in some jurisdictions, the computer readable medium does not include electrical carrier signals and telecommunication signals according to the jurisdiction and the patent practice.
Based on the same inventive concept, the embodiments of the present application further provide an electronic device, including a memory and a processor, where the memory stores a computer program running on the processor, and when the processor executes the computer program, the processor implements all or part of the method steps in the above method.
The processor may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor being a control center of the computer device, and the various interfaces and lines connecting the various parts of the overall computer device.
The memory may be used to store computer programs and/or modules, and the processor implements various functions of the computer device by running or executing the computer programs and/or modules stored in the memory, and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (e.g., a sound playing function, an image playing function, etc.); the storage data area may store data (e.g., audio data, video data, etc.) created according to the use of the handset. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, server, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), servers and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (9)

1. The feature extraction classification metric learning method for face recognition is characterized by comprising the following steps of:
acquiring an initial face image which is detected to be aligned;
performing initial feature extraction processing on the initial face image to obtain a face feature image to be mapped;
mapping the face feature map through a theta MEX function to obtain a mapping feature map;
carrying out classification measurement learning on the mapping feature map through a classification loss function;
the step of mapping the face feature map through a theta mex function to obtain a mapping feature map specifically comprises the following steps:
obtaining a mapping feature map feature according to the following formula:
Figure QLYQS_1
formula (1);
in formula (1), feature k The kth mapping feature map output for the theta mex function, M is the width of the face feature map, N is the height of the face feature map,
Figure QLYQS_2
for the feature value of the ith row and jth column in the inputted kth face feature map,/th>
Figure QLYQS_3
Super parameters for the kth theta mex function operator;
sending each face feature map into mutually independent theta MEX function operators respectively to obtain a mapping feature map;
each ThetaMEX function has a parameter theta, can be adjusted, can freely control the output of the ThetaMEX function, and can control the sparsity of the feature map.
2. The method for learning feature extraction classification metrics of face recognition according to claim 1, wherein the step of performing initial feature extraction processing on the initial face image to obtain a face feature map to be mapped specifically comprises the following steps:
sliding the convolution kernel from an initial position to an end position on the initial face image to obtain feature extraction matrix data;
carrying out data normalization processing on the feature extraction matrix data;
and carrying out nonlinear transformation on the feature extraction matrix data subjected to normalization processing through a nonlinear ReLU function to obtain a face feature map to be mapped.
3. The method for learning feature extraction classification metrics of face recognition according to claim 1, wherein the step of learning the classification metrics of the map feature map by a classification loss function comprises the steps of:
the classified loss function comprises a classified comparison loss function and a classified triplet loss function;
carrying out classification measurement learning on the mapping feature map through a classification comparison loss function; or alternatively, the process may be performed,
and carrying out classification metric learning on the mapping feature map through a classification triplet loss function.
4. A method for learning feature extraction classification metrics for face recognition according to claim 3, characterized in that the step of "learning the classification metrics for the map feature map by classifying a contrast loss function" specifically comprises the steps of:
carrying out classification measurement learning on the mapping feature map through the following classification comparison loss function;
Figure QLYQS_4
formula (2);
in the formula (2), N is the number of mapping feature maps,
Figure QLYQS_5
is a feature vector +_>
Figure QLYQS_6
Column vector for the ith column of the weight matrix, < >>
Figure QLYQS_7
Is->
Figure QLYQS_8
Is a preset threshold.
5. A feature extraction classification metric learning method for face recognition as claimed in claim 3, wherein the step of classifying the mapped feature map by a classification triplet loss function comprises the steps of:
performing classification metric learning on the mapping feature map through the following classification triplet loss function;
Figure QLYQS_9
formula (3);
in the formula (3), the amino acid sequence of the compound,
Figure QLYQS_10
to map specialAnchor point of sign graph, let us go of>
Figure QLYQS_11
Is anchor feature vector->
Figure QLYQS_12
For a mapping profile of the same kind as the anchor point +.>
Figure QLYQS_13
Column vector of weight matrix similar to anchor feature vector, +.>
Figure QLYQS_14
For a map feature map that is heterogeneous to the anchor point,
Figure QLYQS_15
for column vectors of the weight matrix that are dissimilar to the anchor feature vector, margin is a preset threshold.
6. The method for learning feature extraction classification metrics for face recognition according to claim 1, characterized in that before the step of acquiring the initial face image of the detected alignment, the method specifically comprises the following steps:
controlling a camera to shoot a portrait target object, obtaining a portrait video stream, and detecting a face image to be adjusted in the portrait video stream;
and adjusting the face image to be adjusted to be aligned with the camera to obtain an initial face head image.
7. A feature extraction classification metric learning system for face recognition, comprising:
the initial face image acquisition module is used for acquiring the initial face image which is detected and aligned;
the face feature map to be mapped is connected with the initial face image acquisition module in a communication way and is used for carrying out initial feature extraction processing on the initial face image to acquire a face feature map to be mapped;
the mapping feature map acquisition module is in communication connection with the face feature map acquisition module to be mapped and is used for carrying out mapping processing on the face feature map through a theta MEX function to acquire a mapping feature map;
the classifying measurement learning module is in communication connection with the mapping characteristic diagram acquisition module and is used for classifying measurement learning on the mapping characteristic diagram through a classifying loss function;
the map feature map obtaining module is configured to obtain a map feature according to the following formula:
Figure QLYQS_16
formula (1);
in formula (1), feature k The kth mapping feature map output for the theta mex function, M is the width of the face feature map, N is the height of the face feature map,
Figure QLYQS_17
for the feature value of the ith row and jth column in the inputted kth face feature map,/th>
Figure QLYQS_18
Super parameters for the kth theta mex function operator;
sending each face feature map into mutually independent theta MEX function operators respectively to obtain a mapping feature map;
each ThetaMEX function has a parameter theta, can be adjusted, can freely control the output of the ThetaMEX function, and can control the sparsity of the feature map.
8. The feature extraction classification metric learning system of claim 7, comprising:
the detection alignment module is in communication connection with the initial face image acquisition module and is used for controlling a camera to shoot a portrait target object, acquiring a portrait video stream and detecting a face image to be adjusted in the portrait video stream; and adjusting the face image to be adjusted to be aligned with the camera to obtain an initial face head image.
9. A storage medium having stored thereon a computer program which, when executed by a processor, implements the feature extraction classification metric learning method of face recognition as claimed in any one of claims 1 to 6.
CN202210804354.6A 2022-07-07 2022-07-07 Feature extraction classification metric learning method, system and storage medium for face recognition Active CN115240249B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210804354.6A CN115240249B (en) 2022-07-07 2022-07-07 Feature extraction classification metric learning method, system and storage medium for face recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210804354.6A CN115240249B (en) 2022-07-07 2022-07-07 Feature extraction classification metric learning method, system and storage medium for face recognition

Publications (2)

Publication Number Publication Date
CN115240249A CN115240249A (en) 2022-10-25
CN115240249B true CN115240249B (en) 2023-06-06

Family

ID=83672059

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210804354.6A Active CN115240249B (en) 2022-07-07 2022-07-07 Feature extraction classification metric learning method, system and storage medium for face recognition

Country Status (1)

Country Link
CN (1) CN115240249B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408435A (en) * 2014-12-05 2015-03-11 浙江大学 Face identification method based on random pooling convolutional neural network
CN111160491A (en) * 2020-04-03 2020-05-15 北京精诊医疗科技有限公司 Pooling method and pooling model in convolutional neural network
CN113392955A (en) * 2021-05-11 2021-09-14 南方医科大学 CT reconstruction neural network structure and method based on downsampling imaging geometric modeling
CN113592018A (en) * 2021-08-10 2021-11-02 大连大学 Infrared light and visible light image fusion method based on residual dense network and gradient loss
US11270124B1 (en) * 2020-11-16 2022-03-08 Branded Entertainment Network, Inc. Temporal bottleneck attention architecture for video action recognition
CN114332796A (en) * 2021-12-24 2022-04-12 燕山大学 Multi-sensor fusion voxel characteristic map generation method and system
CN114372968A (en) * 2021-12-31 2022-04-19 江南大学 Defect detection method combining attention mechanism and adaptive memory fusion network

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5774576A (en) * 1995-07-17 1998-06-30 Nec Research Institute, Inc. Pattern recognition by unsupervised metric learning
US8873844B2 (en) * 2011-11-21 2014-10-28 Nec Laboratories America, Inc. Large-scale strongly supervised ensemble metric learning
CN107742107B (en) * 2017-10-20 2019-03-01 北京达佳互联信息技术有限公司 Facial image classification method, device and server
KR102535411B1 (en) * 2017-11-16 2023-05-23 삼성전자주식회사 Apparatus and method related to metric learning based data classification
US10706503B2 (en) * 2018-03-13 2020-07-07 Disney Enterprises, Inc. Image processing using a convolutional neural network
CN110097021B (en) * 2019-05-10 2022-09-06 电子科技大学 MTCNN-based face pose estimation method
CN113723455B (en) * 2021-07-28 2023-10-13 南京邮电大学 Strong gravitation lens system classification method and device based on metric learning
CN114067385A (en) * 2021-10-09 2022-02-18 华南理工大学 Cross-modal face retrieval Hash method based on metric learning
CN114565918A (en) * 2022-02-24 2022-05-31 阳光暖果(北京)科技发展有限公司 Face silence living body detection method and system based on multi-feature extraction module

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408435A (en) * 2014-12-05 2015-03-11 浙江大学 Face identification method based on random pooling convolutional neural network
CN111160491A (en) * 2020-04-03 2020-05-15 北京精诊医疗科技有限公司 Pooling method and pooling model in convolutional neural network
US11270124B1 (en) * 2020-11-16 2022-03-08 Branded Entertainment Network, Inc. Temporal bottleneck attention architecture for video action recognition
CN113392955A (en) * 2021-05-11 2021-09-14 南方医科大学 CT reconstruction neural network structure and method based on downsampling imaging geometric modeling
CN113592018A (en) * 2021-08-10 2021-11-02 大连大学 Infrared light and visible light image fusion method based on residual dense network and gradient loss
CN114332796A (en) * 2021-12-24 2022-04-12 燕山大学 Multi-sensor fusion voxel characteristic map generation method and system
CN114372968A (en) * 2021-12-31 2022-04-19 江南大学 Defect detection method combining attention mechanism and adaptive memory fusion network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于稀疏优化学习的图像建模方法;张思博;《中国博士学位论文全文数据库 (信息科技辑)》(第1期);第I138-146页 *
融入无监督度量学习的稀疏子空间聚类模型;江雨燕等;《小型微型计算机系统》;第1-10页 *

Also Published As

Publication number Publication date
CN115240249A (en) 2022-10-25

Similar Documents

Publication Publication Date Title
Hsu et al. Ratio-and-scale-aware YOLO for pedestrian detection
CN112528878B (en) Method and device for detecting lane line, terminal equipment and readable storage medium
US10534957B2 (en) Eyeball movement analysis method and device, and storage medium
CN111160269A (en) Face key point detection method and device
CN108764041B (en) Face recognition method for lower shielding face image
CN109344727B (en) Identity card text information detection method and device, readable storage medium and terminal
CN109960742B (en) Local information searching method and device
US11455831B2 (en) Method and apparatus for face classification
US20170061253A1 (en) Method and device for determining the shape of an object represented in an image, corresponding computer program product and computer-readable medium
CN111695462A (en) Face recognition method, face recognition device, storage medium and server
CN110838119A (en) Human face image quality evaluation method, computer device and computer readable storage medium
CN113869282B (en) Face recognition method, hyper-resolution model training method and related equipment
CN112836653A (en) Face privacy method, device and apparatus and computer storage medium
CN111709305B (en) Face age identification method based on local image block
CN115578590A (en) Image identification method and device based on convolutional neural network model and terminal equipment
CN110826534A (en) Face key point detection method and system based on local principal component analysis
JP7121132B2 (en) Image processing method, apparatus and electronic equipment
CN115240249B (en) Feature extraction classification metric learning method, system and storage medium for face recognition
CN113012030A (en) Image splicing method, device and equipment
CN116129496A (en) Image shielding method and device, computer equipment and storage medium
CN110210425B (en) Face recognition method and device, electronic equipment and storage medium
CN112084874B (en) Object detection method and device and terminal equipment
CN111860272A (en) Image processing method, chip and electronic device
CN113128427A (en) Face recognition method and device, computer readable storage medium and terminal equipment
CN112348008A (en) Certificate information identification method and device, terminal equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant