CN112613341A

CN112613341A - Training method and device, fingerprint identification method and device, and electronic device

Info

Publication number: CN112613341A
Application number: CN202011351546.3A
Authority: CN
Inventors: 魏思远; 梁嘉骏
Original assignee: Beijing Megvii Technology Co Ltd
Current assignee: TIANJIN JIHAO TECHNOLOGY CO LTD
Priority date: 2020-11-25
Filing date: 2020-11-25
Publication date: 2021-04-06

Abstract

The invention relates to a training method and a device, a fingerprint identification method and a device, an electronic device and a storage medium, wherein the training method comprises the following steps: performing feature extraction on the obtained sample image set to obtain sample features; preprocessing the sample characteristics to obtain preprocessing characteristics; calculating the sum of comparison losses of all sample pairs included in the sample image set according to the preprocessing characteristics; calculating the sum of classification losses of all sample pairs included in the sample image set; and training according to the sum of the comparison loss and the sum of the classification loss. During the training process, compared with the traditional mode of training a neural network model, the sum of comparison losses is increased, the distance between the samples of the same type can be shortened, and the distance between the samples of different types can be enlarged, so that a plurality of more compact characteristic distributions are formed, and the neural network model being trained can fully learn the boundaries between the samples of various types.

Description

Training method and device, fingerprint identification method and device, and electronic device

Technical Field

The application belongs to the field of machine learning, and particularly relates to a training method and device, a fingerprint identification method and device, electronic equipment and a storage medium.

Background

After the convolutional neural network model is trained through the training samples, the final effect to be achieved by the convolutional neural network model is to accurately classify positive and negative samples. The samples generated by the same project or the same device in the training samples are called as samples of the same domain or samples of the same sample type.

Currently, for training a convolutional neural network model, training data accumulated by past items that are the same as or similar to the items that the convolutional neural network model interfaces with is usually used as training samples to train, that is, samples from different domains may be included in the training samples, and the samples of each domain include positive samples and negative samples. The training process is to learn the sum of the classification losses of the training samples, so as to learn a decision surface for distinguishing positive and negative samples.

However, this simple way of multiplexing past data for training does not consider the difference between samples of each domain in the training samples, which may cause a decision surface to overfit samples of a certain domain, so that the feature distance between positive and negative samples of other domains is too close, further causing the trained neural network model to accurately distinguish positive and negative samples of the overfit domain, but difficult to accurately distinguish positive and negative samples of other domains with too close feature distances, which affects the generalization of the convolutional neural network model.

Disclosure of Invention

In view of the above, an object of the present application is to provide a training method and apparatus, a fingerprint recognition method and apparatus, an electronic device, and a storage medium, so that a neural network model obtained by training has better generalization.

The embodiment of the application is realized as follows:

in a first aspect, an embodiment of the present application provides a training method, configured to train a neural network model, where the neural network model includes a backbone network and a comparison learning loss network, and the method includes: performing feature extraction on the obtained sample image set through the backbone network to obtain sample features; preprocessing the sample characteristics through the comparison learning loss network to obtain preprocessed characteristics; according to the preprocessing characteristics, calculating and obtaining the sum of comparison losses of all sample pairs included in the sample image set; calculating the sum of the classification losses of all samples included in the sample image set according to the sample characteristics; and training the neural network model according to the sum of the comparison loss and the sum of the classification loss.

Compared with the traditional mode of training the neural network model, the neural network training method provided by the scheme has the advantage that a comparison loss sum is added when the neural network model is trained. When the neural network model trains and learns the sum of the comparison losses, parameters for representing the comparison difference among the samples can be calculated through gradient calculation. The neural network model can learn the same type of samples of each domain and the comparison difference between different types of samples of each domain as far as possible through learning the parameters, so that the neural network model can not only consider the difference between positive and negative samples but also consider the difference brought by the domains when forming a feature space, thereby avoiding overfitting the samples of certain specific domains in a training set and being beneficial to enhancing the generalization of the neural network model.

With reference to the embodiment of the first aspect, in a possible implementation manner, the training the neural network model according to the sum of the comparison losses and the sum of the classification losses includes: and carrying out weighted summation on the comparison loss sum and the classification loss sum, and carrying out back propagation on parameters obtained after weighted summation in the neural network model.

With reference to the embodiment of the first aspect, in a possible implementation manner, each sample included in the sample image set has a sample label, and the preprocessing the sample feature to obtain a preprocessed feature includes: performing global average pooling on sample features corresponding to each sample included in the sample image set to obtain global features corresponding to each sample; calculating a feature distance corresponding to each sample pair included in the sample image set according to the global feature corresponding to each sample; calculating a positive and negative sample pair mask matrix of the sample image set according to the sample label of each sample; the preprocessing features are feature distances corresponding to the sample pairs and the positive and negative sample pair mask matrix.

With reference to the embodiment of the first aspect, in a possible implementation manner, calculating a positive and negative sample pair mask matrix of the sample image set according to the sample label of each sample, includes:

based on the formula M ═ M_i,j|m_i,j＝(y_i＝＝y_j) Calculating a positive and negative sample pair mask matrix of the sample image set; wherein, if the sample label y of the ith sample_iWith the sample label y of the jth sample_jIf the positive sample pair is the positive sample pair, the parameter of the positive sample pair in the positive and negative sample pair mask matrix is 1; if the sample label y of the ith sample_iWith the sample label y of the jth sample_jAnd if not, the sample pair formed by the ith sample and the jth sample is a negative sample pair, and the corresponding parameter of the negative sample pair in the positive and negative sample pair mask matrix is 0.

With reference to the embodiment of the first aspect, in a possible implementation manner, the calculating a feature distance corresponding to each sample pair included in the sample image set includes: for the ith sample x_iAnd the jth sample x_jFormed sample pairs, x_iEach of the global features of (1) with x_jIs summed up and the resulting value after squaring the resulting sum value is determined as the ith sample x_iAnd the jth sample x_jThe composed samples correspond to the characteristic distances.

With reference to the embodiment of the first aspect, in a possible implementation manner, the preprocessing feature is a feature distance corresponding to each sample pair included in the sample image set and a mask matrix of positive and negative sample pairs of the sample image set, and the calculating, according to the preprocessing feature, a sum of alignment losses of all sample pairs included in the sample image set includes: summing comparison losses of all positive sample pairs included in the sample image set to obtain a sum of the positive sample comparison pair losses; summing comparison losses of all negative sample pairs included in the sample image set to obtain a sum of the negative sample comparison pair losses; summing the positive sample comparison pair loss sum and the negative sample comparison pair loss sum to obtain the comparison loss sum; wherein, for each positive sample pair included in the sample image set, the comparison loss of the positive sample pair is: the product of the parameter corresponding to the positive sample pair in the positive and negative sample pair mask matrix and the characteristic distance corresponding to the positive sample pair; for each negative sample pair included in the sample image set, the alignment loss of the negative sample pair is: the product of the corresponding parameter of the negative sample pair in the positive and negative sample pair mask matrix and the characteristic distance optimization value of the negative sample pair; the feature distance optimization value of the negative sample pair is: the larger value of the difference between the preset parameter and the characteristic distance corresponding to the negative sample pair and 0.

With reference to the embodiment of the first aspect, in a possible implementation manner, the calculating, according to the sample features, a sum of classification losses of all samples included in the sample image set includes: and calculating the cross entropy of each sample included in the sample image set according to the sample characteristics, and summing the obtained cross entropies to obtain the sum of the classification losses.

With reference to the embodiment of the first aspect, in one possible implementation manner, the neural network model is used for identifying a living fingerprint under a screen and a foreign object.

In a second aspect, an embodiment of the present application provides a training apparatus, configured to train a neural network model, where the neural network model includes a backbone network and a comparison learning loss network. The training apparatus includes: the device comprises a feature extraction module, a preprocessing module, a calculation module and a training module. The characteristic extraction module is used for extracting the characteristics of the obtained sample image set through the backbone network to obtain sample characteristics; the preprocessing module is used for preprocessing the sample characteristics through the comparison learning loss network to obtain preprocessing characteristics; the calculation module is used for calculating and obtaining the sum of comparison losses of all sample pairs included in the sample image set according to the preprocessing characteristics; the calculation module is further configured to calculate, according to the sample features, a sum of classification losses of all samples included in the sample image set; and the training module is used for training the neural network model according to the comparison loss sum and the classification loss sum.

With reference to the second aspect, in a possible implementation manner, the training module is configured to perform weighted summation on the comparison loss sum and the classification loss sum, and perform back propagation on parameters obtained after the weighted summation in the neural network model.

With reference to the second aspect, in a possible implementation manner, each sample included in the sample image set has a sample label, and the preprocessing module is configured to perform global average pooling on sample features corresponding to each sample included in the sample image set through the comparison learning loss network to obtain a global feature corresponding to each sample; calculating a feature distance corresponding to each sample pair included in the sample image set according to the global feature corresponding to each sample; calculating a positive and negative sample pair mask matrix of the sample image set according to the sample label of each sample; the preprocessing features are feature distances corresponding to the sample pairs and the positive and negative sample pair mask matrix.

With reference to the second aspect, in one possible implementation manner, the preprocessing module is configured to obtain M ═ M based on a formula_i,j|m_i,j＝(y_i＝＝y_j) Calculating a positive and negative sample pair mask matrix of the sample image set; wherein, if the sample label y of the ith sample_iWith the sample label y of the jth sample_jIf the positive sample pair is the positive sample pair, the parameter of the positive sample pair in the positive and negative sample pair mask matrix is 1; if the ith sampleSample label y_iWith the sample label y of the jth sample_jAnd if not, the sample pair formed by the ith sample and the jth sample is a negative sample pair, and the corresponding parameter of the negative sample pair in the positive and negative sample pair mask matrix is 0.

In combination with the second aspect embodiment, in one possible implementation manner, the method is directed to the method for generating the second sample x_iAnd the jth sample x_jThe composed sample pairs, the preprocessing module is also used for dividing x_iEach of the global features of (1) with x_jIs summed up and the resulting value after squaring the resulting sum value is determined as the ith sample x_iAnd the jth sample x_jThe composed samples correspond to the characteristic distances.

With reference to the second aspect of the embodiment, in a possible implementation manner, the preprocessing features are feature distances corresponding to each sample pair included in the sample image set and mask matrixes of positive and negative sample pairs of the sample image set, and the calculation module is configured to sum comparison losses of all positive sample pairs included in the sample image set to obtain a sum of the comparison losses of the positive samples; summing comparison losses of all negative sample pairs included in the sample image set to obtain a sum of the negative sample comparison pair losses; summing the positive sample comparison pair loss sum and the negative sample comparison pair loss sum to obtain the comparison loss sum; wherein, for each positive sample pair included in the sample image set, the comparison loss of the positive sample pair is: the product of the parameter corresponding to the positive sample pair in the positive and negative sample pair mask matrix and the characteristic distance corresponding to the positive sample pair; for each negative sample pair included in the sample image set, the alignment loss of the negative sample pair is: the product of the corresponding parameter of the negative sample pair in the positive and negative sample pair mask matrix and the characteristic distance optimization value of the negative sample pair; the feature distance optimization value of the negative sample pair is: the larger value of the difference between the preset parameter and the characteristic distance corresponding to the negative sample pair and 0.

With reference to the second aspect, in a possible implementation manner, the calculating module is configured to calculate cross entropy of each sample included in the sample image set according to the sample features, and sum the obtained cross entropy to obtain the sum of the classification losses.

With reference to the second aspect, in one possible implementation manner, the neural network model is used for identifying the living fingerprint under the screen and the foreign object.

In a third aspect, an embodiment of the present application further provides a fingerprint identification method, where the method includes: collecting a fingerprint to be identified; and inputting the fingerprint to be identified into a neural network model obtained after training by the training method of any one of the embodiments of the first aspect, identifying, and obtaining an output result.

In a fourth aspect, an embodiment of the present application further provides a fingerprint identification device, which includes an acquisition module and an identification module. The acquisition module is used for acquiring the fingerprint to be identified; and the identification module is used for identifying the neural network model obtained by inputting the fingerprint to be identified into the training method of any one of the embodiments of the first aspect and training the fingerprint to be identified and obtaining an output result.

In a fifth aspect, an embodiment of the present application further provides an electronic device, including: a memory and a processor, the memory and the processor connected; the memory is used for storing programs; the processor calls a program stored in the memory to perform the method of the first aspect embodiment and/or any possible implementation manner of the first aspect embodiment.

In a sixth aspect, the present application further provides a non-volatile computer-readable storage medium (hereinafter, referred to as a storage medium), on which a computer program is stored, where the computer program is executed by a computer to perform the method in the foregoing first aspect and/or any possible implementation manner of the first aspect.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and drawings.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts. The foregoing and other objects, features and advantages of the application will be apparent from the accompanying drawings. Like reference numerals refer to like parts throughout the drawings. The drawings are not intended to be to scale as practical, emphasis instead being placed upon illustrating the subject matter of the present application.

Fig. 1 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Fig. 2 shows a flowchart of a training method provided in an embodiment of the present application.

Fig. 3 shows a schematic structural diagram of an exercise device according to an embodiment of the present application.

Fig. 4 shows a flowchart of a fingerprint identification method provided in an embodiment of the present application.

Fig. 5 is a schematic structural diagram of a fingerprint identification device according to an embodiment of the present application.

Reference numbers: 100-an electronic device; 110-a processor; 120-a memory; 500-a training device; 510-a feature extraction module; 520-a pre-processing module; 530-a calculation module; 540-a training module; 600-a fingerprint recognition device; 610-an acquisition module; 620 — identification module.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, relational terms such as "first," "second," and the like may be used solely in the description herein to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Further, the term "and/or" in the present application is only one kind of association relationship describing the associated object, and means that three kinds of relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone.

In addition, the defects (the training result is over-fitted to the domain corresponding to a certain item, which affects the generalization of the convolutional neural network model) existing in the neural network model training method in the prior art are the results obtained after the applicant has practiced and studied carefully, and therefore, the discovery process of the above defects and the solutions proposed in the following embodiments of the present application for the above defects should be considered as contributions of the applicant to the present application.

In order to solve the above problem, embodiments of the present application provide a training method, an apparatus, an electronic device, and a storage medium, so that a trained neural network model has better generalization on recognition of images from different domains.

The technology can be realized by adopting corresponding software, hardware and a combination of software and hardware. The following describes embodiments of the present application in detail.

First, an electronic device 100 for implementing the training method and apparatus according to the embodiment of the present application is described with reference to fig. 1. A neural network model is deployed on the electronic device 100.

Alternatively, the electronic Device 100 is not limited to a Personal Computer (PC), a tablet PC, a Mobile Internet Device (MID), a Personal digital assistant (pda), or a server.

Among them, the electronic device 100 may include: a processor 110, a memory 120.

It should be noted that the components and structure of electronic device 100 shown in FIG. 1 are exemplary only, and not limiting, and electronic device 100 may have other components and structures as desired.

The processor 110, memory 120, and other components that may be present in the electronic device 100 are electrically connected to each other, directly or indirectly, to enable the transfer or interaction of data. For example, the processor 110, the memory 120, and other components that may be present may be electrically coupled to each other via one or more communication buses or signal lines.

The memory 120 is used for storing a program, for example, a program corresponding to a training method appearing later or a training apparatus appearing later. Optionally, when the neural network model is stored in the memory 120, the neural network model includes at least one software functional module that can be stored in the memory 120 in the form of software or firmware (firmware).

Alternatively, the software function module included in the neural network model may also be solidified in an Operating System (OS) of the electronic device 100.

The processor 110 is adapted to execute executable modules stored in the memory 120, such as software functional modules or computer programs comprised by the neural network model. When the processor 110 receives the execution instruction, it may execute the computer program, for example, to perform: carrying out feature extraction on the obtained sample image set through a backbone network to obtain sample features; preprocessing the sample characteristics through a comparison learning loss network to obtain preprocessed characteristics; according to the preprocessing characteristics, calculating and obtaining the sum of comparison losses of all sample pairs included in the sample image set; calculating the sum of the classification losses of all samples included in the sample image set according to the sample characteristics; and training the neural network model according to the sum of the comparison loss and the sum of the classification loss.

Of course, the method disclosed in any of the embodiments of the present application can be applied to the processor 110, or implemented by the processor 110.

The training method provided by the present application will be described with reference to the flowchart shown in fig. 2.

Step S110: and carrying out feature extraction on the obtained sample image set through a backbone network to obtain sample features.

Taking training of a neural network model for identifying living fingerprints and foreign objects under a screen as an example, the neural network model is disposed on a hardware device (e.g., a mobile phone, a tablet, etc.), so that the hardware device has a function of identifying the living fingerprints.

Since the types and models of the hardware devices are various, for each hardware device, an image of a living fingerprint under a screen and an image of a foreign object under the screen (a non-living image, such as an attack image) acquired by the hardware device may be acquired to form a sample.

For each sample, the staff can mark the sample in advance according to the effect presented by the sample to form a sample label corresponding to the sample.

The sample label is used to qualify the sample, for example in the above example, for a fingerprint live image under the screen, the sample label "positive sample" may be added thereto; for the under-screen foreign matter image, a swatch label "negative swatch" may be added thereto. For another example, in some other embodiments, a sample label "positive sample" may be added to the tumor cell image; for non-tumor cell images, a sample label "negative sample" may be added to them.

Samples from the same hardware device (including positive and negative samples) belong to the same domain, and samples from different hardware devices belong to different domains. The corresponding domains of the samples are different, namely the characterization sample types are different.

In the embodiment of the present application, a positive sample and a positive sample are referred to as a homogeneous sample, a negative sample and a negative sample are referred to as a homogeneous sample, and a positive sample and a negative sample are referred to as a heterogeneous sample. In order to distinguish samples of the same type from samples of the same type and also distinguish samples of different types from samples of different types, the "sample type" is hereinafter collectively characterized by "domain", and accordingly, samples of the same type are referred to as samples from the same domain, and samples of different types are referred to as samples from different domains.

After obtaining a large number of samples, the plurality of sample image sets can be obtained by randomly sampling from the large number of samples according to a random sampling mode. Wherein the number of samples included in each sample image set is the same.

Of course, in a sample image set, positive and negative samples are included, and in addition, in a sample image set, samples of all domains should be included as much as possible.

After the sample image set is obtained, feature extraction may be performed on each sample in the sample image set by using a trunk portion (for example, trunk portions of neural networks such as Mobile Net, ResNet, Xception Net, and the like) included in a conventional neural network with a feature extraction function, so as to obtain sample features X of the sample image set of the current lot.

Step S120: and preprocessing the sample characteristics by comparing the learning loss network to obtain preprocessed characteristics.

Wherein the preprocessing features include: the feature distance corresponding to each sample pair and the positive and negative sample pair mask matrix corresponding to the sample image set.

It is worth pointing out that for each sample included in the sample image set, a sample pair may be formed with another sample included in the sample image set.

The characteristic distance corresponding to each sample pair is used to characterize the characteristic distance between the two samples comprised by the sample pair.

The following will describe a process of calculating the feature distance.

When preprocessing the sample features, for the sample features corresponding to each sample included in the sample image set, the average response value of the spatial dimension of the sample features may be obtained, that is, global average pooling is performed, so as to obtain a global feature X' globavagpooling (X) corresponding to each sample.

Subsequently, for the global feature X' corresponding to each sample, the feature distance corresponding to each sample pair included in the sample image set is calculated.

In some embodiments, the L2 distance may be used as the characteristic distance. At this time, it can be based on the formula

And calculating the characteristic distance corresponding to each sample pair included in the sample image set.

Wherein d is_i,jFor the ith sample x_iAnd the jth sample x_jThe characteristic distance corresponding to the composed sample pair, k represents the sample x_iIs counted starting from 1, l denotes the sample x_jIs counted from 1, n-channel is x_iOr x_jThe dimensions of the included features.

It is worth pointing out that, since all samples in the sample image set are subjected to feature extraction through the same backbone network, all samples in the sample image set include features with the same dimension, which are n-channels.

Formula (II)

The meaning of (A) is: for the ith sample x_iAnd the jth sample x_jFormed sample pairs, x_iEach of the global features of (1) with x_jIs summed up and the resulting value after squaring the resulting sum value is determined as the ith sample x_iAnd the jth sample x_jThe feature distances corresponding to the composed sample pairs, i.e. calculating x_iGlobal of (2)Characteristic and x_jThe euclidean distance of the global feature of (c).

Of course, in other embodiments, other distances may be used as the characteristic distance, such as the L1 distance, and the formula for calculating the characteristic distance may be changed accordingly.

And positive and negative sample pair mask matrixes corresponding to the sample image set are used for characterizing the relationship between the sample labels of all sample pairs included in the sample image set.

As mentioned above, a corresponding sample label is set for each sample in advance. In the embodiment of the present application, a mask matrix of positive and negative sample pairs of a sample image set may be calculated according to a sample label possessed by each sample, and a specific calculation process is as follows.

Based on the formula M ═ M_i,jm_i,j＝(y_i＝＝y_j) Calculating a positive and negative sample pair mask matrix of the sample image set; wherein, if the sample label y of the ith sample_iWith the sample label y of the jth sample_jIf the positive sample pair is the positive sample pair, the parameter of the positive sample pair in the positive and negative sample pair mask matrix is 1; if the sample label y of the ith sample_iWith the sample label y of the jth sample_jIf the positive sample pair and the negative sample pair are different, a sample pair formed by the ith sample and the jth sample is a negative sample pair, and the corresponding parameter of the negative sample pair in the positive and negative sample pair mask matrix is 0.

Step S130: and calculating the sum of the comparison losses of all sample pairs included in the sample image set according to the preprocessing characteristics.

Specifically, the sum of the feature comparison losses of all the positive sample pairs included in the sample image set, that is, the sum L of the positive sample comparison pair losses, may be calculated first_posAnd calculating the sum of the characteristic comparison losses of all negative sample pairs included in the sample image set, namely the sum L of the negative sample comparison losses_neg(ii) a Then the positive samples are compared with the sum of the loss L of the alignment_posComparison with negative samples to sum of losses L_negSumming to obtain comparison of all sample pairs contained in the sample image setSum of losses L_contrastive＝L_pos+L_neg。

Wherein, for any positive sample pair (assuming that the samples included in the positive sample pair are x respectively_iAnd x_j) The loss in comparison with the corresponding features is: the positive sample pair has a corresponding parameter m in the mask matrix_i,jA characteristic distance d corresponding to the positive sample pair_i,jThe product of the two. Then the sum of the positive sample contrast pair losses for all positive sample pairs included in the sample image set

(m due to negative sample pairs_i,j0, therefore, this formula may exclude negative examples).

Wherein, for any negative sample pair (assuming that the samples included in the negative sample pair are x respectively_iAnd x_j) The loss in comparison with the corresponding features is: the negative sample pair has a corresponding parameter m in the mask matrix of the positive and negative sample pairs_i,jCharacteristic distance d from the pair of negative examples_i,jThe product of the optimized values of (1). Characteristic distance d of negative example pair_i,jThe optimized values of (A) are: the preset parameter margin is the characteristic distance d corresponding to the negative sample pair_i,jThe larger of the difference and 0, i.e., Max (0, margin-d)_i,j). Then the sum of the negative-sample-contrast-pair losses for all negative-sample pairs included in the sample image set

(m due to positive sample pairs_i,jIs 1, so this formula can exclude positive samples).

Of course, the value of margin can be adjusted according to actual conditions.

Step S140: and calculating the sum of the classification losses of all samples included in the sample image set according to the sample characteristics.

In a conventional neural network model training process, the sum of the classification losses of the sample image set is generally calculated. In general, the sum of the classification losses can be obtained by calculating the cross entropy of each sample included in the sample image set and summing the obtained cross entropies.

Step S150: and training the neural network model according to the sum of the comparison loss and the sum of the classification loss.

Specifically, the sum of comparison loss and the sum of classification loss may be subjected to weighted summation, and parameters obtained after weighted summation are subjected to back propagation in the neural network model, so as to achieve the purpose of training.

For example, in some embodiments, the sum of alignment losses and the sum of classification losses can be weighted and summed using the formula 0.3A +0.7B, where a is the sum of alignment losses and B is the sum of classification losses.

In the embodiment of the application, when the neural network model is trained based on the loss value, compared with the traditional mode of training the neural network model, one more comparison loss sum L is added_contrastiveSum of losses of the comparison L_contrastiveThe training aid plays an auxiliary role.

Sum of losses from the alignment L_contrastiveIn the back propagation process, the neural network model learns the information. In the learning process, the neural network model can calculate a parameter for representing the comparison difference between two samples included in each sample pair (positive sample pair and negative sample) in the sample image set, namely the comparison difference between every two samples, through gradient calculation.

The neural network model can learn the same type of samples of each domain and the comparison difference between different types of samples of each domain as far as possible through learning the parameters, so that the neural network model can not only consider the difference between positive and negative samples but also consider the difference brought by different domains when forming a feature space, thereby avoiding the neural network model from over-fitting the samples of certain specific domains in a training set, and being beneficial to enhancing the generalization of the neural network model.

Specifically, the neural network model can fully learn the contrast difference between the similar samples of each domain and the contrast difference between the different samples of each domain through learning the parameters, so that the inter-class distance between the similar samples is shortened in the formed feature space, namely the inter-class distance between the positive sample and the positive sample is shortened, and the inter-class distance between the negative sample and the negative sample is shortened; and (3) the distance between different classes of samples is increased, namely the distance between the positive samples and the negative samples is increased. In the feature space formed by the samples in such a way, the same type samples in the same domain and different types of samples in different domains are gathered as much as possible, and the different types of samples in the same domain and different domains are separated as far as possible, so that two sample aggregations (a positive sample aggregation and a negative sample aggregation) with more compact feature distribution are formed, and therefore, the boundary between the positive sample aggregation and the negative sample aggregation can be made as clear as possible, and a neural network model being trained can fully learn the boundary between the positive sample and the negative sample.

After the training of the neural network model is completed, the boundary between the positive sample and the negative sample is fully learned, so that the positive sample and the negative sample from each domain can be accurately recognized, and compared with the traditional neural network model only learning a fuzzy boundary, the generalization of the neural network model can be improved, and the neural network model is favorable for accurately recognizing the images to be recognized from different domains (different devices or different items). When the neural network model is used to identify an underscreen fingerprint live image and an underscreen foreign object image (non-live image, such as an attack image), generalization capability of the neural network model to fingerprint live images and underscreen foreign object images from different items is facilitated.

On the basis of obtaining a neural network model for identifying an image of a living fingerprint under a screen and an image of a foreign object under the screen through the training method, please refer to fig. 3, an embodiment of the present application further provides a fingerprint identification method, where the method includes:

step S210: and collecting the fingerprint to be identified.

Step S220: and inputting the fingerprint to be identified into a neural network model for identification, and obtaining an output result.

As shown in fig. 4, an embodiment of the present application further provides a training apparatus 500 for training a neural network model. The neural network model includes a backbone network and a comparison learning loss network, and the training apparatus 500 includes: a feature extraction module 510, a pre-processing module 520, a computation module 530, and a training module 540.

A feature extraction module 510, configured to perform feature extraction on the obtained sample image set through the backbone network to obtain a sample feature;

a preprocessing module 520, configured to preprocess the sample features through the comparison learning loss network to obtain preprocessed features;

a calculating module 530, configured to calculate, according to the preprocessing feature, a sum of comparison losses of all sample pairs included in the sample image set;

the calculating module 530 is further configured to calculate, according to the sample features, a sum of classification losses of all samples included in the sample image set;

and the training module 540 is configured to train the neural network model according to the sum of the comparison losses and the sum of the classification losses.

In a possible implementation manner, the training module 540 is configured to perform weighted summation on the comparison loss sum and the classification loss sum, and perform back propagation on parameters obtained after weighted summation in the neural network model.

In a possible implementation manner, each sample included in the sample image set has a sample label, and the preprocessing module 520 is configured to perform global average pooling on sample features corresponding to each sample included in the sample image set through the comparison learning loss network to obtain a global feature corresponding to each sample; calculating a feature distance corresponding to each sample pair included in the sample image set according to the global feature corresponding to each sample; calculating a positive and negative sample pair mask matrix of the sample image set according to the sample label of each sample; the preprocessing features are feature distances corresponding to the sample pairs and the positive and negative sample pair mask matrix.

In a possible embodiment, the preprocessing module 520 is used for being based on publicFormula M ═ M_i,j|m_i,j＝(y_i＝＝y_j) Calculating a positive and negative sample pair mask matrix of the sample image set; wherein, if the sample label y of the ith sample_iWith the sample label y of the jth sample_jIf the positive sample pair is the positive sample pair, the parameter of the positive sample pair in the positive and negative sample pair mask matrix is 1; if the sample label y of the ith sample_iWith the sample label y of the jth sample_jAnd if not, the sample pair formed by the ith sample and the jth sample is a negative sample pair, and the corresponding parameter of the negative sample pair in the positive and negative sample pair mask matrix is 0.

In one possible embodiment, the method is applied to the method for the ith sample x_iAnd the jth sample x_jThe pre-processing module 520 is further configured to combine x with the composed sample pairs_iEach of the global features of (1) with x_jIs summed up and the resulting value after squaring the resulting sum value is determined as the ith sample x_iAnd the jth sample x_jThe composed samples correspond to the characteristic distances.

In a possible implementation manner, the preprocessing features are feature distances corresponding to each sample pair included in the sample image set and a mask matrix of positive and negative sample pairs of the sample image set, and the calculating module 530 is configured to sum comparison losses of all positive sample pairs included in the sample image set to obtain a sum of the comparison pair losses of the positive samples; summing comparison losses of all negative sample pairs included in the sample image set to obtain a sum of the negative sample comparison pair losses; summing the positive sample comparison pair loss sum and the negative sample comparison pair loss sum to obtain the comparison loss sum; wherein, for each positive sample pair included in the sample image set, the comparison loss of the positive sample pair is: the product of the parameter corresponding to the positive sample pair in the positive and negative sample pair mask matrix and the characteristic distance corresponding to the positive sample pair; for each negative sample pair included in the sample image set, the alignment loss of the negative sample pair is: the product of the corresponding parameter of the negative sample pair in the positive and negative sample pair mask matrix and the characteristic distance optimization value of the negative sample pair; the feature distance optimization value of the negative sample pair is: the larger value of the difference between the preset parameter and the characteristic distance corresponding to the negative sample pair and 0.

In a possible implementation manner, the calculating module 530 is configured to calculate cross entropy of each sample included in the sample image set according to the sample features, and sum each cross entropy to obtain the sum of the classification losses.

In one possible implementation, the neural network model obtained by training is used for identifying the living body of the underscreen fingerprint and the foreign body.

The training apparatus 500 provided in the embodiment of the present application has the same implementation principle and the same technical effect as the foregoing method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the foregoing method embodiments for parts that are not mentioned in the embodiment of the training apparatus 500.

As shown in fig. 5, an embodiment of the present application further provides a fingerprint identification apparatus 600, which includes an acquisition module 610 and an identification module 620.

The acquisition module 610 is used for acquiring a fingerprint to be identified;

and the identification module 620 is configured to input the fingerprint to be identified into the trained neural network model for identification, and obtain an output result.

In addition, the present application also provides a storage medium, where a computer program is stored, and when the computer program is executed by a computer, the training method as described above is executed, or the fingerprint recognition method as described above is executed.

Furthermore, an embodiment of the present invention further provides an electronic device, which includes a processor and a memory connected to the processor, where the memory stores a computer program, and when the computer program is executed by the processor, the electronic device is caused to perform the training method or the fingerprint recognition method. The structural schematic diagram of the electronic device can be seen in fig. 1.

In summary, the training method and apparatus, the fingerprint identification method and apparatus, the electronic device, and the storage medium according to the embodiments of the present invention perform feature extraction and feature processing on the samples in the sample image set, so as to obtain the pre-processing features. And then calculating the sum of comparison losses of all sample pairs included in the sample image set based on the preprocessing characteristics, and training the neural network model together with the sum of classification losses calculated according to the sample characteristics. When the neural network model is trained, compared with the traditional mode of training the neural network model, the sum of comparison losses is increased. When the neural network model trains and learns the sum of the comparison losses, parameters for representing the comparison difference among the samples can be calculated through gradient calculation. The neural network model can learn the same type of samples of each domain and the comparison difference between different types of samples of each domain as far as possible through learning the parameters, so that the neural network model can not only consider the difference between positive and negative samples but also consider the difference brought by the domains when forming a feature space, thereby avoiding overfitting the samples of certain specific domains in a training set and being beneficial to enhancing the generalization of the neural network model.

It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other.

In the several embodiments provided in the present application, it should be understood that the disclosed neural network model and method may be implemented in other ways. The neural network model embodiments described above are merely illustrative, for example, the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of neural network models, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions may be stored in a storage medium if they are implemented in the form of software function modules and sold or used as separate products. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a notebook computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application.

Claims

1. A training method is used for training a neural network model, wherein the neural network model comprises a backbone network and a comparison learning loss network, and the method comprises the following steps:

performing feature extraction on the obtained sample image set through the backbone network to obtain sample features;

preprocessing the sample characteristics through the comparison learning loss network to obtain preprocessed characteristics;

according to the preprocessing characteristics, calculating and obtaining the sum of comparison losses of all sample pairs included in the sample image set;

calculating the sum of the classification losses of all samples included in the sample image set according to the sample characteristics;

and training the neural network model according to the sum of the comparison loss and the sum of the classification loss.

2. The method of claim 1, wherein training the neural network model based on the sum of the comparison losses and the sum of the classification losses comprises:

and carrying out weighted summation on the comparison loss sum and the classification loss sum, and carrying out back propagation on parameters obtained after weighted summation in the neural network model.

3. The method of claim 1, wherein each sample included in the sample image set has a sample label, and wherein preprocessing the sample features to obtain preprocessed features comprises:

performing global average pooling on sample features corresponding to each sample included in the sample image set to obtain global features corresponding to each sample;

calculating a feature distance corresponding to each sample pair included in the sample image set according to the global feature corresponding to each sample;

calculating a positive and negative sample pair mask matrix of the sample image set according to the sample label of each sample;

the preprocessing features are feature distances corresponding to the sample pairs and the positive and negative sample pair mask matrix.

4. The method of claim 3, wherein computing a positive and negative sample pair mask matrix for the sample image set based on the sample label each sample has comprises:

if the sample label y of the ith sample_iWith the sample label y of the jth sample_jIf the positive sample pair is the positive sample pair, the parameter of the positive sample pair in the positive and negative sample pair mask matrix is 1;

if the sample label y of the ith sample_iWith the sample label y of the jth sample_jAnd if not, the sample pair formed by the ith sample and the jth sample is a negative sample pair, and the corresponding parameter of the negative sample pair in the positive and negative sample pair mask matrix is 0.

5. The method according to claim 1, wherein the preprocessing features are feature distances corresponding to each sample pair included in the sample image set and a mask matrix of positive and negative sample pairs of the sample image set, and the calculating a sum of alignment losses of all sample pairs included in the sample image set according to the preprocessing features comprises:

summing comparison losses of all positive sample pairs included in the sample image set to obtain a sum of the positive sample comparison pair losses;

summing comparison losses of all negative sample pairs included in the sample image set to obtain a sum of the negative sample comparison pair losses;

summing the positive sample comparison pair loss sum and the negative sample comparison pair loss sum to obtain the comparison loss sum;

wherein, for each positive sample pair included in the sample image set, the comparison loss of the positive sample pair is: the product of the parameter corresponding to the positive sample pair in the positive and negative sample pair mask matrix and the characteristic distance corresponding to the positive sample pair; for each negative sample pair included in the sample image set, the alignment loss of the negative sample pair is: the product of the corresponding parameter of the negative sample pair in the positive and negative sample pair mask matrix and the characteristic distance optimization value of the negative sample pair; the feature distance optimization value of the negative sample pair is: the larger value of the difference between the preset parameter and the characteristic distance corresponding to the negative sample pair and 0.

6. The method of claim 1, wherein calculating the sum of the classification losses of all the samples included in the sample image set according to the sample features comprises:

and calculating the cross entropy of each sample included in the sample image set according to the sample characteristics, and summing the obtained cross entropies to obtain the sum of the classification losses.

7. The method of any one of claims 1-6, wherein the neural network model is used to identify an underscreen fingerprint living body and a foreign object.

8. A method of fingerprint identification, the method comprising:

collecting a fingerprint to be identified;

inputting the fingerprint to be recognized into a neural network model obtained after training by the training method according to any one of claims 1 to 7 for recognition, and obtaining an output result.

9. A training apparatus configured to train a neural network model, the neural network model including a backbone network and a comparison learning loss network, the apparatus comprising:

the characteristic extraction module is used for extracting the characteristics of the obtained sample image set through the backbone network to obtain sample characteristics;

the preprocessing module is used for preprocessing the sample characteristics through the comparison learning loss network to obtain preprocessing characteristics;

the calculation module is used for calculating and obtaining the sum of comparison losses of all sample pairs included in the sample image set according to the preprocessing characteristics;

the calculation module is further configured to calculate, according to the sample features, a sum of classification losses of all samples included in the sample image set;

and the training module is used for training the neural network model according to the comparison loss sum and the classification loss sum.

10. A fingerprint recognition apparatus, the apparatus comprising:

the acquisition module is used for acquiring the fingerprint to be identified;

the identification module is used for identifying the neural network model obtained by inputting the fingerprint to be identified and training the fingerprint by the training method according to any one of claims 1 to 7 and obtaining an output result.

11. An electronic device, comprising: a memory and a processor, the memory and the processor connected;

the memory is used for storing programs;

the processor calls a program stored in the memory to perform the method of any of claims 1-8.

12. A storage medium, having stored thereon a computer program which, when executed by a computer, performs the method of any one of claims 1-8.