CN116416666A

CN116416666A - Face recognition method, system and storage medium based on distributed distillation

Info

Publication number: CN116416666A
Application number: CN202310409858.2A
Authority: CN
Inventors: 金炜
Original assignee: Shumei Tianxia Beijing Technology Co ltd; Beijing Nextdata Times Technology Co ltd
Current assignee: Shumei Tianxia Beijing Technology Co ltd; Beijing Nextdata Times Technology Co ltd
Priority date: 2023-04-17
Filing date: 2023-04-17
Publication date: 2023-07-11

Abstract

The invention discloses a face recognition method, a face recognition system and a storage medium based on distributed distillation, which comprise the following steps: according to the face sample set, a difficult sample set and a simple sample set are obtained; constructing a plurality of difficult positive sample pairs and difficult negative sample pairs according to the difficult sample set, and constructing simple positive sample pairs and simple negative sample pairs according to the simple sample set; obtaining a difficult positive sample pair distribution characteristic, a difficult negative sample pair distribution characteristic, a simple positive sample pair distribution characteristic and a simple negative sample pair distribution characteristic by utilizing a pre-training face recognition model, and performing iterative training on a face recognition student model provided with a distributed distillation loss function to obtain a target face recognition model; and inputting the image to be detected into a target face recognition model to obtain a face recognition result of the image to be detected. According to the invention, the human face sample set is divided, and the sample distribution characteristics are subjected to distributed distillation through the distributed distillation loss function, so that the effective identification of human faces in difficult samples is improved.

Description

Face recognition method, system and storage medium based on distributed distillation

Technical Field

The invention relates to the technical field of face recognition, in particular to a face recognition method, a face recognition system and a storage medium based on distributed distillation.

Background

With the continuous development of deep learning technology, the existing face model has achieved an accuracy of nearly 100% on the public data set. However, when the face recognition model trained by the public data set faces factors such as mask shielding, blurring and noise, the face recognition model is often unsatisfactory in effect. In order to combat network supervision, illegal molecules often blur and noise illegal face pictures or videos, and even smear areas on faces to escape interception by a supervision system.

Therefore, it is needed to provide a technical solution to solve the above technical problems.

Disclosure of Invention

In order to solve the technical problems, the invention provides a face recognition method, a face recognition system and a storage medium based on distributed distillation.

The invention discloses a face recognition method based on distributed distillation, which comprises the following steps:

s1, determining a target difficult sample set and a target simple sample set according to a face sample set;

s2, constructing a plurality of target difficult positive sample pairs and target difficult negative sample pairs according to the target difficult sample set, and constructing a plurality of target simple positive sample pairs and target simple negative sample pairs according to the target simple sample set;

s3, acquiring difficult positive sample pair distribution characteristics of all target difficult positive sample pairs, difficult negative sample pair distribution characteristics of all target difficult negative sample pairs, simple positive sample pair distribution characteristics of all target simple positive sample pairs and simple negative sample pair distribution characteristics of all target simple negative sample pairs by utilizing a pre-training face recognition model, and performing iterative training on a face recognition student model provided with a distributed distillation loss function to obtain a target face recognition model;

s4, inputting the image to be detected into the target face recognition model to obtain a face recognition result of the image to be detected.

The face recognition method based on distributed distillation has the following beneficial effects:

the method divides the human face sample set, and performs distributed distillation on the sample distribution characteristics through the distributed distillation loss function, thereby improving the effective identification of the human face in the difficult sample.

Based on the scheme, the face recognition method based on distributed distillation can be improved as follows.

Further, step S1 includes:

s11, acquiring a face image quality score of each face sample image in the face sample set by using a face quality evaluation model, and acquiring a face deviation angle value of each face sample image in the face sample set by using a head posture evaluation model;

s12, in the face sample set, determining a face sample image with a face image quality score smaller than a first preset face image quality score or a face deviation angle value larger than a first preset face deviation angle value as a first difficult sample image, and determining a face sample image with a face image quality score larger than a second preset face image quality score and a face deviation angle value smaller than a second preset face deviation angle value as a first simple sample image;

s13, respectively carrying out data enhancement processing on each first simple sample image by utilizing a combined data enhancement mode to obtain a plurality of second difficult sample images; the combined data enhancement mode is as follows: at least one of noise enhancement, blurring enhancement, picture compression enhancement, and mask enhancement;

s14, constructing the target simple sample set according to all the first simple sample images, and constructing the target difficult sample set according to all the first difficult sample images and all the second difficult sample images.

Further, step S2 includes:

s21, combining any two difficult sample images with the same sample attribute information in the target difficult sample set to obtain a plurality of target difficult positive sample pairs, combining any two difficult sample images with different sample attribute information in the target difficult sample set to obtain a plurality of first difficult negative sample pairs, and determining the first difficult negative sample pairs with the sample pair similarity within a preset similarity range as target difficult negative sample pairs;

s22, combining any two simple sample images with the same sample attribute information of the target simple sample set to obtain a plurality of target simple positive sample pairs, combining any two simple sample images with different sample attribute information of the target simple sample set to obtain a plurality of first simple negative sample pairs, and determining the first simple negative sample pairs with the sample pair similarity within a preset similarity range as target simple negative sample pairs.

Further, step S3 includes:

s31, acquiring difficult positive sample pair distribution characteristics of all target difficult positive sample pairs, difficult negative sample pair distribution characteristics of all target difficult negative sample pairs, simple positive sample pair distribution characteristics of all target simple positive sample pairs and simple negative sample pair distribution characteristics of all target simple negative sample pairs by utilizing a pre-training face recognition model;

s32, inputting the difficult positive sample pair distribution characteristics, the difficult negative sample pair distribution characteristics, the simple positive sample pair distribution characteristics and the simple negative sample pair distribution characteristics into the distributed distillation loss function to obtain the distillation loss value, and optimizing parameters of the pre-training face recognition model by utilizing the simple positive sample pair distribution characteristics and the simple negative sample pair distribution characteristics to obtain an optimized pre-training face recognition model;

s33, optimizing parameters of the face recognition student model based on the distillation loss value to obtain an optimized face recognition student model, taking the optimized face recognition student model as the face recognition student model, taking the optimized pre-training face recognition model as the pre-training face recognition model, and returning to the step S31 until the optimized face recognition student model meets a preset iteration training condition, and determining the optimized face recognition student model as the target face recognition model; wherein, the preset iterative training conditions are as follows: the maximum iterative training times are reached.

Further, the purpose isThe number of target difficult positive sample pairs, the number of target difficult negative sample pairs, the number of target simple positive sample pairs and the number of target simple negative sample pairs are the same, and the distributed distillation loss function is: l=l _KL +L _order +L _arc The method comprises the steps of carrying out a first treatment on the surface of the Wherein,,

，/>

λ ₁ ＝0.1，λ ₂ ＝0.02，λ ₃ =0.5, s denotes the number of sample pairs, P ⁺ Distributing characteristics, P, to the simple positive samples ^- Distributing characteristics to the simple negative, Q ⁺ For the difficult positive sample pair distribution characteristics, Q ^- For the difficult negative sample pair distribution characteristics, P ⁺ (s) positive sample pair characteristics, P, of the s-th target simple positive sample pair ^- (s) negative-sample pair characteristics, Q, of the s-th target simple negative-sample pair ⁺ (s) positive sample pair characteristics, Q, of the s-th target difficult positive sample pair ^- (s) negative-sample pair characteristics of the s-th target difficult negative-sample pair, < ->

Representing the i positive sample pair feature, +.>

Representing the j-th negative pair feature, +.>

Representation->

Mean value of->

Representation->

P represents the total number of simple pairs of samples, q represents the total number of difficult pairs of samples, L _arc And a loss function for the pre-trained face recognition model.

Further, the pre-training face recognition model is as follows: and a Resnet50 network model for face recognition and pre-trained over the face sample set.

The invention discloses a face recognition system based on distributed distillation, which comprises the following technical scheme:

comprising the following steps: the system comprises a first processing module, a second processing module, a training module and an identification module;

the first processing module is used for: according to the face sample set, determining a target difficult sample set and a target simple sample set;

the second processing module is used for: constructing a plurality of target difficult positive sample pairs and target difficult negative sample pairs according to the target difficult sample set, and constructing a plurality of target simple positive sample pairs and target simple negative sample pairs according to the target simple sample set;

the training module is used for: obtaining difficult positive sample pair distribution characteristics of all target difficult positive sample pairs, difficult negative sample pair distribution characteristics of all target difficult negative sample pairs, simple positive sample pair distribution characteristics of all target simple positive sample pairs and simple negative sample pair distribution characteristics of all target simple negative sample pairs by using a pre-training face recognition model, and carrying out iterative training on a face recognition student model provided with a distributed distillation loss function to obtain a target face recognition model;

the identification module is used for: and inputting the image to be detected into the target face recognition model to obtain a face recognition result of the image to be detected.

The face recognition system based on distributed distillation has the following beneficial effects:

the system divides the human face sample set, and performs distributed distillation on sample distribution characteristics through the distributed distillation loss function, so that the effective recognition of human faces in difficult samples is improved.

Based on the scheme, the face recognition system based on distributed distillation can be improved as follows.

Further, the first processing module is specifically configured to:

acquiring a face image quality score of each face sample image in the face sample set by using a face quality evaluation model, and acquiring a face deviation angle value of each face sample image in the face sample set by using a head posture evaluation model;

in the face sample set, determining a face sample image with a face image quality score smaller than a first preset face image quality score or a face deviation angle value larger than a first preset face deviation angle value as a first difficult sample image, and determining a face sample image with a face image quality score larger than a second preset face image quality score and a face deviation angle value smaller than a second preset face deviation angle value as a first simple sample image;

respectively carrying out data enhancement processing on each first simple sample image by utilizing a combined data enhancement mode to obtain a plurality of second difficult sample images; the combined data enhancement mode is as follows: at least one of noise enhancement, blurring enhancement, picture compression enhancement, and mask enhancement;

the target simple sample set is constructed from all of the first simple sample images and the target difficult sample set is constructed from all of the first difficult sample images and all of the second difficult sample images.

Further, the second processing module is specifically configured to:

combining any two difficult sample images with the same sample attribute information in the target difficult sample set to obtain a plurality of target difficult positive sample pairs, combining any two difficult sample images with different sample attribute information in the target difficult sample set to obtain a plurality of first difficult negative sample pairs, and determining the first difficult negative sample pairs with the sample pairs within a preset similarity range as target difficult negative sample pairs;

and combining any two simple sample images with the same sample attribute information of the target simple sample set to obtain a plurality of target simple positive sample pairs, combining any two simple sample images with different sample attribute information of the target simple sample set to obtain a plurality of first simple negative sample pairs, and determining the first simple negative sample pairs with the sample pairs within a preset similarity range as target simple negative sample pairs.

The technical scheme of the storage medium is as follows:

the storage medium has instructions stored therein which, when read by a computer, cause the computer to perform the steps of a distributed distillation based face recognition method as in the present invention.

Drawings

Fig. 1 is a schematic flow chart of an embodiment of a face recognition method based on distributed distillation according to the present invention;

fig. 2 is a schematic flow chart of step S1 in an embodiment of a face recognition method based on distributed distillation according to the present invention;

fig. 3 is a schematic flow chart of step S2 in an embodiment of a face recognition method based on distributed distillation according to the present invention;

fig. 4 is a schematic flow chart of step S3 in an embodiment of a face recognition method based on distributed distillation according to the present invention;

fig. 5 is a schematic overall flow chart in an embodiment of a face recognition method based on distributed distillation provided by the invention;

fig. 6 shows a schematic structural diagram of an embodiment of a face recognition system based on distributed distillation according to the present invention.

Detailed Description

Fig. 1 shows a schematic flow chart of an embodiment of a face recognition method based on distributed distillation. As shown in fig. 1, the method comprises the steps of:

s1, determining a target difficult sample set and a target simple sample set according to a face sample set.

Wherein, (1) the face sample set is: a deep glint dataset for face training comprising a plurality of face sample images. (2) The target difficult sample set is: the data set including the plurality of difficult sample images defined in this embodiment may be a difficult sample image selected from a face sample set, or may be a sample obtained by processing a sample in the face sample set. (3) The target simple sample set is: a data set comprising a plurality of simple sample images as defined in this embodiment, the simple sample images being screened from a sample set of faces.

S2, constructing a plurality of target difficult positive sample pairs and target difficult negative sample pairs according to the target difficult sample set, and constructing a plurality of target simple positive sample pairs and target simple negative sample pairs according to the target simple sample set.

Wherein (1) the target hard positive sample pair is: sample pairs composed of two difficult sample images with identical sample attribute information in the target difficult sample set. (2) The target difficult negative sample pair is: sample pairs composed of two difficult sample images having different sample attribute information in the target difficult sample set. (3) The target simple positive sample pair is: sample pairs formed by two simple sample images with the same sample attribute information in the target simple sample set. (4) The target simple negative pair is: sample pairs composed of two simple sample images with different sample attribute information in the target simple sample set.

S3, utilizing a pre-training face recognition model to obtain difficult positive sample pair distribution characteristics of all target difficult positive sample pairs, difficult negative sample pair distribution characteristics of all target difficult negative sample pairs, simple positive sample pair distribution characteristics of all target simple positive sample pairs and simple negative sample pair distribution characteristics of all target simple negative sample pairs, and carrying out iterative training on a face recognition student model provided with a distributed distillation loss function to obtain a target face recognition model.

Wherein, (1) the pre-training face recognition model is: resnet50 network model for face recognition and pre-trained with a face sample set. (2) The difficult positive sample pair distribution features are: a distribution profile of positive sample pair features comprising each target difficult positive sample pair. (3) Trapping in the bodyThe distribution characteristics of the difficult negative sample pairs are as follows: a distribution profile of negative sample pair features comprising each target difficult negative sample pair. (4) The simple positive samples were characterized by: a distribution profile of positive sample pair features containing each target simple positive sample pair. (5) The simple negative sample pair distribution features are: a distribution profile of negative sample pair features comprising each target simple negative sample pair. (6) The face recognition student model is as follows: in the knowledge distillation process, the network structure of the student model corresponding to the pre-training face recognition model is the same as that of the pre-training face recognition model, and the loss functions of the student model and the pre-training face recognition model are different. The face recognition student model adopts a distributed distillation loss function, and the pre-training face recognition model adopts an original loss function. (7) The number of target difficult positive sample pairs, the number of target difficult negative sample pairs, the number of target simple positive sample pairs, and the number of target simple negative sample pairs are the same. (8) The distributed distillation loss function is: l=l _KL +L _order +L _arc . Wherein,,

Representing the i positive sample pair feature, +.>

Representing the j-th negative pair feature, +.>

Representation->

Mean value of->

Representation->

Wherein, (1) the image to be measured is: the arbitrary image may be an image including a face or an image not including a face. (2) The face recognition result is: and when the predicted value is larger than the threshold value, judging that the image to be detected contains the face, otherwise, judging that the image to be detected does not contain the face.

Preferably, as shown in fig. 2, step S1 includes:

s11, acquiring the face image quality score of each face sample image in the face sample set by using a face quality evaluation model, and acquiring the face deviation angle value of each face sample image in the face sample set by using a head posture evaluation model.

Wherein, (1) the face quality evaluation model is: the specific structure and function of the model are all in the prior art, and are not repeated here. (2) The face image quality score is as follows: and (3) carrying out face image quality evaluation and scoring on a certain image through a face quality evaluation model, wherein the higher the score is, the clearer the face quality in the image is represented, and otherwise, the more the face quality is blurred. (3) The head pose estimation model is: the specific structure and function of the model are all in the prior art, and are not described in detail herein. (4) The face deviation angle value is as follows: the head pose estimation model can acquire the deviation angle value of the face in the image, and the smaller the angle is, the closer the face in the image is to the positive face pose.

S12, in the face sample set, face sample images with the face image quality score smaller than a first preset face image quality score or with the face deviation angle value larger than a first preset face deviation angle value are determined to be first difficult sample images, and face sample images with the face image quality score larger than a second preset face image quality score and with the face deviation angle value smaller than a second preset face deviation angle value are determined to be first simple sample images.

Wherein, (1) the first preset face image quality score defaults to 30 points. (2) The first preset face deviation angle value defaults to 40 degrees. (3) The second preset face image quality score defaults to 60 points. (4) The second preset face deviation angle value defaults to 15 degrees.

It should be noted that, the preset values may be adjusted according to the needs, and the present invention is not limited thereto.

S13, respectively carrying out data enhancement processing on each first simple sample image by utilizing a combined data enhancement mode to obtain a plurality of second difficult sample images.

Wherein, (1) the combined data enhancement mode is: at least one of noise enhancement, blurring enhancement, picture compression enhancement, and mask enhancement. (2) Noise enhancement includes: gaussian noise, salt and pepper noise, laplace noise enhancement, etc. (3) Blur enhancement includes: gaussian blur, mean blur, motion blur, interpolation blur, etc. (4) A first simple sample image corresponds to a second difficult sample image, the number of first simple sample images being the same as the number of second difficult sample images.

Preferably, as shown in fig. 3, step S2 includes:

s21, combining any two difficult sample images with the same sample attribute information in the target difficult sample set to obtain a plurality of target difficult positive sample pairs, combining any two difficult sample images with different sample attribute information in the target difficult sample set to obtain a plurality of first difficult negative sample pairs, and determining the first difficult negative sample pairs with the sample pair similarity within a preset similarity range as target difficult negative sample pairs.

Wherein (1) the sample attribute information is: sample image id information. (2) Default similarity range defaults to: [0,0.2]. (3) The similarity between pairs of samples is calculated using cosine similarity, the distance range being [ -1,1], a larger value representing more similarity.

The (1) different sample image id information represents face images of different persons. (2) Compared with the existing method for taking the maximum similarity, the semi-hard strategy can dig more valuable samples; the pre-training face recognition model has better distinction on the negative sample pair with the similarity smaller than 0, so that the negative sample pair with the similarity smaller than 0 is abandoned; meanwhile, negative sample pairs with similarity greater than 0.2 are also discarded in order to prevent the influence of some noise data or labeling error data.

Preferably, as shown in fig. 4, step S3 includes:

s31, acquiring difficult positive sample pair distribution characteristics of all target difficult positive sample pairs, difficult negative sample pair distribution characteristics of all target difficult negative sample pairs, simple positive sample pair distribution characteristics of all target simple positive sample pairs and simple negative sample pair distribution characteristics of all target simple negative sample pairs by utilizing a pre-training face recognition model.

The process of extracting the sample pair distribution characteristics by using the pre-training face recognition model is the prior art, and is not repeated here.

S32, inputting the difficult positive sample pair distribution characteristics, the difficult negative sample pair distribution characteristics, the simple positive sample pair distribution characteristics and the simple negative sample pair distribution characteristics into the distributed distillation loss function to obtain the distillation loss value, and optimizing parameters of the pre-training face recognition model by utilizing the simple positive sample pair distribution characteristics and the simple negative sample pair distribution characteristics to obtain the optimized pre-training face recognition model.

And S33, optimizing parameters of the face recognition student model based on the distillation loss value to obtain an optimized face recognition student model, taking the optimized face recognition student model as the face recognition student model, taking the optimized pre-training face recognition model as the pre-training face recognition model, and returning to the step S31 until the optimized face recognition student model meets the preset iteration training condition, and determining the optimized face recognition student model as the target face recognition model.

The preset iterative training conditions are as follows: the maximum iterative training times are reached. In this embodiment, the maximum number of iterative training is 6, and may be adjusted according to the requirement, which is not limited herein.

Further, as shown in fig. 5, the histogram constituted of the difficult positive sample pair distribution feature, the difficult negative sample pair distribution feature, the simple positive sample pair distribution feature, and the simple negative sample pair distribution feature is obtained as follows:

wherein s is _ij For similarity value between sample pairs, t ₁ ＝-1,t ₂ ,...,t _R The = +1 is distributed in [ -1,1]In between the two,

for statistical interval, ++>

The statistics for each interval and finally P is the resulting distribution.

According to the technical scheme, the face sample set is divided, and the sample distribution characteristics are subjected to distributed distillation through the distributed distillation loss function, so that the effective identification of the faces in difficult samples is improved.

Fig. 6 shows a schematic structural diagram of an embodiment of a face recognition system based on distributed distillation according to the present invention. As shown in fig. 6, the system 200 includes: a first processing module 210, a second processing module 220, a training module 230, and an identification module 240.

The first processing module 210 is configured to: according to the face sample set, determining a target difficult sample set and a target simple sample set;

the second processing module 220 is configured to: constructing a plurality of target difficult positive sample pairs and target difficult negative sample pairs according to the target difficult sample set, and constructing a plurality of target simple positive sample pairs and target simple negative sample pairs according to the target simple sample set;

the training module 230 is configured to: obtaining difficult positive sample pair distribution characteristics of all target difficult positive sample pairs, difficult negative sample pair distribution characteristics of all target difficult negative sample pairs, simple positive sample pair distribution characteristics of all target simple positive sample pairs and simple negative sample pair distribution characteristics of all target simple negative sample pairs by using a pre-training face recognition model, and carrying out iterative training on a face recognition student model provided with a distributed distillation loss function to obtain a target face recognition model;

the identification module 240 is configured to: and inputting the image to be detected into the target face recognition model to obtain a face recognition result of the image to be detected.

Preferably, the first processing module 210 is specifically configured to:

Preferably, the second processing module 220 is specifically configured to:

The steps for implementing the corresponding functions by the parameters and the modules in the face recognition system 200 based on distributed distillation according to the present embodiment are referred to in the embodiments of the face recognition method based on distributed distillation, and are not described herein.

The storage medium provided by the embodiment of the invention comprises: the storage medium stores instructions that, when read by a computer, cause the computer to perform steps of a face recognition method based on distributed distillation, and specifically, reference may be made to each parameter and step in the above embodiment of a face recognition method based on distributed distillation, which is not described herein.

Computer storage media such as: flash disk, mobile hard disk, etc.

Those skilled in the art will appreciate that the present invention may be implemented as a method, system, and storage medium.

Thus, the invention may be embodied in the form of: either entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or entirely software, or a combination of hardware and software, referred to herein generally as a "circuit," module "or" system. Furthermore, in some embodiments, the invention may also be embodied in the form of a computer program product in one or more computer-readable media, which contain computer-readable program code. Any combination of one or more computer readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. While embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the invention.

Claims

1. A face recognition method based on distributed distillation, comprising:

2. The distributed distillation-based face recognition method according to claim 1, wherein step S1 comprises:

3. The distributed distillation-based face recognition method according to claim 1, wherein step S2 comprises:

4. The distributed distillation-based face recognition method according to claim 1, wherein step S3 comprises:

5. The distributed distillation-based face recognition method according to claim 4, wherein the number of target difficult positive sample pairs, the number of target difficult negative sample pairs, the number of target simple positive sample pairs, and the number of target simple negative sample pairs are the same, and the distributed distillation loss function is: l=l _KL +L _order +L _arc The method comprises the steps of carrying out a first treatment on the surface of the Wherein,,

λ ₁ ＝0.1，λ ₂ ＝0.02，λ ₃ =0.5, s denotes the number of sample pairs, P ⁺ Distributing characteristics, P, to the simple positive samples ^- Distributing characteristics to the simple negative, Q ⁺ For the difficult positive sampleFor distribution characteristics, Q ^- For the difficult negative sample pair distribution characteristics, P ⁺ (s) positive sample pair characteristics, P, of the s-th target simple positive sample pair ^- (s) negative-sample pair characteristics, Q, of the s-th target simple negative-sample pair ⁺ (s) positive sample pair characteristics, Q, of the s-th target difficult positive sample pair ^- (s) is a negative-sample pair characteristic of the s-th target difficult negative-sample pair,

representing the i positive sample pair feature, +.>

Representing the j-th negative pair feature, +.>

Representation->

Mean value of->

Representation->

6. The distributed distillation based face recognition method of any one of claims 1-5, wherein the pre-trained face recognition model is: and a Resnet50 network model for face recognition and pre-trained over the face sample set.

7. A distributed distillation-based face recognition system, comprising: the system comprises a first processing module, a second processing module, a training module and an identification module;

8. The distributed distillation based face recognition system of claim 7, wherein the first processing module is specifically configured to:

9. The distributed distillation based face recognition system of claim 7, wherein the second processing module is specifically configured to:

10. A storage medium having instructions stored therein, which when read by a computer, cause the computer to perform the distributed distillation based face recognition method according to any one of claims 1 to 6.