CN114863181A - Gender classification method and system based on prediction probability knowledge distillation - Google Patents

Gender classification method and system based on prediction probability knowledge distillation Download PDF

Info

Publication number
CN114863181A
CN114863181A CN202210556798.2A CN202210556798A CN114863181A CN 114863181 A CN114863181 A CN 114863181A CN 202210556798 A CN202210556798 A CN 202210556798A CN 114863181 A CN114863181 A CN 114863181A
Authority
CN
China
Prior art keywords
gender
network
teacher
student
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210556798.2A
Other languages
Chinese (zh)
Inventor
黄陶冶
王麒
陈帅斌
蒋泽飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Denghong Technology Co ltd
Original Assignee
Hangzhou Denghong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Denghong Technology Co ltd filed Critical Hangzhou Denghong Technology Co ltd
Priority to CN202210556798.2A priority Critical patent/CN114863181A/en
Publication of CN114863181A publication Critical patent/CN114863181A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a gender classification method and system based on prediction probability knowledge distillation, wherein the method comprises the following steps: acquiring a to-be-predicted image, constructing a ResNet18 network as a teacher main network, and constructing a CNN network as a student network; configuring a gender classifier for a teacher network, setting classification categories and constructing a weight matrix of the teacher network classifier; configuring a gender classifier for the student network, setting classification categories which are the same as those of the teacher network, and constructing a weight matrix of the student network classifier; calculating gender classification loss and gender distillation loss of a teacher network and a student network respectively by adopting a loss function; acquiring gender prediction probability vectors of the teacher network and the student network for respectively outputting the gender prediction to the input to-be-predicted images; and calculating total distillation loss according to the gender prediction probability vector, the gender classification loss and the gender distillation loss, and updating parameters of a teacher network and a student network through a gradient descent algorithm.

Description

Gender classification method and system based on prediction probability knowledge distillation
Technical Field
The invention relates to the technical field of image recognition, in particular to a gender classification method and system based on prediction probability knowledge distillation.
Background
In the security industry, human faces are one of the most important biological characteristics, and human face images comprise attribute information such as gender, race, age, identity and the like. Among them, gender has been receiving attention from researchers as the most important classification problem. In general, when the training data is sufficient, the larger the designed model is, the better the classification effect is. When the model becomes large, the computational power requirement of the model is also improved, which often needs long reasoning time on some low-computational-power equipment and is difficult to meet the actual production requirement. It is of interest to design relatively small models on low-computational power equipment to maximize the accuracy of classification.
Disclosure of Invention
One of the purposes of the invention is to provide a gender classification method and system based on predictive probability knowledge distillation, wherein the method and system can use a larger gender identification model as a teacher network, use a smaller gender identification model as a student network, and guide the learning of the student network through the teacher network, so that the smaller gender identification model can be adapted to a less-computationally-intensive security device.
The invention also aims to provide a gender classification method and system based on prediction probability knowledge distillation, and the method and system can enable the prediction accuracy of a small model to be closer to that of a large recognition model in the gender recognition function through the knowledge distillation method.
Another object of the present invention is to provide a gender classification method and system based on predictive probability knowledge distillation, which does not need to add training data and construct a more elaborate gender identification model, and does not need to change the structure of a small model on a low-cost device, thereby greatly reducing the cost of gender identification.
To achieve at least one of the above objects, the present invention further provides a gender classification method based on predictive probabilistic knowledge distillation, the method comprising the steps of:
acquiring a to-be-predicted image, constructing a ResNet18 network as a teacher main network, and constructing a CNN network as a student network;
configuring a gender classifier for a teacher network, setting classification categories and constructing a weight matrix of the teacher network classifier;
configuring a gender classifier for the student network, setting classification categories which are the same as those of the teacher network, and constructing a weight matrix of the student network classifier;
calculating gender classification loss and gender distillation loss of a teacher network and a student network respectively by adopting a loss function;
acquiring gender prediction probability vectors of the teacher network and the student network for respectively outputting the gender prediction to the input to-be-predicted images;
and calculating total distillation loss according to the gender prediction probability vector, the gender classification loss and the gender distillation loss, and updating parameters of a teacher network and a student network through a gradient descent algorithm to obtain a student network model for finally outputting a gender prediction image.
According to a preferred embodiment of the present invention, the method comprises training the teacher network model, and the training method comprises: acquiring an image to be trained to construct a teacher network training sample set, inputting the teacher network training sample set into a ResNet18 network, configuring network parameters to obtain a teacher network model, and calculating the gender classification loss of the teacher network model:
Figure BDA0003652581920000021
where the input sample set is S, S ═ x 1 ,x 2 ,…,x m };x m Representing the mth input image information to be trained, m being the total number of images of the training sample, x i The ith training image; n is the number of classification categories, and is configured to be 2 for the number of gender classification categories,f t A backbone network representing a teacher network; w t A teacher network classifier weight matrix;
Figure BDA0003652581920000022
representing the y-th column by the weight matrix of the teacher network classifier; bt y For bias of the y type, obtaining gender classification loss L of the teacher network model by using a loss function 1
According to another preferred embodiment of the invention, a random gradient descent algorithm is adopted as a training optimizer to train the teacher network model, and the teacher network model is trained according to the gender classification loss of the teacher network model, so that the classification weight of the teacher network model is obtained and the classification weight of the teacher network model is updated.
According to another preferred embodiment of the present invention, the method comprises constructing a student network model, and the method for constructing the student network model comprises: adopting a CNN network with 5 layers of depth and 3-by-3 convolution size as a backbone network of the student network to construct the student network model:
Figure BDA0003652581920000031
wherein m is the total number of the images of the sample to be trained in one batch; x is the number of i The ith training image; f. of s A backbone network that is a student network; n is the number of classification categories, and the number of gender classification categories of the student network model is configured to be 2, W s A student network classifier weight matrix;
Figure BDA0003652581920000032
representing the column of the y-th class for the student classifier weight matrix; bs y Bias of type y; calculating a gender classification loss L of a student network using a loss function 2
According to another preferred embodiment of the invention, the teacher network model and the student network model respectively calculate the corresponding gender classification loss by adopting a Softmax loss function.
According to another preferred embodiment of the invention, after acquiring the gender prediction probability vectors of the teacher network and the student network respectively outputting the input to-be-predicted images, the distillation loss of the student network model to the teacher network model is further calculated:
Figure BDA0003652581920000033
wherein T is a temperature coefficient set to 10; p is a radical of s (x i ) And p t (x i ) The prediction probability vectors which are respectively output after the same sample data is input by the student network classifier and the teacher network classifier.
According to another preferred embodiment of the present invention, the calculation methods of the prediction probability vectors output by the student network classifier and the teacher network classifier respectively are as follows:
Figure BDA0003652581920000034
wherein
Figure BDA0003652581920000035
Predicting the jth element of the probability vector for the student network;
Figure BDA0003652581920000036
the jth element of the probability vector is predicted for the teacher network.
According to another preferred embodiment of the present invention, the method for calculating total distillation loss according to the gender prediction probability vector, gender classification loss and gender distillation loss comprises: l is 4 =αL 2 +(1-α)L 3 Wherein alpha is a constant, and the value range of alpha is (0, 1).
To achieve at least one of the above objects, the present invention further provides a gender classification system based on predictive probabilistic knowledge distillation, which performs the above gender classification method based on predictive probabilistic knowledge distillation.
The present invention further provides a computer-readable storage medium having stored thereon a computer program executable by a processor to perform the method for gender classification based on predictive probabilistic knowledge distillation.
Drawings
FIG. 1 is a schematic flow chart of a gender classification method based on predictive probability knowledge distillation according to the invention.
FIG. 2 is a schematic diagram showing the structure of a gender classification system based on predictive probability knowledge distillation according to the present invention.
Detailed Description
The following description is presented to disclose the invention so as to enable any person skilled in the art to practice the invention. The preferred embodiments in the following description are given by way of example only, and other obvious variations will occur to those skilled in the art. The basic principles of the invention, as defined in the following description, may be applied to other embodiments, variations, modifications, equivalents, and other technical solutions without departing from the spirit and scope of the invention.
It will be understood by those skilled in the art that in the present disclosure, the terms "longitudinal," "lateral," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like are used in an orientation or positional relationship indicated in the drawings for ease of description and simplicity of description, and do not indicate or imply that the referenced devices or components must be in a particular orientation, constructed and operated in a particular orientation, and thus the above terms are not to be construed as limiting the present invention.
It is understood that the terms "a" and "an" should be interpreted as meaning that a number of one element or element is one in one embodiment, while a number of other elements is one in another embodiment, and the terms "a" and "an" should not be interpreted as limiting the number.
Referring to fig. 1-2, the invention discloses a gender classification method and system based on predictive probability knowledge distillation, and aims to solve the problem that a gender identification model of small equipment in the existing security field is too large and is not suitable for storage requirements of small equipment such as a camera.
Firstly, an image sample to be trained needs to be acquired, wherein the image sample can be acquired based on a crawler technology or based on the existing security equipment, and is preprocessed after the image sample is acquired, and the image data preprocessing method comprises but is not limited to data cleaning, and the existing face recognition technology or the human body recognition technology is adopted to delete impurity images which do not contain face features or human body features. For constructing a trained sample set, and further for constructing a validation set and a test set. The training sample set is used for updating and determining learning parameters such as model weight and bias meeting gradient descent requirements; the verification set is used for determining the hyper-parameters including but not limited to the number of network layers, the number of network nodes, the iteration times, the learning rate and the like; the test set is used to detect the predicted effect of the finalized model. It should be noted that the construction and functions of the training sample set, the verification set, and the test set are conventional settings in the technical field of machine learning, and how to construct the above sets is not described in detail in the present invention.
Specifically, the teacher network model and the student network model need to be established respectively, wherein the teacher network model is generally a large-scale network model, a ResNet18 network with better performance is preferably selected as a backbone network of the teacher network model, a Softmax algorithm is adopted to train the teacher network, and S ═ { x ═ x is defined for the pre-constructed training sample set 1 ,x 2 ,…,x m Is a batch of training data, and further calculates the gender classification loss L of the teacher network 1
Figure BDA0003652581920000051
m is the total number of images of the training sample, x i The ith training image; n is the number of classification categories, and the number of classification categories for gender is 2, f t A backbone network representing a teacher network; w t A teacher network classifier weight matrix;
Figure BDA0003652581920000052
representing the y-th column by the weight matrix of the teacher network classifier; bt y Is a bias of type y. And further adopting a small Batch gradient descent algorithm (Mini-Batch SGD) as a training optimizer, training the teacher network model according to the classification loss, and acquiring the teacher classification network model weight for updating and determining the learning parameters of the teacher network model.
After the teacher network model is built and trained, the invention also needs to build a student network model for gender identification, and it should be noted that the student model needs to adopt a small-structure network model for adapting to small-sized security equipment including but not limited to a camera. The invention prefers a CNN network with 5 layers depth and 3 × 3 convolution window size as the backbone network of the student network model. And further training the student network model by adopting the same Softmax loss function. And updating the learning parameters in the student network model by adopting a gradient descent algorithm. Calculating gender classification loss L in the student network model using the CNN network 2
Figure BDA0003652581920000053
Wherein m is the total number of the images of the sample to be trained in one batch; x is the number of i The ith training image; f. of s A backbone network that is a student network; n is the number of classification categories, and the number of gender classification categories of the student network model is configured to be 2, W s A student network classifier weight matrix;
Figure BDA0003652581920000054
representing the column of the y-th class for the student classifier weight matrix; bs y Is a bias of type y. It should be noted that the gradient descent algorithm (Mini-Batch SGD) is used as a training optimizer to determine the classification lossAnd training the student network model to obtain the weight of the student classification network model, and updating and determining the learning parameters of the student network model. It should be noted that the weight matrix of the classifier in the present invention is a technical means of the CNN network itself, and is set according to the classification category and number, which is not described in detail herein.
After the construction and the training of the teacher network model and the student network model are completed, the training data S of the same batch are respectively input into the trained teacher network model and the trained student network model, and the gender classification loss L of the teacher network model is respectively calculated 1 And the gender classification loss L of the student network model 2 . And further adopting a knowledge distillation algorithm to enable the gender prediction effect of the student network model to be closer to that of a large ResNet18 network. By utilizing the student network model and the knowledge distillation method, the student network model can be used as a small model to be closer to a large ResNet18 network on the premise of not changing the model structure on the aspect of predicting the probability problem.
Referring to fig. 2, the predictive probability distillation in the present invention is to further use the predictive probability of the teacher network to guide the output of the predictive gender probability of the student network based on gender classification, so that the student network learns the probability distribution of the teacher network. Specifically, the invention calculates loss functions of the student network models respectively, wherein the calculated losses include gender classification losses L 2 And distillation loss L 3 :
Figure BDA0003652581920000061
Wherein T is a temperature coefficient set to 10; p is a radical of s (x i ) And p t (x i ) The prediction probability vectors which are respectively output after the same sample data is input by the student network classifier and the teacher network classifier, namely the prediction probability vectors of the student network classifier and the teacher network classifier on the input image are as follows:
Figure BDA0003652581920000062
Figure BDA0003652581920000063
predicting the jth element of the probability vector for the student network;
Figure BDA0003652581920000064
the jth element of the probability vector is predicted for the teacher network.
After the distillation loss calculation of the student network model aiming at the teacher network is completed, the calculated student network model classification loss L is obtained 2 And distillation loss L 3 Then further calculating the total loss L of the knowledge distillation 4
L 4 =αL 2 +(1-α)L 3
Wherein α is a constant with a value range of (0,1), and the value of α is preferably 0.5 in the present invention. It should be noted that the knowledge distillation method further includes: and fixing parameters of the teacher network model and the corresponding classifier, and updating the parameters of the student network model and the classifier thereof according to a loss function Softmax algorithm to obtain a final student network model after final knowledge distillation. And taking the final student network model as a model for predicting the sex of the image person and storing the model in security equipment. Including but not limited to student network model weights and biases.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section, and/or installed from a removable medium. The computer program, when executed by a Central Processing Unit (CPU), performs the above-described functions defined in the method of the present application. It should be noted that the computer readable medium mentioned above in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. The computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wire segments, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless section, wire section, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It will be understood by those skilled in the art that the embodiments of the present invention described above and illustrated in the drawings are given by way of example only and not by way of limitation, the objects of the invention having been fully and effectively achieved, the functional and structural principles of the present invention having been shown and described in the embodiments, and that various changes or modifications may be made in the embodiments of the present invention without departing from such principles.

Claims (10)

1. A gender classification method based on predictive probabilistic knowledge distillation, the method comprising the steps of:
acquiring a to-be-predicted image, constructing a ResNet18 network as a teacher backbone network, and constructing a CNN network as a student backbone network;
configuring a gender classifier for the teacher network model, setting classification categories and constructing a weight matrix of the teacher network classifier;
configuring a gender classifier for the student network model, setting classification categories which are the same as those of the teacher network, and constructing a weight matrix of the student network classifier;
calculating gender classification loss and gender distillation loss of the teacher network model and the student network model respectively by adopting a loss function;
acquiring gender prediction probability vectors of the teacher network model and the student network model for respectively outputting the gender prediction to the input to-be-predicted images;
and calculating the total distillation loss according to the gender prediction probability vector, the gender classification loss and the gender distillation loss, and updating parameters of a teacher network model and a student network model through a gradient descent algorithm to obtain the student network model which finally outputs the gender prediction image.
2. The predictive probabilistic knowledge distillation-based gender classification method as claimed in claim 1, wherein the method comprises training the teacher network model, the training method comprising: acquiring an image to be trained to construct a teacher network training sample set, inputting the teacher network training sample set into a ResNet18 network, configuring network parameters to obtain a teacher network model, and calculating the gender classification loss of the teacher network model:
Figure FDA0003652581910000011
where the input sample set is S, S ═ x 1 ,x 2 ,…,x m };x m Representing the mth input image information to be trained, m being the total number of images of the training sample, x i The ith training image; n is the number of classification categories, and the number of classification categories for gender is 2, f t A backbone network representing a teacher network; w t A teacher network classifier weight matrix;
Figure FDA0003652581910000012
representing the y-th column by the weight matrix of the teacher network classifier; bt y For bias of the y type, obtaining gender classification loss L of the teacher network model by using a loss function 1
3. The gender classification method based on predictive probabilistic knowledge distillation of claim 1, wherein a random gradient descent algorithm is used as a training optimizer to train a teacher network model, and the teacher network model is trained according to gender classification loss of the teacher network model to obtain classification weights of the teacher network model and update the classification weights of the teacher network model.
4. The distillation gender classification method based on the predictive probability knowledge as claimed in claim 2, wherein the method comprises the steps of constructing a student network model, and the construction method of the student network model comprises the following steps: adopting a CNN network with 5 layers of depth and 3-by-3 convolution size as a backbone network of the student network, and constructing the student network model:
Figure FDA0003652581910000021
wherein m is the total number of the images of the sample to be trained in one batch; x is the number of i The ith training image; f. of s A backbone network that is a student network; n is the number of classification categories, and the number of gender classification categories of the student network model is configured to be 2, W s A student network classifier weight matrix;
Figure FDA0003652581910000022
representing the column of the y-th class for the student classifier weight matrix; bs y Bias of type y; calculating a gender classification loss L of a student network using a loss function 2
5. The distillation-based gender classification method according to the claim 4, wherein the teacher network model and the student network model respectively calculate the gender classification loss by using Softmax loss function.
6. The gender classification method based on prediction probability knowledge distillation as claimed in claim 5, characterized in that after obtaining the gender prediction probability vectors of the teacher network and the student network respectively outputting the input images to be predicted, the distillation loss of the student network model to the teacher network model is further calculated:
Figure FDA0003652581910000023
wherein T is a temperature coefficient set to 10; p is a radical of s (x i ) And p t (x i ) The prediction probability vectors which are respectively output after the same sample data is input by the student network classifier and the teacher network classifier.
7. The gender classification method based on predictive probability knowledge distillation as claimed in claim 6, wherein the calculation methods of the predictive probability vectors output by the student network classifier and the teacher network classifier are respectively as follows:
p s (x i )=W s f s (x i )+bs
pt(x i )=Wtf t (x i )+bt;
wherein
Figure FDA0003652581910000024
Predicting the jth element of the probability vector for the student network;
Figure FDA0003652581910000025
the jth element of the probability vector is predicted for the teacher network.
8. The gender classification method based on predictive probability knowledge distillation as claimed in claim 7, wherein the method of calculating total distillation loss based on the gender predictive probability vector, gender classification loss and gender distillation loss comprises: l is 4 =αL 2 +(1-α)L 3 Wherein alpha is a constant, and the value range of alpha is (0, 1).
9. A gender classification system based on predictive probabilistic knowledge distillation, the system performing a gender classification method based on predictive probabilistic knowledge distillation of any one of claims 1-8.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program, which can be executed by a processor to perform a method for gender classification based on predictive probabilistic knowledge distillation according to any of the claims 1-8.
CN202210556798.2A 2022-05-19 2022-05-19 Gender classification method and system based on prediction probability knowledge distillation Pending CN114863181A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210556798.2A CN114863181A (en) 2022-05-19 2022-05-19 Gender classification method and system based on prediction probability knowledge distillation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210556798.2A CN114863181A (en) 2022-05-19 2022-05-19 Gender classification method and system based on prediction probability knowledge distillation

Publications (1)

Publication Number Publication Date
CN114863181A true CN114863181A (en) 2022-08-05

Family

ID=82639837

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210556798.2A Pending CN114863181A (en) 2022-05-19 2022-05-19 Gender classification method and system based on prediction probability knowledge distillation

Country Status (1)

Country Link
CN (1) CN114863181A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115908955A (en) * 2023-03-06 2023-04-04 之江实验室 Bird classification system, method and device for small-sample learning based on gradient distillation

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115908955A (en) * 2023-03-06 2023-04-04 之江实验室 Bird classification system, method and device for small-sample learning based on gradient distillation
CN115908955B (en) * 2023-03-06 2023-06-20 之江实验室 Gradient distillation-based bird classification system, method and device with less sample learning

Similar Documents

Publication Publication Date Title
CN110674880B (en) Network training method, device, medium and electronic equipment for knowledge distillation
US11755911B2 (en) Method and apparatus for training neural network and computer server
CN111737476B (en) Text processing method and device, computer readable storage medium and electronic equipment
US11544524B2 (en) Electronic device and method of obtaining emotion information
WO2022007823A1 (en) Text data processing method and device
US10769484B2 (en) Character detection method and apparatus
US11488013B2 (en) Model training method and apparatus
US11244671B2 (en) Model training method and apparatus
WO2021103761A1 (en) Compound property analysis method and apparatus, compound property analysis model training method, and storage medium
US11803731B2 (en) Neural architecture search with weight sharing
EP3820369B1 (en) Electronic device and method of obtaining emotion information
US11449731B2 (en) Update of attenuation coefficient for a model corresponding to time-series input data
CN113344206A (en) Knowledge distillation method, device and equipment integrating channel and relation feature learning
CN112905801A (en) Event map-based travel prediction method, system, device and storage medium
WO2023280113A1 (en) Data processing method, training method for neural network model, and apparatus
WO2023231753A1 (en) Neural network training method, data processing method, and device
CN113707299A (en) Auxiliary diagnosis method and device based on inquiry session and computer equipment
CN114863181A (en) Gender classification method and system based on prediction probability knowledge distillation
CN113407820A (en) Model training method, related system and storage medium
WO2023174064A1 (en) Automatic search method, automatic-search performance prediction model training method and apparatus
US20240005129A1 (en) Neural architecture and hardware accelerator search
CN113591781B (en) Image processing method and system based on service robot cloud platform
US20240143696A1 (en) Generating differentiable order statistics using sorting networks
Fitra et al. Deep transformer model with pre-layer normalization for covid-19 growth prediction
CN116109449A (en) Data processing method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination