CN117011649B

CN117011649B - Model training method and related device

Info

Publication number: CN117011649B
Application number: CN202311284896.6A
Authority: CN
Inventors: 沈雷
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-10-07
Filing date: 2023-10-07
Publication date: 2024-01-30
Anticipated expiration: 2043-10-07
Also published as: CN117011649A

Abstract

The application discloses a model training method and a related device, which can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic, auxiliary driving and the like, and can be used for training first models with complex model structures and high classification precision, and enabling initial second models with simple model structures to learn model classification capacities of the first models learned in training. The learning strength of the first model for the classification modes of the different information to be classified is determined based on the classification reliability of the first model for the different information to be classified, the learning strength of the first model for the classification modes with higher reliability is higher, and the learning strength of the first model for the classification modes with lower reliability is weaker, so that the selective capacity migration can be performed on the classification capacity learned by the first model in the training process, and the trained second model has more accurate classification capacity while the model structure is simpler.

Description

Model training method and related device

Technical Field

The present disclosure relates to the field of machine learning technologies, and in particular, to a model training method and a related device.

Background

Biometric identification technology is one of widely used identification technologies, for example, face recognition technology, fingerprint identification technology and the like are used when a mobile phone is unlocked. The biological feature recognition mainly depends on a classification result of the classification model on input feature information, and the classification result can identify the probability of the feature information corresponding to each category, wherein the category with the highest probability is the actual category corresponding to the feature information.

In the related art, in order to ensure accuracy of the biometric feature recognition, a classification model with a relatively complex model structure is used for feature classification, and the model generally has a relatively large number of model layers and a relatively large model body, so that although the accuracy of the feature recognition can be ensured, the requirement on processing performance of equipment responsible for running the model is relatively high.

Therefore, the classification model in the related technology is difficult to apply to equipment with lower processing performance, so that the application range of the biological feature recognition technology is limited, and the wide application and popularization of the biological feature recognition technology are not facilitated.

Disclosure of Invention

In order to solve the technical problems, the application provides a model training method, and the classification model obtained through training by the method has a simple structure, can provide a precise classification function, reduces the performance requirement on equipment for running the classification model, and is beneficial to popularization and application of the classification model.

The embodiment of the application discloses the following technical scheme:

in a first aspect, embodiments of the present application disclose a model training method, the method comprising:

acquiring a first model after training and an initial second model to be trained, wherein the model structure complexity of the first model is greater than that of the initial second model, the first model is used for determining sample classification results corresponding to a plurality of pieces of information to be classified respectively, the sample classification results corresponding to target information to be classified are used for identifying actual categories corresponding to the target information to be classified, and the target information to be classified is any one of the plurality of pieces of information to be classified;

determining pending classification results corresponding to the plurality of pieces of information to be classified respectively through the initial second model, wherein the pending classification results corresponding to the target information to be classified are used for identifying the category corresponding to the target information to be classified, which is determined by the initial second model;

determining the credibility parameters of sample classification results corresponding to the plurality of pieces of information to be classified respectively, wherein the credibility parameters are used for representing the credibility of actual categories identified by the sample classification results;

according to the differences between the sample classification results and the undetermined classification results respectively corresponding to the plurality of pieces of information to be classified, the model parameters corresponding to the initial second model are adjusted to obtain a second model, the second model is used for determining the actual category corresponding to the information to be classified, and according to the differences between the sample classification results and the undetermined classification results corresponding to the target information to be classified, the adjusting strength of the model parameters and the credibility represented by the credibility parameters of the sample classification results corresponding to the target information to be classified are adjusted to be in positive correlation.

In a second aspect, an embodiment of the present application discloses a model training apparatus, where the apparatus includes a first acquisition unit, a first determination unit, a second determination unit, and a training unit:

the first obtaining unit is configured to obtain a first model after training and an initial second model to be trained, where the model structure complexity of the first model is greater than that of the initial second model, the first model is configured to determine sample classification results corresponding to a plurality of pieces of information to be classified respectively, and the sample classification result corresponding to a target piece of information to be classified is used to identify an actual category corresponding to the target piece of information to be classified, and the target piece of information to be classified is any one of the plurality of pieces of information to be classified;

the first determining unit is configured to determine, according to the initial second model, pending classification results corresponding to the plurality of to-be-classified information respectively, where the pending classification result corresponding to the target to-be-classified information is used to identify a category corresponding to the target to-be-classified information determined by the initial second model;

the second determining unit is configured to determine reliability parameters of sample classification results corresponding to the plurality of pieces of information to be classified, where the reliability parameters are used to characterize the reliability of actual categories identified by the sample classification results;

The training unit is used for adjusting model parameters corresponding to the initial second model according to differences between sample classification results and undetermined classification results respectively corresponding to the plurality of pieces of to-be-classified information to obtain a second model, the second model is used for determining actual categories corresponding to the to-be-classified information, and adjusting the adjusting force of the model parameters according to the differences between the sample classification results and undetermined classification results corresponding to the target to-be-classified information and the credibility represented by the credibility parameters of the sample classification results corresponding to the target to-be-classified information to be in positive correlation.

In one possible implementation manner, the classification result includes probabilities that the information to be classified corresponds to a plurality of categories, and the training unit is specifically configured to:

and according to the probability that the sample classification result corresponding to the target information to be classified corresponds to a plurality of categories, and the difference between the probability that the target information to be classified corresponds to a plurality of categories and the probability that the target classification result corresponding to the target information to be classified corresponds to the target information to be classified, adjusting the model parameters corresponding to the initial second model.

In one possible implementation manner, the actual category corresponding to the target to-be-classified information identified by the sample classification result corresponding to the target to-be-classified information is a category corresponding to the highest probability in the sample classification result corresponding to the target to-be-classified information, and the second determining unit is specifically configured to:

And respectively taking the plurality of pieces of information to be classified as the target information to be classified, determining the highest probability in a sample classification result corresponding to the target information to be classified as a credibility parameter corresponding to the target information to be classified, wherein the credibility represented by the credibility parameter is positively correlated with the credibility parameter.

In one possible implementation manner, the training unit is specifically configured to:

determining loss weights respectively corresponding to the plurality of pieces of information to be classified according to the credibility parameters of the sample classification results respectively corresponding to the plurality of pieces of information to be classified, wherein the loss weights are used for controlling the adjustment strength of the model parameters according to the difference between the sample classification results corresponding to the information to be classified and the undetermined classification results;

determining a loss parameter corresponding to the target to-be-classified information according to the difference between the sample classification result and the undetermined classification result corresponding to the target to-be-classified information and the loss weight corresponding to the target to-be-classified information;

determining model loss parameters according to the loss parameters respectively corresponding to the plurality of pieces of information to be classified, wherein the model parameters are used for identifying differences between the undetermined classification results determined by the initial second model and the sample classification results determined by the first model;

And adjusting model parameters corresponding to the initial second model according to the model loss parameters.

sequencing the plurality of information to be classified from high to low based on the credibility according to the credibility characterized by the credibility parameters of the sample classification results respectively corresponding to the plurality of information to be classified;

the first N pieces of information to be classified in the sorting result are determined to be a first information set to be classified, the last M pieces of information to be classified in the sorting result are determined to be a third information set to be classified, and the rest of information to be classified are determined to be a second information set to be classified;

determining loss weights corresponding to the first information set to be classified, the second information set to be classified and the third information set to be classified respectively, wherein the loss weight corresponding to the first information set to be classified is greater than the loss weight corresponding to the second information set to be classified, the loss weight corresponding to the second information set to be classified is greater than the loss weight corresponding to the third information set to be classified, the loss weights are positively correlated with the adjustment force, and the loss weight corresponding to the information set to be classified is the loss weight of the information set to be classified corresponding to the information set to be classified.

In one possible implementation, the first model is trained by:

acquiring a sample information set, wherein the sample information set comprises a plurality of sample information, and the plurality of sample information respectively has corresponding sample categories;

respectively taking the plurality of sample information as target sample information, determining a pending classification result corresponding to the target sample information through an initial first model, wherein the pending classification result corresponding to the target sample information is used for identifying a category corresponding to the target sample information determined by the initial first model;

and according to the difference between the sample category corresponding to the target sample information and the category identified by the pending classification result corresponding to the target sample information, adjusting the model parameter corresponding to the initial first model to obtain the first model.

In one possible implementation manner, the first model is specifically configured to extract a sample information feature corresponding to the information to be classified, and determine a sample classification result corresponding to the information to be classified according to the sample information feature, and the first determining unit is specifically configured to:

determining undetermined information characteristics corresponding to the plurality of pieces of information to be classified respectively;

Determining the undetermined classification results corresponding to the plurality of pieces of information to be classified according to undetermined information characteristics corresponding to the plurality of pieces of information to be classified respectively;

the training unit is specifically used for:

and adjusting model parameters corresponding to the initial second model according to differences between sample information features and undetermined information features respectively corresponding to the plurality of pieces of information to be classified and differences between sample classification results and undetermined classification results respectively corresponding to the plurality of pieces of information to be classified.

In one possible implementation manner, the initial second model is composed of a first model part and a second model part, the first model part is used for determining the characteristics of the undetermined information corresponding to the undetermined information, the second model part is used for determining the undetermined classification result corresponding to the undetermined information according to the characteristics of the undetermined information, and the training unit is specifically used for:

according to the differences between the sample information features and the undetermined information features respectively corresponding to the plurality of pieces of information to be classified and the differences between the sample classification results and the undetermined classification results respectively corresponding to the plurality of pieces of information to be classified, adjusting model parameters corresponding to the first model part;

And adjusting model parameters corresponding to the second model part according to differences between sample classification results and undetermined classification results respectively corresponding to the plurality of pieces of information to be classified.

In a possible implementation manner, the second model is composed of a first model part and a second model part, the first model part is used for determining information features corresponding to the information, the second model part is used for determining classification results corresponding to the information according to the information features, the classification results are used for identifying actual categories corresponding to the information, and the device further comprises a second obtaining unit, a third determining unit, a fourth determining unit and a fifth determining unit:

the second obtaining unit is used for obtaining information to be identified and a plurality of category reference information, and the plurality of category reference information respectively have corresponding categories;

the third determining unit is configured to determine, through the first model portion, information features corresponding to the information to be identified and the plurality of category reference information respectively;

the fourth determining unit is configured to determine, according to the similarity between the information feature corresponding to the information to be identified and the information features corresponding to the plurality of category reference information, target category reference information with highest similarity with the information to be identified in the information feature dimension in the plurality of category reference information;

The fifth determining unit is configured to determine a category corresponding to the target category reference information as a category corresponding to the information to be identified.

In one possible implementation manner, the information to be classified is biological characteristic information, and the category corresponding to the information to be classified is a biological object corresponding to the biological characteristic information.

In one possible implementation, the biometric information is palm print image information, which is determined by:

acquiring palm image information corresponding to a biological object;

identifying a target positioning point in the palm image information;

determining an image dividing region corresponding to the palm image information according to the target positioning point;

and dividing the palm print image information from the palm image information through the image dividing region.

In a possible implementation manner, the target positioning point includes an index finger seam point, a middle finger seam point and a ring finger seam point on the palm, and the determining, according to the target positioning point, an image dividing area corresponding to the palm image information includes:

determining a connecting line of the index finger joint point and the ring finger joint point as an x-axis in a plane coordinate system on an image plane corresponding to the palm image information, and determining a straight line passing through the middle finger joint point and perpendicular to the x-axis as a y-axis in the plane coordinate system;

Determining a point, which is positioned on the y axis at the palm center of the palm image information and is the distance between the index finger slit point and the ring finger slit point, from the origin of coordinates as a division center point;

and determining the image dividing region by taking the target multiple of the distance between the index finger seam point and the ring finger seam point as the side length of the image dividing region, wherein the dividing center point is taken as the center point of the image dividing region, and the image dividing region is a square region.

In a third aspect, embodiments of the present application disclose a computer device comprising a processor and a memory:

the memory is used for storing a computer program and transmitting the computer program to the processor;

the processor is configured to execute the model training method according to any one of the first aspects according to instructions in the computer program;

in a fourth aspect, embodiments of the present application disclose a computer readable storage medium for storing a computer program for performing the model training method of any one of the first aspects;

In a fifth aspect, embodiments of the present application disclose a computer program product comprising a computer program which, when run on a computer device, causes the computer device to perform the model training method of any of the first aspects.

According to the technical scheme, the first model can be obtained through training, and the first model has a complex model structure, so that the information to be classified can be classified accurately, an accurate sample classification result is obtained, and the sample classification result is used for identifying the actual category corresponding to the information to be classified. The plurality of information to be classified can be input into the initial second model as well, and a to-be-determined sample classification result output by the initial second model is obtained, and the to-be-determined sample classification result can identify the category corresponding to the information to be classified determined by the initial second model. In order to enable the initial second model to learn how to accurately analyze the category of the information to be classified, model parameters of the initial second model can be adjusted according to the difference between the sample classification result corresponding to the information to be classified and the undetermined classification result. The reliability of the sample classification result determined by the first model aiming at different information to be classified may be different, so that in order to enable the initial second model to effectively learn the classification capability of the first model, the learning strength of the sample classification result with lower reliability determined by the first model can be reduced by the initial second model, and the learning strength of the sample classification result with higher reliability can be improved by the initial second model. The credibility parameter of the corresponding sample classification result can be determined according to each piece of information to be classified, and the credibility parameter can represent the credibility of the actual category identified by the sample classification result. The parameter adjustment force of the initial second model based on the difference of the sample classification result and the undetermined classification result of each piece of to-be-classified information can be adjusted based on the credibility parameter, so that the classification capacity of the initial second model to the first model can be effectively learned, the second model for accurate classification can be obtained, meanwhile, the second model directly learns model knowledge which is already refined by the first model due to the mode of model knowledge migration adopted by the method, the requirement on the model performance of the second model is lower, and the second model can have a simpler model structure. Since the complexity of the model structure of the second model is lower than that of the first model, the performance requirements for the device for running the second model are lower, so that the second model can be applied to various classification scenes, for example, various biological feature recognition scenes, and the classification technology can be effectively popularized.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of a model training method in a practical application scenario provided in an embodiment of the present application;

FIG. 2 is a flowchart of a model training method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a model training method according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a model training method according to an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of a model training method according to an embodiment of the present disclosure;

FIG. 6 is a schematic diagram of a model training method according to an embodiment of the present disclosure;

fig. 7 is a flowchart of a model training method in a practical application scenario provided in an embodiment of the present application;

fig. 8 is a schematic diagram of a model training method in a practical application scenario provided in an embodiment of the present application;

FIG. 9 is a block diagram of a model training device according to an embodiment of the present disclosure;

fig. 10 is a block diagram of a terminal according to an embodiment of the present application;

fig. 11 is a block diagram of a server according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described below with reference to the accompanying drawings.

Biometric recognition is an application mode in classification technology of models, for example, biometric recognition is actually to judge an object corresponding to an input biometric, and the object is a category corresponding to the biometric.

In the related art, biometric identification is usually implemented through a classification model, and a technician can accurately analyze an input biometric through a neural network model with a relatively complex structure to determine an object (i.e., a category) corresponding to the biometric. However, since the model training method in the related art can only enable the model with a relatively complex model structure to learn how to accurately analyze the category, the trained classification model has high performance requirements on the equipment for running the model, the equipment with low performance cannot apply the classification model, or the running speed of the classification model in the equipment is relatively slow, and the quick classification of the information to be classified cannot be realized. Therefore, the classification model trained by the model training mode in the related art is difficult to be widely applied to various classification scenes, the requirements on the conditions of model application are severe, and the biological feature recognition is usually applied to various devices with low processing performance, such as a door lock, a switch and the like, so that the model training method in the related art is difficult to be applied to the field of biological feature recognition.

In order to solve the technical problems, the application provides a model training method, which can train to obtain a first model with a complex model structure and high classification accuracy, and enable an initial second model with a simple model structure to learn the model classification capability learned by the first model in training. The learning strength of the first model for the classification modes of the different information to be classified is determined based on the classification reliability of the first model for the different information to be classified, the learning strength of the first model for the classification modes with higher reliability is higher, and the learning strength of the first model for the classification modes with lower reliability is weaker, so that the selective capacity migration can be performed on the classification capacity learned by the first model in the training process, the second model obtained through training has more accurate classification capacity while the model structure is simpler, and further, the performance requirements of model application equipment are reduced, and the wide application of the classification model is facilitated.

It will be appreciated that the method may be applied to a computer device which is a computer device capable of model training, for example a terminal device or a server. The method can be independently executed by the terminal equipment or the server, can also be applied to a network scene of communication between the terminal equipment and the server, and is executed by the cooperation of the terminal equipment and the server. The terminal equipment can be mobile phones, tablet computers, notebook computers, desktop computers, intelligent voice interaction equipment, intelligent household appliances, vehicle-mounted terminals and the like. The server can be understood as an application server, a Web server, an independent server, a cluster server, a cloud server or the like in actual deployment.

The present application also contemplates artificial intelligence techniques, artificial intelligence (Artificial Intelligence, AI) is a theory, method, technique, and application system that utilizes a digital computer or digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include, for example, sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, pre-training model technologies, operation/interaction systems, mechatronics, and the like. The pre-training model is also called a large model and a basic model, and can be widely applied to all large-direction downstream tasks of artificial intelligence after fine adjustment. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

The present application relates to machine learning and Computer Vision technology, and Computer Vision technology (CV) Computer Vision is a science of researching how to make a machine "look at", and more particularly, to replace a human eye with a camera and a Computer to recognize and measure a target, and further to perform graphic processing, so that the Computer processing becomes an image more suitable for human eye observation or transmission to an instrument for detection. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. The large model technology brings important innovation for the development of computer vision technology, and a pre-trained model in the vision fields of swin-transformer, viT, V-MOE, MAE and the like can be rapidly and widely applied to downstream specific tasks through fine tuning. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and others.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like. The pre-training model is the latest development result of deep learning, and integrates the technology.

According to the method and the device, the second model can be automatically trained through a machine learning technology, palm images can be obtained through a computer vision technology, and the palm images are automatically processed to obtain palm print image information.

In order to facilitate understanding of the technical solution provided in the present application, a model training method provided in the embodiments of the present application will be described next with reference to an actual application scenario.

Referring to fig. 1, fig. 1 is a schematic diagram of a model training method in an actual application scenario provided in an embodiment of the present application, where a computer device may be a server 101 with a model training function.

The server 101 may first acquire a plurality of pieces of information to be classified, which may be, for example, biometric information or the like. The first model is a classification model obtained through training, the initial second model is a model which is not trained yet, and the model structure complexity of the first model is greater than that of the initial second model.

Taking to-be-classified information 1 in a plurality of to-be-classified information as an example, the server 101 can input the to-be-classified information 1 into the first model and the initial second model respectively to obtain a sample classification result and a to-be-classified result, and because the model structure of the first model is complex and the classification of the information is accurate, the sample classification result can identify the actual category corresponding to the to-be-classified information 1, and the to-be-classified result can identify the category determined by the initial second model, so that the accuracy difference of the initial second model in classification can be reflected based on the difference between the sample classification result and the to-be-classified result, and therefore, the model parameters of the initial second model can be adjusted based on the difference to enable the initial second model to learn the model knowledge of the first model about classification, so that the model knowledge of the first model is transferred to the initial second model is realized.

The reliability of the determined classification result of each sample may be different because the first model may have errors, if the reliability of the classification result of a certain sample is low, it is indicated that the accuracy of the classification result of the information to be classified by the first model is insufficient, and if the model parameters of the initial second model are adjusted with the same parameter adjustment force based on the classification result of the sample, the initial second model may learn invalid model knowledge. Based on this, the server 101 may determine a reliability parameter of each sample classification result, where the reliability parameter is used to characterize the reliability of the actual class identified by the sample classification result, and based on the reliability parameter, the server 101 may control the adjustment strength of the model parameter of the initial second model based on the difference, and use a smaller adjustment strength for the sample classification result with lower reliability, so as to reduce learning of invalid model knowledge in the first model; and a sample classification result with higher credibility adopts larger adjusting force, so that the learning of effective model knowledge in the first model is enhanced.

Therefore, on one hand, the second model is trained in a model knowledge migration mode, so that the second model can learn the model knowledge which is already refined in the first model under the condition that the model structure is simpler, the model structure complexity requirement on the second model can be reduced, and a classification model with a simpler model structure is obtained; on the other hand, the parameter adjusting force is controlled by combining the credibility, so that the second model can learn the model knowledge in the first model in a targeted manner, and the classification accuracy of the second model is ensured.

Next, a model training method provided in the embodiments of the present application will be described with reference to the accompanying drawings.

Referring to fig. 2, fig. 2 is a flowchart of a model training method provided in an embodiment of the present application, in this embodiment, the computer device may be any one of the computer devices having a model training function, and the method includes:

s201: and acquiring a first model after training and an initial second model to be trained.

The first model may be a classification model obtained by training in any manner, and has a more complex model structure, so that more accurate information classification capability can be learned by training.

It can be appreciated that the first model is usually trained based on various sample information with classification labels, that is, each sample information has a corresponding actual classification label, and model parameter adjustment is performed by the difference between the classification result determined by the first model and the actual classification label, so that the first model learns how to determine an accurate class. Thus, for the first model, the process of training is a process of building model knowledge for classification from none to none.

In the application, in order to enable the model with lower complexity of the model structure to have accurate classification capability close to that of the first model, the computer equipment can adopt a training technology of model distillation, namely, the model with lower complexity of the structure is not fitted with training data, but model knowledge learned by the first model is directly learned, and migration of the model knowledge learned by the first model to the model with lower complexity of the model structure is completed. The method can enable the model with lower model structure complexity to directly learn the existing model knowledge without constructing model knowledge based on samples from scratch, so that the requirement on model performance is lower, and the method is applicable to the model with lower model structure complexity.

In the application, the computer device may obtain an initial second model to be trained, where the model structure complexity of the first model is greater than that of the initial second model, and the first model may be used to determine sample classification results corresponding to the plurality of pieces of information to be classified respectively, where the information to be classified may be any information having a corresponding class, for example, may be various biological feature information and the like. The sample classification result corresponding to the target information to be classified is used for identifying an actual category corresponding to the target information to be classified, wherein the actual category is a more accurate category obtained by classification through the first model, and the target information to be classified can be any one of a plurality of information to be classified.

S202: and determining undetermined classification results corresponding to the plurality of pieces of information to be classified respectively through an initial second model.

The pending classification result corresponding to the target to-be-classified information is used for identifying the category corresponding to the target to-be-classified information determined by the initial second model, so that the pending classification result can represent the information classification capability of the initial second model.

S203: and determining the credibility parameters of sample classification results corresponding to the plurality of pieces of information to be classified respectively.

It can be appreciated that the higher the learning degree of the model knowledge of the first model by the initial second model, the closer the pending classification result determined theoretically for the same information to be classified should be to the sample classification result determined by the first model. Therefore, in the application, the model parameters of the initial second model can be adjusted through the difference between the undetermined classification result and the sample classification result, so that the undetermined classification result determined by the initial second model approaches to the sample classification result, and model knowledge for accurate classification is learned.

Therefore, the training effectiveness of the initial second model depends on the credibility of the sample classification result determined by the first model, and if the credibility of the sample classification result determined by the first model is lower, namely the accuracy of the actual category identified by the sample classification result is lower, the initial second model can learn invalid model knowledge in the process of gradually approaching the to-be-determined classification result to the sample classification result based on the difference adjustment model parameters; on the contrary, if the reliability of the sample classification result is higher, the initial second model can learn the model knowledge capable of effectively classifying the information. Based on this, in this application, in order to ensure accurate application of the model distillation technology, the computer device may determine the adjustment force for adjusting the model parameters of the initial second model by the difference based on different credibility corresponding to the sample classification result.

The computer device may determine the reliability parameter of the sample classification result corresponding to the plurality of pieces of information to be classified, where the reliability parameter is used to characterize the reliability of the actual category identified by the sample classification result, and the higher the reliability, the more accurate the actual category identified by the sample classification result, and conversely the lower the accuracy.

S204: and according to the difference between the sample classification result and the undetermined classification result which are respectively corresponding to the plurality of pieces of information to be classified, adjusting model parameters corresponding to the initial second model to obtain the second model.

The method comprises the steps that according to the difference between a sample classification result corresponding to target to-be-classified information and a to-be-classified result, the adjustment strength of model parameters is adjusted to be positively correlated with the credibility represented by the credibility parameter of the sample classification result corresponding to the target to-be-classified information, namely, the higher the credibility represented by the credibility parameter is, the more accurate the actual category identified by the sample classification result is, the more accurate the computer equipment can adjust the model parameters of an initial second model according to the difference with larger adjustment strength, so that the proximity degree between the to-be-classified result and the sample classification result determined by the initial second model is higher, and the initial second model can accurately analyze the model knowledge of the information category in the first model to learn the high intensity; the lower the reliability represented by the reliability parameter, the lower the accuracy of the actual category identified by the sample classification result is, and the computer equipment can adjust the model parameter of the initial second model according to the difference with smaller adjusting force, so that the approach degree between the undetermined classification result determined by the initial second model and the sample classification result is lower, and the learning force of the initial second model on the model knowledge which helps the accurate analysis information category in the first model is lower can be reduced. According to the method, preferential learning of the model knowledge of the first model can be achieved, so that the second model obtained through training can efficiently and accurately finish migration of the model knowledge of the first model, the second model has the capability of accurately classifying information, and further in practical application, the second model can be used for determining the practical category corresponding to the information to be classified.

According to the technical scheme, the model training method can enable the initial second model to effectively learn the classification capacity of the first model to obtain the second model for accurate classification, and meanwhile, the second model directly learns model knowledge which is already refined by the first model due to the mode of model knowledge migration, so that the model performance requirement on the second model is low, and the second model can have a simpler model structure. Since the complexity of the model structure of the second model is lower than that of the first model, the performance requirements for the device for running the second model are lower, so that the second model can be applied to various classification scenes, for example, various biological feature recognition scenes, and the classification technology can be effectively popularized.

It can be appreciated that when determining the category, the classification model generally analyzes probabilities of the information corresponding to a plurality of categories according to the information, and determines the category corresponding to the information based on the probabilities. In one possible implementation manner, the classification result output by the model may include probabilities that the information to be classified corresponds to a plurality of categories, and the category determined by the model can be identified through the probabilities, for example, the category with the highest probability output by the model is the category corresponding to the information to be classified considered by the model.

In performing step S204, the computer device may perform step S2041 (not shown in the figure), step S2041 being one possible implementation of step S204, including:

s2041: and according to the probability that the target to-be-classified information included in the sample classification result corresponding to the target to-be-classified information corresponds to a plurality of categories, and the difference between the probabilities that the target to-be-classified information included in the to-be-classified result corresponding to the target to-be-classified information corresponds to a plurality of categories, adjusting model parameters corresponding to the initial second model.

In the embodiment of the present application, since the classification result includes not only the probability corresponding to the identified category but also the probability corresponding to other categories, the amount of leachable information provided to the initial second model is larger than the adjustment based solely on the differences between the categories in the related art. For example, in the related art, the information to be classified for training generally has only a corresponding actual category label, which is equivalent to that the probability of the information to be classified corresponding to the actual category is 100%, and the probabilities of the information to be classified corresponding to other categories are all 0, so that probability information of only one dimension can be learned for the model; in this application, probabilities of the information to be classified, which is determined by the first model and corresponds to a plurality of categories, are used as difference calculation, that is, the initial second model learns an analysis mode of the first model for the plurality of categories, so that the initial second model has a larger information quantity, for example, if the probabilities of the two categories are relatively close, the initial second model can learn the similarity of the two categories, and if the probabilities of the two categories are relatively different, the initial second model can learn the difference between the two categories. Therefore, the model parameters are regulated through the difference in the embodiment of the application, so that the initial second model can learn the model knowledge of the first model more efficiently and accurately, and the model training efficiency is improved.

Based on the above classification result including probabilities corresponding to a plurality of classes, in one possible implementation manner, the actual class corresponding to the target to-be-classified information identified by the sample classification result corresponding to the target to-be-classified information may be the class corresponding to the highest probability in the sample classification result corresponding to the target to-be-classified information, that is, the actual class corresponding to the target to-be-classified information determined by the first model is the class with the highest probability corresponding to the target to-be-classified information analyzed by the first model.

In performing step S203, the computer device may perform step S2031 (not shown in the figure), where step S2031 is one possible implementation of step S203, including:

s2031: and respectively taking the plurality of pieces of information to be classified as target information to be classified, and determining the highest probability in a sample classification result corresponding to the target information to be classified as a credibility parameter corresponding to the target information to be classified.

Wherein the reliability characterized by the reliability parameter is positively correlated with the reliability parameter. It can be understood that if the reliability parameter is higher, it indicates that the maximum probability in the sample classification result is higher, and because the probability of the multiple categories is added to be 1, the higher the maximum probability indicates that the larger the probability difference between the actual category corresponding to the maximum probability and the actual category corresponding to other categories is, the higher the certainty that the first model determines that the target information to be classified corresponds to the actual category is, so that the higher the reliability indicates that the first model can compare and determine that the target information to be classified corresponds to the actual category. For example, if the first model determines that the probability of the target to-be-classified information corresponding to the category 1 is 90% and the probabilities of the other categories are added to be 10%, it is explained that the first model can determine that the category 1 is the actual category corresponding to the target to-be-classified information.

On the contrary, if the reliability parameter is lower, it indicates that even the maximum probability in the sample classification result is lower, because the probability of the plurality of categories is added to be 1, the lower the maximum probability is, the smaller the probability difference between the actual category corresponding to the maximum probability and the other categories is, that is, the first model determines a plurality of categories which are relatively similar in probability, so that the lower the certainty that the first model determines that the target information to be classified corresponds to the actual category, the greater the probability of error occurrence is, and therefore the lower the reliability is, the information to be classified which cannot be compared and determined by the first model is indicated to correspond to the actual category. For example, if the first model determines that the probability of the target to-be-classified information corresponding to the category 1 is 30% and the probability of the target to-be-classified information corresponding to the category 2 is 25%, it is explained that the probability of the target to-be-classified information corresponding to the category 1 and the category 2 is relatively close by the first model, so that the reliability is relatively low even if the classification result can identify the target to-be-classified information corresponding to the category 1.

Next, a specific manner of adjusting the model parameters based on the differences will be described.

In one possible implementation, to achieve different parameter adjustment efforts, the computer device may assign different loss weights based on different confidence parameters.

In performing step S204, the computer device may perform steps S2042-S2045 (not shown), steps S2042-S2045 being one possible implementation of step S204, including:

s2042: and determining loss weights corresponding to the plurality of pieces of information to be classified respectively according to the credibility parameters of the sample classification results corresponding to the plurality of pieces of information to be classified respectively.

The loss weight is used for controlling the adjustment force of the model parameters according to the difference between the sample classification result and the undetermined classification result corresponding to the information to be classified, the loss weight is determined based on the credibility parameters of the sample classification results corresponding to the information to be classified respectively, for example, in the application, if the loss weight is larger, the higher the adjustment force of the model parameters according to the difference is, the higher the credibility represented by the credibility parameters is.

S2043: and determining a loss parameter corresponding to the target information to be classified according to the difference between the sample classification result and the undetermined classification result corresponding to the target information to be classified and the loss weight corresponding to the target information to be classified.

By combining the difference and the loss weight, the loss parameter can be used for representing the adjustment mode of the model parameter of the initial second model based on the difference, wherein the loss weight is used for controlling the adjustment force, the difference is used for measuring whether the model parameter is adjusted, and the smaller the difference is, the closer the classification capacity of the initial second model is to the first model, and the more effective the adjustment of the model parameter is.

S2044: and determining model loss parameters according to the loss parameters respectively corresponding to the plurality of pieces of information to be classified.

The model parameters can characterize the overall difference of the initial second model and the first model when determining the categories corresponding to the information to be classified respectively, so that the model parameters can be used for identifying the difference between the undetermined classification result determined by the initial second model and the sample classification result determined by the first model, namely the difference of the two models in classification capability. The loss weight can be used for controlling the duty ratio of the difference corresponding to each piece of information to be analyzed in the overall difference represented by the model loss parameter, the reliability and the duty ratio controlled by the loss weight are positively correlated, and therefore when the model parameter is regulated according to the overall difference, the regulating force corresponding to the difference is larger.

S2045: and adjusting model parameters corresponding to the initial second model according to the model loss parameters.

In the adjustment process, the computer equipment can enable the overall difference represented by the model loss parameters determined through the mode to be lower and lower through adjusting the model parameters corresponding to the initial second model, so that the overall classification mode of the initial second model for a plurality of pieces of information to be classified is more and more approaching to the classification mode of the first model, the initial second model can learn model knowledge used for information classification in the first model, and the obtained second model can have classification capacity which is closer to that of the first model.

Specifically, the computer device may determine loss weights corresponding to the plurality of pieces of information to be classified, respectively, in the following manner.

In performing step S2042, the computer device may perform steps S20421-S20423 (not shown), steps S20421-S20423 being one possible implementation of step S2042, including:

s20421: and sequencing the plurality of information to be classified from high to low based on the credibility according to the credibility characterized by the credibility parameters of the sample classification results respectively corresponding to the plurality of information to be classified.

The higher the information to be classified is, the higher the credibility of the credibility parameter of the corresponding sample classification result is.

S20422: the first N pieces of information to be classified in the sorting result are determined to be a first information set to be classified, the last M pieces of information to be classified in the sorting result are determined to be a third information set to be classified, and the rest of information to be classified are determined to be a second information set to be classified.

Where N and M are both positive integers, the number of N and M may be determined by specific requirements and are not limited herein. According to the method, the plurality of pieces of information to be classified can be divided into three pieces of information to be combined according to the credibility of the sample classification results, wherein the credibility of the actual category marked by the sample classification results corresponding to the information to be classified in the first information set to be classified is higher than that of the second information set to be classified, and the credibility of the actual category marked by the sample classification results corresponding to the information to be classified in the second information set to be classified is higher than that of the third information set to be classified.

S20423: and determining loss weights respectively corresponding to the first information set to be classified, the second information set to be classified and the third information set to be classified.

The loss weight corresponding to the first information set to be classified is greater than the loss weight corresponding to the second information set to be classified, the loss weight corresponding to the second information set to be classified is greater than the loss weight corresponding to the third information set to be classified, and the loss weight and the adjustment force are positively correlated, so that the adjustment force of the model parameters is maximum based on the difference corresponding to the information to be classified in the first information set to be classified, and the initial second model can learn greater force according to model knowledge capable of effectively classifying the information in the first model; the adjustment strength of the model parameters based on the difference corresponding to the information to be classified in the third information set to be classified is minimum, so that the learning strength of the initial second model on model knowledge which helps information classification in the first model is reduced. When determining the loss weight corresponding to each piece of information to be classified, the loss weight corresponding to the information to be classified is the loss weight of the information set corresponding to the information to be classified.

As mentioned above, the first model may be trained by an actual training sample, and in particular, in one possible implementation, the first model may be trained by:

The computer device may first obtain a sample information set, where the sample information set includes a plurality of sample information, and the plurality of sample information respectively has a corresponding sample category, where the sample category is an actual category corresponding to the sample information.

The computer device may determine, by using the plurality of sample information as target sample information, a pending classification result corresponding to the target sample information through the initial first model, where the pending classification result corresponding to the target sample information is used to identify a category corresponding to the target sample information determined by the initial first model. Therefore, the accuracy of the initial first model in classification can be represented through the difference between the sample category corresponding to the target sample information and the category identified by the pending classification result corresponding to the target sample information, and the initial first model is the first model with incomplete training. The more accurate the initial first model classification, the less the difference. Furthermore, the computer equipment can adjust model parameters corresponding to the initial first model according to the difference between the sample category corresponding to the target sample information and the category identified by the pending classification result corresponding to the target sample information, so that the category identified by the pending classification result determined by the initial first model gradually approaches to the sample category corresponding to the sample information, and the initial first model learns how to accurately classify the information, thereby obtaining the first model.

It should be noted that, since the sample information only has a corresponding sample class, the probability corresponding to the sample class is 100%, and the probability corresponding to any other class is 0, compared with the sample classification result output by the first model, the amount of information contained in the sample classification result is smaller, so that the difficulty of learning the initial first model based on the sample class is greater, and the model structure complexity requirement of the initial first model is higher, therefore, the sample information can be used for training the first model, but is not suitable for training the second model.

Next, the information classification process of the model will be described in detail.

In one possible implementation manner, the first model is specifically configured to extract a sample information feature corresponding to the information to be classified, and determine a sample classification result corresponding to the information to be classified according to the sample information feature. In order for the second model to be able to learn the information classification capabilities of the first model effectively, the computer device may set an information classification procedure for the second model that is similar to the first model.

For example, in performing step S202, the computer device may perform steps S2021-S2022 (not shown), steps S2021-S2022 being one possible implementation of step S202, including:

S2021: and determining undetermined information characteristics corresponding to the plurality of pieces of information to be classified respectively.

The undetermined information features are information features extracted by the initial second model based on the to-be-classified information and are used for subsequently determining undetermined classification results corresponding to the to-be-classified information.

S2022: and determining the undetermined classification results corresponding to the plurality of pieces of information to be classified according to undetermined information characteristics corresponding to the plurality of pieces of information to be classified.

The initial second model may be based on the determined characteristics of the information to be classified, and the category corresponding to the information to be classified. It follows that the accuracy of the model in determining the class is influenced by, on the one hand, whether the information features determined by the model are helpful for classification or not, and on the other hand, the accuracy of the model in analyzing the information features. Based on this, in making model parameter adjustments based on differences, the computer device may make adjustments based on differences in two dimensions of information features and categories.

In performing step S204, the computer device may perform step S2046 (not shown in the figure), step S2046 being one possible implementation of step S204, including:

s2046: and adjusting model parameters corresponding to the initial second model according to differences between the sample information features and the undetermined information features respectively corresponding to the plurality of pieces of information to be classified and differences between the sample classification results and the undetermined classification results respectively corresponding to the plurality of pieces of information to be classified.

Model parameter adjustment is carried out according to differences between sample information features and undetermined information features respectively corresponding to a plurality of pieces of information to be classified, so that an initial second model can learn how to extract information features capable of accurately representing category features of the information to be classified; according to the difference between the sample classification result and the undetermined classification result which are respectively corresponding to the plurality of information to be classified, model parameters are adjusted, so that the initial second model can be used for learning how to accurately classify the information in the whole, wherein learning can be carried out on how to determine the information features with great classification assistance and learning can be carried out on how to accurately analyze the information features.

As can be seen from the analysis of the model classification process, the effect of different differences on model parameter adjustment is different, so that the learning efficiency of the initial second model can be improved by purposefully adjusting model parameters of different parts of the model through the different differences.

Specifically, in one possible implementation manner, the initial second model is composed of a first model part and a second model part, where the first model part is used to determine a feature of the pending information corresponding to the information to be classified, the second model part is used to determine a pending classification result corresponding to the information to be classified according to the feature of the pending information, when executing step S2046, the computer device may execute steps S20461-S20462 (not shown in the figure), and steps S20461-S20462 are one possible implementation manner of step S2046, and includes:

S20461: and adjusting model parameters corresponding to the first model part according to differences between sample information features and undetermined information features respectively corresponding to the plurality of pieces of information to be classified and differences between sample classification results and undetermined classification results respectively corresponding to the plurality of pieces of information to be classified.

Because the extraction accuracy of the undetermined information features and the analysis accuracy of the undetermined information features can influence the accuracy of the undetermined classification result output by the final initial second model, the model parameters of the first model part and the second model part can be adjusted simultaneously through the difference between the sample classification result and the undetermined classification result. Furthermore, the difference between the sample information feature and the pending information feature may itself be used to let the initial second model learn how to determine the exact information feature, so that the model parameters of the first model part can be adjusted by this difference.

S20462: and adjusting model parameters corresponding to the second model part according to differences between sample classification results and undetermined classification results respectively corresponding to the plurality of pieces of information to be classified.

Through the adjustment mode, on one hand, the first model part can learn how to extract the information features which help information classification greatly, and on the other hand, the second model part can learn how to accurately analyze the information features to determine the actual classification, so that the second model has the capability of determining an accurate classification result based on the information to be classified.

It can be appreciated that through the above model training process, the first model part in the second model may have the capability of extracting information features that are more effective in the class characterization effect of the information to be classified, so that to a certain extent, by comparing the information features of different information, it can be determined whether the different information corresponds to the same class.

Based on this, in one possible implementation, in order to further reduce the performance requirements for the device running the model, while guaranteeing the classification accuracy, the computer device may perform the information classification using only the first model part of the second model.

In this implementation manner, the second model may be formed by a first model part for determining information features corresponding to the information and a second model part for determining classification results corresponding to the information according to the information features, and the classification results are used for identifying actual categories corresponding to the information. In practical application, the computer device may acquire the information to be identified and multiple category reference information, where the multiple category reference information respectively has corresponding categories, and in the practical classification process, the classification result may include the categories corresponding to the category reference information, or may not have the results corresponding to the categories.

The computer device may first determine, through the first model portion, information features corresponding to the information to be identified and the plurality of category reference information, respectively, where the information features may be capable of characterizing categories of the corresponding information, respectively. It will be appreciated that if the categories corresponding to the two pieces of information are the same, the information features corresponding to the two pieces of information should be relatively similar. Therefore, the computer equipment can determine the target category reference information with the highest similarity with the information to be identified in the dimension of the information characteristics in the category reference information according to the similarity between the information characteristics corresponding to the information to be identified and the information characteristics respectively corresponding to the category reference information, wherein the target category reference information is the information closest to the information to be identified in category. Therefore, the computer equipment can determine the category corresponding to the target category reference information as the category corresponding to the information to be identified, and the classification of the information to be identified is completed.

Therefore, according to the application mode, the second model part in the second model is not needed to participate, so that the more accurate classification of the information to be identified is realized, the complexity of the model structure in the actual application is further reduced, the performance requirement on the application equipment is further reduced, the actual application and popularization of the classification technology are facilitated, and the classification technology can be applied to wider scenes.

Next, a model application process of the present application will be described with respect to an actual application scenario of the biometric classification. In one possible implementation manner, the information to be classified may be biometric information, and the biometric information may be any biometric information, such as palm print information, iris information, fingerprint information, and the like, where the category corresponding to the information to be classified is a biometric object corresponding to the biometric information, that is, the classification process for the biometric information is to determine the biometric object corresponding to the biometric information.

For example, the multiple biological objects can respectively input corresponding biological feature information through the first model part to obtain corresponding information features, when the biological object is input with the biological feature information, the information features corresponding to the biological feature information can be determined through the first model part, the information features of the multiple biological objects are compared, and the biological object corresponding to the information feature with the highest similarity is found out to serve as the classification result of the input biological feature information.

Specifically, in one possible implementation, the biometric information is palm print image information, which may be determined by:

The computer device may first obtain palm image information corresponding to the biological object, where the biological object may be a person or a non-person object, and the palm image information includes a palm of the biological object. The computer device can identify a target positioning point in the palm image information, wherein the target positioning point is a point with a specific position on the palm and is used for dividing an image dividing region with obvious palm print characteristic.

The computer equipment can determine an image division area corresponding to the palm image information according to the target positioning point, then the computer equipment can divide palm image information from the palm image information through the image division area, and the palm image information can be used for effectively representing palm print characteristics of the biological object, so that the corresponding biological object can be effectively identified based on the palm image information.

Specifically, the computer apparatus may divide the image division areas in the palm image information in the following manner.

In one possible implementation, as shown in fig. 3, the target positioning point may include an index finger seam point on the palm, a middle finger seam point and a ring finger seam point, where the index finger seam point is a point on a finger seam between an index finger and a middle finger, the middle finger seam point is a point on a finger seam between a middle finger and a ring finger, the ring finger seam point is a point on a finger seam between a ring finger and a little finger, the point being a point on a finger seam between a ring finger and a ring finger, the point being closest to the palm.

When determining the image division area corresponding to the palm image information according to the target positioning point, the computer device may determine the line between the index finger slit point and the ring finger slit point as the x-axis in the plane coordinate system on the image plane corresponding to the palm image information, and determine the line passing through the index finger slit point and perpendicular to the x-axis as the y-axis in the plane coordinate system, as shown in fig. 4.

The computer device may then determine a point at the palm center of the palm image information on the y-axis that is a distance from the origin of coordinates that is a distance between the index finger slit point and the ring finger slit point as a division center point, as shown in fig. 5. Finally, the computer device may determine the image division area by using a target multiple of the distance between the index finger seam point and the ring finger seam point as the side length of the image division area, and the division center point is the center point of the image division area, where the image division area is a square area, and the target multiple may be determined according to actual requirements, for example, may be set to twelve times as five times, as shown in fig. 6. As can be seen from FIG. 6, the palm prints on the palm can be collected more comprehensively through the image dividing regions, and other irrelevant information such as finger information and the like can be effectively eliminated, so that the model is facilitated to accurately analyze the palm print characteristics.

In order to facilitate understanding of the technical scheme provided by the application, a model training method provided by the application will be described next in conjunction with an actual application scenario.

Referring to fig. 7, fig. 7 is a flowchart of a model training method in an actual application scenario provided in the embodiment of the present application, where the model training device may be any one of the computer devices with a model training function, and the model application device may be any device capable of running a first model portion of the second model, for example, a palm print recognition terminal device. The method comprises the following steps:

s701: and training through a training sample to obtain a first model.

The training sample is sample palmprint image information with a corresponding biological object. The first model may adopt a convolution application network concept 101 model, a face recognition algorithm arcface is used as a loss function to monitor a training process of the first model, and after training, model parameters of the first model are not changed in a process of training an initial second model through the first model.

S702: and determining sample classification results corresponding to the plurality of palm print image information through the first model, and determining pending classification results corresponding to the plurality of palm print image information through the initial second model.

The sample classification result is used for identifying an actual biological object corresponding to the palm print image information determined by the first model, and the pending classification result is used for identifying a biological object corresponding to the palm print image information determined by the initial second model. The following formula is shown:

wherein,sample information feature extracted for the first model, < >>For the first model part in the first model, is->For the ith palmprint image information, +.>Probability of corresponding ith biological object in sample classification result determined for first model, ++>Is a model parameter in the second model part of the first model for determining the probability of the corresponding i-th biological object.

The second model may adopt a recognition model mobilefacenet, and a face recognition algorithm arcface is adopted as a loss function, and the determination process of the pending classification result is shown in the following formula:

wherein,pending information feature determined for initial second model, < > and/or +.>For the first model part in the initial second model, and (2)>Probability of corresponding ith biological object in pending classification result determined for initial second model,/>Model parameters for determining probabilities corresponding to the ith biological object in the second model part of the initial second model.

S703: and determining the credibility parameter corresponding to the sample classification result according to the probability that the palm print image information corresponds to each biological object in the sample classification result.

The computer device may take the largest probability of the sample classification result output by the first model as a confidence parameter for the sample classification result.

S704: and sequencing the plurality of palm print image information according to the credibility parameter, and determining loss weights respectively corresponding to the plurality of palm print image information according to the sequencing result.

As shown in fig. 8, the computer device may set the palm print image information with the highest corresponding reliability parameter of 25% as a simple sample, the palm print image information with the corresponding reliability parameter of 70% in the middle as a middle sample, and the palm print image information with the corresponding reliability parameter of the last 5% as a difficult sample, so as to obtain three palm print image information sets, where the sample classification result set corresponding to the simple sample set is easy_isolated_logits_teacher, the sample classification result corresponding to the middle sample set is middle_isolated_logits_teacher, and the sample classification result corresponding to the difficult sample set is hard_isolated_logits_teacher.

And according to the grouping result of the palm print image information, the same grouping can be carried out on the pending classification result determined by the initial second model. The undetermined classification result set corresponding to the obtained simple sample set is easy_resolved_logits_student, the undetermined classification result set corresponding to the intermediate sample set is middle_resolved_logits_student, and the undetermined splitting result set corresponding to the difficult sample set is hard_resolved_logits_student.

Wherein the loss weight corresponding to the simple sample set isThe loss weight corresponding to the middle sample set is +.>The loss weight corresponding to the difficult sample set is +.>。/>Maximum (max)/(min)>Minimum.

S705: and determining model loss parameters according to the difference between the sample classification result and the undetermined classification result which correspond to the palm print image information and the loss weight.

Firstly, the computer equipment firstly determines loss parameters corresponding to three sample sets respectively, and the loss parameters are shown in the following formula:

for the simple sample set, the corresponding difference parameter of the palm print image information is used for representing the difference between the sample classification result and the pending classification result of the palm print image information in the set,for the probability of the ith biological object in the sample classification result, n is the total number of biological objects,/is the total number of biological objects>The probability of the ith biological object in the pending classification result.

For the difference parameter corresponding to the palm print image information in the middle sample set, the method is used for representing the difference between the sample classification result and the pending classification result of the palm print image information in the set, and the method is used for identifying the difference parameter>For the probability of the ith biological object in the sample classification result, n is the total number of biological objects,the probability of the ith biological object in the pending classification result.

For representing palm print image information in a set of difficult samplesImage information difference between sample classification result and pending classification result, < >>For the probability of the ith biological object in the sample classification result, n is the total number of biological objects,/is the total number of biological objects>The probability of the ith biological object in the pending classification result.

Model loss parametersThe calculation mode of (2) is shown as the following formula:

wherein,is a constant parameter->Is the weight of the constant parameter. By the method, differences of the plurality of palm print image information can be integrated according to the corresponding loss weights.

S706: and adjusting model parameters corresponding to the initial second model according to the model loss parameters to obtain the second model.

S707: and acquiring palm print image information corresponding to the biological objects respectively.

After the second model is obtained through training in the above manner, the second model may be put into a model application device for application, and the plurality of biological objects are biological objects for use as biological object recognition results.

S708: palm print information features corresponding to the plurality of biological objects respectively are determined through the first model part in the second model.

From the first model portion in the second model, palm print information features may be determined based on the palm print image information, the palm print information features being used to characterize the corresponding biological object.

S709: and acquiring the palmprint image information to be identified.

The palm print image information to be identified is information required to be identified by the biological object, the palm image of the biological object can be acquired through the terminal equipment, and then the palm print image information is extracted through the mode and is input into the model.

S710: and determining target palm print information characteristics corresponding to the palm print image information to be identified through the first model part.

S711: and determining the biological object with the highest similarity between the palm print information characteristics corresponding to the biological objects and the target palm print information characteristics as the biological object corresponding to the palm print image information to be identified.

The similarity calculation formula may be as follows:

wherein,for the palm print information feature corresponding to any object in the plurality of biological objects,/for the palm print information feature>For the target palmprint information feature, < >>Is the similarity between the features of two palmprint information.

Through practical application, the model obtained through the application can be verified to have better recognition effect on palm print recognition, 2000 biological objects are used in practical application, each biological object has 100 palm print pictures, 20 ten thousand pictures are tested, and the result is shown in the following table:

as can be seen from the table, the palmprint recognition passing rate is higher, the false recognition rate is lower, and therefore the recognition performance is better.

Based on the model training method provided by the foregoing embodiment, the present application further provides a model training device, referring to fig. 9, fig. 9 is a structural block diagram of a model training device provided by the embodiment of the present application, where the device includes a first obtaining unit 901, a first determining unit 902, a second determining unit 903, and a training unit 904:

the first obtaining unit 901 is configured to obtain a first model after training and an initial second model to be trained, where a model structure complexity of the first model is greater than that of the initial second model, the first model is configured to determine sample classification results corresponding to a plurality of pieces of information to be classified respectively, and a sample classification result corresponding to a target piece of information to be classified is configured to identify an actual category corresponding to the target piece of information to be classified, and the target piece of information to be classified is any one of the plurality of pieces of information to be classified;

the first determining unit 902 is configured to determine, by using the initial second model, a pending classification result corresponding to each of the plurality of to-be-classified information, where the pending classification result corresponding to the target to-be-classified information is used to identify a category corresponding to the target to-be-classified information determined by using the initial second model;

The second determining unit 903 is configured to determine a reliability parameter of a sample classification result corresponding to the plurality of pieces of information to be classified, where the reliability parameter is used to characterize the reliability of an actual category identified by the sample classification result;

the training unit 904 is configured to adjust a model parameter corresponding to the initial second model according to differences between sample classification results and pending classification results corresponding to the plurality of pieces of information to be classified, so as to obtain a second model, where the second model is configured to determine an actual category corresponding to the information to be classified, and adjust, according to differences between the sample classification results and the pending classification results corresponding to the target information to be classified, a positive correlation between an adjustment strength of the model parameter and a reliability represented by a reliability parameter of the sample classification result corresponding to the target information to be classified.

In a possible implementation manner, the classification result includes probabilities that the information to be classified corresponds to a plurality of categories, and the training unit 904 is specifically configured to:

In a possible implementation manner, the actual category corresponding to the target to-be-classified information identified by the sample classification result corresponding to the target to-be-classified information is a category corresponding to the highest probability in the sample classification result corresponding to the target to-be-classified information, and the second determining unit 903 is specifically configured to:

In one possible implementation, the training unit 904 is specifically configured to:

In one possible implementation, the first model is trained by:

In a possible implementation manner, the first model is specifically configured to extract a sample information feature corresponding to the information to be classified, and determine a sample classification result corresponding to the information to be classified according to the sample information feature, and the first determining unit 902 is specifically configured to:

the training unit 904 is specifically configured to:

In a possible implementation manner, the initial second model is composed of a first model part and a second model part, the first model part is used for determining the characteristics of the pending information corresponding to the information to be classified, the second model part is used for determining the pending classification result corresponding to the information to be classified according to the characteristics of the pending information, and the training unit 904 is specifically configured to:

acquiring palm image information corresponding to a biological object;

identifying a target positioning point in the palm image information;

The embodiment of the application further provides a computer device, please refer to fig. 10, where the computer device may be a terminal device, and the terminal device is taken as an example of a mobile phone:

fig. 10 is a block diagram showing a part of the structure of a mobile phone related to a terminal device provided in an embodiment of the present application. Referring to fig. 10, the mobile phone includes: radio Frequency (RF) circuitry 710, memory 720, input unit 730, display unit 740, sensor 750, audio circuitry 760, wireless fidelity (Wireless Fidelity, wiFi) module 770, processor 780, and power supply 790. It will be appreciated by those skilled in the art that the handset construction shown in fig. 10 is not limiting of the handset and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

The following describes the components of the mobile phone in detail with reference to fig. 10:

the RF circuit 710 may be configured to receive and transmit signals during a message or a call, and specifically, receive downlink information of a base station and process the downlink information with the processor 780; in addition, the data of the design uplink is sent to the base station. Generally, RF circuitry 710 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (Low Noise Amplifier, LNA for short), a duplexer, and the like. In addition, the RF circuitry 710 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to global system for mobile communications (Global System of Mobile communication, GSM for short), general packet radio service (General Packet Radio Service, GPRS for short), code division multiple access (Code Division Multiple Access, CDMA for short), wideband code division multiple access (Wideband Code Division Multiple Access, WCDMA for short), long term evolution (Long Term Evolution, LTE for short), email, short message service (Short Messaging Service, SMS for short), and the like.

The memory 720 may be used to store software programs and modules, and the processor 780 performs various functional applications and data processing of the handset by running the software programs and modules stored in the memory 720. The memory 720 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, application programs required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, memory 720 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The input unit 730 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the handset. In particular, the input unit 730 may include a touch panel 731 and other input devices 732. The touch panel 731, also referred to as a touch screen, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on or thereabout the touch panel 731 using any suitable object or accessory such as a finger, a stylus, etc.), and drive the corresponding connection device according to a predetermined program. Alternatively, the touch panel 731 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into touch point coordinates, which are then sent to the processor 780, and can receive commands from the processor 780 and execute them. In addition, the touch panel 731 may be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. The input unit 730 may include other input devices 732 in addition to the touch panel 731. In particular, the other input devices 732 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc.

The display unit 740 may be used to display information input by a user or information provided to the user and various menus of the mobile phone. The display unit 740 may include a display panel 741, and optionally, the display panel 741 may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD) or an Organic Light-Emitting Diode (OLED) or the like. Further, the touch panel 731 may cover the display panel 741, and when the touch panel 731 detects a touch operation thereon or thereabout, the touch operation is transferred to the processor 780 to determine the type of touch event, and then the processor 780 provides a corresponding visual output on the display panel 741 according to the type of touch event. Although in fig. 10, the touch panel 731 and the display panel 741 are two separate components to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 731 and the display panel 741 may be integrated to implement the input and output functions of the mobile phone.

The handset may also include at least one sensor 750, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 741 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 741 and/or the backlight when the mobile phone moves to the ear. The accelerometer sensor can be used for detecting the acceleration in all directions (generally three axes), detecting the gravity and the direction when the accelerometer sensor is static, and can be used for identifying the gesture of a mobile phone (such as transverse and vertical screen switching, related games, magnetometer gesture calibration), vibration identification related functions (such as pedometer and knocking), and other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors which are also configured by the mobile phone are not repeated herein.

Audio circuitry 760, speaker 761, and microphone 762 may provide an audio interface between a user and a cell phone. The audio circuit 760 may transmit the received electrical signal converted from audio data to the speaker 761, and the electrical signal is converted into a sound signal by the speaker 761 to be output; on the other hand, microphone 762 converts the collected sound signals into electrical signals, which are received by audio circuit 760 and converted into audio data, which are processed by audio data output processor 780 for transmission to, for example, another cell phone via RF circuit 710 or for output to memory 720 for further processing.

WiFi belongs to a short-distance wireless transmission technology, and a mobile phone can help a user to send and receive emails, browse webpages, access streaming media and the like through a WiFi module 770, so that wireless broadband Internet access is provided for the user. Although fig. 10 shows the WiFi module 770, it is understood that it does not belong to the essential constitution of the mobile phone, and can be omitted entirely as required within the scope of not changing the essence of the invention.

The processor 780 is a control center of the mobile phone, connects various parts of the entire mobile phone using various interfaces and lines, and performs various functions of the mobile phone and processes data by running or executing software programs and/or modules stored in the memory 720 and calling data stored in the memory 720, thereby performing overall detection of the mobile phone. Optionally, the processor 780 may include one or more processing units; preferably, the processor 780 may integrate an application processor that primarily processes operating systems, user interfaces, applications, etc., with a modem processor that primarily processes wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 780.

The handset further includes a power supply 790 (e.g., a battery) for powering the various components, which may preferably be logically connected to the processor 780 through a power management system, such as to provide for managing charging, discharging, and power consumption by the power management system.

Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which will not be described herein.

In this embodiment, the processor 780 included in the terminal device further has the following functions:

The embodiment of the present application further provides a server, please refer to fig. 11, fig. 11 is a block diagram of a server 800 provided in the embodiment of the present application, where the server 800 may have a relatively large difference due to different configurations or performances, and may include one or more central processing units (Central Processing Units, abbreviated as CPUs) 822 (e.g. one or more processors) and a memory 832, and one or more storage media 830 (e.g. one or more mass storage devices) storing application 842 or data 844. Wherein the memory 832 and the storage medium 830 may be transitory or persistent. The program stored in the storage medium 830 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 822 may be configured to communicate with the storage medium 830 to execute a series of instruction operations in the storage medium 830 on the server 800.

The Server 800 may also include one or more power supplies 826, one or more wired or wireless network interfaces 850, one or more input/output interfaces 858, and/or one or more operating systems 841, such as Windows Server ^TM ，Mac OS X ^TM ，Unix ^TM , Linux ^TM ，FreeBSD ^TM Etc.

The steps performed by the server in the above embodiments may be based on the server structure shown in fig. 11.

The embodiments of the present application further provide a computer readable storage medium storing a computer program for executing any one of the model training methods described in the foregoing embodiments.

The present application further provides a computer program product comprising a computer program which, when run on a computer device, causes the computer device to perform the model training method of any of the above embodiments.

It will be appreciated that in the specific embodiments of the present application, related data such as user information (e.g., palm print image information) is referred to, and when the above embodiments of the present application are applied to specific products or technologies, user permission or consent is required, and the collection, use and processing of related data is required to comply with relevant laws and regulations and standards of relevant countries and regions.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, where the above program may be stored in a computer readable storage medium, and when the program is executed, the program performs steps including the above method embodiments; and the aforementioned storage medium may be at least one of the following media: read-only memory (ROM), RAM, magnetic disk or optical disk, etc., which can store program codes.

It should be noted that, in the present specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment is mainly described in a different point from other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, with reference to the description of the method embodiments in part. The apparatus and system embodiments described above are merely illustrative, in which elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

The foregoing is merely one specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered in the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of model training, the method comprising:

acquiring a first model after training and an initial second model to be trained, wherein the model structure complexity of the first model is greater than that of the initial second model, the first model is used for determining sample classification results corresponding to a plurality of pieces of information to be classified respectively, the sample classification results corresponding to target information to be classified are used for identifying actual categories corresponding to the target information to be classified, and the target information to be classified is any one of the plurality of pieces of information to be classified; the information to be classified is palm print image information, and the category corresponding to the information to be classified is a biological object corresponding to the palm print image information; the palmprint image information is determined by: acquiring palm image information corresponding to a biological object; identifying a target positioning point in the palm image information; determining an image dividing region corresponding to the palm image information according to the target positioning point; dividing the palm print image information from the palm image information through the image dividing region;

2. The method according to claim 1, wherein the classification result includes probabilities that the information to be classified corresponds to a plurality of categories, respectively, and the adjusting the model parameters corresponding to the initial second model according to differences between the sample classification result and the pending classification result corresponding to the plurality of information to be classified, respectively, includes:

3. The method according to claim 2, wherein the actual category corresponding to the target to-be-classified information identified by the sample classification result corresponding to the target to-be-classified information is a category corresponding to a highest probability in the sample classification results corresponding to the target to-be-classified information, and the determining the reliability parameter of the sample classification results respectively corresponding to the plurality of to-be-classified information includes:

4. The method according to claim 1, wherein adjusting the model parameters corresponding to the initial second model according to the differences between the sample classification results and the pending classification results corresponding to the plurality of pieces of information to be classified, respectively, comprises:

5. The method of claim 4, wherein determining the loss weights corresponding to the plurality of pieces of information to be classified respectively according to the confidence parameters of the sample classification results corresponding to the plurality of pieces of information to be classified respectively, comprises:

6. The method of claim 1, wherein the first model is trained by:

7. The method of claim 1, wherein the first model is specifically configured to extract a sample information feature corresponding to information to be classified, and determine a sample classification result corresponding to the information to be classified according to the sample information feature, and the determining pending classification results corresponding to the plurality of information to be classified respectively includes:

The adjusting the model parameters corresponding to the initial second model according to the differences between the sample classification results and the undetermined classification results respectively corresponding to the plurality of pieces of information to be classified comprises the following steps:

8. The method according to claim 7, wherein the initial second model is composed of a first model part and a second model part, the first model part is used for determining a feature of the undetermined information corresponding to the information to be classified, the second model part is used for determining a result of undetermined classification corresponding to the information to be classified according to the feature of the undetermined information, the model parameters corresponding to the initial second model are adjusted according to differences between the feature of the sample information and the feature of the undetermined information respectively corresponding to the plurality of information to be classified and differences between the result of the sample classification and the result of undetermined classification respectively corresponding to the plurality of information to be classified, and the method comprises:

9. The method of claim 7, wherein the second model is formed of a first model portion for determining information characteristics corresponding to the information and a second model portion for determining classification results corresponding to the information based on the information characteristics, the classification results for identifying actual categories corresponding to the information, the method further comprising:

acquiring information to be identified and a plurality of category reference information, wherein the category reference information respectively has a corresponding category;

determining information characteristics respectively corresponding to the information to be identified and the plurality of category reference information through the first model part;

according to the similarity between the information features corresponding to the information to be identified and the information features corresponding to the plurality of category reference information respectively, determining target category reference information with the highest similarity with the information to be identified in the information feature dimension in the plurality of category reference information;

And determining the category corresponding to the target category reference information as the category corresponding to the information to be identified.

10. The method according to claim 1, wherein the target positioning point includes an index finger seam point, a middle finger seam point, and a ring finger seam point on a palm, and the determining the image division area corresponding to the palm image information according to the target positioning point includes:

11. A model training device, characterized in that the device comprises a first acquisition unit, a first determination unit, a second determination unit and a training unit:

the first obtaining unit is configured to obtain a first model after training and an initial second model to be trained, where the model structure complexity of the first model is greater than that of the initial second model, the first model is configured to determine sample classification results corresponding to a plurality of pieces of information to be classified respectively, and the sample classification result corresponding to a target piece of information to be classified is used to identify an actual category corresponding to the target piece of information to be classified, and the target piece of information to be classified is any one of the plurality of pieces of information to be classified; the information to be classified is palm print image information, and the category corresponding to the information to be classified is a biological object corresponding to the palm print image information; the palmprint image information is determined by: acquiring palm image information corresponding to a biological object; identifying a target positioning point in the palm image information; determining an image dividing region corresponding to the palm image information according to the target positioning point; dividing the palm print image information from the palm image information through the image dividing region;

12. The apparatus according to claim 11, wherein the classification result includes probabilities that the information to be classified corresponds to a plurality of categories, respectively, and the training unit is specifically configured to:

13. The apparatus according to claim 12, wherein the actual category corresponding to the target to-be-classified information identified by the sample classification result corresponding to the target to-be-classified information is a category corresponding to a highest probability in the sample classification result corresponding to the target to-be-classified information, and the second determining unit is specifically configured to:

14. The device according to claim 11, characterized in that the training unit is specifically configured to:

15. The device according to claim 14, characterized in that the training unit is specifically configured to:

16. The apparatus of claim 11, wherein the first model is trained by:

17. The apparatus according to claim 11, wherein the first model is specifically configured to extract a sample information feature corresponding to the information to be classified, and determine a sample classification result corresponding to the information to be classified according to the sample information feature, and the first determining unit is specifically configured to:

the training unit is specifically configured to:

18. The apparatus according to claim 17, wherein the initial second model is composed of a first model part and a second model part, the first model part is used for determining the characteristics of the pending information corresponding to the information to be classified, the second model part is used for determining the pending classification result corresponding to the information to be classified according to the characteristics of the pending information, and the training unit is specifically configured to:

19. The apparatus of claim 17, wherein the second model is formed of a first model part for determining information characteristics corresponding to the information and a second model part for determining classification results corresponding to the information based on the information characteristics, the classification results for identifying actual categories corresponding to the information, the apparatus further comprising: a second acquisition unit, a third determination unit, a fourth determination unit, and a fifth determination unit;

20. The apparatus of claim 11, wherein the target positioning point includes an index finger seam point, a middle finger seam point, and a ring finger seam point on the palm, and the determining the image division area corresponding to the palm image information according to the target positioning point includes:

21. A computer device, the computer device comprising a processor and a memory:

the processor is configured to perform the model training method of any of claims 1-10 according to instructions in the computer program.

22. A computer readable storage medium, characterized in that the computer readable storage medium is for storing a computer program for executing the model training method according to any of the claims 1-10.