WO2021164306A1 - 图像分类模型的训练方法、装置、计算机设备及存储介质 - Google Patents

图像分类模型的训练方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2021164306A1
WO2021164306A1 PCT/CN2020/124324 CN2020124324W WO2021164306A1 WO 2021164306 A1 WO2021164306 A1 WO 2021164306A1 CN 2020124324 W CN2020124324 W CN 2020124324W WO 2021164306 A1 WO2021164306 A1 WO 2021164306A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
parameter
objective function
value
sample image
Prior art date
Application number
PCT/CN2020/124324
Other languages
English (en)
French (fr)
Inventor
曾昱为
王健宗
瞿晓阳
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021164306A1 publication Critical patent/WO2021164306A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a training method, device, computer equipment, and storage medium of an image classification model.
  • Medical image analysis has become an indispensable tool and technical means in medical research, clinical disease diagnosis and treatment.
  • deep learning especially deep convolutional neural networks
  • Medical image classification can be divided into image screening and target or lesion classification.
  • Image screening is one of the earliest applications of deep learning in the field of medical image analysis.
  • the classification of targets or lesions can assist doctors in diagnosing diseases, such as analyzing whether lung CT (Computed Tomography) images are suffering from certain diseases. Classification of disease or severity.
  • Image classification technology has achieved good results in the field of natural images, and the accuracy rate can easily reach 94% in 10 classification tasks.
  • the inventor realizes that to achieve this effect, a large number of labeled samples are needed to be effective.
  • the cost of acquiring annotation data is very high. After the imaging device acquires the image, it takes a lot of time for professional doctors to annotate the image to obtain samples for deep learning.
  • the second is to use transfer learning.
  • the idea is to train on another large-scale data set to obtain network parameters as initial values, and then train on the target data set to optimize the parameters.
  • the trained features are specific to a certain training data set or recognition task, using it for transfer learning may not have good results.
  • the embodiments of the present application provide a training method, device, computer equipment, and storage medium for an image classification model to solve the technical problem that the prior art cannot train a high-precision image classification model with fewer labeled samples.
  • An image classification model training method includes:
  • the first objective function is selected for the labeled sample image to train the first parameter of the feature extraction layer of the image classification model, and the second target is selected for the unlabeled sample image
  • the function trains the first parameter of the feature extraction layer of the image classification model
  • the first objective function is selected for the labeled sample image to train the second parameter of the classification layer of the image classification model, and the second target is selected for the unlabeled sample image
  • the function trains the second parameter of the classification layer of the image classification model
  • the first parameter and the second parameter are alternately trained until the gradient of the target loss function is less than the preset value, the value of the first parameter is used as the target parameter of the feature extraction layer, and the value of the second parameter is The value is used as the target parameter of the classification layer.
  • a training device for an image classification model comprising:
  • the sample image acquisition module is used to acquire the marked sample image in the marked sample data set, and obtain the unmarked sample image in the unmarked sample data set;
  • the similar entropy calculation module is used to calculate the similar entropy between the unlabeled sample image and one of the image prototypes output by the classification layer of the image classification model;
  • the function acquisition module is used to acquire the first objective function and the second objective function, and determine the objective loss function according to the first objective function and the second objective function;
  • the first training module is used to train the first parameter of the feature extraction layer of the image classification model by selecting the first objective function for the labeled sample image when the calculated similar entropy value is greater than the preset value. Select the second objective function to train the first parameter of the feature extraction layer of the image classification model for the sample image;
  • the second training module is used to train the second parameter of the classification layer of the image classification model by selecting the first objective function for the labeled sample image when the calculated similar entropy value is less than the preset value. Use the second objective function to train the second parameter of the classification layer of the image classification model;
  • the target parameter acquisition module is used to alternately train the first parameter and the second parameter until the gradient of the target loss function is less than a preset value, and use the value of the first parameter as the target parameter of the feature extraction layer , And use the value of the second parameter as the target parameter of the classification layer.
  • a computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
  • the first objective function is selected for the labeled sample image to train the first parameter of the feature extraction layer of the image classification model, and the unlabeled sample image is selected
  • the second objective function trains the first parameter of the feature extraction layer of the image classification model
  • the first objective function is selected for the labeled sample image to train the second parameter of the classification layer of the image classification model, and the second parameter of the classification layer of the image classification model is selected for the unlabeled sample image.
  • the second objective function is to train the second parameter of the classification layer of the image classification model;
  • the first parameter and the second parameter are alternately trained until the gradient of the target loss function is less than a preset value, the value of the first parameter is used as the target parameter of the feature extraction layer, and the The value of the second parameter is used as the target parameter of the classification layer.
  • One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
  • the first objective function is selected for the labeled sample image to train the first parameter of the feature extraction layer of the image classification model, and the unlabeled sample image is selected
  • the second objective function trains the first parameter of the feature extraction layer of the image classification model
  • the first objective function is selected for the labeled sample image to train the second parameter of the classification layer of the image classification model, and the second parameter of the classification layer of the image classification model is selected for the unlabeled sample image.
  • the second objective function is to train the second parameter of the classification layer of the image classification model;
  • the first parameter and the second parameter are alternately trained until the gradient of the target loss function is less than a preset value, the value of the first parameter is used as the target parameter of the feature extraction layer, and the The value of the second parameter is used as the target parameter of the classification layer.
  • the training method, device, computer equipment and storage medium of the image classification model proposed in this application make the classification result of the trained classification layer closer to the standard image prototype through adversarial training and learning, so that in the process of training the image classification model, Based on the premise of fewer labeled samples, combined with unlabeled target sample images, the number of effective training samples is increased, and the trained image classification model has a better classification effect.
  • FIG. 1 is a schematic diagram of an application environment of an image classification model training method in an embodiment of the present application
  • Fig. 2 is a flowchart of a method for training an image classification model in an embodiment of the present application
  • FIG. 3 is the relationship between the network structure and the target loss function in an embodiment of the present application.
  • FIG. 4 is a partial flowchart of a method for training an image classification model in an embodiment of the present application
  • Fig. 5 is a further flowchart of step S102 in Fig. 2 of the embodiment of the present application;
  • Fig. 6 is a schematic structural diagram of a training device for an image classification model in an embodiment of the present application.
  • Fig. 7 is a schematic diagram of a computer device in an embodiment of the present application.
  • the training method of the image classification model provided in this application can be applied in the application environment as shown in FIG. 1.
  • the computer equipment includes, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, etc.
  • a method for training an image classification model is provided. Taking the method applied to the computer device in FIG. 1 as an example for description, the method includes the following steps S101 to S106.
  • the labeled sample data set includes labeled public data and a small amount of labeled target data.
  • the small amount of labeled target data is, for example, a picture of a CT image of a human lung that is determined by a doctor to be labeled as a prototype of a certain disease.
  • Annotated public data such as a CT image of a human lung marked by a doctor with a prototype of a relevant disease;
  • the unlabeled sample data set contains several unlabeled target images, such as a CT image that is determined to be a prototype of a certain type of disease but is not labeled image.
  • FIG. 4 is a partial flowchart of the training method of the image classification model in an embodiment of the present application. After the step S101 and before the following step S102, the training method of the image classification model further includes the following Steps S301 and S302.
  • a nonlinear operation is performed on the gray value of the unlabeled sample image, so that the output gray value of the unlabeled sample image has an exponential relationship with the original gray value.
  • the non-linear operation is performed on the gray value of the labeled sample image and the gray value of the unlabeled sample image respectively, that is, Gamma transformation is performed on the image, and Gamma transformation is a nonlinear operation on the gray value of the input image.
  • V out AV in ⁇
  • this index is Gamma " ⁇ ".
  • the value of ⁇ ranges from 0 to 1, so it needs to be normalized first, and then the exponent is taken.
  • Gamma transformation enhances the details of the dark parts of the image. Simply put, through nonlinear transformation, the linear response of the image from the exposure intensity becomes closer to the response that the human eye feels, and the image that is about to be bleached or too dark is corrected.
  • the classification layer module is to add a K-way linear classification layer and a randomly initialized weight matrix W.
  • W is regarded as the prototype of each type, such as w 1 as the prototype of the first type of disease, w 2 as the prototype of the second type of disease, and w n as the prototype of the nth type of lesion, and the prototype refers to the representative of each type .
  • the output result of the feature extraction layer is sent to the classification layer softmax, and the probability value of each sample being classified into each category can be obtained. According to whether the public data set and the target domain number set are labeled, different objective functions are designed.
  • step S102 further includes:
  • the probability is brought into the second objective function, and the similar entropy value of the unlabeled sample image and the output of the classification layer is calculated through the second objective function.
  • the second objective function H 2 is:
  • n represents the total number of image prototypes
  • p(y k
  • x) represents the probability that the sample image x is predicted to be the k-th image prototype
  • E represents the average value of the batch size of the training data batch.
  • Fig. 5 is a further flowchart of step S102 in Fig. 2 of the embodiment of the present application. Further, as shown in Fig. 5, the relationship between the unlabeled sample image and one of the image prototypes output by the classification layer of the image classification model is calculated.
  • the steps of similar entropy include the following steps S401 to S403:
  • S401 Extract the second feature of the unlabeled sample image through the feature extraction layer.
  • S402 Input the second feature into the classification layer to obtain the probability that the sample image is predicted to be the k-th image prototype.
  • this step S103 further includes:
  • the objective loss function is calculated by the following formula
  • H represents the objective loss function
  • H 1 represents the first objective function
  • H 2 represents the second objective function.
  • the maximum correlation entropy is used as the objective function to train the feature extraction layer and the classification layer.
  • the correlation entropy value is used to quantify the similarity between two random variables A and B, and the correlation entropy of variables A and B is shown in formula (1):
  • the formula of correlation entropy is applied to the training of labeled data, and for the labeled data, the first objective function H 1 is obtained as:
  • p(x i ) represents the prediction result of the labeled sample image x being predicted as the i-th type image prototype
  • represents the preset value
  • y i represents the true value of the image x being the i-th type image prototype
  • n represents the total number of categories of the image prototype.
  • the p(x) can be obtained by the following formula (2):
  • F(x) is the feature extracted by the feature extraction layer
  • W represents the weight vector
  • the step of obtaining the prediction result that the labeled sample image x is predicted to be the i-th type image prototype includes:
  • the extracted first feature is input to the classification layer for classification, and the prediction result that the labeled sample image is predicted to be the i-th type image prototype is obtained.
  • the preset value is 0.
  • the relationship between the network structure and the first objective loss function H 1 is shown in Figure 3.
  • the feature extraction layer in the adaptive fusion method model between data is used to perform migration learning on the Resnet50 network, fine-tune the network structure and parameters, and automatically learn and extract hidden multi-level disease classification features.
  • the network introduces jump connections, which makes the back propagation of the gradient easier and allows deeper networks to be effectively trained.
  • the model is only trained on the public data and a small part of the target data, which cannot learn the distinctive features of the entire target data. Therefore, for unlabeled target instances, it is necessary to train the first parameter of the feature extraction layer according to the second objective function and to maximize the conditional entropy.
  • the first objective function is selected for the labeled sample image to train the second parameter of the classification layer of the image classification model, and the second parameter of the classification layer of the image classification model is selected for the unlabeled sample image.
  • the second objective function trains the second parameter of the classification layer of the image classification model.
  • the preset value is 0.
  • step S104 when the similarity entropy value is less than the preset value, the first objective function is selected for the labeled sample image to train the second parameter of the classification layer of the image classification model, and the unlabeled target instance It is necessary to train the classification layer according to the second objective function and minimize the conditional entropy.
  • Train the classifier by maximizing conditional entropy (that is, when the similarity entropy value is greater than a certain preset value), and minimize the conditional entropy (that is, when the similarity entropy value is less than a certain preset value) to train the feature extractor, on the one hand, it can be maximized Reduce the distance between the prototype and the unlabeled target data to a great extent, so as to extract distinctive features and increase the number of effective training samples.
  • the labeled sample data set and the unlabeled sample data set are used for semi-supervised training, so that the unlabeled sample images realize adaptive fusion between data.
  • the Resnet50 deep convolutional neural network trained on the ImageNet data set is fine-tuned to extract the features of the lung CT image, and the prediction probability is obtained through the Softmax classification layer.
  • different objective functions are designed. For labeled data, use the first objective function as the loss function to train the feature extraction layer and classification layer of the image classification model; for unlabeled data, use the method of maximizing and minimizing conditional entropy, and use the second objective function as the loss
  • the function trains the feature extraction layer and classification layer of the image classification model.
  • the conditional entropy of the alternate training classifier is maximized and the conditional entropy of the feature extractor is minimized, finally achieving high-precision classification of unlabeled data.
  • the mechanism of meta-learning is the versatility of tasks, that is, facing different tasks, there is no need to build different models, and the same set of learning algorithms can solve many different tasks.
  • the learnable parameter ⁇ of a model Faced with different tasks, the corresponding task can be solved by changing the value of the parameter ⁇ .
  • the value of the parameter ⁇ can be learned by the meta-learner.
  • the value of ⁇ is continuously updated by the method of gradient descent according to the loss function, so that the model is constantly close to the model that can solve the task.
  • This embodiment also establishes a cross-domain migration method between different lung CT image sample data sets. This method realizes the high-precision identification of a large number of unlabeled clinical data through cross-domain migration of labeled public data and unlabeled clinical data, supplemented by a very small amount of labeled clinical data, and has high sensitivity and specificity. And the domain adaptive model has good generalization ability.
  • the image classification model training method proposed in this embodiment calculates the similar entropy value between the unlabeled sample image and one of the image prototypes output by the classification layer of the image classification model, and the calculated similar entropy value is greater than
  • the preset value is set, the first objective function is selected for the labeled sample images to train the first parameters of the feature extraction layer of the image classification model, and the second objective function is selected for the features of the image classification model for unlabeled sample images.
  • the first parameter of the extraction layer is trained.
  • the first objective function is selected for the labeled sample image to train the second parameter of the classification layer of the image classification model.
  • the labeled sample image selects the second objective function to train the second parameter of the classification layer of the image classification model, and alternately trains the first parameter and the second parameter until the gradient of the target loss function is less than the preset value
  • the value of the first parameter is used as the target parameter of the feature extraction layer
  • the value of the second parameter is used as the target parameter of the classification layer
  • the objective function for training the feature extraction layer and the classification layer is different.
  • This kind of adversarial training and learning makes the classification result of the trained classification layer closer to the standard image prototype, so that in the process of training the image classification model, it is based on fewer labeled samples. Under the premise of, combined with unlabeled target sample images at the same time, the number of effective training samples is increased, and at the same time, the trained image classification model has a better classification effect.
  • a training device for an image classification model corresponds to the training method for the image classification model in the above-mentioned embodiment in a one-to-one correspondence.
  • the training device 100 for the image classification model includes a sample image acquisition module 11, a similar entropy calculation module 12, a function acquisition module 13, a first training module 14, a second training module 15, and a target parameter acquisition module 16. .
  • the detailed description of each functional module is as follows:
  • the sample image acquisition module 11 is configured to acquire annotated sample images in the annotated sample data set, and acquire unannotated sample images in the unlabeled sample data set.
  • the similar entropy calculation module 12 is used to calculate the similar entropy between the unlabeled sample image and one of the image prototypes output by the classification layer of the image classification model.
  • the similar entropy value calculation module 12 further includes:
  • the second feature extraction unit is configured to extract the second feature of the unlabeled sample image through the feature extraction layer;
  • a probability prediction unit configured to input the second feature into the classification layer to obtain the probability that the sample image is predicted to be the k-th image prototype
  • the entropy value output unit is configured to bring the probability into the second objective function, and calculate the similar entropy value output by the unlabeled sample image and the classification layer through the second objective function.
  • the function acquisition module 13 is configured to acquire the first objective function and the second objective function, and determine the objective loss function according to the first objective function and the second objective function.
  • the first objective function is:
  • p(x i ) represents the prediction result of the labeled sample image x being predicted as the i-th type image prototype
  • represents the preset value
  • y i represents the true value of the image x being the i-th type image prototype
  • n represents the total number of categories of the image prototype.
  • n represents the total number of image prototypes
  • p(y k
  • x) represents the probability that the sample image x is predicted to be the k-th image prototype
  • E represents the average value of the batch size of the training data batch.
  • the function obtaining module 13 specifically includes:
  • the first feature extraction unit is configured to extract the first feature of the labeled sample image through the feature extraction layer of the image classification model
  • the result prediction unit is used to input the extracted first feature into the classification layer for classification, and obtain a prediction result that the labeled sample image is predicted to be the i-th type image prototype.
  • the first training module 14 is used to train the first parameter of the feature extraction layer of the image classification model by selecting the first objective function for the labeled sample image when the calculated similar entropy value is greater than the preset value.
  • the labeled sample image selects the second objective function to train the first parameter of the feature extraction layer of the image classification model.
  • the second training module 15 is used for training the second parameter of the classification layer of the image classification model by selecting the first objective function for the labeled sample image when the calculated similar entropy value is less than the preset value.
  • the labeled sample image selects the second objective function to train the second parameter of the classification layer of the image classification model.
  • the target parameter acquisition module 16 is configured to alternately train the first parameter and the second parameter until the gradient of the target loss function is less than a preset value, and use the value of the first parameter as the target of the feature extraction layer Parameter, the value of the second parameter is used as the target parameter of the classification layer.
  • the function obtaining module 13 is specifically configured to calculate the target loss function through the following formula:
  • H represents the objective loss function
  • H 1 represents the first objective function
  • H 2 represents the second objective function.
  • the training device 100 for the image classification model further includes:
  • the first operation unit is configured to perform a non-linear operation on the gray value of the marked sample image, so that the output gray value of the marked sample image has an exponential relationship with the original gray value;
  • the second operation unit is used to perform a non-linear operation on the gray value of the unlabeled sample image, so that the output gray value of the unlabeled sample image has an exponential relationship with the original gray value.
  • first and “second” in the above-mentioned modules/units is only to distinguish different modules/units, and is not used to limit which module/unit has a higher priority or other limiting meanings.
  • the terms “including” and “having” and any variations of them are intended to cover non-exclusive inclusions.
  • a process, method, system, product, or device that includes a series of steps or modules is not necessarily limited to those clearly listed.
  • Those steps or modules may include other steps or modules that are not clearly listed or are inherent to these processes, methods, products, or equipment.
  • the division of modules in this application is only a logical division , There can be other division methods when realizing in practical applications.
  • each module in the training device for the image classification model described above can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a terminal, and its internal structure diagram may be as shown in FIG. 7.
  • the computer equipment includes a processor, a memory, a network interface, a display screen and an input device connected through a system bus.
  • the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and computer readable instructions.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the network interface of the computer device is used to communicate with an external server through a network connection. When the computer-readable instructions are executed by the processor, an image classification model training method is realized.
  • a computer device including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor.
  • the processor executes the computer-readable instructions to implement the image in the above-mentioned embodiment.
  • the steps of the training method of the classification model for example, step 101 to step 106 shown in FIG. 2 and other extensions of the method and extensions of related steps.
  • the processor executes the computer-readable instructions
  • the functions of the modules/units of the training device for the image classification model in the foregoing embodiment are implemented, for example, the functions of the modules 11 to 16 shown in FIG. 6.
  • the processor executes the computer-readable instruction, the following steps are implemented:
  • the first objective function is selected for the labeled sample image to train the first parameter of the feature extraction layer of the image classification model, and the unlabeled sample image is selected
  • the second objective function trains the first parameter of the feature extraction layer of the image classification model
  • the first objective function is selected for the labeled sample image to train the second parameter of the classification layer of the image classification model, and the second parameter of the classification layer of the image classification model is selected for the unlabeled sample image.
  • the second objective function is to train the second parameter of the classification layer of the image classification model;
  • the first parameter and the second parameter are alternately trained until the gradient of the target loss function is less than a preset value, the value of the first parameter is used as the target parameter of the feature extraction layer, and the The value of the second parameter is used as the target parameter of the classification layer.
  • processor further implements the following steps when executing the computer-readable instruction:
  • the objective loss function is calculated by the following formula:
  • H represents the objective loss function
  • H 1 represents the first objective function
  • H 2 represents the second objective function.
  • the first objective function is:
  • p(x i ) represents the prediction result of the labeled sample image x being predicted as the i-th type image prototype
  • represents the preset value
  • y i represents the true value of the image x being the i-th type image prototype
  • n represents the total number of categories of the image prototype.
  • processor further implements the following steps when executing the computer-readable instruction:
  • the extracted first feature is input to the classification layer for classification, and the prediction result that the labeled sample image is predicted to be the i-th type image prototype is obtained.
  • n represents the total number of image prototypes
  • p(y k
  • x) represents the probability that the sample image x is predicted to be the k-th image prototype.
  • processor further implements the following steps when executing the computer-readable instruction:
  • the probability is brought into the second objective function, and the similar entropy value of the unlabeled sample image and the output of the classification layer is calculated by the second objective function.
  • processor further implements the following steps when executing the computer-readable instruction:
  • a non-linear operation is performed on the gray value of the unlabeled sample image, so that the output gray value of the unlabeled sample image has an exponential relationship with the original gray value.
  • the processor may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor, etc.
  • the processor is the control center of the computer device, and various interfaces and lines are used to connect various parts of the entire computer device.
  • the memory may be used to store the computer-readable instructions and/or modules, and the processor may execute or execute the computer-readable instructions and/or modules stored in the memory, and call data stored in the memory, Realize various functions of the computer device.
  • the memory may mainly include a storage program area and a storage data area.
  • the storage program area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; the storage data area may store Data created based on the use of mobile phones (such as audio data, video data, etc.), etc.
  • the memory may be integrated in the processor, or may be provided separately from the processor.
  • one or more readable storage media storing computer readable instructions may be non-volatile or volatile, and the computer readable instructions may be When executed by one or more processors, the one or more processors are caused to execute the steps of the image classification model training method in the above-mentioned embodiment, such as steps 101 to 106 shown in FIG. 2 and other extensions and related methods of the method. Step extension.
  • the computer-readable instructions when executed by the processor, the functions of the modules/units of the training device for the image classification model in the above-mentioned embodiment are realized, for example, the functions of the modules 11 to 16 shown in FIG. 7.
  • the one or more processors execute the following steps:
  • the first objective function is selected for the labeled sample image to train the first parameter of the feature extraction layer of the image classification model, and the unlabeled sample image is selected
  • the second objective function trains the first parameter of the feature extraction layer of the image classification model
  • the first objective function is selected for the labeled sample image to train the second parameter of the classification layer of the image classification model, and the second parameter of the classification layer of the image classification model is selected for the unlabeled sample image.
  • the second objective function is to train the second parameter of the classification layer of the image classification model;
  • the first parameter and the second parameter are alternately trained until the gradient of the target loss function is less than a preset value, the value of the first parameter is used as the target parameter of the feature extraction layer, and the The value of the second parameter is used as the target parameter of the classification layer.
  • the one or more processors further execute the following steps:
  • the objective loss function is calculated by the following formula
  • H represents the objective loss function
  • H 1 represents the first objective function
  • H 2 represents the second objective function.
  • the first objective function is:
  • p(x i ) represents the prediction result of the labeled sample image x being predicted as the i-th type image prototype
  • represents the preset value
  • y i represents the true value of the image x being the i-th type image prototype
  • n represents the total number of categories of the image prototype.
  • the one or more processors further execute the following steps:
  • the extracted first feature is input to the classification layer for classification, and the prediction result that the labeled sample image is predicted to be the i-th type image prototype is obtained.
  • n represents the total number of image prototypes
  • p(y k
  • x) represents the probability that the sample image x is predicted to be the k-th image prototype.
  • the one or more processors further execute the following steps:
  • the probability is brought into the second objective function, and the similar entropy value of the unlabeled sample image and the output of the classification layer is calculated by the second objective function.
  • the one or more processors further execute the following steps:
  • a non-linear operation is performed on the gray value of the unlabeled sample image, so that the output gray value of the unlabeled sample image has an exponential relationship with the original gray value.
  • the training method, device, computer equipment, and storage medium of the image classification model proposed in this embodiment calculate the similar entropy value between the unlabeled sample image and one of the image prototypes output by the classification layer, and calculate the When the similar entropy value is greater than the preset value, the first objective function is selected for the labeled sample image to train the first parameter of the feature extraction layer of the image classification model, and the second objective function is selected for the unlabeled sample image. The first parameter of the feature extraction layer of the image classification model is trained.
  • the first objective function is selected for the labeled sample image as the second parameter of the classification layer of the image classification model
  • the second objective function is selected for the unlabeled sample image to train the second parameter of the classification layer of the image classification model, and the first parameter and the second parameter are alternately trained until the target loss function is
  • the gradient is less than the preset value
  • the value of the first parameter is used as the target parameter of the feature extraction layer
  • the value of the second parameter is used as the target parameter of the classification layer.
  • the similar entropy value is greater than and less than the preset value
  • different objective functions are used to train the feature extraction layer and the classification layer.
  • This kind of adversarial training and learning makes the classification result of the trained classification layer closer to the standard image prototype, so that in the process of training the image classification model, Based on the premise of fewer labeled samples, combined with unlabeled target sample images, the number of effective training samples is increased, and the trained image classification model has a better classification effect.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Evolutionary Biology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Quality & Reliability (AREA)
  • Radiology & Medical Imaging (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种图像分类模型的训练方法,应用于人工智能技术领域,用于解决现有技术无法用较少带标注的样本训练出高精度的图像分类模型的技术问题。本申请提供的方法包括:获取被标注的样本图像和未被标注的样本图像;计算未被标注的样本图像与分类层输出的其中一类图像原型之间的相似熵值;根据第一目标函数和第二目标函数确定目标损失函数;根据计算的相似熵值的大小,对于被标注的样本图像和未被标注的样本图像分别选用第一目标函数和第二目标函数交替对特征提取层的第一参数和分类层的第二参数进行训练,直至该目标损失函数的梯度小于预设值时,将该第一参数的取值作为该特征提取层的目标参数,将该第二参数的取值作为该分类层的目标参数。

Description

图像分类模型的训练方法、装置、计算机设备及存储介质
本申请要求于2020年09月17日提交中国专利局、申请号为202010979940.5,发明名称为“图像分类模型的训练方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种图像分类模型的训练方法、装置、计算机设备及存储介质。
背景技术
随着医学成像技术和计算机技术的不断发展和进步,医学图像分析已成为医学研究、临床疾病诊断和治疗中一个不可或缺的工具和技术手段。近几年来,深度学习,特别是深度卷积神经网络已经迅速发展成为医学图像分析的研究热点,它能够从医学图像大数据中自动特区隐含的疾病诊断特征。医学图像分类可以分为图像筛查和目标或病灶分类。图像筛查是深度学习在医学图像分析领域中的最早应用之一,目标或病灶的分类可以辅助医生对疾病进行诊断,例如分析肺部CT(Computed Tomography,电子计算机断层扫描)图像是否患某种疾病或严重程度分级。
图像分类技术在自然图像领域已经获得了很好的成就,在10分类任务中准确率可以轻松达到94%。然而,发明人意识到达到这种效果的需要大量的标注样本才能发挥作用。由其在医学图像领域,获取标注数据的成本非常大,成像设备获取图像之后,需要专业的医生花费大量的时间对图像进行标注才能获取用于深度学习的样本。
在数据量少的情况下,现有的方法有两种解决方案:
一是数据增强,通过旋转,平移,变形等变化,产生更多的图像。由于产生图像还是由原始图像演变而来,产生的图像与原始图像没有太多实质性的区别导致其并没有起到太多增大有效样本数据的效果。
二是使用迁移学习,其思想是通过在另一种大规模的数据集上面训练,得到的网络参数作为初始值,再在目标数据集上训练对参数进行调优。然而,如果训练出的特征对某个训练数据集或者识别任务具有特异性,用它做迁移学习就未必有好的效果。
发明内容
本申请实施例提供一种图像分类模型的训练方法、装置、计算机设备及存储介质,以解决现有技术无法用较少带标注的样本训练出高精度的图像分类模型的技术问题。
一种图像分类模型的训练方法,该方法包括:
获取被标注的样本数据集中被标注的样本图像,获取未被标注的样本数据集中未被标注的样本图像;
计算该未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值;
获取第一目标函数和第二目标函数,并根据该第一目标函数和该第二目标函数确定目标损失函数;
当计算的该相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目标函数对该图像分类模型的特征提取层的第一参数进行训练;
当计算的该相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对该图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对该图像分类模型的分类层的第二参数进行训练;
交替对该第一参数和该第二参数进行训练,直至该目标损失函数的梯度小于预设值时,将该第一参数的取值作为该特征提取层的目标参数,将该第二参数的取值作为该分类层的目标参数。
一种图像分类模型的训练装置,该装置包括:
样本图像获取模块,用于获取被标注的样本数据集中被标注的样本图像,获取未被标注的样本数据集中未被标注的样本图像;
相似熵值计算模块,用于计算该未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值;
函数获取模块,用于获取第一目标函数和第二目标函数,并根据该第一目标函数和该第二目标函数确定目标损失函数;
第一训练模块,用于当计算的该相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目标函数对该图像分类模型的特征提取层的第一参数进行训练;
第二训练模块,用于当计算的该相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对该图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对该图像分类模型的分类层的第二参数进行训练;
目标参数获取模块,用于交替对该第一参数和该第二参数进行训练,直至该目标损失函数的梯度小于预设值时,将该第一参数的取值作为该特征提取层的目标参数,将该第二参数的取值作为该分类层的目标参数。
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
获取被标注的样本数据集中被标注的样本图像,获取未被标注的样本数据集中未被标 注的样本图像;
计算所述未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值;
获取第一目标函数和第二目标函数,并根据所述第一目标函数和所述第二目标函数确定目标损失函数;
当计算的所述相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的特征提取层的第一参数进行训练;
当计算的所述相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的分类层的第二参数进行训练;
交替对所述第一参数和所述第二参数进行训练,直至所述目标损失函数的梯度小于预设值时,将所述第一参数的取值作为所述特征提取层的目标参数,将所述第二参数的取值作为所述分类层的目标参数。
一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
获取被标注的样本数据集中被标注的样本图像,获取未被标注的样本数据集中未被标注的样本图像;
计算所述未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值;
获取第一目标函数和第二目标函数,并根据所述第一目标函数和所述第二目标函数确定目标损失函数;
当计算的所述相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的特征提取层的第一参数进行训练;
当计算的所述相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的分类层的第二参数进行训练;
交替对所述第一参数和所述第二参数进行训练,直至所述目标损失函数的梯度小于预设值时,将所述第一参数的取值作为所述特征提取层的目标参数,将所述第二参数的取值作为所述分类层的目标参数。
本申请提出的图像分类模型的训练方法、装置、计算机设备及存储介质,通过对抗性的训练学习使得训练出的分类层的分类结果更靠近标准图像原型,使得训练图像分类模型的过程中,在依据较少标注样本的前提下,同时结合了未标注的目标样本图像,提高了有效训练样本的数量,同时使得训练出的图像分类模型具有更好的分类效果。
附图说明
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一实施例中图像分类模型的训练方法的一应用环境示意图;
图2是本申请一实施例中图像分类模型的训练方法的一流程图;
图3是本申请一实施例中网络结构与目标损失函数之间的关系;
图4是本申请一实施例中图像分类模型的训练方法的一局部流程图;
图5是本申请实施例图2中步骤S102的进一步流程图;
图6是本申请一实施例中图像分类模型的训练装置的结构示意图;
图7是本申请一实施例中计算机设备的一示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请提供的图像分类模型的训练方法,可应用在如图1所示的应用环境中。其中,该计算机设备包括但不限于各种个人计算机、笔记本电脑、智能手机、平板电脑等。
在一实施例中,如图2所示,提供一种图像分类模型的训练方法,以该方法应用在图1中的计算机设备为例进行说明,包括如下步骤S101至S106。
S101、获取被标注的样本数据集中被标注的样本图像,获取未被标注的样本数据集中未被标注的样本图像。
其中,该被标注的样本数据集包括被标注的公开数据和少量被标注的目标数据,该少量被标注的目标数据例如人体肺部CT图像中被医生确定标记为某种疾病原型的图片,该被标注的公开数据例如被医生标记有相关疾病原型的人体肺部CT图像;该未被标注的样本数据集中包含有若干未标注的目标图像,例如确定为某一类疾病原型但是未标注的CT图像。
进一步地,图4是本申请一实施例中图像分类模型的训练方法的一局部流程图,在该步骤S101的步骤之后,在以下步骤S102的步骤之前,该图像分类模型的训练方法还包括以下步骤S301和S302。
对该被标注的样本图像的灰度值进行非线性操作,使得该被标注的样本图像的输出灰度值与原始灰度值呈指数关系;
对该未被标注的样本图像的灰度值进行非线性操作,使得该未被标注的样本图像的输出灰度值与原始灰度值呈指数关系。
其中,对该被标注的样本图像的灰度值和该未被标注的样本图像的灰度值分别进行非线性操作即对图像进行Gamma变换,Gamma变换是对输入图像灰度值进行的非线性操作,使输出图像灰度值与输入图像灰度值呈指数关系:V out=AV in γ,这个指数即为Gamma“γ”。γ的取值范围为0~1,因此需要先进行归一化,然后取指数。
Gamma变换提升了图像的暗部细节,简单来说就是通过非线性变换,让图像从暴光强度的线性响应变得更接近人眼感受的响应,即将漂白或过暗的图片,进行矫正。
S102、计算该未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值。
分类层模块是添加一个K-way线性分类层和一个随机初始化的权重矩阵W,最后一个线性层的权重向量,表示为W=[w 1,w 2,…,w n],其中n为总类别数,将W中的每个向量视为每类原型,如w 1作为第一类疾病原型,w 2作为第二类疾病原型,w n作为第n类病灶原型,原型指每类的代表。接着将特征提取层的输出结果送进分类层softmax,可以得到每个样本被分到每一类的概率值。根据公开数据集和目标域数集是否被标注,设计不同的目标函数。
进一步地,该步骤S102进一步包括:
通过该特征提取层对该未被标注的样本图像的第二特征进行提取;
将该第二特征输入至该分类层,得到该样本图像被预测为第k类图像原型的概率;
将该概率带入该第二目标函数,通过该第二目标函数计算该未被标注的样本图像与分类层输出的相似熵值。
作为可选地,所述第二目标函数H 2为:
Figure PCTCN2020124324-appb-000001
其中,n表示所述图像原型的总类数,p(y=k|x)表示样本图像x被预测为第k类图像原型的概率,E表示训练的数据批大小batch size的平均值。
图5是本申请实施例图2中步骤S102的进一步流程图,进一步地,如图5所示,计算该未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值的步骤包括以下步骤S401至S403:
S401、通过该特征提取层对该未被标注的样本图像的第二特征进行提取;
S402、将该第二特征输入至该分类层,得到该样本图像被预测为第k类图像原型的概率;
S403、将该概率带入该第二目标函数,通过该第二目标函数计算该未被标注的样本图像与分类层输出的相似熵值。
S103、获取第一目标函数和第二目标函数,并根据该第一目标函数和该第二目标函数确定目标损失函数。
其中,该步骤S103进一步包括:
通过以下公式计算所述目标损失函数
H=-H 1±H 2
其中,所述H表示所述目标损失函数,H 1表示所述第一目标函数,H 2表示所述第二目标函数,当所述相似熵值大于0时所述H 2的符号为正,当所述相似熵值小于0时所述H 2的符号为负。
对于被标注的公开数据和少量被标注的目标数据,使用最大相关熵作为目标函数来训练特征提取层和分类层。其中,该相关熵值用于量化两个随机变量A和B之间的相似性,变量A和B的相关熵如公式(1)所示:
Figure PCTCN2020124324-appb-000002
作为可选地,将该相关熵的公式应用于有标签数据的训练中,对于被标注的数据,得到其第一目标函数H 1为:
Figure PCTCN2020124324-appb-000003
其中,p(x i)表示所述被标注的样本图像x被预测为第i类图像原型的预测结果,σ表示预先设置的值,y i表示图像x为第i类图像原型的真实值,n表示所述图像原型的总类数。
其中,该p(x)可以通过以下公式(2)得到:
p(x)=softmax(W TF(x))      (2)
其中,F(x)为特征提取层提取的特征,W代表权重向量。
进一步地,获取该被标注的样本图像x被预测为第i类图像原型的预测结果的步骤包括:
通过该图像分类模型的特征提取层对该被标注的样本图像的第一特征进行提取;
将提取的该第一特征输入至该分类层进行分类,得到该被标注的样本图像被预测为第i类图像原型的预测结果。
S104、当计算的该相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目标函数对该图像分类模型的特征提取层的第一参数进行训练。
作为可选地,该预设值为0。网络结构与第一目标损失函数H 1之间的关系如图3所示。其中,数据间自适应融合方法模型中的特征提取层,用于对Resnet50网络进行迁移学习、微调网络结构和参数,自动学习提取隐含的多层次的疾病分类特征。作为可选地,移除ResNet50网络的最后一个线性层来构建该特征提取层,该网络引入跳连接,使得梯度的反向传播更加容易,让更深的网络得以有效训练。
使用第一目标损失函数H 1可以确保特征提取层提取区别性特征。然而,该模型只是在公开数据和一小部分目标数据上进行分类训练的,这并不能学习整个目标数据的区别性特征。因此,对未标注的目标实例需要根据第二目标函数进行以及最大化条件熵对特征提取层的第一参数进行训练。
S105、当计算的该相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对该图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对该图像分类模型的分类层的第二参数进行训练。
作为可选地,该预设值为0。
与上述步骤S104相对应地,当该相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对该图像分类模型的分类层的第二参数进行训练,未标注的目标实例需要根据第二目标函数进行以及最小化条件熵对分类层进行训练。
其中,网络结构与第二目标损失函数H 2之间的关系如图3所示。图3中θ表示待训练的参数。
S106、交替对该第一参数和该第二参数进行训练,直至该目标损失函数的梯度小于预设值时,将该第一参数的取值作为该特征提取层的目标参数,将该第二参数的取值作为该分类层的目标参数。
通过最大化条件熵(即相似熵值大于某一预设值时)来训练分类器,最小化条件熵(即相似熵值小于某一预设值时)来训练特征提取器,一方面可以最大程度地减少类原型和未标注目标数据之间的距离,从而提取具有区别性的特征,提高有效训练样本数量。
另一方面,通过根据相似熵值的大小实现对特征提取层和分类层的交替式对抗训练,假设每个类都存在一个域不变的原型,它作为两个域的一个代表点。最后一个线性层权重向量的w i作为第i类疾病原型,则每一类疾病都对应一疾病原型。由于标注的数据中,公开数据居多,目标数据可能只有几个,因此估计的疾病原型靠近公开数据的分布,通过上述步骤S104、S105和S106可以实现移动目标数据中的未标注数据的特征,以使得训练出的图像分类模型更靠近w i这个图像原型位置。
本实施例利用被标注的样本数据集和未被标注的样本数据集进行半监督训练,使得未被标注的样本图像实现数据间自适应融合。基于迁移学习的思想,微调在ImageNet数据集上训练的Resnet50深度卷积神经网络,提取肺部CT图像的特征,经过Softmax分类层得到预测概率。根据数据是否被标注,设计不同的目标函数。对于标注过的数据,使用第一目标函数作为损失函数对图像分类模型的特征提取层和分类层进行训练;对于没有标注的数据,使用最大最小化条件熵的方法,使用第二目标函数作为损失函数对图像分类模型的特征提取层和分类层进行训练,交替训练分类器的条件熵最大,并使特征提取器的条件熵最小,最终实现对无标注数据的高精度分类。
由于小样本学习分类发展迅速,面对繁多的分类任务,都可以通过训练一个模型来达到任务要求。元学习的机制是任务的通用性,即面对不同的任务,不需要构建不同的模型, 用同样的一套学习算法即可解决多种不同的任务。定义一个模型的可学习参数θ,面对不同的任务,可以通过改变参数θ的值来解决相应的任务。而参数θ的值可以通过元学习器去学习,在面对不同任务的时候,根据损失函数通过梯度下降的方法不断地更新θ值,使这个模型不断向能解决这个任务的模型靠近,当θ值最终收敛时,认为元学习器学习到了一个较好的参数θ,让模型自适应地解决相应任务。本实施例同时建立了不同肺部CT图像样本数据集之间的跨域迁移方法。该方法通过对有标签公开数据与无标签临床数据进行跨域迁移,辅之以极少量带标签临床采集数据,实现了大量无标签临床数据的高精度识别,具有较高的灵敏度与特异度,且域自适应模型具有较好的泛化能力。
本实施例提出的图像分类模型的训练方法通过计算该未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值,并在计算的该相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目标函数对该图像分类模型的特征提取层的第一参数进行训练,在计算的该相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对该图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对该图像分类模型的分类层的第二参数进行训练,交替对该第一参数和该第二参数进行训练,直至该目标损失函数的梯度小于预设值时,将该第一参数的取值作为该特征提取层的目标参数,将该第二参数的取值作为该分类层的目标参数,在相似熵值大于和小于预设值时,分别采用不同的目标函数对特征提取层和分类层进行训练,这种对抗性的训练学习使得训练出的分类层的分类结果更靠近标准图像原型,使得训练图像分类模型的过程中,在依据较少标注样本的前提下,同时结合了未标注的目标样本图像,提高了有效训练样本的数量,同时使得训练出的图像分类模型具有更好的分类效果。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
在一实施例中,提供一种图像分类模型的训练装置,该图像分类模型的训练装置与上述实施例中图像分类模型的训练方法一一对应。如图6所示,该图像分类模型的训练装置100包括样本图像获取模块11、相似熵值计算模块12、函数获取模块13、第一训练模块14、第二训练模块15和目标参数获取模块16。各功能模块详细说明如下:
样本图像获取模块11,用于获取被标注的样本数据集中被标注的样本图像,获取未被标注的样本数据集中未被标注的样本图像。
相似熵值计算模块12,用于计算该未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值。
其中,该相似熵值计算模块12进一步包括:
第二特征提取单元,用于通过该特征提取层对该未被标注的样本图像的第二特征进行提取;
概率预测单元,用于将该第二特征输入至该分类层,得到该样本图像被预测为第k类图像原型的概率;
熵值输出单元,用于将该概率带入该第二目标函数,通过该第二目标函数计算该未被标注的样本图像与分类层输出的相似熵值。
函数获取模块13,用于获取第一目标函数和第二目标函数,并根据该第一目标函数和该第二目标函数确定目标损失函数。
其中,所述第一目标函数为:
Figure PCTCN2020124324-appb-000004
其中,p(x i)表示所述被标注的样本图像x被预测为第i类图像原型的预测结果,σ表示预先设置的值,y i表示图像x为第i类图像原型的真实值,n表示所述图像原型的总类数。
进一步地,该第二目标函数为:
Figure PCTCN2020124324-appb-000005
其中,n表示所述图像原型的总类数,p(y=k|x)表示样本图像x被预测为第k类图像原型的概率,E表示训练的数据批大小batch size的平均值。
进一步地,该函数获取模块13具体包括:
第一特征提取单元,用于通过该图像分类模型的特征提取层对该被标注的样本图像的第一特征进行提取;
结果预测单元,用于将提取的该第一特征输入至该分类层进行分类,得到该被标注的样本图像被预测为第i类图像原型的预测结果。
第一训练模块14,用于当计算的该相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目标函数对该图像分类模型的特征提取层的第一参数进行训练。
第二训练模块15,用于当计算的该相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对该图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对该图像分类模型的分类层的第二参数进行训练。
目标参数获取模块16,用于交替对该第一参数和该第二参数进行训练,直至该目标损失函数的梯度小于预设值时,将该第一参数的取值作为该特征提取层的目标参数,将该第二参数的取值作为该分类层的目标参数。
在其中一个实施例中,函数获取模块13具体用于通过以下公式计算所述目标损失函数:
H=-H 1±H 2
其中,所述H表示所述目标损失函数,H 1表示所述第一目标函数,H 2表示所述第二目标函数,当所述相似熵值大于0时所述H 2的符号为正,当所述相似熵值小于0时所述 H 2的符号为负。
作为可选地,该图像分类模型的训练装置100还包括:
第一操作单元,用于对该被标注的样本图像的灰度值进行非线性操作,使得该被标注的样本图像的输出灰度值与原始灰度值呈指数关系;
第二操作单元,用于对该未被标注的样本图像的灰度值进行非线性操作,使得该未被标注的样本图像的输出灰度值与原始灰度值呈指数关系。
其中上述模块/单元中的“第一”和“第二”的意义仅在于将不同的模块/单元加以区分,并不用于限定哪个模块/单元的优先级更高或者其它的限定意义。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或模块的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或模块,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或模块,本申请中所出现的模块的划分,仅仅是一种逻辑上的划分,实际应用中实现时可以有另外的划分方式。
关于图像分类模型的训练装置的具体限定可以参见上文中对于图像分类模型的训练方法的限定,在此不再赘述。上述图像分类模型的训练装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是终端,其内部结构图可以如图7所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机可读指令。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的网络接口用于与外部服务器通过网络连接通信。该计算机可读指令被处理器执行时以实现一种图像分类模型的训练方法。
在一个实施例中,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机可读指令,处理器执行计算机可读指令时实现上述实施例中图像分类模型的训练方法的步骤,例如图2所示的步骤101至步骤106及该方法的其它扩展和相关步骤的延伸。或者,处理器执行计算机可读指令时实现上述实施例中图像分类模型的训练装置的各模块/单元的功能,例如图6所示模块11至模块16的功能。具体地,所述处理器执行所述计算机可读指令时实现如下步骤:
获取被标注的样本数据集中被标注的样本图像,获取未被标注的样本数据集中未被标注的样本图像;
计算所述未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值;
获取第一目标函数和第二目标函数,并根据所述第一目标函数和所述第二目标函数确定目标损失函数;
当计算的所述相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的特征提取层的第一参数进行训练;
当计算的所述相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的分类层的第二参数进行训练;
交替对所述第一参数和所述第二参数进行训练,直至所述目标损失函数的梯度小于预设值时,将所述第一参数的取值作为所述特征提取层的目标参数,将所述第二参数的取值作为所述分类层的目标参数。
进一步地,所述处理器执行所述计算机可读指令时还实现如下步骤:
通过以下公式计算所述目标损失函数:
H=-H 1±H 2
其中,所述H表示所述目标损失函数,H 1表示所述第一目标函数,H 2表示所述第二目标函数,当所述相似熵值大于0时所述H 2的符号为正,当所述相似熵值小于0时所述H 2的符号为负。
进一步地,所述第一目标函数为:
Figure PCTCN2020124324-appb-000006
其中,p(x i)表示所述被标注的样本图像x被预测为第i类图像原型的预测结果,σ表示预先设置的值,y i表示图像x为第i类图像原型的真实值,n表示所述图像原型的总类数。
进一步地,所述处理器执行所述计算机可读指令时还实现如下步骤:
通过所述图像分类模型的特征提取层对所述被标注的样本图像的第一特征进行提取;
将提取的所述第一特征输入至所述分类层进行分类,得到所述被标注的样本图像被预测为第i类图像原型的预测结果。
进一步地,所述第二目标函数为:
Figure PCTCN2020124324-appb-000007
其中,n表示所述图像原型的总类数,p(y=k|x)表示样本图像x被预测为第k类图像原型的概率。
进一步地,所述处理器执行所述计算机可读指令时还实现如下步骤:
通过所述特征提取层对所述未被标注的样本图像的第二特征进行提取;
将所述第二特征输入至所述分类层,得到所述样本图像被预测为第k类图像原型的概率;
将所述概率带入所述第二目标函数,通过所述第二目标函数计算所述未被标注的样本图像与分类层输出的相似熵值。
进一步地,所述处理器执行所述计算机可读指令时还实现如下步骤:
对所述被标注的样本图像的灰度值进行非线性操作,使得所述被标注的样本图像的输出灰度值与原始灰度值呈指数关系;
对所述未被标注的样本图像的灰度值进行非线性操作,使得所述未被标注的样本图像的输出灰度值与原始灰度值呈指数关系。
所述处理器可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等,所述处理器是所述计算机装置的控制中心,利用各种接口和线路连接整个计算机装置的各个部分。
所述存储器可用于存储所述计算机可读指令和/或模块,所述处理器通过运行或执行存储在所述存储器内的计算机可读指令和/或模块,以及调用存储在存储器内的数据,实现所述计算机装置的各种功能。所述存储器可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、视频数据等)等。
所述存储器可以集成在所述处理器中,也可以与所述处理器分开设置。
在一个实施例中,一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读存储介质可以是非易失性,也可以是易失性,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行上述实施例中图像分类模型的训练方法的步骤,例如图2所示的步骤101至步骤106及该方法的其它扩展和相关步骤的延伸。或者,计算机可读指令被处理器执行时实现上述实施例中图像分类模型的训练装置的各模块/单元的功能,例如图7所示模块11至模块16的功能。具体地,所述计算机可读指令被一个或多个处理器执行时,该一个或多个处理器执行如下步骤:
获取被标注的样本数据集中被标注的样本图像,获取未被标注的样本数据集中未被标注的样本图像;
计算所述未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值;
获取第一目标函数和第二目标函数,并根据所述第一目标函数和所述第二目标函数确定目标损失函数;
当计算的所述相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目 标函数对所述图像分类模型的特征提取层的第一参数进行训练;
当计算的所述相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的分类层的第二参数进行训练;
交替对所述第一参数和所述第二参数进行训练,直至所述目标损失函数的梯度小于预设值时,将所述第一参数的取值作为所述特征提取层的目标参数,将所述第二参数的取值作为所述分类层的目标参数。
进一步地,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:
通过以下公式计算所述目标损失函数
H=-H 1±H 2
其中,所述H表示所述目标损失函数,H 1表示所述第一目标函数,H 2表示所述第二目标函数,当所述相似熵值大于0时所述H 2的符号为正,当所述相似熵值小于0时所述H 2的符号为负。
进一步地,所述第一目标函数为:
Figure PCTCN2020124324-appb-000008
其中,p(x i)表示所述被标注的样本图像x被预测为第i类图像原型的预测结果,σ表示预先设置的值,y i表示图像x为第i类图像原型的真实值,n表示所述图像原型的总类数。
进一步地,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:
通过所述图像分类模型的特征提取层对所述被标注的样本图像的第一特征进行提取;
将提取的所述第一特征输入至所述分类层进行分类,得到所述被标注的样本图像被预测为第i类图像原型的预测结果。
进一步地,所述第二目标函数为:
Figure PCTCN2020124324-appb-000009
其中,n表示所述图像原型的总类数,p(y=k|x)表示样本图像x被预测为第k类图像原型的概率。
进一步地,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:
通过所述特征提取层对所述未被标注的样本图像的第二特征进行提取;
将所述第二特征输入至所述分类层,得到所述样本图像被预测为第k类图像原型的概率;
将所述概率带入所述第二目标函数,通过所述第二目标函数计算所述未被标注的样本 图像与分类层输出的相似熵值。
进一步地,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:
对所述被标注的样本图像的灰度值进行非线性操作,使得所述被标注的样本图像的输出灰度值与原始灰度值呈指数关系;
对所述未被标注的样本图像的灰度值进行非线性操作,使得所述未被标注的样本图像的输出灰度值与原始灰度值呈指数关系。
本实施例提出的图像分类模型的训练方法、装置、计算机设备及存储介质,通过计算该未被标注的样本图像与分类层输出的其中一类图像原型之间的相似熵值,并在计算的该相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目标函数对该图像分类模型的特征提取层的第一参数进行训练,在计算的该相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对该图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对该图像分类模型的分类层的第二参数进行训练,交替对该第一参数和该第二参数进行训练,直至该目标损失函数的梯度小于预设值时,将该第一参数的取值作为该特征提取层的目标参数,将该第二参数的取值作为该分类层的目标参数,在相似熵值大于和小于预设值时,分别采用不同的目标函数对特征提取层和分类层进行训练,这种对抗性的训练学习使得训练出的分类层的分类结果更靠近标准图像原型,使得训练图像分类模型的过程中,在依据较少标注样本的前提下,同时结合了未标注的目标样本图像,提高了有效训练样本的数量,同时使得训练出的图像分类模型具有更好的分类效果。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功 能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。

Claims (20)

  1. 一种图像分类模型的训练方法,其中,所述方法包括:
    获取被标注的样本数据集中被标注的样本图像,获取未被标注的样本数据集中未被标注的样本图像;
    计算所述未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值;
    获取第一目标函数和第二目标函数,并根据所述第一目标函数和所述第二目标函数确定目标损失函数;
    当计算的所述相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的特征提取层的第一参数进行训练;
    当计算的所述相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的分类层的第二参数进行训练;
    交替对所述第一参数和所述第二参数进行训练,直至所述目标损失函数的梯度小于预设值时,将所述第一参数的取值作为所述特征提取层的目标参数,将所述第二参数的取值作为所述分类层的目标参数。
  2. 根据权利要求1所述的图像分类模型的训练方法,其中,所述获取第一目标函数和第二目标函数,并根据所述第一目标函数和所述第二目标函数确定目标损失函数的步骤包括:
    通过以下公式计算所述目标损失函数
    H=-H 1±H 2
    其中,所述H表示所述目标损失函数,H 1表示所述第一目标函数,H 2表示所述第二目标函数,当所述相似熵值大于0时所述H 2的符号为正,当所述相似熵值小于0时所述H 2的符号为负。
  3. 根据权利要求2所述的图像分类模型的训练方法,其中,所述第一目标函数为:
    Figure PCTCN2020124324-appb-100001
    其中,p(x i)表示所述被标注的样本图像x被预测为第i类图像原型的预测结果,σ表示预先设置的值,y i表示图像x为第i类图像原型的真实值,n表示所述图像原型的总类数。
  4. 根据权利要求3所述的图像分类模型的训练方法,其中,获取所述被标注的样本图像x被预测为第i类图像原型的预测结果的步骤包括:
    通过所述图像分类模型的特征提取层对所述被标注的样本图像的第一特征进行提取;
    将提取的所述第一特征输入至所述分类层进行分类,得到所述被标注的样本图像被预测为第i类图像原型的预测结果。
  5. 根据权利要求2所述的图像分类模型的训练方法,其中,所述第二目标函数为:
    Figure PCTCN2020124324-appb-100002
    其中,n表示所述图像原型的总类数,p(y=k|x)表示样本图像x被预测为第k类图像原型的概率。
  6. 根据权利要求5所述的图像分类模型的训练方法,其中,所述计算所述未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值的步骤包括:
    通过所述特征提取层对所述未被标注的样本图像的第二特征进行提取;
    将所述第二特征输入至所述分类层,得到所述样本图像被预测为第k类图像原型的概率;
    将所述概率带入所述第二目标函数,通过所述第二目标函数计算所述未被标注的样本图像与分类层输出的相似熵值。
  7. 根据权利要求1所述的图像分类模型的训练方法,其中,在所述计算所述未被标注的样本图像与分类层输出的图像原型之间的相似熵值的步骤之前,所述方法还包括:
    对所述被标注的样本图像的灰度值进行非线性操作,使得所述被标注的样本图像的输出灰度值与原始灰度值呈指数关系;
    对所述未被标注的样本图像的灰度值进行非线性操作,使得所述未被标注的样本图像的输出灰度值与原始灰度值呈指数关系。
  8. 一种图像分类模型的训练装置,其中,所述装置包括:
    样本图像获取模块,用于获取被标注的样本数据集中被标注的样本图像,获取未被标注的样本数据集中未被标注的样本图像;
    相似熵值计算模块,用于计算所述未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值;
    函数获取模块,用于获取第一目标函数和第二目标函数,并根据所述第一目标函数和所述第二目标函数确定目标损失函数;
    第一训练模块,用于当计算的所述相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的特征提取层的第一参数进行训练;
    第二训练模块,用于当计算的所述相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的分类层的第二参数进行训练;
    目标参数获取模块,用于交替对所述第一参数和所述第二参数进行训练,直至所述目标损失函数的梯度小于预设值时,将所述第一参数的取值作为所述特征提取层的目标参数,将所述第二参数的取值作为所述分类层的目标参数。
  9. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其中,所述处理器执行所述计算机可读指令时实现如下步骤:
    获取被标注的样本数据集中被标注的样本图像,获取未被标注的样本数据集中未被标注的样本图像;
    计算所述未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值;
    获取第一目标函数和第二目标函数,并根据所述第一目标函数和所述第二目标函数确定目标损失函数;
    当计算的所述相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的特征提取层的第一参数进行训练;
    当计算的所述相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的分类层的第二参数进行训练;
    交替对所述第一参数和所述第二参数进行训练,直至所述目标损失函数的梯度小于预设值时,将所述第一参数的取值作为所述特征提取层的目标参数,将所述第二参数的取值作为所述分类层的目标参数。
  10. 根据权利要求9所述的计算机设备,其中,所述处理器执行所述计算机可读指令时还实现如下步骤:
    通过以下公式计算所述目标损失函数
    H=-H 1±H 2
    其中,所述H表示所述目标损失函数,H 1表示所述第一目标函数,H 2表示所述第二目标函数,当所述相似熵值大于0时所述H 2的符号为正,当所述相似熵值小于0时所述H 2的符号为负。
  11. 根据权利要求10所述的计算机设备,其中,所述第一目标函数为:
    Figure PCTCN2020124324-appb-100003
    其中,p(x i)表示所述被标注的样本图像x被预测为第i类图像原型的预测结果,σ表示预先设置的值,y i表示图像x为第i类图像原型的真实值,n表示所述图像原型的总类数。
  12. 根据权利要求11所述的计算机设备,其中,所述处理器执行所述计算机可读指令时还实现如下步骤:
    通过所述图像分类模型的特征提取层对所述被标注的样本图像的第一特征进行提取;
    将提取的所述第一特征输入至所述分类层进行分类,得到所述被标注的样本图像被预测为第i类图像原型的预测结果。
  13. 根据权利要求10所述的计算机设备,其中,所述第二目标函数为:
    Figure PCTCN2020124324-appb-100004
    其中,n表示所述图像原型的总类数,p(y=k|x)表示样本图像x被预测为第k类图像原型的概率。
  14. 根据权利要求13所述的计算机设备,其中,所述处理器执行所述计算机可读指令时还实现如下步骤:
    通过所述特征提取层对所述未被标注的样本图像的第二特征进行提取;
    将所述第二特征输入至所述分类层,得到所述样本图像被预测为第k类图像原型的概率;
    将所述概率带入所述第二目标函数,通过所述第二目标函数计算所述未被标注的样本图像与分类层输出的相似熵值。
  15. 根据权利要求9所述的计算机设备,其中,所述处理器执行所述计算机可读指令时还实现如下步骤:
    对所述被标注的样本图像的灰度值进行非线性操作,使得所述被标注的样本图像的输出灰度值与原始灰度值呈指数关系;
    对所述未被标注的样本图像的灰度值进行非线性操作,使得所述未被标注的样本图像的输出灰度值与原始灰度值呈指数关系。
  16. 一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
    获取被标注的样本数据集中被标注的样本图像,获取未被标注的样本数据集中未被标注的样本图像;
    计算所述未被标注的样本图像与图像分类模型的分类层输出的其中一类图像原型之间的相似熵值;
    获取第一目标函数和第二目标函数,并根据所述第一目标函数和所述第二目标函数确定目标损失函数;
    当计算的所述相似熵值大于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的特征提取层的第一参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的特征提取层的第一参数进行训练;
    当计算的所述相似熵值小于预设值时,对于被标注的样本图像选用第一目标函数对所述图像分类模型的分类层的第二参数进行训练,对于未被标注的样本图像选用第二目标函数对所述图像分类模型的分类层的第二参数进行训练;
    交替对所述第一参数和所述第二参数进行训练,直至所述目标损失函数的梯度小于预设值时,将所述第一参数的取值作为所述特征提取层的目标参数,将所述第二参数的取值作为所述分类层的目标参数。
  17. 根据权利要求16所述的一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:
    通过以下公式计算所述目标损失函数
    H=-H 1±H 2
    其中,所述H表示所述目标损失函数,H 1表示所述第一目标函数,H 2表示所述第二目标函数,当所述相似熵值大于0时所述H 2的符号为正,当所述相似熵值小于0时所述H 2的符号为负。
  18. 根据权利要求17所述的一个或多个存储有计算机可读指令的可读存储介质,其中,所述第一目标函数为:
    Figure PCTCN2020124324-appb-100005
    其中,p(x i)表示所述被标注的样本图像x被预测为第i类图像原型的预测结果,σ表示预先设置的值,y i表示图像x为第i类图像原型的真实值,n表示所述图像原型的总类数。
  19. 根据权利要求18所述的一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:
    通过所述图像分类模型的特征提取层对所述被标注的样本图像的第一特征进行提取;
    将提取的所述第一特征输入至所述分类层进行分类,得到所述被标注的样本图像被预测为第i类图像原型的预测结果。
  20. 根据权利要求16所述的一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:
    对所述被标注的样本图像的灰度值进行非线性操作,使得所述被标注的样本图像的输出灰度值与原始灰度值呈指数关系;
    对所述未被标注的样本图像的灰度值进行非线性操作,使得所述未被标注的样本图像的输出灰度值与原始灰度值呈指数关系。
PCT/CN2020/124324 2020-09-17 2020-10-28 图像分类模型的训练方法、装置、计算机设备及存储介质 WO2021164306A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010979940.5 2020-09-17
CN202010979940.5A CN111931865B (zh) 2020-09-17 2020-09-17 图像分类模型的训练方法、装置、计算机设备及存储介质

Publications (1)

Publication Number Publication Date
WO2021164306A1 true WO2021164306A1 (zh) 2021-08-26

Family

ID=73335325

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/124324 WO2021164306A1 (zh) 2020-09-17 2020-10-28 图像分类模型的训练方法、装置、计算机设备及存储介质

Country Status (2)

Country Link
CN (1) CN111931865B (zh)
WO (1) WO2021164306A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113762393A (zh) * 2021-09-08 2021-12-07 杭州网易智企科技有限公司 模型训练方法、注视点检测方法、介质、装置和计算设备
CN114693995A (zh) * 2022-04-14 2022-07-01 北京百度网讯科技有限公司 应用于图像处理的模型训练方法、图像处理方法和设备
CN114821203A (zh) * 2022-06-29 2022-07-29 中国科学院自动化研究所 基于一致性损失的细粒度图像模型训练及识别方法和装置
CN115482436A (zh) * 2022-09-21 2022-12-16 北京百度网讯科技有限公司 图像筛选模型的训练方法、装置以及图像筛选方法
CN116663648A (zh) * 2023-04-23 2023-08-29 北京大学 模型训练方法、装置、设备及存储介质
CN116665135A (zh) * 2023-07-28 2023-08-29 中国华能集团清洁能源技术研究院有限公司 储能站电池组的热失控风险预警方法、装置和电子设备
CN117036869A (zh) * 2023-10-08 2023-11-10 之江实验室 一种基于多样性和随机策略的模型训练方法及装置

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112215212B (zh) * 2020-12-02 2021-03-02 腾讯科技(深圳)有限公司 一种图像识别方法、装置、计算机设备及存储介质
CN112434754A (zh) * 2020-12-14 2021-03-02 前线智能科技(南京)有限公司 一种基于图神经网络的跨模态医学影像域适应分类方法
CN112784879A (zh) * 2020-12-31 2021-05-11 前线智能科技(南京)有限公司 一种基于小样本域自适应的医学影像分割或分类方法
CN113159202B (zh) * 2021-04-28 2023-09-26 平安科技(深圳)有限公司 图像分类方法、装置、电子设备及存储介质
CN113361543B (zh) * 2021-06-09 2024-05-21 北京工业大学 Ct图像特征提取方法、装置、电子设备和存储介质
CN113537151B (zh) * 2021-08-12 2023-10-17 北京达佳互联信息技术有限公司 图像处理模型的训练方法及装置、图像处理方法及装置
CN113673599B (zh) * 2021-08-20 2024-04-12 大连海事大学 一种基于校正原型学习的高光谱影像分类方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10430946B1 (en) * 2019-03-14 2019-10-01 Inception Institute of Artificial Intelligence, Ltd. Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques
CN110674854A (zh) * 2019-09-09 2020-01-10 东软集团股份有限公司 一种图像分类模型训练方法、图像分类方法、装置及设备
CN110909784A (zh) * 2019-11-15 2020-03-24 北京奇艺世纪科技有限公司 一种图像识别模型的训练方法、装置及电子设备
CN111310846A (zh) * 2020-02-28 2020-06-19 平安科技(深圳)有限公司 一种选取样本图像的方法、装置、存储介质和服务器

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11222263B2 (en) * 2016-07-28 2022-01-11 Samsung Electronics Co., Ltd. Neural network method and apparatus
CN106250931A (zh) * 2016-08-03 2016-12-21 武汉大学 一种基于随机卷积神经网络的高分辨率图像场景分类方法
CN106971200A (zh) * 2017-03-13 2017-07-21 天津大学 一种基于自适应迁移学习的图像记忆度预测方法
CN107239802B (zh) * 2017-06-28 2021-06-01 广东工业大学 一种图像分类方法及装置
CN110633745B (zh) * 2017-12-12 2022-11-29 腾讯科技(深圳)有限公司 一种基于人工智能的图像分类训练方法、装置及存储介质
CN108460758A (zh) * 2018-02-09 2018-08-28 河南工业大学 肺结节检测模型的构建方法
CN108805160B (zh) * 2018-04-17 2020-03-24 平安科技(深圳)有限公司 迁移学习方法、装置、计算机设备和存储介质
CN111626315A (zh) * 2019-02-28 2020-09-04 北京京东尚科信息技术有限公司 模型训练方法、对象识别方法、装置、介质及电子设备
CN110689086B (zh) * 2019-10-08 2020-09-25 郑州轻工业学院 基于生成式对抗网络的半监督高分遥感图像场景分类方法
CN110889332A (zh) * 2019-10-30 2020-03-17 中国科学院自动化研究所南京人工智能芯片创新研究院 一种基于面试中微表情的说谎检测方法
CN110956185B (zh) * 2019-11-21 2023-04-18 大连理工大学人工智能大连研究院 一种图像显著目标的检测方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10430946B1 (en) * 2019-03-14 2019-10-01 Inception Institute of Artificial Intelligence, Ltd. Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques
CN110674854A (zh) * 2019-09-09 2020-01-10 东软集团股份有限公司 一种图像分类模型训练方法、图像分类方法、装置及设备
CN110909784A (zh) * 2019-11-15 2020-03-24 北京奇艺世纪科技有限公司 一种图像识别模型的训练方法、装置及电子设备
CN111310846A (zh) * 2020-02-28 2020-06-19 平安科技(深圳)有限公司 一种选取样本图像的方法、装置、存储介质和服务器

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113762393A (zh) * 2021-09-08 2021-12-07 杭州网易智企科技有限公司 模型训练方法、注视点检测方法、介质、装置和计算设备
CN113762393B (zh) * 2021-09-08 2024-04-30 杭州网易智企科技有限公司 模型训练方法、注视点检测方法、介质、装置和计算设备
CN114693995A (zh) * 2022-04-14 2022-07-01 北京百度网讯科技有限公司 应用于图像处理的模型训练方法、图像处理方法和设备
CN114821203A (zh) * 2022-06-29 2022-07-29 中国科学院自动化研究所 基于一致性损失的细粒度图像模型训练及识别方法和装置
CN115482436A (zh) * 2022-09-21 2022-12-16 北京百度网讯科技有限公司 图像筛选模型的训练方法、装置以及图像筛选方法
CN116663648A (zh) * 2023-04-23 2023-08-29 北京大学 模型训练方法、装置、设备及存储介质
CN116663648B (zh) * 2023-04-23 2024-04-02 北京大学 模型训练方法、装置、设备及存储介质
CN116665135A (zh) * 2023-07-28 2023-08-29 中国华能集团清洁能源技术研究院有限公司 储能站电池组的热失控风险预警方法、装置和电子设备
CN116665135B (zh) * 2023-07-28 2023-10-20 中国华能集团清洁能源技术研究院有限公司 储能站电池组的热失控风险预警方法、装置和电子设备
CN117036869A (zh) * 2023-10-08 2023-11-10 之江实验室 一种基于多样性和随机策略的模型训练方法及装置
CN117036869B (zh) * 2023-10-08 2024-01-09 之江实验室 一种基于多样性和随机策略的模型训练方法及装置

Also Published As

Publication number Publication date
CN111931865A (zh) 2020-11-13
CN111931865B (zh) 2021-01-26

Similar Documents

Publication Publication Date Title
WO2021164306A1 (zh) 图像分类模型的训练方法、装置、计算机设备及存储介质
Wang et al. Fully automatic wound segmentation with deep convolutional neural networks
Kim et al. Artificial intelligence in fracture detection: transfer learning from deep convolutional neural networks
CN111191791B (zh) 基于机器学习模型的图片分类方法、装置及设备
WO2019228317A1 (zh) 人脸识别方法、装置及计算机可读介质
WO2020215557A1 (zh) 医学影像解释方法、装置、计算机设备及存储介质
WO2021068323A1 (zh) 多任务面部动作识别模型训练方法、多任务面部动作识别方法、装置、计算机设备和存储介质
EP4163831A1 (en) Neural network distillation method and device
WO2019024568A1 (zh) 眼底图像处理方法、装置、计算机设备和存储介质
CN111368672A (zh) 一种用于遗传病面部识别模型的构建方法及装置
WO2022178997A1 (zh) 医学影像配准方法、装置、计算机设备及存储介质
Chen et al. Development of a computer-aided tool for the pattern recognition of facial features in diagnosing Turner syndrome: comparison of diagnostic accuracy with clinical workers
WO2022227214A1 (zh) 分类模型训练方法、装置、终端设备及存储介质
Patra et al. Learning spatio-temporal aggregation for fetal heart analysis in ultrasound video
WO2023142532A1 (zh) 一种推理模型训练方法及装置
Nigudgi et al. Lung cancer CT image classification using hybrid-SVM transfer learning approach
Tang et al. Lesion segmentation and RECIST diameter prediction via click-driven attention and dual-path connection
Wang et al. A cell phone app for facial acne severity assessment
Izumi et al. Detecting hand joint ankylosis and subluxation in radiographic images using deep learning: A step in the development of an automatic radiographic scoring system for joint destruction
Qiu et al. A novel tongue feature extraction method on mobile devices
Farhan et al. MCLSG: Multi-modal classification of lung disease and severity grading framework using consolidated feature engineering mechanisms
Madhu et al. Intelligent diagnostic model for malaria parasite detection and classification using imperative inception-based capsule neural networks
Viscaino et al. Computer-aided ear diagnosis system based on CNN-LSTM hybrid learning framework for video otoscopy examination
Al. Shawesh et al. Enhancing histopathological colorectal cancer image classification by using convolutional neural network
Nainwal et al. A comprehending deep learning approach for disease classification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20920699

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20920699

Country of ref document: EP

Kind code of ref document: A1