CN117649565A

CN117649565A - Model training method, training device and medical image classification method

Info

Publication number: CN117649565A
Application number: CN202410122839.6A
Authority: CN
Inventors: 黄莉莉; 刘明; 李成龙; 江波; 赵海峰; 汤进
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2024-01-30
Filing date: 2024-01-30
Publication date: 2024-03-05
Anticipated expiration: 2044-01-30
Also published as: CN117649565B

Abstract

The application discloses a model training method, a training device and a medical image classification method, wherein a medical image training set is obtained, and comprises a plurality of medical images and corresponding disease label information and clinical information thereof; dividing the medical image dataset into a plurality of data blocks, and constructing an image-level relation matrix of each data block based on disease label information and/or clinical information corresponding to medical images in each data block; and sequentially inputting the data blocks and the corresponding image-level relation matrix into a pre-training image classification model in a batch mode for model training so as to obtain a trained medical image classification model. In the model training stage, an image-level relation matrix is introduced, so that the visual characteristic embedding is optimized, the classification model is helped to understand images, and the classification performance of the medical image classification model is improved.

Description

Model training method, training device and medical image classification method

Technical Field

The present disclosure relates to the field of computer technology and medical image processing technology, and in particular, to a model training method, a training device, and a medical image classification method.

Background

Computer-aided diagnosis and aided detection have been important research directions in the medical field. The chest X-ray examination is a common examination method for screening chest diseases, has the advantages of low cost, high shooting speed, capability of finding out various lesions and the like, and is widely applied to clinic.

In general applications such as early medical image processing, it is often necessary to manually define features, and then to pass the features through a machine learning algorithm for subsequent classification operations. However, with the development of deep learning and the wide application of convolutional neural networks, the currently mainstream method has been changed to automatically extract image features and classify by using a neural network model. The feature extraction can be automatically performed on the input image through the pre-trained neural network model. These extracted visual features of the image are passed to a classifier for training, resulting in a trained classification model. This approach enables better capture of complex features in the image and reduces the workload of manually defining the features.

Typically, the classification task requires a label corresponding to the original image data. Whereas in a multi-label classification task, an image may correspond to a plurality of different labels. The traditional disease classification algorithm only aims at corresponding label information to images, but fails to make use of potential explicit correlation between different images to perform accurate disease diagnosis, which can lead to longer training time of the image classification algorithm and less ideal image classification performance.

Disclosure of Invention

In view of the foregoing drawbacks of the prior art, it is an object of the present application to provide a model training method, a training apparatus and a medical image classification method, which overcome, at least in part, one or more of the problems due to the limitations and disadvantages of the related art.

To achieve the above and other related objects, the present application provides a medical image classification model training method, including:

acquiring a medical image training set, wherein the medical image training set comprises a plurality of medical images and corresponding disease label information and clinical information thereof;

dividing the medical image training set into a plurality of data blocks, and constructing an image-level relation matrix of each data block based on disease label information and/or clinical information corresponding to medical images in each data block;

and sequentially inputting the data blocks and the corresponding image-level relation matrix into a pre-training image classification model in a batch mode for model training so as to obtain a trained medical image classification model.

In an alternative embodiment of the present application, obtaining a medical image training set includes:

acquiring an original medical image data set, wherein the original medical image data set comprises a plurality of original medical images, and corresponding disease labels and clinical information thereof;

and downsampling the original medical image in the original medical image data set, and performing probability inversion or probability rotation operation on the downsampled original medical image to acquire the medical image training set.

In an optional embodiment of the present application, a center clipping manner is adopted to downsample the original medical image in the original medical image dataset, and a probability inversion or probability rotation operation is performed on the downsampled original medical image, so as to obtain the medical image training set.

In an optional embodiment of the present application, constructing an image-level relationship matrix of each data block based on disease label information and/or clinical information corresponding to a medical image in each data block includes:

constructing a label confusion matrix of each data block based on disease label information corresponding to the medical image in each data block; and/or

Constructing a corresponding clinical information relation matrix of the data block based on clinical information corresponding to the medical image in each data block, wherein each clinical information corresponds to one clinical information relation matrix;

the label confusion matrix and/or the clinical information relationship matrix are used as the image-level relationship matrix.

In an alternative embodiment of the present application, the clinical information includes patient id, patient age, patient sex, and subject position information.

In an optional embodiment of the present application, constructing a label confusion matrix of each data block based on disease label information corresponding to a medical image in each data block includes:

calculating the inner product of disease label information of all medical images in each data block;

and normalizing the inner product to be used as a label confusion matrix of each data block.

In an alternative embodiment of the present application, the pre-trained image classification model includes a VGG model, a res net model, or a DensNet model.

In an alternative embodiment of the present application, the medical image is a chest X-ray image.

To achieve the above and other related objects, the present application further provides a medical image classification model training apparatus, including:

the data acquisition module is used for acquiring a medical image training set, wherein the medical image training set comprises a plurality of medical images and corresponding disease label information and clinical information thereof;

the relation matrix construction module is used for dividing the medical image training set into a plurality of data blocks and constructing an image-level relation matrix of each data block based on disease label information and/or clinical information corresponding to medical images in each data block;

and the model training module is used for sequentially inputting the data blocks and the corresponding image-level relation matrix into the pre-training image classification model in a batch mode to perform model training so as to obtain a trained medical image classification model.

To achieve the above and other related objects, the present application further provides a medical image classification method, including:

acquiring medical images to be classified;

inputting the medical images to be classified into a medical image classification model trained and acquired based on the model training method so as to acquire the disease labels of the medical images to be classified.

According to the model training method, a medical image training set is obtained, and the medical image training set comprises a plurality of medical images and corresponding disease label information and clinical information; dividing the medical image dataset into a plurality of data blocks, and constructing an image-level relation matrix of each data block based on disease label information and/or clinical information corresponding to medical images in each data block; and sequentially inputting the data blocks and the corresponding image-level relation matrix into a pre-training image classification model in a batch mode for model training so as to obtain a trained medical image classification model. In the model training stage, an image-level relation matrix is introduced, so that the visual characteristic embedding is optimized, the classification model is helped to understand images, and the classification performance of the medical image classification model is improved.

Drawings

Fig. 1 is a schematic flow chart of a medical image classification model training method of the present application.

Fig. 2 shows a sub-flowchart of step S10.

Fig. 3 shows a sub-flowchart of step S20.

Fig. 4 is a functional block diagram of the medical image classification model training apparatus according to the present application.

Detailed Description

Other advantages and effects of the present application will become apparent to those skilled in the art from the present disclosure, when the following description of the embodiments is taken in conjunction with the accompanying drawings. The present application may be embodied or carried out in other specific embodiments, and the details of the present application may be modified or changed from various points of view and applications without departing from the spirit of the present application.

Please refer to fig. 1-4. It should be noted that, the illustrations provided in the present embodiment merely illustrate the basic concepts of the application by way of illustration, and only the components related to the application are shown in the drawings, rather than being drawn according to the number, shape and size of the components in actual implementation, and the form, number and proportion of the components in actual implementation may be arbitrarily changed, and the layout of the components may be more complex.

FIG. 1 is a flow chart of a preferred embodiment of the medical image classification model training method of the present application. Referring to fig. 1, the medical image classification model training method includes the following steps:

s10: acquiring a medical image training set, wherein the medical image training set comprises a plurality of medical images and corresponding disease label information and clinical information thereof;

s20: dividing the medical image dataset into a plurality of data blocks, and constructing an image-level relation matrix of each data block based on disease label information and/or clinical information corresponding to medical images in each data block;

s30: and sequentially inputting the data blocks and the corresponding image-level relation matrix into a pre-training image classification model in a batch mode for model training so as to obtain a trained medical image classification model.

The technical solutions of the present application will be described in detail below in connection with specific application examples.

First, step S10 is performed: the method comprises the steps of obtaining a medical image training set, wherein the medical image training set comprises a plurality of medical images and corresponding disease label information and clinical information.

Before model training, a large amount of existing medical image data needs to be acquired as an original medical image data set, wherein the original medical image data set comprises diagnostic tags which reflect disease information and correspond to the original medical images one by one, and clinical information such as patient id, patient age, patient gender, shooting position information and the like corresponding to the medical images, and the diagnostic tags should clearly identify the state of the corresponding medical images on a specified disease, namely negative (0) or positive (1). And acquiring a medical image data set for training by preprocessing the original medical image in the original medical image data set.

It should be noted that, the medical image and the diagnostic tag and the clinical information corresponding to the medical image may be obtained from a database of a history file, but not limited to, or may be obtained from a database of a network open source.

The medical image may be, for example, a chest X-ray image. Assume that a medical image has a corresponding diagnostic tag: normal thorax, abnormal trachea, abnormal lung parenchyma and normal rib diaphragm angle. Its label is: 0,1,0 (positive is abnormal, negative is normal).

When the original medical image in the original medical image data set is preprocessed to obtain the medical image data set for training, the original medical image in the original medical image data set can be downsampled to obtain a normalized medical image, and the downsampled original medical image is subjected to probability inversion or probability rotation operation to obtain the medical image data set for training. Of course, in some embodiments, the step of performing the probability inversion or probability rotation operation on the downsampled original medical image may not be performed.

In a specific embodiment, for example, a center clipping manner may be adopted to downsample the original medical image in the original medical image data set, and probability inversion or probability rotation operation and normalization processing are performed on the downsampled original medical image to obtain a training medical image data set including a medical image training set, where the size of the downsampled image and the probability of inversion or rotation may be adjusted according to actual requirements. Not only can the data set be expanded through probability inversion or probability rotation operation, but also the robustness of the classification model can be improved.

As an example, for example, the original medical image in the original medical image dataset may be downsampled to 224×224 in a central clipping manner, the downsampled single medical image is turned with a probability of 50% to obtain a rotation operation, and then the matrix of the downsampled medical image and the matrix of the turned or rotated medical image are normalized by using parameters provided by ImageNet, so as to obtain a matrix form (i.e. an initial image feature) of an image standard as input data of the classification model.

The medical image dataset for training needs to be divided into a medical image training set, a medical image verification set and a medical image test set before the model is performed, and the medical image data set is used for training and testing a subsequent medical image classification model.

Next, step S20 is performed: dividing the medical image dataset into a plurality of data blocks, and constructing an image-level relation matrix of each data block based on disease label information and/or clinical information corresponding to medical images in each data block.

In order to reduce memory consumption, speed up training, increase randomness and improve generalization capability of the model, the embodiment adopts a batch training mode to train the model. The medical image data set is divided into a plurality of data blocks with the same size according to the batch training size, each data block is ensured to contain the same number of medical images, and then the medical images are input into a pre-training image classification model in batches for training.

When the model is trained, the inventor takes the correlation between clinical information and disease label information between medical images as the input of model training, so that the classification model can fully utilize the potential explicit correlation between different medical images like a radiologist to help the model understand the medical images, and the consistency of the classification model is kept, thereby obtaining better interpretability, accelerating the model training speed and improving the classification capability of the model.

Before inputting each data block into a pre-training image classification model in batches for model training, constructing a label confusion matrix of each data block based on disease label information corresponding to medical images in each data block; constructing a corresponding clinical information relation matrix of the data block based on clinical information corresponding to the medical image in each data block, wherein each clinical information corresponds to one clinical information relation matrix; and using the label confusion matrix and the clinical information relation matrix as the image-level relation matrix, that is, the image-level relation matrix comprises a label confusion matrix and a clinical information relation matrix corresponding to each clinical information.

In a specific embodiment, when the clinical information relation matrix corresponding to each clinical information is constructed, if some clinical information of the two medical images is the same, the relation of the clinical information of the two medical images is set to be 1, otherwise, the relation is set to be 0. Taking clinical information including patient id, patient age, patient sex and shooting position information as an example, setting the relation as 1 when the patient ids of two medical images are the same, and setting the relation as 0 otherwise, so as to form a clinical information relation matrix of n rows and n columns corresponding to the patient ids; the patient age, patient sex and shooting body position information can also respectively construct a clinical information relation matrix of n rows and n columns, wherein n is the size of the database, namely the number of medical images in the database.

In a specific embodiment, when constructing the label confusion matrix of each data block based on the disease label information corresponding to the medical image in each data block, the inner product N of the disease label information of all the medical images in each data block may be calculated based on the following formula:

；

wherein L is a label matrix corresponding to the medical images in the same batch of data blocks, which is a matrix of n rows and m columns, n is the number of the medical images in a single data block, and m is a single medical imageNumber of disease tags in image, L ^T The inner product N is a symmetric matrix of N rows and N columns;

then, since the inner product N is a symmetric matrix of N rows and N columns, the inner product may be normalized based on one of the following two equations as the tag confusion matrix M for each of the data blocks:

or (b)

Wherein,M _i,j for the value of the ith row and jth column in the label confusion matrix M,N _i,j values for the ith row and jth column in the tag confusion matrix M.

The process of obtaining the inner product and tag confusion matrix M will be described in one specific example as follows:

if one data block contains five medical images, the disease labels of each medical image are 4, and the labels corresponding to the five medical images are 0,1 and 0 in sequence; 1,0,1,0;1, 0;1,0, 1;1,0,0,1. Therefore, the label matrix L corresponding to the medical image of the data block is a matrix of 5 rows and 4 columns as shown in the following formula:

so that the inner product N of the disease label information of all medical images in the data block can be expressed as:

the normalized tag confusion matrix M may be expressed as:

。

finally, step S130 is performed: and sequentially inputting the data blocks and the corresponding image-level relation matrix into a pre-training image classification model in a batch mode for model training so as to obtain a trained medical image classification model.

When the data blocks are input in a batch mode, the data blocks, the corresponding label confusion matrix, the corresponding clinical information relation matrix and the like are input into a pre-training image classification model as image-level relation matrix to participate in training in model training, potential explicit correlations among different medical images are fully utilized to help the model understand the medical images, and the model consistency of the related images is maintained, so that better interpretability is obtained, the model training speed is increased, and the classification capacity of the model is improved. In the training process, average loss is calculated, back propagation is carried out, and parameters are updated, so that fine adjustment of the pre-training image classification model is completed, and a trained medical image classification model is obtained. In this embodiment, the pretrained image classification model adopts a res net model, and it can be understood that the pretrained image classification model may also be replaced by other models such as a VGG model and a res net model.

Specifically, when training is performed, data is required to be packaged and input according to model requirements, the data is input to include initial image features, disease labels corresponding to the images and corresponding image-level relation matrixes, the initial image features are input to a classification model, and abstract image features are obtained through multiple convolution operations and enter the classifier to output model prediction probability. And inputting the prediction probability and the true label value into a loss value calculation formula to obtain single sample model loss, accumulating all the single sample losses to average to obtain model average loss, and returning trainable parameters in the gradient update model to complete one-time training. The process is repeated until the model average loss converges to less than the threshold, ending the model training.

Wherein the average loss value for each training is calculated according to the following formulaloss：

Wherein,nis the total number of samples of the medical image in a batch of data blocks,y _i is the firstiA disease label of the medical image is provided,predicting the first for the modeliProbability value of positive for a medical image.

The medical image training set is utilized to finish the fine adjustment of the pre-training image classification model, and the acquired medical image classification model is also required to be tested by utilizing the medical image testing set. Specifically, the medical image classification model trained by the medical image training set is utilized to predict the prediction probability of the medical image in the medical image test set, and in the test process, a preset prediction probability threshold is needed to divide the model prediction probability into two classes of 0 and 1 for respectively representing negative and positive, so that the classification performance of the medical image classification model after training is detected. The present example sets the threshold to 0.5, so samples with a probability greater than 0.5 are predicted positive and samples less than 0.5 are predicted negative.

In practical application, the medical image to be classified is obtained, the preprocessing process in the step S10 is performed on the obtained medical image to be classified, and then the medical image is input into a trained medical image classification model, so that the disease label of the medical image to be classified can be predicted.

By using the method and the device, the image-level relation matrix is introduced in the model training stage, the visual feature embedding is optimized, the latent explicit correlation among different medical images is used for helping the classification model to understand the images, the classification performance of the medical image classification model can be improved, and the training speed can be accelerated.

As shown in fig. 4, fig. 4 is a functional block diagram of a preferred embodiment of the medical image classification model training apparatus 11 of the present application. The medical image classification model training device 11 includes a data acquisition module 111, a relationship matrix construction module 112, and a model training module 113.

The data acquisition module 111 is configured to acquire a medical image training set, where the medical image training set includes a plurality of medical images and corresponding disease label information and clinical information thereof; the relationship matrix construction module 112 is configured to divide the medical image dataset into a plurality of data blocks, and construct an image-level relationship matrix of each data block based on disease label information and/or clinical information corresponding to medical images in each data block; the model training module 113 is configured to sequentially input the data blocks and the corresponding image-level relation matrix to a pre-training image classification model in a batch manner for model training, so as to obtain a trained medical image classification model.

It should be noted that, the medical image classification model training apparatus 11 of the present application is a virtual system corresponding to the above medical image classification model training method, and functional modules in the medical image classification model training apparatus 11 correspond to corresponding steps in the medical image classification model training method, respectively. The medical image classification model training apparatus 11 of the present application may be implemented in cooperation with a medical image classification model training method. The related technical details mentioned in the medical image classification model training method of the present application are still valid in the medical image classification model training apparatus 11, and are not repeated here for the sake of reducing repetition. Accordingly, the related technical details mentioned in the medical image classification model training apparatus 11 of the present application can also be applied to the above medical image classification model training method.

It should be noted that each of the above functional modules may be fully or partially integrated into one physical entity or may be physically separated. And these modules may all be implemented in software in the form of calls by the processing element; or can be realized in hardware; the method can also be realized in a form of calling software by a processing element, and the method can be realized in a form of hardware by a part of modules. In addition, all or part of the modules can be integrated together or can be independently implemented. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, some or all of the steps of the above methods, or the above functional modules, may be implemented by integrated logic circuits of hardware in the processor element or instructions in the form of software.

In the description herein, numerous specific details are provided, such as examples of components and/or methods, to provide a thorough understanding of embodiments of the present application. One skilled in the relevant art will recognize, however, that an embodiment of the application can be practiced without one or more of the specific details, or with other apparatus, systems, components, methods, components, materials, parts, and so forth. In other instances, well-known structures, materials, or operations are not specifically shown or described in detail to avoid obscuring aspects of embodiments of the present application.

It will also be appreciated that one or more of the elements shown in the figures may also be implemented in a more separated or integrated manner, or even removed because of inoperability in certain circumstances or provided because it may be useful depending on the particular application.

In addition, any labeled arrows in the drawings/figures should be considered only as exemplary, and not limiting, unless otherwise specifically indicated. Furthermore, the term "or" as used herein is generally intended to mean "and/or" unless specified otherwise. Combinations of parts or steps will also be considered as being noted where terminology is foreseen as rendering the ability to separate or combine is unclear.

The above description of illustrated embodiments of the present application, including what is described in the abstract, is not intended to be exhaustive or to limit the application to the precise forms disclosed herein. Although specific embodiments of, and examples for, the application are described herein for illustrative purposes only, various equivalent modifications are possible within the spirit and scope of the present invention, as those skilled in the relevant art will recognize and appreciate. As noted, these modifications may be made to the present application in light of the foregoing description of illustrated embodiments of the present application and are to be included within the spirit and scope of the present application.

The systems and methods have been described herein in general terms as being helpful in understanding the details of the present application. Furthermore, various specific details have been given to provide a general understanding of embodiments of the present application. One skilled in the relevant art will recognize, however, that the embodiments of the application can be practiced without one or more of the specific details, or with other apparatus, systems, assemblies, methods, components, materials, parts, and/or the like. In other instances, well-known structures, materials, and/or operations are not specifically shown or described in detail to avoid obscuring aspects of embodiments of the present application.

Thus, although the present application has been described herein with reference to particular embodiments thereof, a latitude of modification, various changes and substitutions are also in the foregoing disclosures, and it will be appreciated that in some instances some features of the application will be employed without a corresponding use of other features without departing from the scope and spirit of the proposed invention. Therefore, many modifications may be made to adapt a particular situation or material to the essential scope and spirit of the present application. It is intended that the application not be limited to the particular terms used in following claims and/or to the particular embodiment disclosed as the best mode contemplated for carrying out this application, but that the application will include any and all embodiments and equivalents falling within the scope of the appended claims. Accordingly, the scope of the present application is to be determined solely by the appended claims.

Claims

1. A medical image classification model training method, comprising:

2. The method of claim 1, wherein obtaining a training set of medical images comprises:

3. The medical image classification model training method according to claim 2, wherein the original medical image in the original medical image dataset is downsampled by adopting a center clipping mode, and the downsampled original medical image is subjected to a probability inversion or probability rotation operation to obtain the medical image training set.

4. The medical image classification model training method according to claim 1, wherein constructing an image-level relationship matrix for each of the data blocks based on disease label information and/or clinical information corresponding to the medical image in each of the data blocks comprises:

5. The medical image classification model training method of claim 4, wherein the clinical information includes patient id, patient age, patient gender and subject position information.

6. The medical image classification model training method of claim 4, wherein constructing a label confusion matrix for each of the data blocks based on disease label information corresponding to medical images in each of the data blocks comprises:

7. The medical image classification model training method of claim 1, wherein the pre-trained image classification model comprises a VGG model, a res net model, or a DensNet model.

8. The medical image classification model training method according to any one of claims 1-7, wherein the medical image is a chest X-ray image.

9. A medical image classification model training device, comprising:

10. A medical image classification method, comprising:

acquiring medical images to be classified;

inputting the medical image to be classified into a medical image classification model trained and acquired based on the method of any one of claims 1-8 to acquire a disease label of the medical image to be classified.