CN113888470B

CN113888470B - Diagnosis method and device based on convolutional neural network and multi-modal medical image

Info

Publication number: CN113888470B
Application number: CN202111038996.1A
Authority: CN
Inventors: 徐枫; 周展平
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2021-09-06
Filing date: 2021-09-06
Publication date: 2024-07-05
Anticipated expiration: 2041-09-06
Also published as: CN113888470A

Abstract

The application provides a diagnosis method based on a convolutional neural network and multi-mode medical images, which relates to the technical fields of computer vision, deep learning and medical image processing, and comprises the following steps: collecting training data, wherein the training data comprises medical image samples of multiple modes and diagnosis marking information corresponding to the medical image samples of the multiple modes; optimizing network parameters of the convolutional neural network according to the training data to obtain an optimized classification network based on the convolutional neural network; medical images of multiple modes are acquired, the medical images of the multiple modes are input into a classification network based on a convolutional neural network, so that the prediction probability of diseases corresponding to the medical images of each mode is output, and the prediction probability is used for representing the probability of suffering from the diseases and the probability of not suffering from the diseases respectively. The application integrates medical images of multiple modes to diagnose, thereby improving the accuracy of disease diagnosis.

Description

Diagnosis method and device based on convolutional neural network and multi-modal medical image

Technical Field

The application relates to the technical fields of computer vision, deep learning and medical image processing, in particular to a diagnosis method and device based on a convolutional neural network and multi-mode medical images.

Background

In the fields of computer vision, natural language processing and the like, comprehensively utilizing data of multiple modes is a common method for improving algorithm performance. In practice, doctors often make use of multiple modality data for diagnosis. Convolutional Neural Networks (CNNs) are also widely used in the medical image processing field as an excellent image processing method. Common medical image types include CT, MRI, ultrasound images, pathology images, and the like. Existing computer-automated diagnostic methods based on medical images only utilize images of a single modality. Medical images of different modes can reflect information of physiological structures from multiple angles, so that the accuracy of diagnosis is improved.

Disclosure of Invention

The present application aims to solve at least one of the technical problems in the related art to some extent.

Therefore, a first object of the present application is to provide a diagnosis method based on convolutional neural network and multi-modal medical image, which solves the technical problem that the existing computer automated diagnosis method only uses single-modal image, realizes the purpose of diagnosing the multi-modal medical image by using deep learning method, and improves the accuracy of disease diagnosis.

A second object of the present application is to provide a diagnostic device based on a convolutional neural network and a multi-modal medical image.

A third object of the present application is to propose a non-transitory computer readable storage medium.

To achieve the above object, an embodiment of a first aspect of the present application provides a diagnostic method based on a convolutional neural network and a multi-modal medical image, including: collecting training data, wherein the training data comprises medical image samples of multiple modes and diagnosis marking information corresponding to the medical image samples of the multiple modes; optimizing network parameters of the convolutional neural network according to the training data to obtain an optimized classification network based on the convolutional neural network; medical images of multiple modes are acquired, the medical images of the multiple modes are input into a classification network based on a convolutional neural network, so that the prediction probability of diseases corresponding to the medical images of each mode is output, and the prediction probability is used for representing the probability of suffering from the diseases and the probability of not suffering from the diseases respectively.

Optionally, in an embodiment of the present application, the convolutional neural network-based classification network includes a plurality of convolutional neural network-based feature extraction networks and a multi-layer perceptron, each feature extraction network corresponds to a mode, and the network parameters of the convolutional neural network are optimized according to training data, so as to obtain an optimized convolutional neural network-based classification network, including:

Inputting the medical image samples of each mode in the medical image samples of a plurality of modes into a corresponding feature extraction network to perform feature extraction, wherein the extracted features are vectors;

The features of all modes are spliced into long vectors, the long vectors are input into a multi-layer perceptron, corresponding values are output, and the corresponding values are calculated by using softmax, so that the prediction probability of each category is obtained;

And acquiring a cross entropy loss function of the prediction probability according to the prediction probability of each category, and optimizing network parameters of the convolutional neural network according to the cross entropy loss function.

To achieve the above object, a second aspect of the present invention provides a diagnostic apparatus based on a convolutional neural network and a multi-modal medical image, comprising:

The acquisition module is used for acquiring training data, wherein the training data comprises medical image samples of multiple modes and diagnosis marking information corresponding to the medical image samples of the multiple modes;

The optimizing module is used for optimizing the network parameters of the convolutional neural network according to the training data to obtain an optimized classification network based on the convolutional neural network;

the classification module is used for acquiring medical images of multiple modes, inputting the medical images of multiple modes into a classification network based on a convolutional neural network so as to output the prediction probability of the diseases corresponding to the medical images of each mode, wherein the prediction probability is used for representing the probability of the diseases and the probability of the diseases not.

Optionally, in an embodiment of the present application, the classification network based on the convolutional neural network includes a plurality of feature extraction networks based on the convolutional neural network and a multi-layer perceptron, each feature extraction network corresponds to a mode, and the optimization module is specifically configured to:

In order to achieve the above object, an embodiment of a third aspect of the present invention proposes a non-transitory computer-readable storage medium capable of performing a diagnosis method based on a convolutional neural network and a multi-modal medical image when instructions in the storage medium are executed by a processor.

The diagnosis method based on the convolutional neural network and the multi-modal medical image, the diagnosis device based on the convolutional neural network and the multi-modal medical image and the non-transitory computer readable storage medium solve the technical problem that the computer automated diagnosis method only uses single-modal images in the existing method, realize the purpose of diagnosing the multi-modal medical image by using a deep learning method, and improve the accuracy of disease diagnosis.

Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.

Drawings

The foregoing and/or additional aspects and advantages of the application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flowchart of a diagnostic method based on convolutional neural network and multi-modal medical image according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of a diagnostic device based on a convolutional neural network and a multi-modal medical image according to a second embodiment of the present application.

Detailed Description

Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present application and should not be construed as limiting the application.

The following describes a diagnosis method and apparatus based on a convolutional neural network and a multi-modal medical image according to an embodiment of the present application with reference to the accompanying drawings.

Fig. 1 is a flowchart of a diagnosis method based on a convolutional neural network and a multi-modal medical image according to an embodiment of the application.

As shown in fig. 1, the diagnosis method based on the convolutional neural network and the multi-modal medical image comprises the following steps:

step 101, acquiring training data, wherein the training data comprises medical image samples of multiple modes and diagnosis marking information corresponding to the medical image samples of the multiple modes;

Step 102, optimizing network parameters of the convolutional neural network according to training data to obtain an optimized classification network based on the convolutional neural network;

Step 103, acquiring medical images of multiple modes, inputting the medical images of multiple modes into a classification network based on a convolutional neural network, and outputting the prediction probability of the disease corresponding to the medical images of each mode, wherein the prediction probability is used for representing the probability of suffering from the disease and the probability of not suffering from the disease respectively.

According to the diagnosis method based on the convolutional neural network and the multi-mode medical image, training data are collected, and the training data comprise medical image samples of multiple modes and diagnosis marking information corresponding to the medical image samples of the multiple modes; optimizing network parameters of the convolutional neural network according to the training data to obtain an optimized classification network based on the convolutional neural network; medical images of multiple modes are acquired, the medical images of the multiple modes are input into a classification network based on a convolutional neural network, so that the prediction probability of diseases corresponding to the medical images of each mode is output, and the prediction probability is used for representing the probability of suffering from the diseases and the probability of not suffering from the diseases respectively. Therefore, the technical problem that the computer automatic diagnosis method only utilizes a single-mode image in the existing method can be solved, the purpose of diagnosing the medical images in multiple modes comprehensively by utilizing a deep learning method is achieved, and the accuracy of disease diagnosis is improved.

After the multi-mode medical images and diagnostic labels for training are obtained, the multi-mode medical images are respectively input into different convolutional neural networks to extract features, all the features are spliced into a vector to serve as multi-mode comprehensive features, and finally the multi-mode comprehensive features are classified through a multi-layer perceptron. The parameters of the network are optimized through training, so that the automatic diagnosis of the diseases by using a computer is realized, and the method mainly comprises the following steps: collecting medical images of multiple modes of a patient and diagnosing and labeling; training by using a classification network based on a convolutional neural network by utilizing data, and optimizing network parameters; when in use, medical images of multiple modes of a patient are acquired, and the medical images are input into a classification network to obtain the prediction probability.

Further, in an embodiment of the present application, a convolutional neural network-based classification network includes a plurality of convolutional neural network-based feature extraction networks and a multi-layer perceptron, each feature extraction network corresponds to a mode, and network parameters of the convolutional neural network are optimized according to training data to obtain an optimized convolutional neural network-based classification network, including:

For example, for a two-class problem that diagnoses whether a disease is present, the final output is a vector of length 2, where the values are the probability of having the disease and the probability of not having the disease, respectively.

And performing end-to-end training on the whole multi-mode ultrasonic image classification network by using the medical images of multiple modes so as to optimize network parameters by using a cross entropy loss function of predictive probability.

As shown in fig. 2, the diagnosis device based on convolutional neural network and multi-modal medical image comprises:

the acquisition module 10 is configured to acquire training data, where the training data includes medical image samples of multiple modes and diagnostic labeling information corresponding to the medical image samples of multiple modes;

the optimizing module 20 is configured to optimize network parameters of the convolutional neural network according to the training data, and obtain an optimized classification network based on the convolutional neural network;

The classification module 30 is configured to acquire medical images of multiple modes, input the medical images of multiple modes into a classification network based on a convolutional neural network, and output prediction probabilities of diseases corresponding to the medical images of each mode, where the prediction probabilities are used to represent a probability of having a disease and a probability of not having the disease, respectively.

Further, in the embodiment of the present application, the classification network based on the convolutional neural network includes a plurality of feature extraction networks based on the convolutional neural network and a multi-layer perceptron, each feature extraction network corresponds to a mode, and the optimization module is specifically configured to:

The diagnosis device based on the convolutional neural network and the multi-mode medical image comprises an acquisition module, a diagnosis module and a diagnosis module, wherein the acquisition module is used for acquiring training data, and the training data comprises medical image samples of multiple modes and diagnosis marking information corresponding to the medical image samples of the multiple modes; the optimizing module is used for optimizing the network parameters of the convolutional neural network according to the training data to obtain an optimized classification network based on the convolutional neural network; the classification module is used for acquiring medical images of multiple modes, inputting the medical images of multiple modes into a classification network based on a convolutional neural network so as to output the prediction probability of the diseases corresponding to the medical images of each mode, wherein the prediction probability is used for representing the probability of the diseases and the probability of the diseases not. Therefore, the technical problem that the computer automatic diagnosis method only utilizes a single-mode image in the existing method can be solved, the purpose of diagnosing the medical images in multiple modes comprehensively by utilizing a deep learning method is achieved, and the accuracy of disease diagnosis is improved.

In order to implement the above embodiment, the present invention also proposes a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the diagnostic method based on a convolutional neural network and a multi-modal medical image of the above embodiment.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order from that shown or discussed, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.

The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims

1. A diagnosis method based on a convolutional neural network and a multi-modal medical image is characterized by comprising the following steps:

Collecting training data, wherein the training data comprises medical image samples of multiple modes and diagnosis marking information corresponding to the medical image samples of the multiple modes;

optimizing the network parameters of the convolutional neural network according to the training data to obtain an optimized classification network based on the convolutional neural network;

Acquiring medical images of multiple modes, inputting the medical images of the multiple modes into the classification network based on the convolutional neural network so as to output the prediction probability of diseases corresponding to the medical images of each mode, wherein the prediction probability is used for representing the probability of the diseases and the probability of the diseases not;

The convolutional neural network-based classification network comprises a plurality of convolutional neural network-based feature extraction networks and a multi-layer perceptron, wherein each feature extraction network corresponds to a mode, network parameters of the convolutional neural network are optimized according to the training data, and the optimized convolutional neural network-based classification network is obtained, and the convolutional neural network-based classification network comprises:

Inputting the medical image samples of each mode in the medical image samples of the plurality of modes into a corresponding feature extraction network to perform feature extraction, wherein the extracted features are vectors;

the features of all modes are spliced into long vectors, the long vectors are input into the multi-layer perceptron, corresponding values are output, the corresponding values are calculated by using softmax, and the prediction probability of each category is obtained;

2. A diagnostic device based on a convolutional neural network and a multi-modal medical image, comprising:

The classification module is used for acquiring medical images of multiple modes, inputting the medical images of multiple modes into the classification network based on the convolutional neural network so as to output the prediction probability of diseases corresponding to the medical images of each mode, wherein the prediction probability is used for representing the probability of the diseases and the probability of the diseases not;

The classification network based on the convolutional neural network comprises a plurality of characteristic extraction networks based on the convolutional neural network and a multi-layer perceptron, wherein each characteristic extraction network corresponds to one mode, and the optimization module is specifically used for:

3. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements a diagnostic method based on a convolutional neural network and a multimodal medical image as claimed in claim 1.