CN117422668A

CN117422668A - Pulmonary edema image recognition method, device, equipment and storage medium

Info

Publication number: CN117422668A
Application number: CN202311191467.4A
Authority: CN
Inventors: 薛新颖; 潘磊; 赵晟; 刘鹏飞; 臧学磊; 魏华英; 翟怀远; 陈明利; 李天宇; 刘小闪; 郭海峰
Original assignee: Beijing Shijitan Hospital; China Academy of Railway Sciences Corp Ltd CARS; China State Railway Group Co Ltd; Energy Saving and Environmental Protection and Occupational Safety and Health Research of CARS
Current assignee: Beijing Shijitan Hospital; China Academy of Railway Sciences Corp Ltd CARS; China State Railway Group Co Ltd; Energy Saving and Environmental Protection and Occupational Safety and Health Research of CARS
Priority date: 2023-09-15
Filing date: 2023-09-15
Publication date: 2024-01-19

Abstract

The application relates to the technical field of image recognition, and provides a pulmonary edema image recognition method, a device, equipment and a storage medium, wherein the pulmonary edema image recognition method comprises the following steps: inputting an image to be identified into one of the neural network models of a pretrained twin network model to extract characteristics of the image to be identified and obtain image characteristics, wherein the twin network model comprises two convolutional neural networks; and inputting the image features into a fully-connected neural network classifier, and outputting an image recognition result. The method solves the problems that the medical image sample size is small, the feature extraction capability is weak and the feature extraction is insufficient for effective classification due to the trained model, the robustness of the trained feature extraction network is stronger, the effective features are more, and the classifier is convenient to classify.

Description

Pulmonary edema image recognition method, device, equipment and storage medium

Technical Field

The present invention relates to the field of image recognition technology, and in particular, to a pulmonary edema image recognition method, apparatus, device, and computer readable storage medium.

Background

In high altitude areas, altitude sickness, including altitude pulmonary edema, is easily caused by insufficient oxygen supply to the body due to rarefaction of oxygen. Traditional diagnostic methods for altitude pulmonary edema typically rely on manual observation and analysis of CT images of the lungs. However, this method has problems such as complicated operation and low accuracy. In recent years, with the development of deep learning technology, automatic analysis and diagnosis of lung images by using convolutional neural networks are becoming a new research direction.

However, there are some limitations to the automatic analysis and prediction of lung images using conventional convolutional neural networks, such as limitations in feature extraction, differences between different cases, and the like.

Disclosure of Invention

The invention mainly aims to provide a pulmonary edema image recognition method, a device, equipment and a computer readable storage medium, which aim to solve the technical problem that the existing pulmonary image recognition accuracy is low.

To achieve the object, the present invention provides a pulmonary edema image recognition method including the steps of:

inputting an image to be identified into one of the neural network models of a pretrained twin network model to extract characteristics of the image to be identified and obtain image characteristics, wherein the twin network model comprises two convolutional neural networks;

and inputting the image features into a fully-connected neural network classifier, and outputting an image recognition result.

In the pulmonary edema image recognition method provided by the application, the twin network model training step includes:

different data enhancement is adopted for the sample image, so that a plurality of enhancement sample images corresponding to the data enhancement are obtained;

carrying out feature extraction on each enhanced sample graph through a twin network to obtain feature vectors;

calculating the distance between the feature vectors by using a contrast loss function according to the feature vectors;

and based on the distance between the feature vectors, updating network parameters in the twin network by adopting a gradient descent method.

In the pulmonary edema image recognition method provided in the present application, the sample image includes a CT image, different data enhancement is adopted for the sample image, and an enhanced sample image with multiple corresponding data enhancement is obtained, including:

and performing one or more operations of random cutting, rotation, scaling, overturning, gray level transformation and histogram enhancement on the CT image to obtain a corresponding enhancement sample graph after the operation is finished.

In the pulmonary edema image recognition method provided in the present application, the twin network includes two identical convolutional neural networks, and the feature extraction is performed on each enhanced sample graph through the twin network to obtain feature vectors, including:

and respectively inputting the two enhanced sample images corresponding to one CT image into two convolutional neural networks to obtain two feature vectors, wherein the two convolutional neural networks share weights.

In the pulmonary edema image recognition method provided in the present application, the calculating, according to the feature vectors, a distance between the feature vectors using a contrast loss function includes:

measuring the distance between the two feature vectors, and calculating the distance between the two feature vectors by using a contrast feature loss function, wherein the contrast loss function formula is as follows:

；

wherein,is a binary variable of whether the class of the two samples is the same, < ->Is a weight parameter, ++>And->Is a feature vector of two sample images, +.>Is a distance measurement function, ++>Is a margin parameter.

In addition, in order to achieve the object, the present invention also provides a pulmonary edema image recognition apparatus including a processor, a memory, and a pulmonary edema image recognition program stored on the memory and executable by the processor, wherein the pulmonary edema image recognition program, when executed by the processor, implements the steps of the pulmonary edema image recognition method.

In addition, the invention also provides a pulmonary edema image recognition device, which comprises:

the image feature extraction device is used for inputting an image to be identified into one of the neural network models of the pretrained twin network model so as to extract the features of the image to be identified and obtain image features, wherein the twin network model comprises two convolutional neural networks;

and the result output device is used for inputting the image characteristics into the fully-connected neural network classifier and outputting an image recognition result.

In addition, to achieve the object, the present invention also provides a computer-readable storage medium having stored thereon a pulmonary edema image recognition program, wherein the pulmonary edema image recognition program, when executed by a processor, implements the steps of the pulmonary edema image recognition method as described.

The invention provides a pulmonary edema image identification method, which comprises the following steps: inputting an image to be identified into one of the neural network models of a pretrained twin network model to extract characteristics of the image to be identified and obtain image characteristics, wherein the twin network model comprises two convolutional neural networks; and inputting the image features into a fully-connected neural network classifier, and outputting an image recognition result. The method solves the problems that the medical image sample size is small, the feature extraction capability is weak and the feature extraction is insufficient for effective classification due to the trained model, the robustness of the trained feature extraction network is stronger, the effective features are more, and the classifier is convenient to classify. And taking out one convolutional neural network by using a trained twin network model, adding a classifier at last, and fixing the previous characteristic extraction network parameters. And optimizing the classifier by using a cross entropy loss function to realize the prediction of unknown CT images of the altitude pulmonary edema.

Drawings

Fig. 1 is a schematic hardware configuration diagram of a pulmonary edema image recognition apparatus according to an embodiment of the present invention;

FIG. 2 is a flow chart of a pulmonary edema image recognition method of the present invention;

FIG. 3 is a schematic diagram of a training process of the pulmonary edema image recognition model of the present invention;

fig. 4 is a flow chart of the pulmonary edema image recognition method of the present invention.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

In high altitude areas, altitude sickness, including altitude pulmonary edema, is easily caused by insufficient oxygen supply to the body due to rarefaction of oxygen. Altitude pulmonary edema is a serious pulmonary disease with high morbidity and mortality. Traditional diagnostic methods for altitude pulmonary edema typically rely on manual observation and analysis of CT images of the lungs. However, this method has problems such as complicated operation and low accuracy. In recent years, with the development of deep learning technology, automatic analysis and diagnosis of lung images by using convolutional neural networks are becoming a new research direction.

However, there are some limitations to the automatic analysis and prediction of lung images using conventional convolutional neural networks, such as limitations in feature extraction, differences between different cases, and the like. Therefore, researchers have proposed a predictive diagnosis method for altitude pulmonary edema based on unsupervised contrast learning. The method adopts different data enhancement methods, constructs a sample pair, and constructs a two-branch feature learning network to learn image features. During training, a network with shared weights is used for extracting features of two sample graphs in a sample pair, and then distance measurement is carried out on the features of the two sample graphs in a contrast learning mode. By combining with the contrast feature distance function to optimize, the method can make the deep learning features of the same sample similar and the deep learning features of different samples have large difference, thereby optimizing the feature extraction capability and providing better features for the classifier to classify. Compared with the traditional convolutional neural network method, the method has better diagnosis precision.

The pulmonary edema image recognition method related to the embodiment of the invention is mainly applied to pulmonary edema image recognition equipment, and the pulmonary edema image recognition equipment can be PC, portable computer, mobile terminal and other equipment with display and processing functions.

Referring to fig. 1, fig. 1 is a schematic hardware configuration of a pulmonary edema image recognition apparatus according to an embodiment of the present invention. In an embodiment of the present invention, the pulmonary edema image recognition apparatus may include a processor 1001 (e.g., CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein the communication bus 1002 is used to enable connected communications between these components; the user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard); the network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface); the memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory, and the memory 1005 may alternatively be a storage device independent of the processor 1001.

Those skilled in the art will appreciate that the hardware configuration shown in fig. 1 does not constitute a limitation of the pulmonary edema image recognition apparatus, and may include more or fewer components than illustrated, or may combine certain components, or a different arrangement of components.

With continued reference to fig. 1, the memory 1005 of fig. 1, which is a computer readable storage medium, may include an operating system, a network communication module, and a pulmonary edema image recognition program.

In fig. 1, the network communication module is mainly used for connecting with a server and performing data communication with the server; the processor 1001 may invoke the pulmonary edema image recognition procedure stored in the memory 1005, and execute the pulmonary edema image recognition method according to the embodiment of the present invention.

The embodiment of the invention provides a pulmonary edema image identification method.

Referring to fig. 2-4, fig. 2 is a flowchart illustrating a first embodiment of a pulmonary edema image recognition method according to the present invention.

In this embodiment, the pulmonary edema image recognition method includes the steps of:

The convolutional neural network of the invention uses the convolutional neural network ResNet50, and with the continuous development of CNN, the number of layers of convolution is increased in order to obtain deep features. Initially the Le Net network has only 5 layers, then AlexNet is 8 layers, and later the vgnet network contains 19 layers and GoogleNet has 22 layers. However, a method of enhancing the learning ability of the network by increasing the number of network layers is not always feasible, because after the number of network layers reaches a certain depth, the number of network layers is increased, so that the problem of random gradient disappearance occurs in the network, and the accuracy of the network is reduced. To solve this problem, the conventional method is to use a data initialization and regularization method, which solves the problem of gradient disappearance, but the problem of network accuracy is not improved. The occurrence of the residual error network can solve the gradient problem, the expressed characteristics are better due to the increase of the network layer number, the corresponding detection or classification performance is stronger, and the residual error uses 1X 1 convolution, so that the parameter number can be reduced, and the calculated amount can be reduced to a certain extent.

The key point of the ResNet network is that a residual unit in the structure of the ResNet network comprises a cross-layer connection, input can be directly transferred in a cross-layer manner, equal mapping is carried out, and then the input is added with the result after convolution operation. Assuming that the input image is x, the output is H (x), the output after convolution in the middle is a nonlinear function of F (x), and the final output is H (x) =f (x) +x, such output can still perform nonlinear transformation, residual refers to "difference", that is, F (x), and the network is converted to find a residual function F (x) =h (x) -x, so that the residual function is easier to optimize than F (x) =h (x).

The Resnet50 network includes 49 convolutional layers, a fully-connected layer. The Resnet50 network structure may be divided into seven parts, the first part not containing residual blocks, mainly performing convolution, regularization, activation function, max pooling calculations on the inputs. The second, third, fourth and fifth part structures all contain residual blocks, and the sizes of the residual blocks are not changed and only used for changing the dimensions of the residual blocks. In the network structure of Resnet50, the residual block has three layers of convolution, and the total of the network has 1+3× (3+4+6+3) =49 convolution layers, and the total of the last full connection layer is 50 layers, which is also the origin of the name of Resnet 50. The input of the network is 224×224×3, the output is 7×7×2048 after the convolution calculation of the first five parts, the pooling layer converts the input into a feature vector, and finally the classifier calculates the feature vector and outputs the class probability.

The fully connected neural network (Fully Connected Neural Network, FCNN for short) is a most basic artificial neural network structure, also called a multi-layer perceptron (Multilayer Perceptron, MLP). In a fully-connected neural network, each neuron is connected to all neurons of the previous and subsequent layers to form a dense connection structure. The fully-connected neural network can learn complex characteristics of input data and perform tasks such as classification, regression and the like.

Structure of fully connected neural network:

input layer: the input layer is responsible for receiving the raw data and passing it on to the next layer. The number of neurons of the input layer depends on the dimension of the input data.

Hidden layer: the hidden layer is an intermediate layer in the fully connected neural network, and a plurality of hidden layers can be provided. The number of neurons in the hidden layer can be freely set, and each neuron is connected with all neurons in the previous layer and the next layer.

Output layer: the output layer is the last layer of the fully-connected neural network and is responsible for outputting the prediction result of the network. The number of neurons in the output layer depends on the type of task, e.g., a two-class task typically has one neuron in the output layer and multiple neurons in the output layer.

Activation function: the activation function is used to introduce non-linear factors that enable the neural network to fit complex non-linear relationships. Common activation functions are ReLU, sigmoid, tanh, etc.

In some embodiments, the twin network model training step comprises:

In some embodiments, the sample image includes a CT image, the enhancing the sample image with different data enhancement to obtain a plurality of enhanced sample images with corresponding data enhancement, including:

And a plurality of reasonable data enhancement methods are designed, so that the scale of a training set can be expanded, the occurrence of over fitting is reduced, and the generalization capability of the model is improved.

Based on clinical features and sample differences, a variety of reasonable data enhancement methods are designed:

sample characteristics and clinical characteristics including characteristics of resolution, noise, brightness and the like of CT images are analyzed, and corresponding data enhancement methods are designed according to differences of different image devices.

And carrying out transformation operations such as random cutting, rotation, scaling, overturning and the like on the image, and enhancing the diversity of the data samples.

A plurality of sample pairs are constructed for each sample by using a plurality of data enhancement methods, so that training data can be enriched, and the diversity of training samples is increased, thereby improving the generalization capability of the model.

In some embodiments, the twin network includes two identical convolutional neural networks, and the feature extraction is performed on each enhanced sample graph through the twin network to obtain a feature vector, which includes:

The two-branch feature learning network is constructed, and a mode of unsupervised comparison learning is adopted, so that more robust and effective feature representation can be extracted, and the discrimination capability and generalization capability of the model are improved.

By learning the feature representation in a contrast learning mode, more distinguishable feature representations can be learned, and the robustness and generalization capability of the model are improved.

Two-branch feature learning network is constructed for feature extraction, feature extraction capacity is improved, and sample information utilization rate is improved through an unsupervised comparison learning mode:

and (3) constructing a characteristic extraction network based on a convolutional neural network, and respectively extracting the characteristics of the plateau pulmonary edema and the normal lung.

A contrast Loss function (contrast Loss) is used as a training target, so that the data-enhanced samples are still close in feature space.

According to the invention, different sample pairs are randomly selected for training, so that the diversity of training data is increased.

In some embodiments, the calculating the distance between the feature vectors using a contrast loss function from the feature vectors comprises:

；

A pulmonary edema image recognition apparatus, the pulmonary edema image recognition apparatus comprising:

Each module in the pulmonary edema image recognition device corresponds to each step in the embodiment of the pulmonary edema image recognition method, and the functions and implementation processes of each module are not described in detail herein.

In addition, the embodiment of the invention also provides a computer readable storage medium.

The computer readable storage medium of the present invention stores thereon a pulmonary edema image recognition program, wherein the pulmonary edema image recognition program, when executed by a processor, implements the steps of the pulmonary edema image recognition method as described.

The method implemented when the pulmonary edema image recognition procedure is executed may refer to various embodiments of the pulmonary edema image recognition method of the present invention, and will not be described herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The embodiment numbers of the present invention are merely for description and do not represent advantages or disadvantages of the embodiments.

The subject application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

From the above description of the embodiments, it will be clear to those skilled in the art that the example method may be implemented by means of software plus a necessary general purpose hardware platform, but of course also by means of hardware, although in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.

The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims

1. A pulmonary edema image recognition method, characterized in that the pulmonary edema image recognition method comprises the steps of:

2. The pulmonary edema image recognition method of claim 1, wherein the twin network model training step includes:

3. The pulmonary edema image recognition method of claim 2, wherein the sample image includes a CT image, wherein the employing different data enhancement to the sample image results in a plurality of enhanced sample images of corresponding data enhancement, comprising:

4. A pulmonary edema image recognition method according to claim 3, wherein the twin network includes two identical convolutional neural networks, the feature extraction is performed on each enhanced sample map through the twin network to obtain feature vectors, including:

5. The pulmonary edema image recognition method of claim 4, wherein the calculating a distance between the feature vectors using a contrast loss function from the feature vectors comprises:

；

6. A pulmonary edema image recognition apparatus, characterized by comprising:

7. A pulmonary edema image recognition apparatus, characterized in that the pulmonary edema image recognition apparatus includes a processor, a memory, and a pulmonary edema image recognition program stored on the memory and executable by the processor, wherein the pulmonary edema image recognition program, when executed by the processor, implements the steps of the pulmonary edema image recognition method of any one of claims 1 to 5.

8. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a pulmonary edema image recognition program, wherein the pulmonary edema image recognition program, when executed by a processor, implements the steps of the pulmonary edema image recognition method according to any one of claims 1 to 5.