CN114266735A - Method for detecting pathological change abnormality of chest X-ray image - Google Patents

Method for detecting pathological change abnormality of chest X-ray image Download PDF

Info

Publication number
CN114266735A
CN114266735A CN202111484958.9A CN202111484958A CN114266735A CN 114266735 A CN114266735 A CN 114266735A CN 202111484958 A CN202111484958 A CN 202111484958A CN 114266735 A CN114266735 A CN 114266735A
Authority
CN
China
Prior art keywords
extraction module
context information
lesion
sequence
inputting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111484958.9A
Other languages
Chinese (zh)
Inventor
巫义锐
孔其然
袁驰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN202111484958.9A priority Critical patent/CN114266735A/en
Publication of CN114266735A publication Critical patent/CN114266735A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a method for detecting pathological changes and abnormalities of a chest X-ray image, which comprises the steps of inputting an image to be detected into a feature extraction module to obtain a first feature map; inputting the first feature map into a context information extraction module to obtain a second feature map rich in context information; unfolding a second characteristic diagram into a one-dimensional sequence, mapping the one-dimensional sequence into an embedded sequence with set dimensionality, and adding position coding information into the embedded sequence; and inputting the sequence added with the position code into a transform network model, outputting a target frame of a lesion and a lesion type, and finishing lesion abnormality detection of the X-ray picture. The method can effectively cope with the complexity and diversity of chest X-ray lesions, has high detection accuracy, and can effectively finish the chest X-ray lesion detection.

Description

Method for detecting pathological change abnormality of chest X-ray image
Technical Field
The invention relates to a method for detecting pathological changes and abnormalities of a chest X-ray image, and belongs to the technical field of image processing.
Background
In recent years, in the medical field, X-ray images have a very important role in diagnosis. In order to make rapid and accurate automatic diagnosis possible, a great deal of research is devoted to developing an intelligent computer-aided detection system to help doctors to diagnose the pathological changes of the chest X-ray film. With the outbreak of new coronary pneumonia, the demand for X-ray diagnosis of chest diseases is greatly increased, and effective computer-assisted tools are urgently needed to reduce the burden of doctors. The identification of X-ray images is very difficult due to the overlapping of organ structures along the projection direction, combined with the diversity of chest X-ray diseases. Often requiring a highly experienced physician to make the diagnosis. The existing X-ray image detection method is difficult to cope with complex scenes and has poor accuracy.
Disclosure of Invention
The invention aims to provide a chest X-ray image lesion abnormity detection method to solve the problems that the existing X-ray image detection method is difficult to cope with complex scenes and poor in accuracy.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention provides a method for detecting the pathological changes of a chest X-ray image, which is realized based on an abnormality detection model, wherein the abnormality detection model comprises a feature extraction module, a context information extraction module, a position encoder and a transform network model which are sequentially connected, and the method comprises the following steps:
inputting an image to be detected into a feature extraction module to obtain a first feature map;
inputting the first feature map into a context information extraction module to obtain a second feature map rich in context information;
expanding the second characteristic diagram into a one-dimensional sequence, mapping the one-dimensional sequence into an embedded sequence with a set dimension, and adding position coding information in the embedded sequence by using a position encoder;
and inputting the sequence added with the position codes into a transformer network model, and outputting a target frame of the lesion and the type of the lesion.
Further, the feature extraction module employs a ResNet network.
Further, the inputting the first feature map into a context information extraction module to obtain a second feature map rich in context information includes:
inputting the first feature graph into a context information extraction module, and adding a result output by the context information extraction module and the feature graph before input to obtain a new feature graph;
after pooling and downsampling the new feature graph, taking the new feature graph as the input of a context information extraction module;
and repeating the steps for multiple times until the characteristic diagram output by the context information extraction module is fused with multiple layers of information.
Further, the context information extraction module includes 2 standard convolution layers, a plurality of bottleneck structures including jump connections, and 1 standard convolution layer, which are connected in sequence, and each of the bottleneck structures includes 1 standard convolution layer, 1 extended convolution layer, and 1 standard convolution layer, which are connected in sequence.
Further, the sizes of convolution kernels of 3 standard convolution layers in the context information extraction module are respectively 1x1, 3x3 and 3x3, and the number of channels is respectively 128, 128 and 512; the convolution kernel sizes of the 2 standard convolution layers in each bottleneck structure are respectively 1x1 and 1x1, the channel numbers are respectively 128 and 128, the convolution kernel size of the expansion convolution layer is 3x3, the expansion rate is 2, and the channel number is 256.
Further, the adding position coding information in the embedded sequence includes:
position coding information is added using sine and cosine functions of different frequencies.
Further, the transform network model comprises a transform encoder and a transform decoder including a multi-head attention mechanism, and a multi-layer feedforward neural network.
Further, the anomaly detection model is obtained by training through the following method:
acquiring a chest lesion data set of an X-ray image, wherein the data set is divided into a plurality of lesion category sub-data sets, each sub-data set comprises a plurality of sample images under the same lesion category, each sample image comprises coordinates of a lesion and is marked with a lesion category;
inputting each sample image of each subdata set into the anomaly detection model, and outputting a prediction result;
and performing optimal matching on the prediction result and the true value by adopting a Hungarian algorithm to obtain a loss function, performing back propagation according to the loss function, performing gradient descent, and training to obtain the anomaly detection model.
Further, the loss function is:
Figure BDA0003396132020000031
wherein, the loss function includes classification loss function and positioning regression loss function, and the classification loss function adopts cross entropy loss:
Figure BDA0003396132020000041
the localization regression loss function includes IoU losses and regression losses, expressed as:
Figure BDA0003396132020000042
wherein L isregIs the smooth L1 function, which is of the form:
Figure BDA0003396132020000043
Liouis a GIoU function of the form:
Figure BDA0003396132020000044
wherein, A and B represent rectangles participating in calculation, C represents a minimum rectangular box containing both A and B, and | represents the area of the rectangular box.
Compared with the prior art, the invention has the following beneficial effects:
according to the method for detecting the pathological changes of the chest X-ray image, the transformer structure fused with the context information is used as the feature extractor, the complexity and the diversity of the chest X-ray pathological changes can be effectively dealt with, the detection accuracy is high, various different types of pathological change regions can be distinguished, and the pathological change regions can be accurately marked.
Drawings
FIG. 1 is a network structure diagram of a method for detecting abnormal lesion in chest X-ray images according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a context information extraction module;
FIG. 3 is a chest X-ray image to be examined;
FIG. 4 is a graph showing the results of detection.
Detailed Description
The invention is further described with reference to specific examples. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
The embodiment of the invention provides a method for detecting the pathological changes and abnormalities of a chest X-ray image, which is realized based on an abnormality detection model as shown in figure 1. The anomaly detection model comprises a feature extraction module, a context information extraction module, a position encoder and a transform network model which are sequentially connected.
With reference to fig. 1, a method for detecting abnormalities of breast X-ray image lesions specifically includes the following steps:
step 1, inputting an image to be detected into a feature extraction module to obtain a first feature map;
and converting the input image into a feature map by using the ResNet as a feature extraction module and using convolution, pooling and jump connection.
Step 2, inputting the first feature map into a context information extraction module to obtain a second feature map rich in context information;
as shown in fig. 2, the context information extraction module (DCE module) includes a 1x1 standard convolutional layer, a 3x3 standard convolutional layer, a plurality of bottleneck structures with jump connection, and a 3x3 standard convolutional layer, which are connected in sequence, wherein the number of channels of the 3 standard convolutional layers is 128, and 512, respectively.
Each bottleneck structure containing jump connection comprises a standard convolution layer of 1x1, an expansion convolution layer of 3x3 and a standard convolution layer of 1x1 which are connected in sequence. The number of channels of 2 standard convolutional layers is 128 and 128 respectively, the expansion rate of the extended convolutional layer is 2, and the number of channels is 256.
The feature map is input into the DCE module, firstly, the dimension of the feature map is reduced through standard convolution of 1x1 and standard convolution of 3x3, then the feature map is subjected to bottleneck structure containing a plurality of jump connections, and then the dimension of the feature map is increased through one standard convolution of 1x 1.
Wherein, the context information is extracted by adopting an iterative fusion mode.
In an embodiment, the performing, by using a context information extraction module, the context feature extraction on the first feature map specifically includes:
step 21, sending the characteristic diagram into a DCE module, and adding the obtained result and the original characteristic diagram;
wherein the characteristic diagram output in the step 1 is marked as F1
Step 22, reducing the size of the characteristic diagram to be half of the original size by using the pooling operation;
wherein the size of the pooling layer is 2 x 2.
And step 23, repeating the steps 21 and 22 for a plurality of times until the characteristic diagram fuses a plurality of layers of information.
Layer I feature F is shown in FIG. 2lAfter DCE module and a pooling downsampling operation, the l +1 layer characteristic F with smaller size is obtainedl+1It can be formulated as:
Fl=fl-1+FDCE(Fl-1)
Fl+1=Fdown(Fl)
wherein FlExpressed as the first layer characteristic diagram, fDCERepresenting a context information extraction module, fdownIndicating downsampling, here a pooling operation is used as the downsampling operation.
In this embodiment, there are 4 layers of feature maps, and finally, a feature map F is obtained4
Step 3, unfolding the second characteristic diagram into a one-dimensional sequence, mapping the one-dimensional sequence into an embedded sequence with set dimensionality, and adding position coding information in the embedded sequence by using a position coder;
wherein the one-dimensional sequence is mapped to a dimension dmodelThen add position-coding information using sine and cosine functions of different frequencies.
Figure BDA0003396132020000071
Figure BDA0003396132020000072
Where pos represents position, i represents dimension, dmodelRepresenting the total dimension of the embedded sequence, the dimension of the added position-coded sequence is still dmodel
And 4, inputting the sequence added with the position codes into a transform network model, and outputting a target frame of the lesion and the type of the lesion.
The transform network model comprises a transform encoder and a transform decoder which apply a multi-head attention mechanism, and a multi-layer feedforward neural network.
The multi-headed self-attentiveness mechanism may be expressed as:
MultiHead(Q,K,V)=Concat(head1,…,head h)Wo
wherein concat represents the concatenation of the feature tensors. headiThe ith single-headed attention head is shown.
Figure BDA0003396132020000073
Wherein the content of the first and second substances,
Figure BDA0003396132020000074
and is
Figure BDA0003396132020000075
h represents the number of heads of multi-head attention, and the dimension of X is dmodel
The multi-head attention consists of h single-head attention mechanisms, wherein the single-head self-attention mechanism is expressed as:
Figure BDA0003396132020000076
wherein Q, K, V are obtained by a series of matrix multiplication transformations of the input sequence X, and respectively represent a query, a key and a value. Dimension of Q is NqThe dimensions of K and V are NkvThe Softmax function is used to calculate the attention weight a, which is computed from the query and key:
Figure BDA0003396132020000081
wherein
Figure BDA0003396132020000082
Here, i is the index of the query and j is the index of the key value. N is a radical ofkvRepresenting the dimensions of query Q and key K. The final result is the sum of the values weighted by the attention weights, i.e. the ith behavior of the output of the single-headed attention mechanism:
Figure BDA0003396132020000083
and obtaining an output sequence through a plurality of transform encoders and transform decoders.
Then, the output sequence passes through two layers of feedforward neural networks, and the output dimension is NobjAnd (4) detecting the detection frame multiplied by 4 and the corresponding lesion type, and finally finishing the detection. Wherein N isobjIndicating the number of detection boxes, and 4 indicates that the dimension of the coordinates of the rectangular detection box is 4. Using relu as the activation function in between, it can be expressed as:
FFN(x)=max(0,xW1+b1)W2+b2
wherein W1、W2Respectively a parameter matrix, b1、b2For biasing, the dimension of input x is dmodel. In the invention dmodelSet to 256, the number of heads of attention h is set to 8, Nq,NkvIs set to dmodelThe value of/h is 32.
In the embodiment of the present invention, the anomaly detection model also needs to be trained in advance, which specifically includes:
step a, acquiring a chest lesion data set of an X-ray image, wherein the data set is divided into a plurality of lesion category sub-data sets, each sub-data set comprises a plurality of sample images under the same lesion category, each sample image comprises coordinates of a lesion and is marked with the lesion category;
b, inputting each sample image of each subdata set into the abnormal detection model, and outputting a prediction result;
and c, optimally matching the prediction result with the true value by adopting a Hungarian algorithm to obtain a loss function, carrying out reverse propagation according to the loss function, carrying out gradient descent, and training to obtain a model.
Model output NobjThe target box is used as a prediction result, which is hereinafter abbreviated as N, and is used as a prediction result set
Figure BDA0003396132020000091
And (4) showing.
Using the best match found using the Hungarian algorithm
Figure BDA0003396132020000092
Used as training:
Figure BDA0003396132020000093
wherein omegaNAll permutations, L, representing the set of true values ymatchIs a function that measures the difference between predicted and true values. Is defined as
Figure BDA0003396132020000094
Wherein the content of the first and second substances,
Figure BDA0003396132020000095
the format of the prediction result comprises the coordinates and confidence of the target box.
Figure BDA0003396132020000096
I target boxes represented as predictions belong to CiThe probability of the category.
Figure BDA0003396132020000097
The optimal arrangement which is found by adopting Hungarian algorithm and enables the overall matching loss to be minimum is shown. The matching loss is:
Figure BDA0003396132020000098
in training, the loss function includes classification loss and localization loss.
The classification loss adopts cross entropy loss:
Figure BDA0003396132020000099
the localization regression loss function includes IoU losses and regression losses, expressed as:
Figure BDA00033961320200000910
wherein L isregIs the smooth L1 function, which is of the form:
Figure BDA00033961320200000911
Liouis a GIoU function of the form:
Figure BDA00033961320200000912
wherein, A and B represent rectangular boxes participating in calculation, C represents a minimum rectangular box containing both A and B, and | represents the area of the rectangular box.
Training uses an Adam optimizer, where β1=0.9,β2=0.98,∈=10-9In the training process, the learning rate is continuously changed according to the following formula:
Figure BDA0003396132020000101
where step _ num represents the number of steps trained and warp _ steps represents the number of steps preheated, here set to 4000.
As shown in fig. 3, a chest X-ray image to be detected is given, and is input into the abnormality detection model, and the lesion region and the category of the image are obtained by the model output, as shown in fig. 4, it can be seen that the method successfully detects lung consolidation, effusion and fibrosis in the X-ray image.
Through the embodiment, the method for detecting the abnormal lesion of the chest X-ray image adopts the transformer structure fused with the context information as the feature extractor, can effectively cope with the complexity and diversity of the chest X-ray lesion, has high detection accuracy, can distinguish lesion areas of different types, and can accurately mark the lesion areas.
The present invention has been disclosed in terms of the preferred embodiment, but is not intended to be limited to the embodiment, and all technical solutions obtained by substituting or converting equivalents thereof fall within the scope of the present invention.

Claims (9)

1. A chest X-ray image lesion abnormality detection method is characterized in that: the method is realized based on an anomaly detection model, wherein the anomaly detection model comprises a feature extraction module, a context information extraction module, a position encoder and a transform network model which are sequentially connected, and the method comprises the following steps:
inputting an image to be detected into a feature extraction module to obtain a first feature map;
inputting the first feature map into a context information extraction module to obtain a second feature map rich in context information;
expanding the second characteristic diagram into a one-dimensional sequence, mapping the one-dimensional sequence into an embedded sequence with a set dimension, and adding position coding information in the embedded sequence by using a position encoder;
and inputting the sequence added with the position codes into a transformer network model, and outputting a target frame of the lesion and the type of the lesion.
2. The method of claim 1, wherein the feature extraction module employs a ResNet network.
3. The method of claim 1, wherein inputting the first feature map into a context information extraction module to obtain a second feature map rich in context information comprises:
inputting the first feature graph into a context information extraction module, and adding a result output by the context information extraction module and the feature graph before input to obtain a new feature graph;
after pooling and downsampling the new feature graph, taking the new feature graph as the input of a context information extraction module;
and repeating the steps for multiple times until the characteristic diagram output by the context information extraction module is fused with multiple layers of information.
4. The method of claim 1, wherein the context information extraction module comprises 2 standard convolutional layers, a plurality of bottleneck structures with jump connections, and 1 standard convolutional layer connected in sequence, each of the bottleneck structures comprising 1 standard convolutional layer, 1 extended convolutional layer, and 1 standard convolutional layer connected in sequence.
5. The method of claim 1, wherein the convolution kernel sizes of the 3 standard convolution layers in the context information extraction module are 1x1, 3x3 and 3x3, respectively, and the number of channels is 128, 128 and 512; the convolution kernel sizes of the 2 standard convolution layers in each bottleneck structure are respectively 1x1 and 1x1, the channel numbers are respectively 128 and 128, the convolution kernel size of the expansion convolution layer is 3x3, the expansion rate is 2, and the channel number is 256.
6. The method of claim 1, wherein adding position-coding information to the embedded sequence comprises:
position coding information is added using sine and cosine functions of different frequencies.
7. The method of claim 1, wherein the transform network model comprises a transform encoder and a transform decoder comprising a multi-headed attention mechanism, and a multi-layered feedforward neural network.
8. The method of claim 1, wherein the anomaly detection model is trained by:
acquiring a chest lesion data set of an X-ray image, wherein the data set is divided into a plurality of lesion category sub-data sets, each sub-data set comprises a plurality of sample images under the same lesion category, each sample image comprises coordinates of a lesion and is marked with a lesion category;
inputting each sample image of each subdata set into the anomaly detection model, and outputting a prediction result;
and performing optimal matching on the prediction result and the true value by adopting a Hungarian algorithm to obtain a loss function, performing back propagation according to the loss function, performing gradient descent, and training to obtain the anomaly detection model.
9. The method of claim 8, wherein the loss function is:
Figure FDA0003396132010000031
wherein, the loss function includes classification loss function and positioning regression loss function, and the classification loss function adopts cross entropy loss:
Figure FDA0003396132010000032
the localization regression loss function includes IoU losses and regression losses, expressed as:
Figure FDA0003396132010000033
wherein L isregIs the smooth L1 function, which is of the form:
Figure FDA0003396132010000034
Liouis a GIoU function of the form:
Figure FDA0003396132010000035
wherein, A and B represent rectangles participating in calculation, C represents a minimum rectangular box containing both A and B, and | represents the area of the rectangular box.
CN202111484958.9A 2021-12-07 2021-12-07 Method for detecting pathological change abnormality of chest X-ray image Pending CN114266735A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111484958.9A CN114266735A (en) 2021-12-07 2021-12-07 Method for detecting pathological change abnormality of chest X-ray image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111484958.9A CN114266735A (en) 2021-12-07 2021-12-07 Method for detecting pathological change abnormality of chest X-ray image

Publications (1)

Publication Number Publication Date
CN114266735A true CN114266735A (en) 2022-04-01

Family

ID=80826725

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111484958.9A Pending CN114266735A (en) 2021-12-07 2021-12-07 Method for detecting pathological change abnormality of chest X-ray image

Country Status (1)

Country Link
CN (1) CN114266735A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116965843A (en) * 2023-09-19 2023-10-31 南方医科大学南方医院 Mammary gland stereotactic system
CN117522877A (en) * 2024-01-08 2024-02-06 吉林大学 Method for constructing chest multi-disease diagnosis model based on visual self-attention

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626379A (en) * 2020-07-07 2020-09-04 中国计量大学 X-ray image detection method for pneumonia
CN113065402A (en) * 2021-03-05 2021-07-02 四川翼飞视科技有限公司 Face detection method based on deformed attention mechanism
CN113469962A (en) * 2021-06-24 2021-10-01 江苏大学 Feature extraction and image-text fusion method and system for cancer lesion detection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626379A (en) * 2020-07-07 2020-09-04 中国计量大学 X-ray image detection method for pneumonia
CN113065402A (en) * 2021-03-05 2021-07-02 四川翼飞视科技有限公司 Face detection method based on deformed attention mechanism
CN113469962A (en) * 2021-06-24 2021-10-01 江苏大学 Feature extraction and image-text fusion method and system for cancer lesion detection

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116965843A (en) * 2023-09-19 2023-10-31 南方医科大学南方医院 Mammary gland stereotactic system
CN116965843B (en) * 2023-09-19 2023-12-01 南方医科大学南方医院 Mammary gland stereotactic system
CN117522877A (en) * 2024-01-08 2024-02-06 吉林大学 Method for constructing chest multi-disease diagnosis model based on visual self-attention
CN117522877B (en) * 2024-01-08 2024-04-05 吉林大学 Method for constructing chest multi-disease diagnosis model based on visual self-attention

Similar Documents

Publication Publication Date Title
Mansilla et al. Learning deformable registration of medical images with anatomical constraints
Majid et al. Classification of stomach infections: A paradigm of convolutional neural network along with classical features fusion and selection
CN107766894B (en) Remote sensing image natural language generation method based on attention mechanism and deep learning
Yadav et al. Lung-GANs: unsupervised representation learning for lung disease classification using chest CT and X-ray images
CN114266735A (en) Method for detecting pathological change abnormality of chest X-ray image
CN111275118B (en) Chest film multi-label classification method based on self-correction type label generation network
CN115223678A (en) X-ray chest radiography diagnosis report generation method based on multi-task multi-mode deep learning
CN114170232B (en) Transformer-based X-ray chest radiography automatic diagnosis and new crown infection area distinguishing method
Sun et al. Context matters: Graph-based self-supervised representation learning for medical images
CN113312973A (en) Method and system for extracting features of gesture recognition key points
CN117058448A (en) Pulmonary CT image classification system based on domain knowledge and parallel separable convolution Swin transducer
Attallah Deep learning-based CAD system for COVID-19 diagnosis via spectral-temporal images
CN116129426A (en) Fine granularity classification method for cervical cell smear 18 category
Khademi et al. Spatio-temporal hybrid fusion of cae and swin transformers for lung cancer malignancy prediction
prasad Koyyada et al. An explainable artificial intelligence model for identifying local indicators and detecting lung disease from chest X-ray images
CN114202002A (en) Pulmonary nodule detection device based on improved FasterRCNN algorithm
CN113313699A (en) X-ray chest disease classification and positioning method based on weak supervised learning and electronic equipment
Xu et al. Identification of benign and malignant lung nodules in CT images based on ensemble learning method
Zheng et al. MA-Net: Mutex attention network for COVID-19 diagnosis on CT images
Elhanashi et al. Classification and Localization of Multi-type Abnormalities on Chest X-rays Images
JP2004188201A (en) Method to automatically construct two-dimensional statistical form model for lung area
Chaisangmongkon et al. External validation of deep learning algorithms for cardiothoracic ratio measurement
Kim et al. Severity quantification and lesion localization of covid-19 on CXR using vision transformer
CN115239740A (en) GT-UNet-based full-center segmentation algorithm
CN113989576A (en) Medical image classification method combining wavelet transformation and tensor network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination