CN115131280A

CN115131280A - Improved YOLO v4 lung nodule detection method fused with attention mechanism

Info

Publication number: CN115131280A
Application number: CN202210321194.XA
Authority: CN
Inventors: 吴丹慧; 李铁强; 李霞; 路彤
Original assignee: China Jiliang University
Current assignee: China Jiliang University
Priority date: 2022-03-29
Filing date: 2022-03-29
Publication date: 2022-09-30

Abstract

The invention relates to an improved YOLO v4 lung nodule detection method fusing attention, which further improves the speed and the precision of lung nodule detection. The invention adopts YOLO v4 as a basic pulmonary nodule detection network model, and adds a mixed hole convolution module in a backbone network, so as to enlarge the receptive field without losing resolution and improve the positioning accuracy. The attention mechanism module is added at the tail end of the neck network and can be used for increasing the weight of useful features, paying more attention to a target area containing important information, simultaneously restraining the weight of invalid features and restraining irrelevant information, and therefore the overall accuracy of target detection is improved. And finally, training the data set, loading the trained optimal weight file into the model of the invention, and performing feature extraction to generate a lung nodule prediction frame and a lung nodule prediction confidence coefficient.

Description

Improved YOLO v4 lung nodule detection method fused with attention mechanism

Technical Field

The invention relates to the technical field of medical image processing, in particular to an improved YOLO v4 lung nodule detection method fusing attention.

Background

Physical examination has become an effective means for screening various diseases. The screening of lung cancer is realized by low-dose CT (computed Tomography) of lung. Lung cancer is one of the diseases with high mortality in the world, is not easy to detect in early stage, and is often in middle and late stage if the patient goes to hospital for diagnosis due to the occurrence of relevant symptoms. The current proposals for major hospitals or medical centers are that effective screening and prevention can be achieved through examination of low-dose pulmonary CT. Generally, one low-dose lung CT scan generally contains a two-dimensional transverse position lung window image, a soft tissue window image, and a coronal or sagittal position image reconstructed by later three-dimensional reconstruction, and the total image is added up to one or two hundred. Therefore, in such multiple images, the rate of missed diagnosis is high if the micro-nodules are simply searched by the naked eyes of the radiologist. With the rapid development of artificial intelligence in recent years, convolutional neural networks have been applied to a plurality of fields such as speech recognition, natural language processing, and object detection, and particularly, remarkable results have been achieved in image processing.

In clinical medicine, early lung cancer exists in the form of lung nodules, and a CAD (Computer-Aided Diagnosis) system is commonly used for detecting lung nodules as a common method for screening medical images of the lung. When a CAD system is used for lung nodule detection, firstly, data needs to be segmented and preprocessed, and the region of suspected malignant lung nodules is extracted; then classifying, and removing false positive nodules; and finally, the real positive nodules are reserved, and the influence of over-high false positive proportion on the detection result is prevented. The research on the lung nodules is mainly difficult to realize, the detection accuracy is low, the false positive is high, the traditional detection method cannot meet the requirement of accurate detection of the lung cancer in China with continuous development of scientific technology, and if the detection accuracy can be effectively improved and the false positive nodules are eliminated, the method is a great welfare for human beings.

Disclosure of Invention

1. An improved YOLO v4 lung nodule detection method fused with an attention mechanism is characterized by comprising the following specific steps of:

step 1: collecting images to establish a data set;

step 2: preprocessing a lung CT image;

step 3: a network structure for target detection is selected and improved, a mixed cavity convolution is added on the basis of a YOLO v4 network, and an attention mechanism is fused, so that the resolution ratio is not lost while the receptive field is expanded, a target area containing important information is focused more, irrelevant information is inhibited, and the positioning accuracy can be improved better;

step 4: and training the improved pulmonary nodule detection model by using the selected data set to detect pulmonary nodules.

2. The improved YOLO v4 lung nodule detection method based on attention mechanism as claimed in claim 1, wherein the specific steps of Step1 are as follows:

using the LUNA16 dataset, the LUNA16 dataset comprised 888 low dose pulmonary CT images in mhd format, each image containing a series of axial slices of the thorax.

3. The improved YOLO v4 pulmonary nodule detection method based on attention mechanism as claimed in claim 1, wherein the specific steps of Step2 are as follows:

the values obtained after CT acquisition are X-ray attenuation values in Hounsfield (HU). In pulmonary CT images, the lung HU values are typically around-500, and regions with HU values within [ -1000, +400] will remain, others are considered irrelevant for pulmonary disease detection and are discarded.

4. The improved YOLO v4 pulmonary nodule detection method based on attention mechanism as claimed in claim 1, wherein the specific steps of Step3 are as follows:

step3.1: the backbone network structure adopts CSP (chip scale package) linking, and a mixed cavity convolution module is added in the backbone network, so that the sensitivity field is enlarged, the resolution is not lost, the positioning accuracy is improved, the backbone characteristic receiving range is enlarged, and the extraction of context information is facilitated;

step3.2: because the lung nodule target is small, the invention integrates a attention mechanism module (CBAM) for increasing the weight of useful characteristics and paying more attention to a target region containing important information; meanwhile, invalid characteristic weight is restrained, and irrelevant information is restrained, so that the overall accuracy of target detection is improved.

5. The improved YOLO v4 pulmonary nodule detection method based on attention mechanism as claimed in claim 1, wherein the specific steps of Step4 are as follows:

the improved lung nodule detection model is trained by using the selected data set, the trained optimal weight file is loaded into the model of the invention for feature extraction, a series of candidate regions are generated, then the candidate frames are labeled according to the position relation between the candidate regions and the object real frames on the picture, and a lung nodule prediction frame and a lung nodule prediction confidence coefficient are generated.

The invention has the beneficial effects that: the invention can effectively improve the accuracy and speed of pulmonary nodule detection, provide a convenient and fast reading mode for clinic and reduce misdiagnosis caused by human factors in daily work of a radiologist.

Drawings

Fig. 1 is a flow chart of an improved YOLO v4 lung nodule detection based on an attention mechanism.

Fig. 2 is a schematic diagram of an improved YOLO v4 lung nodule detection network structure based on an attention mechanism.

FIG. 3 is a schematic diagram of a hybrid hole convolution module.

FIG. 4 is a schematic illustration of the attention mechanism (CBAM).

Fig. 5 is an exemplary illustration of pulmonary CT.

Fig. 6 is a diagram of lung nodule detection results in an exemplary lung CT diagram.

Detailed description of the preferred embodiments

The present invention will be described in further detail with reference to the accompanying drawings.

The invention provides an improved YOLO v4 lung nodule detection method fused with an attention mechanism, and a specific detection flow is shown in figure 1.

Step 1: raw 3D CT images are collected to build a data set.

The present invention uses the LUNA16 dataset, and the LUNA16 dataset includes 888 low-dose pulmonary CT images in mhd format, each image containing a series of axial slices of the thorax.

Step 2: and processing the lung CT image.

The data set was classified as 2D detection data in the form of VOCs, and the values obtained after CT acquisition were X-ray attenuation values in units of Hounsfield (HU). In pulmonary CT images, the lung HU values are typically around-500, and regions with HU values within [ -1000, +400] will remain, others are considered irrelevant for pulmonary disease detection and are discarded.

The criterion for the determination of nodules in the LUNA16 dataset was that at least three of the four radiologists identified the nodule as having a radius greater than 3 mm. Thus in the annotation of the data set, non-nodules, nodules with a radius of less than 3mm, and nodules deemed by 1 or two radiologists to be greater than 3mm are considered irrelevant and are discarded.

And step 3: and selecting a network structure for target detection.

According to the invention, YOLO v4 is selected as a basic target detection model, mixed cavity convolution is added, and an attention mechanism is fused, so that an improved YOLO v4 network structure based on the attention mechanism is formed, as shown in FIG. 2.

The YOLO v4 network architecture includes a backbone (capdarknet53), a neck (neck), and a head (head).

The specific improvement steps are as follows:

3.1 the backbone network structure is linked by CSP, the invention adds a mixed cavity convolution module in the backbone network, as shown in FIG. 3, the difference between the cavity convolution and the ordinary convolution lies in that the convolution kernel is filled with 0, the number of the filled 0 is different according to the difference of the cavity rate, and the receptive field is different, thereby obtaining more characteristic information with different scales.

The calculation method of the convolution kernel of the cavity convolution comprises the following steps:

x _n ＝x _k +(x _k -1)×(D _r -1)

wherein x is _n Is the convolution kernel size, x, of the hole convolution _k Is the original convolution kernel size, D _r Is the void fraction.

The calculation method of the receptive field of the cavity convolution comprises the following steps:

wherein, y _m The receptive field, y, of each point of the mth layer _m-1 For each of the m-1 th layerReceptive field of points, x _m Is the convolution kernel size, s, of the mth layer of convolution _i Is the step size of the i-th layer convolution.

The mixed hole convolution module is added to enlarge the receptive field without losing the resolution and improve the positioning accuracy. The neck network adopts SSP module to increase the receiving range of main character, to extract context information.

3.2 because the lung nodule target is small, the present invention incorporates a attention-machine (CBAM) module, as shown in FIG. 4.

The attention mechanism module comprises two aspects of a Channel Attention Mechanism (CAM) and a Space Attention Mechanism (SAM).

The Channel Attention Mechanism (CAM) adopts numerical information of an adaptive average pooling and an adaptive maximum pooling compression characteristic diagram on a space dimension (channel), the adaptive average pooling has feedback on each pixel point on the characteristic diagram, and when the adaptive maximum pooling is used for gradient back propagation calculation, gradient feedback exists only at the position with the maximum response in the characteristic diagram. Two successive fully connected layers are followed by a Sigmoid activation function.

Weight of channel attention mechanism:

wherein the content of the first and second substances,

and

for global average pooling and maximum pooling of features on a channel, respectively, W ₁ And W ₀ Sigma is a sigmoid activation function for two layers of weight parameters on the neural network.

The Space Attention Mechanism (SAM) firstly performs maximum pooling and average pooling on input features in space dimension, connects two pooled results, then performs convolution on the spliced result through a convolution check, ensures that the finally obtained features are consistent with an input feature diagram in space dimension, and then performs normalization processing by adopting a Sigmoid activation function.

Weight of spatial attention mechanism:

wherein the content of the first and second substances,

and

global average pooling and maximum pooling are respectively carried out on the spatial features, f is convolution kernel learning of 1 x 1 is used for two spliced and fused features, and sigma is a sigmoid activation function.

The attention mechanism module can be used for increasing the weight of useful features and paying more attention to a target area containing important information; meanwhile, the invalid characteristic weight is restrained, and irrelevant information is restrained, so that the overall precision of target detection is improved.

And 4, step 4: and (5) training and storing a weight file by data.

Training the lung nodule detection model of the improved YOLO v4 based on the fusion attention mechanism by using the data set, and selecting DIOU as a loss function in order to obtain ideal weight parameters

Wherein A is a prediction box, B is a label box, B ^gt Each represents anThe center points of the color box and the target box, p represents the Euclidean distance between the two center points, and c represents the diagonal distance of the smallest rectangle which can cover the color box and the target box simultaneously.

And (4) initializing the network to be 0.001 according to the total loss function set in the step (4), setting the batch size to be 8, and training the network on a Pytroch 1.4.0 platform by using an ADAM optimizer until the value of the total loss function is converged within a set error range, and then stopping training. And when the convergence of the loss function is close to 0, loading the trained optimal weight file into the model of the invention, and performing feature extraction to generate a series of candidate regions.

And 5: and outputting the result.

And loading the weight for prediction, and labeling the candidate frame according to the position relation between the candidate region and the real frame of the object on the picture to generate a lung nodule prediction frame and a lung nodule prediction confidence coefficient.

Fig. 5 is input as an example lung CT diagram, fig. 6 shows a lung nodule detection result in the example lung CT diagram in the test set, and it can be seen from the result diagram that the improved YOLO v4 lung nodule detection method based on the attention mechanism provided by the invention can accurately and quickly detect lung nodules from clinical real CT cases, save a large amount of radiograph reading time of radiologists, and improve the work efficiency.

Claims

step 1: collecting images to establish a data set;

step 2: preprocessing a lung CT image;

step 3: the method comprises the steps of selecting a network structure for target detection to be improved, adding mixed hole convolution on the basis of a YOLO v4 network and fusing an attention mechanism, so that the resolution ratio is not lost while the receptive field is expanded, a target area containing important information is focused more, irrelevant information is restrained, and the positioning accuracy can be improved better;

2. The improved YOLO v4 pulmonary nodule detection method based on attention mechanism as claimed in claim 1, wherein the specific steps of Step1 are as follows:

4. The improved YOLO v4 lung nodule detection method based on attention mechanism as claimed in claim 1, wherein the specific steps of Step3 are as follows:

step3.2: because the lung nodule target is small, the invention integrates a attention mechanism module (CBAM) for increasing the weight of useful characteristics and paying more attention to a target region containing important information; meanwhile, the invalid characteristic weight is restrained, and irrelevant information is restrained, so that the overall precision of target detection is improved.