CN111986177A - Chest rib fracture detection method based on attention convolution neural network - Google Patents

Chest rib fracture detection method based on attention convolution neural network Download PDF

Info

Publication number
CN111986177A
CN111986177A CN202010845981.5A CN202010845981A CN111986177A CN 111986177 A CN111986177 A CN 111986177A CN 202010845981 A CN202010845981 A CN 202010845981A CN 111986177 A CN111986177 A CN 111986177A
Authority
CN
China
Prior art keywords
convolution
attention
point
chest
fracture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010845981.5A
Other languages
Chinese (zh)
Other versions
CN111986177B (en
Inventor
张�雄
彭司春
上官宏
侯婷
郝雅文
王安红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taiyuan University of Science and Technology
Original Assignee
Taiyuan University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taiyuan University of Science and Technology filed Critical Taiyuan University of Science and Technology
Priority to CN202010845981.5A priority Critical patent/CN111986177B/en
Publication of CN111986177A publication Critical patent/CN111986177A/en
Application granted granted Critical
Publication of CN111986177B publication Critical patent/CN111986177B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/66Analysis of geometric attributes of image moments or centre of gravity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30008Bone
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Geometry (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the field of CT image target detection, and the specific technical scheme is as follows: a chest rib fracture detection method based on an attention convolution neural network comprises the following specific steps: firstly, obtaining a chest rib fracture data set; secondly, training the data set, wherein the training process comprises the following steps: 1) sending the data set into a preprocessing module for preprocessing; 2) extracting primary features by a feature extraction network; 3) extracting multi-scale features by a multi-scale inclusion module, and recombining the features with different scales; 4) the cascade corner pooling prediction module predicts key points and outputs corresponding hot spot graphs/connection vectors/offsets; 5) constraining the hotspot graph/connection vector/offset of the key point through an integral loss function; and thirdly, testing the chest rib fracture data set, classifying and positioning the target according to the predicted parameters of the key points at the upper left corner, the lower right corner and the center, and having high classification accuracy and high positioning precision.

Description

Chest rib fracture detection method based on attention convolution neural network
Technical Field
The invention belongs to the technical field of CT image target detection, and particularly relates to a chest rib fracture detection method based on an attention convolution neural network.
Background
The rib fracture is a pathological phenomenon that the broken end of a rib is shifted inwards and outwards or crushed due to the action modes of direct violence or indirect violence (such as front and back compression of the chest) from different outsides, and is a serious chest wound which often occurs in daily life (such as physical exercise, high-altitude falling, various criminal cases, traffic accidents and the like).
Rib fracture is the main reason for chest pain and hydropneumothorax after trauma of patients, and can bring strong pain to the patients; in addition, the fracture forms are complex and various, which causes certain difficulty in fracture diagnosis. If a doctor wants to find an optimal treatment scheme for a patient in time, an accurate pathological judgment is needed; the fracture lesion of the rib can induce the lesions of adjacent structures such as lung, chest and mediastinum to a certain extent, and the rapid and accurate diagnosis of the rib fracture has positive effect on treating diseases of other parts; in addition, rib fracture diagnosis is important evidence for judicial identification, insurance claims, and the like. Based on the above reasons, accurate diagnosis of the positions and number of the rib fractures is very important for judging the disability degree, the fracture type and the injury grade, improving the medical diagnosis and treatment level and avoiding medical disputes.
The most clinically used diagnostic basis for chest trauma is Computed Tomography (CT). At present, the CT image shot by a commercial or clinically used CT device has higher definition, and compared with a conventional X-ray film, the CT film can accurately obtain the detailed conditions of rib fracture, such as the number and specific parts of the fracture, and can also evaluate the damage of adjacent tissue structures. A doctor with 3 years of clinical experience can accurately diagnose the fracture type and the injury degree by reading the high-definition CT image. However, the current hospital or judicial institution depends heavily on the quality of CT image and the experience of the doctor when diagnosing or treating rib fracture. One person without clinical experience can be inaccurate in diagnosing the rib fracture, and one doctor with clinical experience needs to spend 2-3 minutes and compare the rib fracture back and forth when diagnosing the rib fracture, so that the diagnosis process is time-consuming and tedious. In addition, the manual interpretation is influenced by factors such as interpretation fatigue, the number of rib fractures and non-standard anatomical plane distribution, and the like, and the missed diagnosis rate is high. Therefore, it is highly desirable to develop a rapid and accurate automatic rib fracture identification technique to enable patients to perform surgical treatment as early as possible.
In 2006, professor Geoffrey Hinton first proposed the concept of Deep Learning (DL), which is a method for computers to automatically learn pattern features. Compared with the traditional algorithm, the deep learning technology has strong feature extraction capability, can obtain deep features by relying on a large amount of sample data, has stronger robustness and generalization capability, and has excellent performance in various image processing fields. In recent years, the method has become a research hotspot in the fields of breast cancer, lung nodule, lung tumor prediction and the like.
CornerNet and centret are the latest research results in deep learning-like target detection methods, and perform well in many target detection tasks. However, the direct application of them to chest rib fracture detection has the following problems: firstly, the chest rib fracture detection belongs to the small-scale target detection problem (the size of a whole chest CT image is 1176 multiplied 1194, the fracture size is about 50 multiplied 50-100 multiplied 100, the occupied area of the fracture position is small in the whole image), the fracture form is complex and various, the similarity with the surrounding background is large, and the characteristic effect is not good by directly using HourglassNet to extract the fracture; secondly, the centret has the advantages that the characteristics of the geometric center point are supplemented on the basis of the CornerNet, and the characteristics of the geometric center point of the chest rib fracture are often sparse, so the rib fracture characteristics extracted by the centret are poor in expression capacity, and the detection rate and the classification accuracy of the fracture cannot be guaranteed.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides a chest rib fracture detection method, which is characterized in that all rib fracture types contained in a chest CT image are identified by detecting the chest CT image, the fracture positions are marked, a specific boundary box is marked, and the accuracy of the boundary box is higher.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows: a chest rib fracture detection method based on an attention convolution neural network adopts an attention module to capture effective characteristic points which can most express semantic information in a central area, so as to improve the detection rate and classification accuracy of fracture; and a multi-scale inclusion block is added to effectively extract the rib fracture characteristics with complex and various forms, so that the detection accuracy of the ultra-small target chest rib fracture is improved.
The method comprises the following specific steps:
the method comprises the steps of firstly, scanning a chest by a CT scanner to obtain chest rib fracture data, and carrying out category classification, manual labeling and format conversion on the chest rib fracture data to obtain a chest rib fracture data set.
Step two, training the data set of the fracture of the chest rib, wherein the training process comprises the following steps:
1) sending the data set of the chest rib fracture into a preprocessing module for preprocessing;
2) sending the preprocessed image into a feature extraction network for primary feature extraction;
3) sending the image after the primary feature extraction into a multi-scale inclusion module for multi-scale feature extraction, and recombining features of different scales;
4) sending the reconstructed image into a cascade angular point pooling prediction module, predicting key points at the upper left corner and the lower right corner of the target to be detected, and respectively outputting a heat point map, a connection vector and an offset;
sending the reconstructed image into a central pooling prediction module, predicting a central point of a target to be detected, and respectively outputting an offset and a hot spot map, wherein the central point hot spot map is processed by an attention module;
5) constraining a hot spot graph/a connecting vector/an offset of a corner point and an offset/a hot spot graph of a central point through an integral loss function;
step three, testing the data set of the fracture of the chest rib:
1) sending the data set of the chest rib fracture into a preprocessing module for preprocessing;
2) sending the preprocessed image into a feature extraction network for primary feature extraction;
3) sending the image after the primary feature extraction into a multi-scale inclusion module for multi-scale feature extraction, and recombining features of different scales;
4) sending the reconstructed image into a cascade angular point pooling prediction module, predicting key points at the upper left corner and the lower right corner of the target to be detected, and respectively outputting a heat point map, a connection vector and an offset;
sending the reconstructed image into a central pooling prediction module, predicting a central point of a target to be detected, and respectively outputting an offset and a hot spot map, wherein the central point hot spot map is processed by an attention module;
5) and processing the central point hot spot diagram through an attention module, and carrying out target classification and positioning on the predicted parameters of the key points at the upper left corner, the lower right corner and the center.
The hot spot graph is restrained by the hot spot loss function, the connection vector is restrained by the connection vector loss function, and the offset is restrained by the offset loss function.
The multi-scale inclusion module comprises four branches, wherein three branches adopt convolution kernels with the size of 1 x 1 for convolution, and adopt convolution kernels with the size of 3 x 3 for convolution operation on convolution results; the other branch adopts pooling kernels with the size of 3 multiplied by 3 for pooling, and adopts convolution kernels with the size of 1 multiplied by 1 for performing convolution operation on pooled results; and inputting the result of the convolution of the four branches into the convolution layer for data dimension reduction.
The attention module comprises three branches, wherein a first branch adopts a convolution kernel with the size of 1 multiplied by 1 to perform convolution on an input image and obtain a characteristic diagram f (x), a second branch adopts a convolution kernel with the size of 1 multiplied by 1 to perform convolution on the input image and obtain a characteristic diagram g (x), and autocorrelation coefficients of the characteristic diagram f (x) and the characteristic diagram g (x) are calculated to obtain a characteristic image pixel weight value distribution diagram; and the third branch adopts a convolution core with the size of 1 multiplied by 1 to convolute the input image and obtain a characteristic graph h (x), the characteristic graph h (x) is multiplied by a characteristic image pixel weight value distribution graph to obtain a corresponding self-attention mechanism characteristic graph, the point with the maximum weight, namely the response point with the most prominent characteristic, is found, and the self-attention mechanism characteristic graph is superposed on the original input characteristic graph through a modulation factor alpha to obtain output.
The overall loss function comprises corner point position loss, attention point position loss, embedding loss, corner point compensation and attention point compensation loss, and is specifically expressed as follows:
Figure BDA0002642205340000041
wherein α ═ β ═ 0.1, γ ═ 1;
Figure BDA0002642205340000042
the position of the corner points is represented,
Figure BDA0002642205340000043
indicating a loss of position of the point of attention,
Figure BDA0002642205340000044
the loss of the embedding is indicated,
Figure BDA0002642205340000045
for reducing the distance of the connecting vector of two corner points belonging to the same object,
Figure BDA0002642205340000046
for enlarging the distance of the connecting vector of two corner points not belonging to the same object,
Figure BDA0002642205340000047
representing the compensation loss of the corner point;
Figure BDA0002642205340000048
indicating a loss of compensation for the point of attention.
The invention improves the inherent structure of the CenterNet (a network model for carrying out target detection based on three key points), so that the CenterNet is suitable for the detection of the fracture of the rib of the chest:
in the aspect of feature extraction, considering that rib fracture belongs to an ultra-small scale target, fracture forms are complex and various and have high similarity with surrounding backgrounds, the detection network provided by the invention reserves a HourglassNet structure, and inputs a feature map into an inclusion module capable of performing multi-scale feature extraction after primary feature extraction is performed by using the HourglassNet structure, so that the detection accuracy of the ultra-small target can be improved while rib fracture features are effectively extracted.
Secondly, an attention subnetwork is designed in a prediction module to be used for adaptively estimating effective characteristic points of a central area of the rib fracture, and the target category and the position of a boundary box are determined according to the estimated central characteristic points and key points at the upper left corner and the lower right corner, so that the positioning precision is high.
Drawings
Fig. 1 is a schematic view of the overall architecture of a rib fracture detection network.
Fig. 2 is a schematic diagram of a multi-scale inclusion block.
FIG. 3 is a schematic diagram of an attention module.
Fig. 4 is a CT image of several fractures commonly seen in rib bone images of the chest.
FIG. 5 is a detected Broken1 rib image with the defect location within a small box.
FIG. 6 is a detected Broken2 rib image with the defect location within a small box.
FIG. 7 is a detected Broken3 rib image with the defect location within a small box.
FIG. 8 is a detected Broken4 rib image with the defect location within a small box.
FIG. 9 is a detected Broken5 rib image with the defect location within a small box.
FIG. 10 is an accurate histogram of three defects detected by different inspection models.
Detailed Description
In order to make the technical problems, technical solutions and advantageous effects to be solved by the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, a method for detecting a fracture of a rib of a chest based on an attention convolution neural network includes the following steps:
scanning the chest by adopting a CT scanner to obtain the fracture data of the rib of the chest;
secondly, carrying out category classification, manual labeling and format conversion operation on the chest rib fracture data to obtain a chest rib fracture data set;
step three, training a chest rib fracture data set;
and step four, testing the data set of the chest rib fracture.
The training process is specifically divided into the following steps:
1) sending the data set of the chest rib fracture into a preprocessing module for preprocessing;
2) firstly, sending the rib fracture image into a HourglassNet (a hourglass network with a symmetrical coding and decoding structure) for primary feature extraction;
3) then sending the obtained product to a multi-scale inclusion module (a basic block of a GoogleNet network), extracting multi-scale features, and recombining the features with different scales;
4) predicting key points of the upper left corner and the lower right corner of the target to be detected in a cascade pooling prediction module, and respectively outputting a heat point diagram, a connection vector and an offset of the key points;
5) in the central pooling prediction module, predicting a central point of a target to be detected, particularly, extracting a sensitive point which can most express the central feature of the target by adopting an attention subnet in the aspect of central point prediction, and outputting a heat point diagram, a connection vector and an offset;
6) and constraining the network with hot spot loss, connection vector loss and offset loss.
The testing process is substantially the same as the training process except that: and step six, classifying and positioning the target according to the predicted parameters of the key points at the upper left corner, the lower right corner and the center.
The specific structure of the multi-scale initiation module is as follows:
in order to improve the detection precision of the network on the chest rib fracture, the feature extraction module comprises a HourglassNet module and a multi-scale inclusion module. The hourglass-shaped HourglassNet is a characteristic extraction network of a coding and decoding structure, so that the preliminary characteristic extraction of rib fracture by the hourglass-shaped HourglassNet ensures the detection degree of the network on rib fracture; in addition, due to the introduction of the multi-scale inclusion module, the receptive field of the network can be increased under the condition of low calculation cost, and meanwhile, the bottom-layer features and the high-layer semantic features of the image are extracted, so that the detection accuracy of the network on the ultra-small target is improved.
As shown in fig. 2, the multi-scale inclusion structure is divided into 4 branches, and first, in order to obtain features of multiple scales simultaneously, the input image is convolved and pooled by respectively adopting convolution kernels with the size of 1 × 1 and pooling kernels with the size of 3 × 3, then convolution operations are respectively carried out on convolution results of convolution kernels with the size of 3 × 3, and convolution operations are carried out on pooled results of convolution kernels with the size of 1 × 1; and secondly, fusing the extracted features with different scales and inputting the fused features into a convolution layer, wherein the convolution layer consists of a bottleneck layer (the size of a convolution kernel is 1 multiplied by 1), a batch normalization layer (BatchNorm, BN) and an activation function layer (Relu). The convolution operation performed by adopting the convolution kernel with the size of 1 multiplied by 1 can be regarded as linear transformation of an input channel, and the operation can realize data dimension reduction, improve the characteristic characterization capability of the network and increase the adaptability of the network to the scale.
Compared with traditional feature extraction modules such as Alexnet and VGG, the multi-scale inclusion module is characterized in that:
(1) input via bottleneck layer (bottleeck, convolution kernel size 1 × 1): the classical multi-scale algorithm is to deconvolute the same input by convolution kernels with different sizes and then combine the results of the deconvolution and convolution kernels with different sizes, which is very computationally intensive. And (3) the bottleeck is introduced to carry out feature dimension reduction, the layer number of the feature graph is reduced, and the nonlinear expression capability of the model is increased while parameters are reduced and calculation is accelerated.
(2) B N and Relu: BN is added to normalize each layer to a Gaussian random variable with standard normal distribution; the nonlinear activation unit with Relu can well transfer errors in back propagation, and the gradient of the Relu activation function is either 1 or 0, so that gradient explosion/disappearance can be inhibited, and the training speed is accelerated. This combination is added after each convolution and the update of the acceleration weight parameters can be repeated.
(3) The convolution with the convolution kernel size of 5 multiplied by 5 is split into 2 convolutions with the convolution kernel size of 3 multiplied by 3, so that the obtained image information is single, the convolution kernels are more abundant, and the width of the network is increased. At the same time, a small convolution kernel may better extract information for smaller objects.
In general, a conventional convolutional layer only performs convolution operation on input data in the same scale (for example, the size of a convolution kernel is constantly 3 × 3), the output data dimension is fixed, and all output features are substantially uniformly distributed on a scale range of 3 × 3, that is, a sparsely distributed feature set is output. However, since features are extracted at multiple scales (e.g. convolution kernel size is 1 × 1, 3 × 3, 5 × 5, etc.), the distribution rule of the multiple features output by the multi-scale initiation module is: the uniform distribution is not presented any more, but the features with strong correlation are gathered together (for example, a plurality of features of 1 × 1 are gathered together, a plurality of features of 3 × 3 are gathered together, and a plurality of features of 5 × 5 are gathered together), and irrelevant non-critical features are weakened, that is, a plurality of sub-feature sets with dense distribution are output. Similarly, the same number of features are output, and the multi-scale acceptance module outputs fewer 'redundant' information of the features. The pure feature set is transmitted layer by layer and finally used as the input of reverse calculation, the natural convergence speed is higher, and a high-quality detection frame can be obtained. In general, the multi-scale inclusion module can decompose a sparse matrix into a dense matrix, so that the calculation amount is greatly reduced, and the network convergence speed is accelerated.
Wherein, the concrete structure of the self-attention module net is as follows:
because the fracture center area features of the rib image are sparse, the target geometric center point features may be useless information, and if the point is captured as the feature basis of classification, the detection rate of the network on the fracture and the classification accuracy can be reduced. In order to solve the problem, a central key point prediction module of the Centernet is improved, the position of a geometric center is not taken as a determination standard of the central key point, and a feature point which is noticed by a self-attention machine is taken as the central key point. The self-attention mechanism can automatically focus on the feature center of the target, and the pixel point with the largest weight value is obtained and used as a central key point by calculating the self-correlation coefficient among the pixel points in the feature graph.
As shown in fig. 3, the structure of the self-attention module is divided into 3 branches. Firstly, convolving an input image by adopting a convolution kernel with the size of 1 multiplied by 1 to obtain feature maps f (x) and g (x) in a 1 st branch and a 2 nd branch respectively, and calculating autocorrelation coefficients of the feature maps f (x) and g (x) to obtain a feature image pixel weight value distribution map; secondly, in the 3 rd branch, the convolution kernel with the size of 1 multiplied by 1 is also adopted to convolute the input image to obtain a feature map h (x), the feature map h (x) is multiplied by the pixel weight size distribution map obtained in the previous step to obtain a corresponding self-attention feature map, and the point with the maximum weight, namely the response point which can highlight the feature most is found; finally, the self-attention characteristic diagram is superposed on the original input characteristic diagram through a modulation factor alpha to obtain output.
The self-attention mechanism does not depend on the position information of the pixel points of the feature graph, the internal correlation of data or features is captured by calculating the similarity between pixels, the dependence on external feature information is reduced, particularly, the important features of sparse data can be extracted quickly, parallel calculation can be performed well, and the calculation efficiency is greatly improved. The self-attention structure is adopted to detect the rib fracture, so that the problems of reduction of the rib fracture detection accuracy and increase of false detection caused by centeret due to sparse characteristics of the central region of the fracture image can be effectively solved, and the self-attention structure has positive effects on improving the rib fracture detection rate and increasing the classification accuracy of fracture types.
The loss function applied to the attention convolution neural network is specifically expressed as follows:
the loss function of the attention convolution neural network is shown as formula (1), and includes five items of loss of corner position, loss of attention point position, embedding loss, corner offsets and loss of attention point offsets.
Figure BDA0002642205340000081
Wherein, α ═ β ═ 0.1, γ ═ 1,
Figure BDA0002642205340000082
representing the corner location (headers) loss,
Figure BDA0002642205340000083
the position loss of the attention point is shown and is an improved version of focal loss (the improvement of a cross entropy loss function balances the imbalance problem of positive and negative sample proportions);
Figure BDA0002642205340000084
and expressing vector loss, namely, finding a pair of corner points of each target based on the distance between connecting vectors of different corner points, wherein if one upper left corner point and one lower right corner point belong to the same target, the distance between embedding vectors of the upper left corner point and the lower right corner point is small. The embedding training is carried out by
Figure BDA0002642205340000085
The two loss functions are implemented such that,
Figure BDA0002642205340000086
for reducing the distance between two corner connecting vectors belonging to the same object,
Figure BDA0002642205340000087
for enlarging purposes other thanTwo corner points of the same object connect the distance of the vector. Model training
Figure BDA0002642205340000088
The loss function groups the vertices of the same object,
Figure BDA0002642205340000089
the loss function is used for separating the vertexes of different targets;
Figure BDA00026422053400000810
the method specifically adopts learning of a smooth L1 loss function supervision parameter gamma.
The data set construction process comprises the following steps:
the invention collects the image data of 30 patients who are subjected to chest CT examination due to chest trauma, and the resolution of the images is unified to 1176 x 1194. Because the chest rib fracture has different pathological conditions and different directions and is difficult to distinguish, the reasonable classification of the chest rib fracture is particularly important. The invention sets a classification standard for the medical diagnosis method according to the medical diagnosis method, so that the classification standard is suitable for realizing under a deep learning framework and is close to the actual requirement of clinical medicine. Rib fractures are specifically classified into the following 5 categories: bilateral cortical bone fracture, lateral cortical bone fracture, medial cortical bone fracture, cortical bone flexion fracture and other non-classifiable events. As shown in fig. 4, for the convenience of the experiment, the data set was constructed by sequentially designating bilateral cortical bone fractures, lateral cortical bone fractures, medial cortical bone fractures, cortical bone buckling fractures, and other non-classifiable sites as brooken 1, brooken 2, brooken 3, brooken 4, and brooken 5.
Finally, a chest CT rib fracture image data set is successfully constructed by an image cutting data set expansion method from a chest CT rib fracture image collected from a hospital, and 12276 images are all defective images. Among these defective images, there are 11079 single-type defective images and 1197 mixed-type defective images. The single type defect image data set composition is shown in table 1.
TABLE 1 Single type Defect image dataset construction
Figure BDA0002642205340000091
The overall image dataset is specifically constructed as shown in table 2:
TABLE 2 Overall image dataset composition
Figure BDA0002642205340000092
Figure BDA0002642205340000101
In the experiment, the number of the pictures in the training set is 11314, the number of the pictures in the verification set is 212, and the rest 750 pictures are used as the test set. In addition, the invention adopts the marking format of the COCO data set, and the process is as follows: (1) unifying all collected chest rib fracture images into a jpg format, naming by adopting 8-digit numbers, such as 00000001.jpg, dividing the images into three parts, namely training, verifying and testing, and respectively storing the three parts under a train 2014, a minival2014 and a testdev2017 folder of images folders; (2) the rib fracture images in the folders of train 2014, minival2014 and testdev2017 are manually marked through graphic image annotation tool software to be used as group Truth (true value, which indicates the classification accuracy of the training set with supervised learning). Generating an xml file corresponding to each image after the labeling is finished, wherein the file comprises information such as rib categories, coordinates of the upper left corner of a bounding box, length and width and the like; (3) xml files are converted into respective instaces _ train val2014 json, instaces _ mini 2014 json and instaces _ testdev2017 json files and are stored under the annotations folder in a unified mode, and the files in the json format are in a format required by a COCO data set.
The training set in this experiment was trained on three networks, Cornernet, Centernet and our improved network. After training is finished, the test images in the data set are input into a model, 5 types of defects including brooken 1, brooken 2, brooken 3, brooken 4 and brooken 5 are identified and positioned, and the 5 types of defects and the positioning frames are output as target types and positioning frames.
The results of the experiments performed on the above data sets are as follows:
in order to detect the performance of the improved network, the invention adopts a test set consisting of 750 pictures, which correspond to 150 defects of 5 types, namely brooken 1, brooken 2, brooken 3, brooken 4 and brooken 5.
As can be clearly seen from fig. 5, 6, 7, 8 and 9, the invention can accurately detect the 5 ribs, and because the idea of multi-scale inclusion is adopted, the network model can learn the features of the upper layer and the lower layer at the same time, the extracted features are more comprehensive, and the detection precision of the method is improved. And meanwhile, the advantages of the attention module are combined, so that the accuracy of the position of the bounding box is higher.
The invention adopts three indexes of the omission factor, the false alarm rate and the detection position precision to quantitatively analyze the detection performances of CornerNet, CenterNet and the method of the invention on three defects. The statistics of the defect detection results of the three methods are shown in table 3. The statistical results of the detection accuracy, the false detection rate and the missed detection rate of the three methods are shown in table 4. The accuracy histogram for three defects detected using different detection models is shown in fig. 10.
TABLE 3 statistics of the results of the three methods of defect detection (unit: sheet)
Figure BDA0002642205340000111
TABLE 4 statistical results of detection accuracy, false detection rate and missed detection rate of the three methods
Figure BDA0002642205340000112
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principles of the present invention are intended to be included therein.

Claims (6)

1. A chest rib fracture detection method based on an attention convolution neural network is characterized by comprising the following specific steps:
firstly, constructing a chest rib fracture data set;
secondly, training a chest rib fracture data set:
1) sending the data set of the chest rib fracture into a preprocessing module for preprocessing;
2) sending the preprocessed image into a feature extraction network for primary feature extraction;
3) sending the image after the primary feature extraction into a multi-scale inclusion module for multi-scale feature extraction, and recombining features of different scales;
4) sending the reconstructed image into a cascade angular point pooling prediction module, predicting key points at the upper left corner and the lower right corner of the target to be detected, and respectively outputting a heat point map, a connection vector and an offset;
sending the reconstructed image into a central pooling prediction module, predicting a central point of a target to be detected, and respectively outputting an offset and a hot spot map, wherein the central point hot spot map is processed by an attention module;
5) constraining a hot spot graph/a connecting vector/an offset of a corner point and an offset/a hot spot graph of a central point through an integral loss function;
thirdly, testing the data set of the fracture of the chest rib:
the early stage process of the test is the same as the training process in the step two, namely the step 1), the step 2), the step 3) and the step 4), and the difference steps are as follows: and the central point hot spot diagram is processed by an attention module, and target classification and positioning are carried out on the predicted parameters of the key points at the upper left corner, the lower right corner and the center.
2. The method of claim 1, wherein the hotspot graph is constrained by a hotspot loss function, the connection vector is constrained by a connection vector loss function, and the offset is constrained by an offset loss function.
3. The chest rib fracture detection method based on the attention convolution neural network is characterized in that the multi-scale inclusion module comprises four branches, wherein three branches adopt convolution kernels with the size of 1 x 1 for convolution, and adopt convolution kernels with the size of 3 x 3 for convolution to perform convolution operation on the convolution results; the other branch adopts pooling kernels with the size of 3 multiplied by 3 for pooling, and adopts convolution kernels with the size of 1 multiplied by 1 for performing convolution operation on pooled results; and inputting the result of the convolution of the four branches into the convolution layer for data dimension reduction.
4. The chest rib fracture detection method based on the attention convolution neural network is characterized in that the attention module comprises three branches, the first branch adopts a convolution kernel with the size of 1 x 1 to convolve an input image and obtain a feature map f (x), the second branch adopts a convolution kernel with the size of 1 x 1 to convolve an input image and obtain a feature map g (x), autocorrelation coefficients of the feature maps f (x) and g (x) are calculated, and a feature image pixel weight value distribution map is obtained; and the third branch adopts a convolution core with the size of 1 multiplied by 1 to convolute the input image and obtain a characteristic graph h (x), the characteristic graph h (x) is multiplied by a characteristic image pixel weight value distribution graph to obtain a corresponding self-attention mechanism characteristic graph, the point with the maximum weight, namely the response point with the most prominent characteristic, is found, and the self-attention mechanism characteristic graph is superposed on the original input characteristic graph through a modulation factor alpha to obtain output.
5. The method for detecting the fracture of the rib of the chest based on the attention convolution neural network as claimed in claim 4, wherein the overall loss function includes a corner point position loss, an attention point position loss, an embedding loss, a corner point compensation and an attention point compensation loss, and is specifically expressed as follows:
Figure FDA0002642205330000021
wherein α ═ β ═ 0.1, γ ═ 1;
Figure FDA0002642205330000022
the position of the corner points is represented,
Figure FDA0002642205330000023
indicating a loss of position of the point of attention,
Figure FDA0002642205330000024
the loss of the embedding is indicated,
Figure FDA0002642205330000025
for reducing the distance between two corner connecting vectors belonging to the same object,
Figure FDA0002642205330000026
for enlarging the distance between two corner connecting vectors not belonging to the same object,
Figure FDA0002642205330000027
representing the compensation loss of the corner point;
Figure FDA0002642205330000028
indicating a loss of compensation for the point of attention.
6. The chest rib fracture detection method based on the attention convolution neural network as claimed in claim 1, wherein in the first step, the specific process of constructing the chest rib fracture data set is as follows: the CT scanner scans the chest to obtain the fracture data of the chest rib, and performs classification, manual labeling and format conversion on the fracture data of the chest rib to obtain a fracture data set of the chest rib.
CN202010845981.5A 2020-08-20 2020-08-20 Chest rib fracture detection method based on attention convolution neural network Active CN111986177B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010845981.5A CN111986177B (en) 2020-08-20 2020-08-20 Chest rib fracture detection method based on attention convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010845981.5A CN111986177B (en) 2020-08-20 2020-08-20 Chest rib fracture detection method based on attention convolution neural network

Publications (2)

Publication Number Publication Date
CN111986177A true CN111986177A (en) 2020-11-24
CN111986177B CN111986177B (en) 2023-06-16

Family

ID=73443498

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010845981.5A Active CN111986177B (en) 2020-08-20 2020-08-20 Chest rib fracture detection method based on attention convolution neural network

Country Status (1)

Country Link
CN (1) CN111986177B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112669312A (en) * 2021-01-12 2021-04-16 中国计量大学 Chest radiography pneumonia detection method and system based on depth feature symmetric fusion
CN112784924A (en) * 2021-02-08 2021-05-11 宁波大学 Rib fracture CT image classification method based on grouping aggregation deep learning model
CN112802484A (en) * 2021-04-12 2021-05-14 四川大学 Panda sound event detection method and system under mixed audio frequency
CN112837264A (en) * 2020-12-23 2021-05-25 南京市江宁医院 Rib positioning and fracture clinical outcome prediction device and automatic diagnosis system
CN112907537A (en) * 2021-02-20 2021-06-04 司法鉴定科学研究院 Skeleton sex identification method based on deep learning and on-site virtual simulation technology
CN113077874A (en) * 2021-03-19 2021-07-06 浙江大学 Spine disease rehabilitation intelligent auxiliary diagnosis and treatment system and method based on infrared thermography
CN113111754A (en) * 2021-04-02 2021-07-13 中国科学院深圳先进技术研究院 Target detection method, device, terminal equipment and storage medium
CN113129278A (en) * 2021-04-06 2021-07-16 华东师范大学 X-Ray picture femoral shaft fracture detection method based on non-local separation attention mechanism
CN113674261A (en) * 2021-08-26 2021-11-19 上海脊影慧智能科技有限公司 Bone detection method, system, electronic device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140376798A1 (en) * 2013-06-20 2014-12-25 Carestream Health, Inc. Rib enhancement in radiographic images
WO2017084222A1 (en) * 2015-11-22 2017-05-26 南方医科大学 Convolutional neural network-based method for processing x-ray chest radiograph bone suppression
CN110298266A (en) * 2019-06-10 2019-10-01 天津大学 Deep neural network object detection method based on multiple dimensioned receptive field Fusion Features
CN111128348A (en) * 2019-12-27 2020-05-08 上海联影智能医疗科技有限公司 Medical image processing method, device, storage medium and computer equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140376798A1 (en) * 2013-06-20 2014-12-25 Carestream Health, Inc. Rib enhancement in radiographic images
WO2017084222A1 (en) * 2015-11-22 2017-05-26 南方医科大学 Convolutional neural network-based method for processing x-ray chest radiograph bone suppression
CN110298266A (en) * 2019-06-10 2019-10-01 天津大学 Deep neural network object detection method based on multiple dimensioned receptive field Fusion Features
CN111128348A (en) * 2019-12-27 2020-05-08 上海联影智能医疗科技有限公司 Medical image processing method, device, storage medium and computer equipment

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112837264B (en) * 2020-12-23 2023-12-15 南京市江宁医院 Rib positioning and fracture clinical outcome prediction device and automatic diagnosis system
CN112837264A (en) * 2020-12-23 2021-05-25 南京市江宁医院 Rib positioning and fracture clinical outcome prediction device and automatic diagnosis system
CN112669312A (en) * 2021-01-12 2021-04-16 中国计量大学 Chest radiography pneumonia detection method and system based on depth feature symmetric fusion
CN112784924A (en) * 2021-02-08 2021-05-11 宁波大学 Rib fracture CT image classification method based on grouping aggregation deep learning model
CN112784924B (en) * 2021-02-08 2023-05-23 宁波大学 Rib fracture CT image classification method based on grouping aggregation deep learning model
CN112907537A (en) * 2021-02-20 2021-06-04 司法鉴定科学研究院 Skeleton sex identification method based on deep learning and on-site virtual simulation technology
CN113077874A (en) * 2021-03-19 2021-07-06 浙江大学 Spine disease rehabilitation intelligent auxiliary diagnosis and treatment system and method based on infrared thermography
CN113077874B (en) * 2021-03-19 2023-11-28 浙江大学 Intelligent auxiliary diagnosis and treatment system and method for rehabilitation of spine diseases based on infrared thermal images
CN113111754A (en) * 2021-04-02 2021-07-13 中国科学院深圳先进技术研究院 Target detection method, device, terminal equipment and storage medium
CN113129278A (en) * 2021-04-06 2021-07-16 华东师范大学 X-Ray picture femoral shaft fracture detection method based on non-local separation attention mechanism
CN112802484B (en) * 2021-04-12 2021-06-18 四川大学 Panda sound event detection method and system under mixed audio frequency
CN112802484A (en) * 2021-04-12 2021-05-14 四川大学 Panda sound event detection method and system under mixed audio frequency
CN113674261A (en) * 2021-08-26 2021-11-19 上海脊影慧智能科技有限公司 Bone detection method, system, electronic device and storage medium

Also Published As

Publication number Publication date
CN111986177B (en) 2023-06-16

Similar Documents

Publication Publication Date Title
CN111986177B (en) Chest rib fracture detection method based on attention convolution neural network
Sahinbas et al. Transfer learning-based convolutional neural network for COVID-19 detection with X-ray images
Li et al. Automatic lumbar spinal MRI image segmentation with a multi-scale attention network
Chen et al. Anatomy-aware siamese network: Exploiting semantic asymmetry for accurate pelvic fracture detection in x-ray images
CN111553892A (en) Lung nodule segmentation calculation method, device and system based on deep learning
Yao et al. Pneumonia detection using an improved algorithm based on faster r-cnn
CN112784856A (en) Channel attention feature extraction method and identification method of chest X-ray image
CN113706491A (en) Meniscus injury grading method based on mixed attention weak supervision transfer learning
Guan et al. Automatic detection and localization of thighbone fractures in X-ray based on improved deep learning method
WO2022110525A1 (en) Comprehensive detection apparatus and method for cancerous region
CN113743463A (en) Tumor benign and malignant identification method and system based on image data and deep learning
Lu et al. Lung cancer detection using a dilated CNN with VGG16
Pradhan et al. Lung cancer detection using 3D convolutional neural networks
Huang et al. Bone feature segmentation in ultrasound spine image with robustness to speckle and regular occlusion noise
Kaliyugarasan et al. Pulmonary nodule classification in lung cancer from 3D thoracic CT scans using fastai and MONAI
Zhang et al. An algorithm for automatic rib fracture recognition combined with nnU-Net and DenseNet
Irene et al. Segmentation and approximation of blood volume in intracranial hemorrhage patients based on computed tomography scan images using deep learning method
CN117058149B (en) Method for training and identifying medical image measurement model of osteoarthritis
Giv et al. Lung segmentation using active shape model to detect the disease from chest radiography
Wang et al. False positive reduction in pulmonary nodule classification using 3D texture and edge feature in CT images
Sha et al. The improved faster-RCNN for spinal fracture lesions detection
CN113139627B (en) Mediastinal lump identification method, system and device
Kanawade et al. A Deep Learning Approach for Pneumonia Detection from X− ray Images
Paul et al. Computer-Aided Diagnosis Using Hybrid Technique for Fastened and Accurate Analysis of Tuberculosis Detection with Adaboost and Learning Vector Quantization
Gan et al. Improved traumatic brain injury classification approach based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant