CN112419246B - Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution - Google Patents
Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution Download PDFInfo
- Publication number
- CN112419246B CN112419246B CN202011263459.2A CN202011263459A CN112419246B CN 112419246 B CN112419246 B CN 112419246B CN 202011263459 A CN202011263459 A CN 202011263459A CN 112419246 B CN112419246 B CN 112419246B
- Authority
- CN
- China
- Prior art keywords
- network
- region
- cancer
- channels
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 49
- 210000004877 mucosa Anatomy 0.000 title claims abstract description 6
- 210000004204 blood vessel Anatomy 0.000 title claims description 25
- 230000000877 morphologic effect Effects 0.000 title claims description 3
- 206010028980 Neoplasm Diseases 0.000 claims abstract description 44
- 201000011510 cancer Diseases 0.000 claims abstract description 44
- 238000003384 imaging method Methods 0.000 claims abstract description 31
- 238000003745 diagnosis Methods 0.000 claims abstract description 27
- 230000003902 lesion Effects 0.000 claims abstract description 13
- 238000000605 extraction Methods 0.000 claims abstract description 12
- 238000011176 pooling Methods 0.000 claims abstract description 11
- 238000012800 visualization Methods 0.000 claims abstract description 8
- 239000003086 colorant Substances 0.000 claims abstract description 7
- 238000012549 training Methods 0.000 claims description 19
- 230000006870 function Effects 0.000 claims description 8
- 238000010586 diagram Methods 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 238000002474 experimental method Methods 0.000 claims description 3
- 230000000644 propagated effect Effects 0.000 claims 1
- 238000005070 sampling Methods 0.000 claims 1
- 230000001629 suppression Effects 0.000 claims 1
- 238000000034 method Methods 0.000 abstract description 7
- 206010041823 squamous cell carcinoma Diseases 0.000 abstract description 5
- 230000002792 vascular Effects 0.000 abstract description 4
- 238000012545 processing Methods 0.000 abstract description 2
- 238000013527 convolutional neural network Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 6
- 230000008595 infiltration Effects 0.000 description 6
- 238000001764 infiltration Methods 0.000 description 6
- 230000035945 sensitivity Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 230000002159 abnormal effect Effects 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 206010017993 Gastrointestinal neoplasms Diseases 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000010365 information processing Effects 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000001356 surgical procedure Methods 0.000 description 2
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 1
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 1
- 208000005718 Stomach Neoplasms Diseases 0.000 description 1
- 208000003464 asthenopia Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 238000004195 computer-aided diagnosis Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000010339 dilation Effects 0.000 description 1
- 201000004101 esophageal cancer Diseases 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 206010017758 gastric cancer Diseases 0.000 description 1
- 230000002496 gastric effect Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- YHXISWVBGDMDLQ-UHFFFAOYSA-N moclobemide Chemical compound C1=CC(Cl)=CC=C1C(=O)NCCN1CCOCC1 YHXISWVBGDMDLQ-UHFFFAOYSA-N 0.000 description 1
- 230000009854 mucosal lesion Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 201000011549 stomach cancer Diseases 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10068—Endoscopic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20092—Interactive image processing based on input by user
- G06T2207/20104—Interactive definition of region of interest [ROI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to the technical field of medical image processing, and particularly relates to a depth detection network for quantifying vascular morphology distribution of esophageal mucosa IPCLs. The method comprises a characteristic extraction network, a characteristic pyramid, a region candidate network, a cancer focus classification network with interest region pooling and clustering distribution prior self-embedding, and a system for visualization on a narrow-band imaging endoscope image. Extracting a feature map of an input image by a feature extraction network; the characteristic pyramid fuses the characteristics of different scales; the regional candidate network provides possible lesion regions; pooling the region of interest to pool features into suspicious lesion areas; the cluster distribution priori classifies the cancer foci from an embedded cancer focus classification network; and finally, visualizing the images on the narrow-band imaging endoscope, and using different colors to frame and mark the cancer focus. The invention can detect and diagnose the cancer focus of early esophageal squamous carcinoma in the image, can effectively improve the diagnosis efficiency and assist doctors to obtain higher diagnosis precision.
Description
Technical Field
The invention belongs to the technical field of medical image processing, and particularly relates to a depth detection network for quantifying vascular morphology distribution of esophageal mucosa IPCLs.
Background
The prognosis of esophageal cancer and gastric cancer is poor, the relative survival rate in 5 years is respectively 20.9 percent and 27.4 percent, and the serious burden is brought to the health care[11,13-14]. Standardized upper gastrointestinal cancer screening, treatment and follow-up are effective means for reducing cancer morbidity and mortality. Narrow-band imaging endoscopic screening is the first means for finding upper gastrointestinal cancer. The pathological type and infiltration depth of esophageal mucosal lesion under narrow-band imaging endoscope are judged mainly according to the unique vascular morphology of capillary loops (IPCLs) in epithelial papilla.
According to the typing standards proposed by Inoue and Arima[15]Blood vessels can be generally classified into A, B1, B2 and B3. Wherein type a refers to the observation of non-abnormal blood vessels; type B1 refers to observation of loop-shaped abnormal blood vessel, blood vessel dilation, snake-shaped form, different caliber, non-uniform shape, diameter of 20-30 μ M, and infiltration depth of M1-M2 layers; type B2 refers to observation of non-loop blood vessels, which are irregularly dendritic or multiple, and have a depth of infiltration M3-SM 1; type B3 indicates that large green blood vessels were observed, the blood vessels were highly dilated, and the depth of infiltration was SM 2.
The type, number and distribution of IPCL blood vessels play an important guiding role in clinical treatment decision. For example, IPCL with a deep infiltration depth is suggested to have a large amount of aggregation, which may suggest that esophageal lesions enter the middle and late stages, and is not suitable for minimally invasive treatment or even surgical treatment; conversely, if the IPCL with a deeper infiltration depth is scattered, the patient may have an opportunity for surgery.
Clinically, the observation of the IPCL is greatly affected by human subjective factors because, unlike conventional gastrointestinal endoscopic imaging modalities, the observation of the IPCL requires 10-50 times magnification of the surface of the lesion using a magnifying gastroscope in the NBI mode. In the same principle as in a microscope, the doctor will get an image close to 200 fine structures/fields of view in the zoom mode. Under the condition, a clinician needs to observe all structures, visual fatigue is very easy to generate, and due to the shortage of clinical experience, after 5-10 visual fields are observed, the clinician only remembers a particularly impressive part, namely the Murphy's law of reference, lacks an objective and quantifiable concept, and is easy to cause misjudgment of the state of an illness and cause errors of medical decision.
The research can enable clinicians to get rid of the influence of subjective factors (fatigue, careless omission, insufficient experience and the like caused by a large amount of fine observation), only needs to amplify focuses, obtains IPCL prediction aiming at all visual fields through computer analysis, comprises the quantity, proportion and aggregation conditions of various blood vessels, and can help the clinicians to judge the focuses more accurately.
The deep convolutional neural network is a machine learning technology, can effectively avoid human factors, and automatically learns how to extract abundant representative visual features from a large amount of marked data. The technology uses a back propagation optimization algorithm, so that a machine updates internal parameters thereof and learns the mapping relation from an input image to a label. In recent years, deep convolutional neural networks have greatly improved the performance of various tasks in computer vision.
2012, Krizhevsky et al[1]First in ImageNet[2]The image classification competition applies a deep convolutional neural network, and obtains a champion with a Top-5 error rate of 15.3%, which causes a hot tide of deep learning. 2015 Simnyan et al[3]The neural networks VGG-16 and VGG-19 with 16 and 19 layers are provided, the parameter number of the networks is increased, and the result of the ImageNet image classification task is further improved. 2016, He et al[4]The use of the 152-layer residual network ResNet achieves a classification effect exceeding that of human eyes.
Deep convolutional neural networks not only perform excellently in image classification tasks, but also in some structured output tasks, such as object detection[5-7]Semantic segmentation[8,9]The same excellent effects are obtained. If the deep convolutional neural network is applied to computer-aided diagnosis, doctors can be assisted to make better medical diagnosis, early discovery and early treatment can be achieved, and the treatment effect is improved.
The invention provides a cluster distribution priori self-embedded detection network, which can fully excavate the potential cluster distribution priori of a cancer focus, extract rich characteristics and simultaneously realize the detection and diagnosis of the cancer focus of early esophageal squamous carcinoma.
Disclosure of Invention
The invention aims to provide a cluster distribution prior self-embedded depth detection network for quantifying vascular morphology distribution of esophageal mucosa IPCLs, which eliminates the influence of human factors and realizes automatic diagnosis of a narrow-band imaging endoscope image.
The invention provides a cluster distribution prior self-embedded detection network, which is based on a target detection neural network and specifically comprises the following steps: the system comprises a characteristic extraction backbone network, a characteristic pyramid network, a region candidate network, an interest region pooling and clustering distribution prior self-embedded cancer focus classification network and an auxiliary diagnosis system for performing visualization on a narrow-band imaging endoscope image; wherein:
(1) the feature extraction backbone network is in ResNet-50[4]The basic construction includes 50 convolutional layers for extracting the feature map of the input image (i.e., the feature extractor as a feature pyramid). Specifically, feature maps are extracted at the tail of layers 1,2,3 and 4 of the ResNet-50 model respectively, the extracted feature maps have 256 channels, 512 channels, 1024 channels and 2048 channels respectively, and the sizes of the feature maps are original sizes of 1/4 channels, 1/8 channels, 1/16 channels and 1/32 channels respectively; feature graph feed into feature pyramid network[12];
(2) The characteristic pyramid network is used for fusing the characteristics of different scales, firstly unifying all characteristic graphs to 256 channels by using convolution of 1 multiplied by 1, then upsampling the characteristics of an upper layer to twice size layer by layer from top to bottom, adding the upsampled characteristics and the characteristics of a lower layer, and carrying out convolution of 3 multiplied by 3; thus, a multi-scale feature map is obtained: the sizes of the original images are 1/4, 1/8, 1/16 and 1/32 respectively, and the number of channels is 256;
(3) the region candidate network is used for extracting possible lesion regions; wherein the anchor generator is first used[5]Generating a dense rectangular candidate frame; the rectangle candidate frame has 5 × 3 different sizes, and is formed by combining five different sizes (such as width of 32, 64, 128, 256, 512) and three different shapes (such as 1:1, 1:2, 2: 1); the features of each layer in the feature pyramid are subjected to 3 × 3 convolution and 1 × 1 convolution, and the candidate box is judged to belong to a positive sample or a negative sample through Softmax; finally, performing boundary frame regression of three shapes (each box has 4 coordinates, so 3 × 4 is 12 channels) through convolution of 12 channels by 1 × 1, and correcting inaccurate candidate frames;
(4) the region of interest pooling and cluster distribution is a priori self-embedded cancer lesion classification network, wherein the region of interest pooling is performed by pooling features into suspicious lesion regions; the cluster distribution prior self-embedded cancer focus classification network is used for classifying the cancer focus; specifically, the region of interest is framed with a rectangular bounding box parallel to the coordinate axes, and the cancer focus classification result of the region, i.e., a normal (i.e., class a) region or a lesion region (i.e., class B1, B2, B3) is given; the network firstly extracts regions of interest from feature maps of different levels of the feature pyramid, aligns the regions of interest, and pools the regions of interest to 7 × 7 at the maximum, so that each region of interest corresponds to a feature with the size of 256 × 7 × 7; then, the features of each region of interest are overlapped with the features of K adjacent neighbors (namely, the feature channels are connected and combined) to form a feature map in a shape of (256 × K) × 7 × 7, so that the classification network applies potential cancer focus distribution prior; two output branches are then produced through the full connection layer: the first branch circuit outputs the position offset of each characteristic region for further correcting the position of the detection frame; calculating the classification probability of the features through a Softmax function by the second branch to obtain the category of the cancer focus of the region; wherein the fully-connected layer is to flatten the characteristic diagram of (256 × K) × 7 × 7 shape to form a characteristic of (12544 × K) × 1 × 1, the output of the fully-connected layer is 1024 channels, the output of the first branch is 20 channels, i.e. each category corresponds to four coordinates of the bounding box, 5 × 4, and the output of the second branch is 5 channels, i.e. 5 categories including negative samples;
(5) the auxiliary diagnosis system for visualization on the narrow-band imaging endoscope image is used for visually displaying the narrow-band imaging endoscope image and performing frame selection marking on a cancer focus by using different colors. Specifically, the input is a narrow-band imaging endoscope image; the network is used for detecting and diagnosing the cancer foci, and detection boxes with different colors are used for representing different types of the cancer foci, namely, the detection boxes with different colors are green, red, purple and black respectively represent A, B1, B2 and B3, and the classification confidence of the detection boxes is marked. Then, the confidence degrees of all the detection frames are screened, and the confidence degrees smaller than a threshold value T are removed1All ofDetecting the frame, and then eliminating the intersection ratio larger than the threshold T by using non-maximum value inhibition2Redundant overlap boxes. Wherein T is1、T2Take [0, 1] over]All values in the range with a step size of 0.05 and by comparison F1Score to determine the optimal threshold value T1、T2。
The training method of the network model comprises the following steps:
before training, network parameters of the ResNet-50 model are initialized randomly, images in a training set are scaled, the resolution of the images is not more than 800 x 1333, and corresponding bounding boxes are scaled at the same time.
During training, the images are firstly set to be [0.485, 0.456 and 0.406 ] according to the mean value]And standard deviation of [0.229, 0.224, 0.225 ]]Three channels (R, G, B) of the image are normalized. Using Adam optimization algorithm[16]Let the initial learning rate be 10-4Two estimated exponential decay rates: beta is a1Is set to 0.9, beta20.999, weight decay is 0, and a small batch stochastic gradient descent strategy is used, with the batch size set to 8 to minimize the loss function; training is carried out for N rounds; because the distribution of each type of blood vessel in the training set is not balanced, the blood vessels of B2 and B3 types cannot be sufficiently trained, and Focal loss is used as a loss function of the cancer focus classification network, wherein the weights of negative samples, A, B1, B2 and B3 are respectively C1、C2、C3、C4、C5And the distribution proportion of each blood vessel in the training set is determined after a plurality of experiments.
In the invention, after the image of the narrow-band imaging endoscope is input, the detection and diagnosis result of the cancer focus can be obtained only by one-time forward transmission.
The invention has the beneficial effects that:
the invention designs a cluster distribution prior self-embedded detection network, which takes a narrow-band imaging endoscope image as input and simultaneously realizes the cancer focus detection and diagnosis of early esophageal squamous carcinoma. The image to be tested can obtain detection and diagnosis results only through one-time forward propagation, and detection and classification tasks share part of network parameters, so that the calculation amount is effectively reduced, and the diagnosis efficiency is improved. Experimental results show that the method can accurately detect the cancer focus area of early esophageal squamous carcinoma, provide an accurate diagnosis result based on the detection frame, reduce the influence of human factors and improve the efficiency and accuracy of clinical diagnosis.
Drawings
FIG. 1 is a network framework diagram of the present invention.
FIG. 2 is a schematic diagram showing the detection and diagnosis effects of the present invention after the narrow-band imaging endoscope image is inputted into the network model. The method comprises the following steps of (a) obtaining a narrow-band imaging endoscope image, (b) obtaining a result of detecting and classifying the cancer foci in the image by the method, and (c) obtaining a result of detecting and classifying the cancer foci in the image by a doctor through experience.
Fig. 3 is a comparison of the present invention and the visualization effect of the doctor in the narrow band imaging endoscopic image for detection and diagnosis.
Figure 4 is a recall comparison of the present invention with different classifications of physician detection and diagnosis in a narrow band imaging endoscopic image.
Fig. 5 is a characteristic diagram display after the characteristic extraction is performed through the characteristic extraction network.
Detailed Description
The embodiments of the present invention are described in detail below, but the scope of the present invention is not limited to the examples.
The invention adopts the network frame shown in figure 1, and uses 144 narrow-band imaging endoscopic images which are cooperatively marked by a plurality of doctors with abundant seniority to train, thereby obtaining a model which can automatically detect and diagnose esophageal squamous carcinoma foci on the narrow-band imaging endoscopic images. The specific process comprises the following steps:
(1) before training, network parameters of the ResNet-50 model are initialized randomly, images in a training set are scaled, the resolution of the images is not more than 800 x 1333, and corresponding bounding boxes are scaled at the same time. .
(2) During training, the images are first set to [0.485, 0.456, 0.406 ] according to the mean value]And standard deviation of [0.229, 0.224, 0.225 ]]-normalizing the three channels (R, G, B) of the image; using Adam optimization algorithm[16]Let the initial learning rate be 10-4Two estimated exponential decay rates beta1Is set to 0.9, beta20.999, weight decay is 0, and a small batch stochastic gradient descent strategy is used, with the batch size set to 8 to minimize the loss function; training is carried out for N rounds; because the distribution of each type of blood vessel in the training set is not balanced, the blood vessels of B2 and B3 types cannot be sufficiently trained, and Focal loss is used as a loss function of the cancer focus classification network, wherein the weights of negative samples, A, B1, B2 and B3 are respectively C1、C2、C3、C4、C5And the distribution proportion of each blood vessel in the training set is determined after a plurality of experiments.
(3) During testing, the narrow-band imaging endoscope image is scaled so that the resolution does not exceed 800 × 1333, and is input into a trained model, and the model outputs an outer surrounding frame of all detected blood vessels, corresponding cancer focus categories (including normal category a and abnormal categories B1, B2 and B3, which are four categories in total) and confidence level p of the cancer focus categories. In particular, since the number of blood vessels included in a narrow-band imaging endoscopic image is large, the upper limit of the number of detection frames per image is set to 250. Setting a threshold value T1Is 0.3 when p>At 0.3, the outer enclosure frame is retained, otherwise the outer enclosure frame is removed. Setting a threshold T20.3, the remaining outer bounding box is non-maximum suppressed, remaining only in the neighborhood (intersection ratio greater than threshold T)2Time) the bounding box with the highest confidence p.
FIG. 2 illustrates the detection and diagnosis effect of the present invention after the narrow-band imaging endoscope image is inputted into the network model, wherein (a) is the original narrow-band imaging endoscope image; (b) for the outer surrounding frame obtained by detecting the cancer foci in the image and the corresponding classification and confidence coefficient, different colors respectively represent different cancer focus types, namely green, red, purple and white respectively represent A, B1, B2 and B3; (c) the method is a result of the cooperation of cancer focus detection and classification in images after discussion for a plurality of doctors with years of clinical practice and abundant experience. It can be seen from the figure that the results of the combined judgment of the system and a plurality of doctors with abundant experience are basically consistent in the detection and classification of the cancer foci, and the invention has strong application value.
Figure 3 shows the visualization of the present invention in comparison to a single physician's detection and diagnosis in a narrow band imaging endoscopic image, where the reference basis (i.e., the standard outcome) for detection and diagnosis is cooperatively annotated by a plurality of seniority-rich physicians. Therefore, when a single doctor carries out detection and diagnosis, mistakes and omissions are inevitably generated, and higher sensitivity cannot be achieved, but the system of the invention not only has higher judgment speed (each image is less than 1 second) but also has higher accuracy compared with a single doctor.
FIG. 4 is a comparison of the recall rate of different categories of detection and diagnosis by a single physician in a narrow band imaging endoscopic image, wherein the reference basis (i.e., standard outcome) for detection and diagnosis is cooperatively annotated by a plurality of highly qualified physicians. It can be seen that the recall rate of the present invention is much higher than that of a single doctor, and the recall rate means the rate of detection and correct classification of real cancer foci, which means that the present invention is much less than the case of detection of misdiagnosis by a single doctor.
Fig. 5 is a characteristic diagram showing the invention after characteristic extraction through the characteristic extraction network, and it can be seen that after the characteristic extraction, the characteristic values of blood vessels and non-blood vessels are greatly different, which shows that the characteristic extraction network can effectively extract key characteristics for detection and diagnosis from a narrow-band imaging endoscopic image.
Tables 1 and 2 show the sensitivity, accuracy and recall rate analysis of the invention and a single doctor in the narrow-band imaging endoscope image. Table 1 is the performance of the network of the present invention when K ═ 4 is taken (i.e., the classification uses feature fusion of 4 neighbors); table 2 shows the results of the tests and diagnoses by the individual doctors. The judgment criteria of detection and diagnosis are labeled by a plurality of doctors with rich seniority. Therefore, the invention exceeds the detection and diagnosis level of a single doctor in recall rate and embodies the clinical use value of the invention.
TABLE 1
Type (B) | TP | FP | FN | Sensitivity of a sample to a test | Rate of accuracy | Recall rate |
A | 169 | 267 | 53 | 0.761 | 0.388 | 0.669 |
B1 | 3248 | 489 | 249 | 0.929 | 0.869 | 0.916 |
B2 | 98 | 40 | 70 | 0.583 | 0.710 | 0.466 |
B3 | 20 | 22 | 5 | 0.800 | 0.476 | 0.500 |
Overall | 3535 | 818 | 377 | 0.904 | 0.812 | 0.884 |
TABLE 2
Type of lesion | TP | FP | FN | Sensitivity of a sample to a test | Rate of accuracy | Recall rate |
A | - | - | - | - | - | 0.50 |
B1 | - | - | - | - | - | 0.70 |
B2 | - | - | - | - | - | 0.93 |
B3 | - | - | - | - | - | 1.00 |
Overall | - | - | - | - | - | 0.67 |
。
Reference to the literature
[1]Krizhevsky,A.,Sutskever,I.&Hinton,G.E.ImageNet classification with deep convolutional neural networks.Advances in Neural Information Processing Systems,1097-1105(2012).
[2]Russakovsky,O.,Deng,J.,Su,H.et al.ImageNet Large Scale Visual Recognition Challenge.International Journal of Computer Vision 115,211-252(2015).
[3]Simonyan,K.&Zisserman A.Very deep convolutional networks for large-scale image recognition.International Conference on Representation Learning,(2014).
[4]He,K.,Zhang,X.,Ren,S.&Sun,J.Deep residual learning for image recognition.IEEE Conference on Computer Vision and Pattern Recognition,770-778(2016).
[5]Girshick,R.,Donahue,J.,Darrell,T.&Malik,J.Rich feature hierarchies for accurate object detection and semantic segmentation.IEEE Conference on Computer Vision and Pattern Recognition,580-587(2014).
[6]Girshick,R.Fast R-CNN.IEEE International Conference on Computer Vision,1440-1448(2015).
[7]Ren,S.,He,K.,Girshick,R.&Sun,J.Faster R-CNN:Towards real-time object detection with region proposal networks.Neural Information Processing Systems,(2015).
[8]Long,J.,Shelhamer,E.&Darrell,T.Fully convolutional networks for semantic segmentation.IEEE International Conference on Computer Vision,3431-3440(2015).
[9]Chen,L.,Papandreou,G.,Kokkinos,I.,Murphy,K.&Yuille,A.L.DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs.IEEE Transactions on Pattern Analysis and Machine Intelligence 40,834-848(2018).
[10]Ervik M,L F,Ferlay J,et al.Cancer Today.Lyon,France:International Agency for Research on Cancer.Cancer Today.[EB/OL].[2017-02-26].
[11] Chenwangqing, Zhengrongshan, Zhangwei, etc. analysis of the morbidity and mortality of Chinese malignant tumors in 2013 [ J ] Chinese tumors, 2017,26(1):1-7.
[12]Tsung-Yi Lin,Piotr Dollár,Ross B.Girshick,Kaiming He,BharathHariharan,Serge J.Belongie:Feature Pyramid Networks for Object Detection.CVPR 2017:936-944.
[13]Zeng H,Zheng R,Guo Y,et al.Cancer survival in China,2003-2005:a population based study[J].Int J Cancer,2015,136(8)
[14]Chen WQ,Zheng RS,Baade PD,et al.Cancer statistics in China,2015[J].CA Cancer J Clin,2016,66(2):115-132.
[15]Inoue H,Kaga M,Ikeda H,et al.Magnification endoscopyin esophageal squamous cell carcinoma:a review of theintrapapillary capillary loop classification[J].AnnGastroenterol,2015,28(1):41-48.
[16]Diederik P.Kingma,Jimmy Ba.Adam:A Method for Stochastic Optimization.ICLR(Poster)2015。
Claims (3)
1. A clustering distribution prior self-embedding depth detection system for quantifying esophageal mucosa IPCLs blood vessel morphological distribution is characterized by specifically comprising: the system comprises a characteristic extraction backbone network, a characteristic pyramid network, a region candidate network, an interest region pooling and clustering distribution prior self-embedded cancer focus classification network and an auxiliary diagnosis system for performing visualization on a narrow-band imaging endoscope image; wherein:
(1) the feature extraction backbone network is constructed on the basis of ResNet-50 and comprises 50 convolutional layers for extracting a feature map of an input image; specifically, feature maps are extracted at the tail of layers 1,2,3 and 4 of the ResNet-50 model respectively, the extracted feature maps have 256 channels, 512 channels, 1024 channels and 2048 channels respectively, and the size of each feature map is 1/4, 1/8, 1/16 and 1/32 size of an original map; the feature graph is sent into a feature pyramid network;
(2) the characteristic pyramid network is used for fusing characteristics of different scales, firstly unifying all characteristic graphs to 256 channels by using convolution of 1 multiplied by 1, then up-sampling the characteristics of an upper layer to twice size layer by layer from top to bottom, adding the characteristics of the upper layer and the lower layer, and performing convolution of 3 multiplied by 3; obtaining a multi-scale feature map: the sizes of the original images are 1/4, 1/8, 1/16 and 1/32 respectively, and the number of channels is 256;
(3) the region candidate network is used for extracting possible lesion regions; wherein, firstly, an anchor generator is used for generating a dense rectangular candidate box; the candidate frames of the rectangle have 5 multiplied by 3 different sizes and are formed by combining five different sizes and three different shapes; the features of each layer in the feature pyramid are subjected to 3 × 3 convolution and 1 × 1 convolution, and the candidate box is judged to belong to a positive sample or a negative sample through Softmax; finally, carrying out boundary frame regression of three shapes through convolution of 12 channels by 1 × 1, and correcting inaccurate candidate frames;
(4) the region of interest pooling and cluster distribution is a priori self-embedded cancer lesion classification network, wherein the region of interest pooling is a pooling of features into suspicious lesion regions by pooling the region of interest; the cluster distribution prior self-embedded cancer focus classification network is used for classifying the cancer focus; specifically, the region of interest is framed by a rectangular bounding box parallel to the coordinate axis, and the classification result of the cancer focus of the region is given, namely a normal A-type region or a lesion region B1, B2 and B3; the network firstly extracts interested areas from feature maps of different levels of a feature pyramid, aligns the interested areas and pools the interested areas to 7 x 7 to the maximum extent, so that each interested area corresponds to a feature with the size of 256 x 7; then, the features of each region of interest are superposed with the adjacent K adjacent features to form a (256 XK). times.7X 7 shaped feature map, so that the classification network is applied to the potential cancer focus distribution prior; thereafter two output branches are generated through the full connection layer: the first branch circuit outputs the position offset of each characteristic region for further correcting the position of the detection frame; calculating the classification probability of the features by a Softmax function through the second branch to obtain the category of the cancer focus of the region; wherein the fully-connected layer is to flatten the characteristic diagram of (256 × K) × 7 × 7 shape to form a characteristic of (12544 × K) × 1 × 1, the output of the fully-connected layer is 1024 channels, the output of the first branch is 20 channels, i.e. each category corresponds to four coordinates of the bounding box, 5 × 4, and the output of the second branch is 5 channels, i.e. 5 categories including negative samples;
(5) the auxiliary diagnosis system for visualization on the narrow-band imaging endoscope image is used for performing visualization display on the narrow-band imaging endoscope image and performing frame selection marking on a cancer focus by using different colors; specifically, the input is a narrow-band imaging endoscope image; detecting and diagnosing the cancer foci by using the network, wherein detection frames with different colors are used for representing different types of the cancer foci, namely, the detection frames respectively represent A, B1, B2 and B3 in green, red, purple and black, and the classification confidence degrees of the detection frames are marked; then, the confidence degrees of all the detection frames are screened, and the confidence degrees smaller than a threshold value are removedT 1 Using non-maximum suppression to eliminate the intersection ratio larger than the thresholdT 2 Redundant overlap boxes of (2); whereinT 1 、T 2 Take [0, 1] over]All values of step size 0.05 within the range and by comparison F1-score to determine optimal thresholdT 1 、T 2 。
2. The depth detection system of claim 1, wherein the network model is trained as follows:
before training, randomly initializing network parameters of a ResNet-50 model, and scaling images in a training set to ensure that the resolution ratio of the images does not exceed 800 multiplied by 1333 and the corresponding bounding boxes are scaled in the same proportion;
in training, firstly, images are based on the mean value = [0.485, 0.456, 0.406 =]And standard deviation = [0.229, 0.224, 0.225]Normalizing the three channels R, G, B of the image; using Adam optimization algorithm, set initial learning rate to 10-4Two estimated exponential decay rates:β 1the setting was made to be 0.9,β 20.999, weight decay of 0, and using a small batch stochastic gradient descent strategySlightly, the batch size is set to 8 to minimize the loss function; training is carried out for N rounds; because the distribution of each type of blood vessel in the training set is not balanced, the blood vessels of B2 and B3 types cannot be sufficiently trained, and Focal loss is used as a loss function of the cancer focus classification network, wherein the weights of negative samples, A, B1, B2 and B3 are respectively C1、C2、C3、C4、C5And the distribution proportion of each blood vessel in the training set is determined after a plurality of experiments.
3. The depth detection system of claim 2, wherein the images of the narrow band imaging endoscope are input into the trained network and are propagated forward once to obtain the detection and diagnosis results of the cancer focus.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011263459.2A CN112419246B (en) | 2020-11-12 | 2020-11-12 | Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011263459.2A CN112419246B (en) | 2020-11-12 | 2020-11-12 | Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112419246A CN112419246A (en) | 2021-02-26 |
CN112419246B true CN112419246B (en) | 2022-07-22 |
Family
ID=74831021
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011263459.2A Active CN112419246B (en) | 2020-11-12 | 2020-11-12 | Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112419246B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113643291B (en) * | 2021-10-14 | 2021-12-24 | 武汉大学 | Method and device for determining esophagus marker infiltration depth grade and readable storage medium |
CN113706533B (en) * | 2021-10-28 | 2022-02-08 | 武汉大学 | Image processing method, image processing device, computer equipment and storage medium |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109118485A (en) * | 2018-08-13 | 2019-01-01 | 复旦大学 | Digestive endoscope image classification based on multitask neural network cancer detection system early |
CN111784671B (en) * | 2020-06-30 | 2022-07-05 | 天津大学 | Pathological image focus region detection method based on multi-scale deep learning |
-
2020
- 2020-11-12 CN CN202011263459.2A patent/CN112419246B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112419246A (en) | 2021-02-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112102256B (en) | Narrow-band endoscopic image-oriented cancer focus detection and diagnosis system for early esophageal squamous carcinoma | |
Li et al. | A large-scale database and a CNN model for attention-based glaucoma detection | |
CN111985536B (en) | Based on weak supervised learning gastroscopic pathology image Classification method | |
EP2685881B1 (en) | Medical instrument for examining the cervix | |
Roth et al. | A new 2.5 D representation for lymph node detection using random sets of deep convolutional neural network observations | |
CN109858540B (en) | Medical image recognition system and method based on multi-mode fusion | |
CN110288597B (en) | Attention mechanism-based wireless capsule endoscope video saliency detection method | |
CN111899229A (en) | Advanced gastric cancer auxiliary diagnosis method based on deep learning multi-model fusion technology | |
CN108257135A (en) | The assistant diagnosis system of medical image features is understood based on deep learning method | |
CN110276356A (en) | Eye fundus image aneurysms recognition methods based on R-CNN | |
CN104299242B (en) | Fluoroscopic visualization eye fundus image extracting method based on NGC ACM | |
CN112419246B (en) | Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution | |
CN112419248B (en) | Ear sclerosis focus detection and diagnosis system based on small target detection neural network | |
CN114782307A (en) | Enhanced CT image colorectal cancer staging auxiliary diagnosis system based on deep learning | |
CN112102332A (en) | Cancer WSI segmentation method based on local classification neural network | |
CN112270667B (en) | TI-RADS-based integrated deep learning multi-tag identification method | |
US20230005140A1 (en) | Automated detection of tumors based on image processing | |
Sun et al. | A novel gastric ulcer differentiation system using convolutional neural networks | |
Lei et al. | Automated detection of retinopathy of prematurity by deep attention network | |
Yue et al. | Automatic acetowhite lesion segmentation via specular reflection removal and deep attention network | |
CN114398979A (en) | Ultrasonic image thyroid nodule classification method based on feature decoupling | |
CN115019049A (en) | Bone imaging bone lesion segmentation method, system and equipment based on deep neural network | |
Vallée et al. | Accurate small bowel lesions detection in wireless capsule endoscopy images using deep recurrent attention neural network | |
CN112634291A (en) | Automatic burn wound area segmentation method based on neural network | |
Oliver et al. | Automatic diagnosis of masses by using level set segmentation and shape description |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |