CN112419246A - Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution - Google Patents

Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution Download PDF

Info

Publication number
CN112419246A
CN112419246A CN202011263459.2A CN202011263459A CN112419246A CN 112419246 A CN112419246 A CN 112419246A CN 202011263459 A CN202011263459 A CN 202011263459A CN 112419246 A CN112419246 A CN 112419246A
Authority
CN
China
Prior art keywords
network
channels
region
cancer
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011263459.2A
Other languages
Chinese (zh)
Other versions
CN112419246B (en
Inventor
钟芸诗
颜波
蔡世伦
谭伟敏
王沛晟
李吉春
阿依木克地斯·亚力孔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN202011263459.2A priority Critical patent/CN112419246B/en
Publication of CN112419246A publication Critical patent/CN112419246A/en
Application granted granted Critical
Publication of CN112419246B publication Critical patent/CN112419246B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10068Endoscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Abstract

The invention belongs to the technical field of medical image processing, and particularly relates to a depth detection network for quantifying vascular morphology distribution of esophageal mucosa IPCLs. The method comprises a characteristic extraction network, a characteristic pyramid, a region candidate network, a cancer focus classification network with interest region pooling and clustering distribution prior self-embedding, and a system for visualization on a narrow-band imaging endoscope image. Extracting a feature map of an input image by a feature extraction network; the characteristic pyramid fuses the characteristics of different scales; the regional candidate network provides possible lesion regions; pooling the region of interest to pool features into suspicious lesion regions; cluster distribution prior classifies cancer foci from an embedded cancer focus classification network; and finally, visualizing the images on the narrow-band imaging endoscope, and using different colors to frame and mark the cancer focus. The invention can detect and diagnose the cancer focus of early esophageal squamous carcinoma in the image, can effectively improve the diagnosis efficiency and assist doctors to obtain higher diagnosis precision.

Description

Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution
Technical Field
The invention belongs to the technical field of medical image processing, and particularly relates to a depth detection network for quantifying vascular mucosa IPCLs (IPCLs) blood vessel morphological distribution.
Background
Esophageal cancer and gastric cancer are common upper gastrointestinal malignant tumors in developing countries such as China, the number of new cases in China exceeds 40 percent of the total cases all over the world, and the morbidity and mortality are obviously higher than the average level all over the world[10]. According to the latest statistics of Chinese tumor registration center, the new cases of esophagus cancer and stomach cancer are respectively located at the sixth and the second of malignant tumor. The prognosis of esophagus cancer and stomach cancer is poor, the relative survival rate in 5 years is respectively 20.9 percent and 27.4 percent, and the serious burden is brought to the health care[11,13-14]. Standardized upper gastrointestinal cancer screening, treatment and follow-up are effective means for reducing the incidence and mortality of cancer. Narrow-band imaging endoscopic screening is the first means for finding upper gastrointestinal cancer. The pathological type and infiltration depth of esophageal mucosal lesion under narrow-band imaging endoscope are judged mainly according to the unique vascular morphology of capillary loops (IPCLs) in epithelial papilla.
According to the typing standards proposed by Inoue and Arima[15]Blood vessels can be generally classified into A, B1, B2 and B3. Wherein type a refers to the observation of non-abnormal blood vessels; type B1 refers to observation of loop-shaped abnormal blood vessel, blood vessel dilation, snake-shaped form, different caliber, non-uniform shape, diameter of 20-30 μ M, and infiltration depth of M1-M2 layer; b2 type finger observationThe shape of the blood vessel is irregular tree-like or multiple from the non-loop-shaped blood vessel, and the infiltration depth is M3-SM1 layers; type B3 indicates that large green blood vessels were observed, the vessels were highly dilated and their depth of infiltration was SM2 layers.
The type, number and distribution of IPCL blood vessels play an important guiding role in clinical treatment decision. For example, the IPCL with a deep infiltration depth is prompted to have a large amount of aggregation, which may prompt that the esophageal lesion enters the middle and late stages, and is not suitable for minimally invasive treatment or even surgical treatment; conversely, if the IPCL with a deeper infiltration depth is scattered, the patient may have an opportunity for surgery.
Clinically, the observation of the IPCL is greatly affected by human subjective factors because, unlike conventional gastrointestinal endoscopic imaging modalities, the observation of the IPCL requires 10-50 times magnification of the surface of the lesion using a magnifying gastroscope in the NBI mode. In the same principle as in a microscope, the doctor will get an image close to 200 fine structures/fields of view in the zoom mode. Under the condition, a clinician needs to observe all structures, visual fatigue is very easy to generate, and due to insufficient clinical experience, after 5-10 visual fields are observed, the clinician only remembers a particularly impressive part, namely the ' reference Murphy ' law ', lacks an objective and quantifiable concept, and is easy to misjudge the state of an illness and cause errors of medical decision.
The research can enable clinicians to get rid of the influence of subjective factors (fatigue, careless, insufficient experience and the like caused by a large amount of fine observation), only needs to amplify the focus, obtains IPCL prediction aiming at all visual fields through computer analysis, comprises the number, proportion and aggregation condition of various blood vessels, and can help the clinicians to judge the focus more accurately.
The deep convolutional neural network is a machine learning technology, can effectively avoid human factors, and automatically learns how to extract abundant representative visual features from a large amount of marked data. The technology uses a back propagation optimization algorithm, so that a machine updates internal parameters thereof and learns the mapping relation from an input image to a label. In recent years, deep convolutional neural networks have greatly improved the performance of tasks in computer vision.
2012, Krizhevsky et al[1]First in ImageNet[2]The image classification competition applies a deep convolutional neural network, and obtains a champion with a Top-5 error rate of 15.3%, which causes a hot tide of deep learning. 2015 Simnyan et al[3]The neural networks VGG-16 and VGG-19 of 16 and 19 layers are provided, the parameter number of the networks is increased, and the result of the ImageNet image classification task is further improved. 2016 He et al[4]The use of the 152-layer residual network ResNet achieves a classification effect exceeding that of human eyes.
Deep convolutional neural networks not only perform excellently in image classification tasks, but also in some structured output tasks, such as object detection[5-7]Semantic segmentation[8,9]The same excellent effects are obtained. If the deep convolutional neural network is applied to computer-aided diagnosis, doctors can be assisted to make better medical diagnosis, early discovery and early treatment can be achieved, and the treatment effect is improved.
The invention provides a cluster distribution priori self-embedded detection network, which can fully excavate the potential cluster distribution priori of a cancer focus, extract rich characteristics and simultaneously realize the cancer focus detection and diagnosis of early esophageal squamous carcinoma.
Disclosure of Invention
The invention aims to provide a cluster distribution prior self-embedded depth detection network for quantifying the vascular morphology distribution of esophageal mucosa IPCLs, which eliminates the influence of human factors and realizes the automatic diagnosis of narrow-band imaging endoscopic images.
The invention provides a cluster distribution prior self-embedded detection network, which is based on a target detection neural network and specifically comprises the following steps: the system comprises a characteristic extraction backbone network, a characteristic pyramid network, a regional candidate network, an interest region pooling and clustering distribution prior self-embedded cancer focus classification network and an auxiliary diagnosis system for performing visualization on a narrow-band imaging endoscope image; wherein:
(1) the feature extraction backbone network is in ResNet-50[4]Constructed on the basis of 50And a convolutional layer for extracting a feature map of the input image (i.e., a feature extractor as a feature pyramid). Specifically, feature maps are extracted at the tail of layers 1,2,3 and 4 of the ResNet-50 model respectively, the extracted feature maps respectively have 256 channels, 512 channels, 1024 channels and 2048 channels, and the sizes of the feature maps are original sizes of 1/4 channels, 1/8 channels, 1/16 channels and 1/32 channels; feature graph feed into feature pyramid network[12]
(2) The characteristic pyramid network is used for fusing characteristics of different scales, firstly unifying all characteristic graphs to 256 channels by using convolution of 1 multiplied by 1, then up-sampling the characteristics of an upper layer to twice size layer by layer from top to bottom, adding the characteristics of the upper layer and the lower layer, and performing convolution of 3 multiplied by 3; thus, a multi-scale feature map is obtained: the sizes of the original images are 1/4, 1/8, 1/16 and 1/32 respectively, and the number of channels is 256;
(3) the region candidate network is used for extracting possible lesion regions; wherein the anchor generator is used first[5]Generating a dense rectangular candidate frame; the rectangle candidate frames have 5 × 3 different sizes, and are formed by combining five different sizes (such as width of 32, 64, 128, 256 and 512) and three different shapes (such as 1:1, 1:2 and 2: 1); the features of each layer in the feature pyramid are subjected to 3 × 3 convolution and 1 × 1 convolution, and the candidate box is judged to belong to a positive sample or a negative sample through Softmax; finally, performing boundary frame regression of three shapes (each box has 4 coordinates, so 3 × 4 is 12 channels) through convolution of 12 channels by 1 × 1, and correcting inaccurate candidate frames;
(4) the region of interest pooling and cluster distribution is a priori self-embedded cancer lesion classification network, wherein the region of interest pooling is performed by pooling features into suspicious lesion regions; the cluster distribution prior self-embedded cancer focus classification network is used for classifying the cancer focus; specifically, the region of interest is framed with a rectangular bounding box parallel to the coordinate axes, and the cancer focus classification result of the region, i.e., a normal (i.e., class a) region or a lesion region (i.e., class B1, B2, B3), is given; the network firstly extracts regions of interest from feature maps of different levels of a feature pyramid, aligns the regions of interest, and pools the regions of interest to 7 × 7 at maximum, so that each region of interest corresponds to a feature with the size of 256 × 7 × 7; then, the features of each region of interest are overlapped with the features of K adjacent neighbors (namely, the feature channels are connected and combined) to form a feature map in a shape of (256 × K) × 7 × 7, so that the classification network applies potential cancer focus distribution prior; two output branches are then produced through the full connection layer: the first branch circuit outputs the position offset of each characteristic region for further correcting the position of the detection frame; calculating the classification probability of the features through a Softmax function by the second branch to obtain the category of the cancer focus of the region; wherein the fully-connected layer is to flatten the characteristic diagram of (256 × K) × 7 × 7 shape to form a characteristic of (12544 × K) × 1 × 1, the output of the fully-connected layer is 1024 channels, the output of the first branch is 20 channels, i.e. each category corresponds to four coordinates of the bounding box, 5 × 4, and the output of the second branch is 5 channels, i.e. 5 categories including negative samples;
(5) the auxiliary diagnosis system for visualization on the narrow-band imaging endoscope image is used for finally carrying out visualization display on the narrow-band imaging endoscope image and carrying out frame selection marking on a cancer focus by using different colors. Specifically, the input is a narrow-band imaging endoscope image; the network is used for detecting and diagnosing the cancer foci, different color detection boxes are used for representing different cancer focus types, namely, the detection boxes respectively represent A, B1, B2 and B3 types in green, red, purple and black, and the classification confidence of the detection boxes is marked. Then, the confidence degrees of all the detection frames are screened, and the confidence degrees smaller than a threshold value T are removed1The non-maximum value is used for inhibiting and eliminating all the detection frames, and the intersection ratio is larger than the threshold value T2Redundant overlap boxes. Wherein T is1、T2Take [0, 1] over]All values of step size 0.05 within the range and by comparison F1Score to determine the optimal threshold value T1、T2
The training method of the network model comprises the following steps:
before training, network parameters of the ResNet-50 model are initialized randomly, images in a training set are scaled, the resolution of the images is not more than 800 x 1333, and corresponding bounding boxes are scaled in the same proportion.
During training, the images are firstly set to be [0.485, 0.456 and 0.406 ] according to the mean value]And standard deviation of [0.229, 0.224, 0.225 ]]Three channels (R, G, B) of the image are normalized. Using Adam optimization algorithm[16]Let the initial learning rate be 10-4Two estimated exponential decay rates: beta is a1Is set to 0.9, beta20.999, weight decay is 0, and a small batch stochastic gradient descent strategy is used, with the batch size set to 8 to minimize the loss function; training is carried out for N rounds; because the distribution of each type of blood vessel in the training set is not balanced, the blood vessels of B2 and B3 types can not be fully trained, and Focal loss is used as a loss function of the cancer focus classification network, wherein the weights of negative samples, A, B1, B2 and B3 are respectively C1、C2、C3、C4、C5And the distribution proportion of each blood vessel in the training set is determined after a plurality of experiments.
In the invention, after the image of the narrow-band imaging endoscope is input, the detection and diagnosis result of the cancer focus can be obtained only by one-time forward transmission.
The invention has the beneficial effects that:
the invention designs a cluster distribution prior self-embedded detection network, which takes a narrow-band imaging endoscope image as input and simultaneously realizes the cancer focus detection and diagnosis of early esophageal squamous cell carcinoma. The image to be tested can obtain detection and diagnosis results only through one-time forward propagation, partial network parameters are shared by detection and classification tasks, the calculated amount is effectively reduced, and the diagnosis efficiency is improved. Experimental results show that the invention can accurately detect the cancer focus area of early esophageal squamous carcinoma, provides accurate diagnosis results based on the detection frame, reduces the influence of human factors and improves the efficiency and accuracy of clinical diagnosis.
Drawings
FIG. 1 is a network framework diagram of the present invention.
FIG. 2 is a schematic diagram showing the detection and diagnosis effects of the present invention after the narrow-band imaging endoscope image is inputted into the network model. The method comprises the following steps of (a) obtaining a narrow-band imaging endoscope image, (b) obtaining a result of detecting and classifying a cancer focus in the image by the method, and (c) obtaining a result of detecting and classifying the cancer focus in the image by a doctor through experience.
Fig. 3 is a comparison of the present invention and the visualization effect of the doctor in the narrow band imaging endoscopic image for detection and diagnosis.
Figure 4 is a recall comparison of the present invention versus different classifications of physician detection and diagnosis in a narrow band imaging endoscopic image.
Fig. 5 is a characteristic diagram display after the characteristic extraction is performed through the characteristic extraction network in the present invention.
Detailed Description
The embodiments of the present invention are described in detail below, but the scope of the present invention is not limited to the examples.
The invention adopts the network framework shown in figure 1, and uses 144 narrow-band imaging endoscope images which are cooperatively marked by a plurality of doctors with abundant seniors for training, thereby obtaining a model which can automatically detect and diagnose the esophageal squamous cell carcinoma foci on the narrow-band imaging endoscope images. The specific process is as follows:
(1) before training, network parameters of the ResNet-50 model are initialized randomly, images in a training set are scaled, the resolution of the images is not more than 800 x 1333, and corresponding bounding boxes are scaled in the same proportion. .
(2) During training, the images are firstly set to be [0.485, 0.456 and 0.406 ] according to the mean value]And standard deviation of [0.229, 0.224, 0.225 ]]-normalizing the three channels (R, G, B) of the image; using Adam optimization algorithm[16]Let the initial learning rate be 10-4Twice estimated exponential decay Rate beta1Is set to 0.9, beta20.999, weight decay is 0, and a small batch stochastic gradient descent strategy is used, with the batch size set to 8 to minimize the loss function; training is carried out for N rounds; because the distribution of each type of blood vessel in the training set is not balanced, the blood vessels of B2 and B3 types can not be fully trained, and Focal loss is used as a loss function of the cancer focus classification network, wherein the weights of negative samples, A, B1, B2 and B3 are respectively C1、C2、C3、C4、C5The training is performed for multiple times according to the distribution proportion of each blood vessel in the training setDetermined after the experiment.
(3) During testing, the narrow-band imaging endoscope image is scaled so that the resolution does not exceed 800 × 1333, and the narrow-band imaging endoscope image is input into a trained model, and the model outputs the outer surrounding frames of all detected blood vessels, the corresponding cancer focus classes (including the normal class A and the four classes of the abnormal classes B1, B2 and B3) and the confidence coefficient p of the cancer focus classes. In particular, since the number of blood vessels included in a narrow-band imaging endoscopic image is large, the upper limit of the number of detection frames per image is set to 250. Setting a threshold T10.3, when p > 0.3, the outer bounding box is retained, otherwise the outer bounding box is removed. Setting a threshold T20.3, the remaining outer bounding box is non-maximally suppressed, remaining only in the neighborhood (intersection ratio greater than threshold T)2Time) the bounding box with the highest confidence p.
FIG. 2 illustrates the detection and diagnosis effect of the present invention after the narrow-band imaging endoscope image is inputted into the network model, wherein (a) is the original narrow-band imaging endoscope image; (b) for the outer surrounding frame obtained by detecting the cancer foci in the image and the corresponding classification and confidence coefficient, different colors respectively represent different cancer focus types, namely green, red, purple and white respectively represent A, B1, B2 and B3; (c) the method is a result of the cooperation of cancer focus detection and classification in images after discussion for a plurality of doctors with years of clinical practice and abundant experience. It can be seen from the figure that the results of the combined judgment of the system and a plurality of doctors with abundant experience are basically consistent in the detection and classification of the cancer foci, and the invention has strong application value.
Fig. 3 shows the visual effect of the present invention compared with the detection and diagnosis of a single doctor in a narrow-band imaging endoscopic image, wherein the reference basis (i.e. standard result) of the detection and diagnosis is cooperatively labeled by a plurality of qualified doctors. Therefore, when a single doctor carries out detection and diagnosis, mistakes and omissions are inevitably generated, and higher sensitivity cannot be achieved, but the system of the invention not only has higher judgment speed (each image is less than 1 second) but also has higher accuracy compared with a single doctor.
FIG. 4 is a comparison of recall rates of different categories of detection and diagnosis of the present invention compared to a single physician in a narrow band imaging endoscopic image, wherein the reference basis (i.e., standard outcome) for detection and diagnosis is co-labeled by a plurality of highly qualified physicians. It can be seen that the recall rate of the present invention is much higher than that of a single doctor, and the recall rate means the rate of detection and correct classification of real cancer foci, which means that the present invention is much less than the case of detection of misdiagnosis by a single doctor.
Fig. 5 is a characteristic diagram showing the characteristic extraction performed by the feature extraction network according to the present invention, and it can be seen that after the characteristic extraction, the characteristic values of blood vessels and non-blood vessels are greatly different, which indicates that the feature extraction network can effectively extract key features for detection and diagnosis from the narrow-band imaging endoscopic image.
Tables 1 and 2 show the sensitivity, accuracy and recall rate analysis of the invention and a single doctor in the narrow-band imaging endoscope image. Table 1 is the performance of the network of the present invention when K ═ 4 is taken (i.e., the classification uses feature fusion of 4 neighbors); table 2 shows the results of the tests and diagnoses by the individual doctors. The judgment criteria of detection and diagnosis are labeled by a plurality of doctors with rich seniority. Therefore, the invention surpasses the detection and diagnosis level of a single doctor in the recall rate and embodies the clinical use value of the invention.
TABLE 1
Type (B) TP FP FN Sensitivity of the device Rate of accuracy Recall rate
A 169 267 53 0.761 0.388 0.669
B1 3248 489 249 0.929 0.869 0.916
B2 98 40 70 0.583 0.710 0.466
B3 20 22 5 0.800 0.476 0.500
Overall 3535 818 377 0.904 0.812 0.884
TABLE 2
Type of lesion TP FP FN Sensitivity of the device Rate of accuracy Recall rate
A - - - - - 0.50
B1 - - - - - 0.70
B2 - - - - - 0.93
B3 - - - - - 1.00
Overall - - - - - 0.67
Reference to the literature
[1]Krizhevsky,A.,Sutskever,I.&Hinton,G.E.ImageNet classification with deep convolutional neural networks.Advances in Neural Information Processing Systems,1097-1105(2012).
[2]Russakovsky,O.,Deng,J.,Su,H.et al.ImageNet Large Scale Visual Recognition Challenge.International Journal of Computer Vision 115,211-252(2015).
[3]Simonyan,K.&Zisserman A.Very deep convolutional networks for large-scale image recognition.International Conference on Representation Learning,(2014).
[4]He,K.,Zhang,X.,Ren,S.&Sun,J.Deep residual learning for image recognition.IEEE Conference on Computer Vision and Pattern Recognition,770-778(2016).
[5]Girshick,R.,Donahue,J.,Darrell,T.&Malik,J.Rich feature hierarchies for accurate object detection and semantic segmentation.IEEE Conference on Computer Vision and Pattern Recognition,580-587(2014).
[6]Girshick,R.Fast R-CNN.IEEE International Conference on Computer Vision,1440-1448(2015).
[7]Ren,S.,He,K.,Girshick,R.&Sun,J.Faster R-CNN:Towards real-time object detection with region proposal networks.Neural Information Processing Systems,(2015).
[8]Long,J.,Shelhamer,E.&Darrell,T.Fully convolutional networks for semantic segmentation.IEEE International Conference on Computer Vision,3431-3440(2015).
[9]Chen,L.,Papandreou,G.,Kokkinos,I.,Murphy,K.&Yuille,A.L.DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs.IEEE Transactions on Pattern Analysis and Machine Intelligence 40,834-848(2018).
[10]Ervik M,L F,Ferlay J,et al.Cancer Today.Lyon,France:International Agency for Research on Cancer.Cancer Today.[EB/OL].[2017-02-26].
[11] Chenwangqing, Zhengrongshou, Zhangwei, etc. in 2013, the attack and death of Chinese malignant tumors are analyzed [ J ] in 2017,26(1):1-7.
[12]Tsung-Yi Lin,Piotr Dollár,Ross B.Girshick,Kaiming He,BharathHariharan,Serge J.Belongie:Feature Pyramid Networks for Object Detection.CVPR 2017:936-944.
[13]Zeng H,Zheng R,Guo Y,et al.Cancer survival in China,2003-2005:a population based study[J].Int J Cancer,2015,136(8)
[14]Chen WQ,Zheng RS,Baade PD,et al.Cancer statistics in China,2015[J].CA Cancer J Clin,2016,66(2):115-132.
[15]Inoue H,Kaga M,Ikeda H,et al.Magnification endoscopyin esophageal squamous cell carcinoma:a review of theintrapapillary capillary loop classification[J].AnnGastroenterol,2015,28(1):41-48.
[16]Diederik P.Kingma,Jimmy Ba.Adam:A Method for Stochastic Optimization.ICLR(Poster)2015.

Claims (3)

1. A clustering distribution prior self-embedded depth detection network for quantifying esophageal mucosa IPCLs vessel morphology distribution is characterized by specifically comprising: the system comprises a characteristic extraction backbone network, a characteristic pyramid network, a regional candidate network, an interest region pooling and clustering distribution prior self-embedded cancer focus classification network and an auxiliary diagnosis system for performing visualization on a narrow-band imaging endoscope image; wherein:
(1) the feature extraction backbone network is constructed on the basis of ResNet-50 and comprises 50 convolutional layers for extracting a feature map of an input image; specifically, feature maps are extracted at the tail of layers 1,2,3 and 4 of the ResNet-50 model respectively, the extracted feature maps respectively have 256 channels, 512 channels, 1024 channels and 2048 channels, and the sizes of the feature maps are original sizes of 1/4 channels, 1/8 channels, 1/16 channels and 1/32 channels; feature graph feed into feature pyramid network]
(2) The characteristic pyramid network is used for fusing characteristics of different scales, firstly unifying all characteristic graphs to 256 channels by using convolution of 1 multiplied by 1, then up-sampling the characteristics of an upper layer to twice size layer by layer from top to bottom, adding the characteristics of the upper layer and the lower layer, and performing convolution of 3 multiplied by 3; obtaining a multi-scale feature map: the sizes of the original images are 1/4, 1/8, 1/16 and 1/32 respectively, and the number of channels is 256;
(3) the region candidate network is used for extracting possible lesion regions; wherein, firstly, an anchor generator is used for generating a dense rectangular candidate box; the candidate frames of the rectangle have 5 multiplied by 3 different sizes and are formed by combining five different sizes and three different shapes; the features of each layer in the feature pyramid are subjected to 3 × 3 convolution and 1 × 1 convolution, and the candidate box is judged to belong to a positive sample or a negative sample through Softmax; finally, performing boundary frame regression of three shapes through convolution of 12 channels by 1 × 1, and correcting inaccurate candidate frames;
(4) the region of interest pooling and cluster distribution is a priori self-embedded cancer lesion classification network, wherein the region of interest pooling is performed by pooling features in the region of interest to suspicious lesion regions; the cluster distribution prior self-embedded cancer focus classification network is used for classifying the cancer focus; specifically, a region of interest is framed by a rectangular bounding box parallel to coordinate axes, and a cancer focus classification result of the region is given, namely a normal (type A) region or a lesion region (types B1, B2 and B3); the network firstly extracts interested areas from feature maps of different levels of a feature pyramid, aligns the interested areas and pools the interested areas to 7 x 7 to the maximum extent, so that each interested area corresponds to a feature with the size of 256 x 7; then, the features of each region of interest are superposed with the adjacent K adjacent features to form a (256 × K) × 7 × 7-shaped feature map, so that the classification network is applied to potential cancer focus distribution prior; two output branches are then produced through the full connection layer: the first branch circuit outputs the position offset of each characteristic region for further correcting the position of the detection frame; calculating the classification probability of the features through a Softmax function by the second branch to obtain the category of the cancer focus of the region; wherein the fully-connected layer is to flatten the characteristic diagram of (256 × K) × 7 × 7 shape to form a characteristic of (12544 × K) × 1 × 1, the output of the fully-connected layer is 1024 channels, the output of the first branch is 20 channels, i.e. each category corresponds to four coordinates of the bounding box, 5 × 4, and the output of the second branch is 5 channels, i.e. 5 categories including negative samples;
(5) the auxiliary diagnosis system for visualization on the narrow-band imaging endoscope image is used for performing visualization display on the narrow-band imaging endoscope image and performing frame selection marking on a cancer focus by using different colors; specifically, the input is a narrow-band imaging endoscope image; the network is used for detecting and diagnosing the cancer foci, and different color detection boxes are used for representing different cancer focus types, namely green, red, purple,Black represents A, B1, B2 and B3, and the classification confidence of the detection frame is marked; then, the confidence degrees of all the detection frames are screened, and the confidence degrees smaller than a threshold value T are removed1The non-maximum value is used for inhibiting and eliminating all the detection frames, and the intersection ratio is larger than the threshold value T2Redundant overlap boxes of (2); wherein T is1、T2Take [0, 1] over]All values of step size 0.05 within the range and by comparison F1Score to determine the optimal threshold value T1、T2
2. The deep detection network of claim 1, wherein the network model is trained as follows:
before training, randomly initializing network parameters of a ResNet-50 model, and scaling images in a training set to ensure that the resolution ratio of the images does not exceed 800 multiplied by 1333 and the corresponding bounding boxes are scaled in the same proportion;
during training, the images are firstly set to be [0.485, 0.456 and 0.406 ] according to the mean value]And standard deviation of [0.229, 0.224, 0.225 ]]-normalizing the three channels (R, G, B) of the image; using Adam optimization algorithm, set initial learning rate to 10-4Two estimated exponential decay rates: beta is a1Is set to 0.9, beta20.999, weight decay is 0, and a small batch stochastic gradient descent strategy is used, with the batch size set to 8 to minimize the loss function; training is carried out for N rounds; because the distribution of each type of blood vessel in the training set is not balanced, the blood vessels of B2 and B3 types cannot be sufficiently trained, Focalloss is used as a loss function of a cancer focus classification network, wherein the weights of negative samples, A, B1, B2 and B3 are respectively C1、C2、C3、C4、C5And the distribution proportion of each blood vessel in the training set is determined after a plurality of experiments.
3. The depth detection network of claim 2, wherein the images of the narrow-band imaging endoscope are input into the trained network and transmitted in a forward direction to obtain the detection and diagnosis result of the cancer focus.
CN202011263459.2A 2020-11-12 2020-11-12 Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution Active CN112419246B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011263459.2A CN112419246B (en) 2020-11-12 2020-11-12 Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011263459.2A CN112419246B (en) 2020-11-12 2020-11-12 Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution

Publications (2)

Publication Number Publication Date
CN112419246A true CN112419246A (en) 2021-02-26
CN112419246B CN112419246B (en) 2022-07-22

Family

ID=74831021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011263459.2A Active CN112419246B (en) 2020-11-12 2020-11-12 Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution

Country Status (1)

Country Link
CN (1) CN112419246B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113643291A (en) * 2021-10-14 2021-11-12 武汉大学 Method and device for determining esophagus marker infiltration depth grade and readable storage medium
CN113706533A (en) * 2021-10-28 2021-11-26 武汉大学 Image processing method, image processing device, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109118485A (en) * 2018-08-13 2019-01-01 复旦大学 Digestive endoscope image classification based on multitask neural network cancer detection system early
CN111784671A (en) * 2020-06-30 2020-10-16 天津大学 Pathological image focus region detection method based on multi-scale deep learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109118485A (en) * 2018-08-13 2019-01-01 复旦大学 Digestive endoscope image classification based on multitask neural network cancer detection system early
CN111784671A (en) * 2020-06-30 2020-10-16 天津大学 Pathological image focus region detection method based on multi-scale deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MOHAMED HUSSEIN ET AL.: "Role of artificial intelligence in the diagnosis of oesophageal neoplasia: 2020 an endoscopic odyssey", 《WORLD JOURNAL OF GASTROENTEROLOGY》 *
赵媛媛: "窄带光成像和放大内镜图像在早期食管鳞癌诊断中的应用研究及其计算机辅助诊断研究方法的探索性研究", 《中国优秀博硕士学位论文全文数据库(博士) 医药卫生科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113643291A (en) * 2021-10-14 2021-11-12 武汉大学 Method and device for determining esophagus marker infiltration depth grade and readable storage medium
CN113643291B (en) * 2021-10-14 2021-12-24 武汉大学 Method and device for determining esophagus marker infiltration depth grade and readable storage medium
CN113706533A (en) * 2021-10-28 2021-11-26 武汉大学 Image processing method, image processing device, computer equipment and storage medium
CN113706533B (en) * 2021-10-28 2022-02-08 武汉大学 Image processing method, image processing device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN112419246B (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN112102256B (en) Narrow-band endoscopic image-oriented cancer focus detection and diagnosis system for early esophageal squamous carcinoma
Li et al. A large-scale database and a CNN model for attention-based glaucoma detection
EP2685881B1 (en) Medical instrument for examining the cervix
Roth et al. A new 2.5 D representation for lymph node detection using random sets of deep convolutional neural network observations
CN111985536B (en) Based on weak supervised learning gastroscopic pathology image Classification method
CN109858540B (en) Medical image recognition system and method based on multi-mode fusion
CN110276356A (en) Eye fundus image aneurysms recognition methods based on R-CNN
CN108765392B (en) Digestive tract endoscope lesion detection and identification method based on sliding window
Lv et al. A cascade network for detecting covid-19 using chest x-rays
Pal et al. Deep metric learning for cervical image classification
CN112419246B (en) Depth detection network for quantifying esophageal mucosa IPCLs blood vessel morphological distribution
Sun et al. A novel gastric ulcer differentiation system using convolutional neural networks
Maghsoudi et al. A computer aided method to detect bleeding, tumor, and disease regions in Wireless Capsule Endoscopy
CN112102332A (en) Cancer WSI segmentation method based on local classification neural network
Lei et al. Automated detection of retinopathy of prematurity by deep attention network
CN114398979A (en) Ultrasonic image thyroid nodule classification method based on feature decoupling
Xing et al. A saliency-aware hybrid dense network for bleeding detection in wireless capsule endoscopy images
WO2021183765A1 (en) Automated detection of tumors based on image processing
Xiong et al. Automatic cataract classification based on multi-feature fusion and SVM
CN112419248A (en) Ear sclerosis focus detection and diagnosis system based on small target detection neural network
Krak et al. Detection of early pneumonia on individual CT scans with dilated convolutions
Cao et al. Deep learning based lesion detection for mammograms
Oloumi et al. Digital image processing for ophthalmology: Detection and modeling of retinal vascular architecture
Li et al. Tongue image segmentation via thresholding and clustering
Wang et al. A ROI extraction method for wrist imaging applied in smart bone-age assessment system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant