CN115205520A - Intelligent target detection method, system, electronic device and storage medium for gastroscope images - Google Patents
Intelligent target detection method, system, electronic device and storage medium for gastroscope images Download PDFInfo
- Publication number
- CN115205520A CN115205520A CN202210831722.6A CN202210831722A CN115205520A CN 115205520 A CN115205520 A CN 115205520A CN 202210831722 A CN202210831722 A CN 202210831722A CN 115205520 A CN115205520 A CN 115205520A
- Authority
- CN
- China
- Prior art keywords
- target
- type
- focus
- probability
- images
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 36
- 238000003384 imaging method Methods 0.000 claims abstract description 42
- 230000003902 lesion Effects 0.000 claims description 148
- 238000000034 method Methods 0.000 claims description 25
- 238000011176 pooling Methods 0.000 claims description 11
- 238000007781 pre-processing Methods 0.000 claims description 8
- 230000011218 segmentation Effects 0.000 claims description 8
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 6
- 238000002372 labelling Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 claims 6
- 238000004590 computer program Methods 0.000 claims 2
- 230000000007 visual effect Effects 0.000 claims 2
- 230000006870 function Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000001839 endoscopy Methods 0.000 description 2
- 208000005718 Stomach Neoplasms Diseases 0.000 description 1
- 208000007107 Stomach Ulcer Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 206010017758 gastric cancer Diseases 0.000 description 1
- 210000001156 gastric mucosa Anatomy 0.000 description 1
- 201000005917 gastric ulcer Diseases 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 201000011549 stomach cancer Diseases 0.000 description 1
- 208000018556 stomach disease Diseases 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明涉及目标检测技术领域,特别是涉及一种胃镜图像智能目标检测系统及方法、系统、电子设备及存储介质。The invention relates to the technical field of target detection, in particular to a gastroscope image intelligent target detection system and method, a system, an electronic device and a storage medium.
背景技术Background technique
目前,胃镜是诊断胃部疾病的重要手段,具体可以根据胃镜采集的胃粘膜上皮图像,判断当前患者是否患有胃癌或胃溃疡等疾病。At present, gastroscope is an important means of diagnosing gastric diseases. Specifically, it can be determined whether the current patient suffers from gastric cancer or gastric ulcer and other diseases according to the gastric mucosa epithelial images collected by gastroscope.
随着人工智能技术的发展,越来越多的医疗辅助技术与人工智能相结合,例如对医疗图像进行图像处理,然后利用深度学习对处理后的图像进行特征提取,然后输入到训练好的模型进行目标识别。人工智能技术可以降低以往人工对可能含病灶图像进行筛查识别的工作量,以及提高检测结果的准确率。With the development of artificial intelligence technology, more and more medical assistance technologies are combined with artificial intelligence, such as image processing of medical images, and then using deep learning to extract features from the processed images, and then input them into the trained model Perform target recognition. Artificial intelligence technology can reduce the workload of manual screening and identification of images that may contain lesions in the past, and improve the accuracy of detection results.
目前,胃镜图像数据集主要包括内镜中三种模式影像:白光图像、蓝激光成像技术(blue laser imaging,BLI)图像和内镜下联动成像模式(linked color imaging,LCI)图像。而目前的基于胃镜图像进行深度学习目标识别的方法通常是利用胃镜的白光图像进行一系列预处理,并输入至训练好的模型来识别病灶点的位置以及区域,以给医生提供辅助。但是由于蓝激光成像和LCI图像有着其独特的特点,可以放大一些细微差别,能更好的识别出病灶。且现有的目标识别技术目标区域的选取不够精确,且现有的识别算法例如YOLO,Fast-RCNN的检测准确率不高。基于上述问题,提供一种改进的胃镜图像的识别方法,来提高识别的准确率。At present, the gastroscopic image dataset mainly includes three modes of images in endoscopy: white light images, blue laser imaging (BLI) images, and linked color imaging (LCI) images under endoscopy. The current method of deep learning target recognition based on gastroscope images usually uses the white light images of gastroscope to perform a series of preprocessing, and input them to the trained model to identify the location and area of the lesion to provide assistance to doctors. However, due to the unique characteristics of blue laser imaging and LCI images, some subtle differences can be magnified and lesions can be better identified. And the selection of the target area of the existing target recognition technology is not accurate enough, and the detection accuracy of the existing recognition algorithms such as YOLO and Fast-RCNN is not high. Based on the above problems, an improved recognition method for gastroscopic images is provided to improve the recognition accuracy.
发明内容SUMMARY OF THE INVENTION
本发明提供一种胃镜图像智能目标检测方法、系统、电子设备及存储介质,以解决现有技术检测准确率低等缺陷。The present invention provides an intelligent target detection method, system, electronic device and storage medium for gastroscope images, so as to solve the defects of low detection accuracy in the prior art.
为实现上述目的,本发明提供了如下方案:For achieving the above object, the present invention provides the following scheme:
一种胃镜图像智能目标检测方法:An intelligent target detection method for gastroscope images:
获取同一患者的相同部位的待检测的胃镜白光图像、蓝激光成像图像和联动成像模式图像,其中,保持三种图像拍摄的角度相同;Acquire the gastroscopic white light image, blue laser imaging image and linkage imaging mode image of the same part of the same patient to be detected, wherein the shooting angles of the three images are kept the same;
利用精确的网格划分方法来分割出上述三种图像的目标区域;Use accurate mesh division method to segment the target area of the above three images;
将白光图像相应的目标区域图像输入至第一识别模型,得到第一目标位置、第一病灶类型、第一病灶概率;Inputting the target area image corresponding to the white light image into the first recognition model to obtain the first target position, the first lesion type, and the first lesion probability;
将蓝激光成像图像相应的目标区域图像输入至第二识别模型,得到第二目标位置、第二病灶类型、第二病灶概率;Inputting the corresponding target area image of the blue laser imaging image into the second recognition model to obtain the second target position, the second lesion type, and the second lesion probability;
将联动成像模式图像相应的目标区域图像输入至第三识别模型,得到第三目标位置、第三病灶类型、第三病灶概率;Inputting the target area image corresponding to the linked imaging mode image into the third recognition model to obtain the third target position, the third lesion type, and the third lesion probability;
将得到目标位置的三种图像经处理调整到视角、位置相对一致,对比三张图像中的目标位置,取第一至第三目标位置的交集为最终的目标位置,若第一、二、三病灶类型相同,则确定最终的目标病灶类型为第一或第二或第三类型,且其概率为第一、二、三的病灶概率的加权和;The three images of the target position are processed and adjusted so that the viewing angles and positions are relatively consistent, and the target positions in the three images are compared, and the intersection of the first to third target positions is taken as the final target position. If the lesion types are the same, the final target lesion type is determined to be the first, second or third type, and its probability is the weighted sum of the first, second, and third lesion probabilities;
若第一病灶类型与第二、第三病灶类型不相同,且第二、第三病灶类型相同,则确定最终的目标病灶类型为第三病灶类型,且其概率为第二、三的病灶概率的加权和;If the first lesion type is different from the second and third lesion types, and the second and third lesion types are the same, the final target lesion type is determined as the third lesion type, and its probability is the second and third lesion probability the weighted sum of ;
若第一病灶类型与第二病灶类型相同、与第三病灶类型不相同,则分别计算第一概率与第二概率的加权和M,将M与第三病灶概率相比,取概率值较大的概率对应的病灶类型为最终的目标病灶类型;If the type of the first lesion is the same as the type of the second lesion and the type of the third lesion is different, calculate the weighted sum M of the first probability and the second probability respectively, compare M with the probability of the third lesion, and take the larger probability value The probability corresponding to the lesion type is the final target lesion type;
若第一病灶类型与第二病灶类型不相同,且第一、第三病灶类型相同,则确定最终的目标病灶类型为第三病灶类型,且其概率为第一、三的病灶概率的加权和;If the first lesion type is different from the second lesion type, and the first and third lesion types are the same, the final target lesion type is determined as the third lesion type, and its probability is the weighted sum of the first and third lesion probabilities ;
若第一第二、第三病灶类型均不相同,则确定最终的目标病灶类型为第三病灶类型,且其概率为三的病灶概率。If the first, second, and third lesion types are different, the final target lesion type is determined to be the third lesion type, and the probability thereof is the lesion probability of three.
可选的,利用精确的网格划分方法来分割出上述三种图像的目标区域之前,还包括:对图像进行预处理。预处理包括随机裁剪,采样一个片段,使得裁剪部分与目标重叠到一定比例,裁剪完成后重新缩放到固定尺寸;还可以包括随机水平翻转。Optionally, before using an accurate grid division method to segment the target regions of the above three images, the method further includes: preprocessing the images. Preprocessing includes random cropping, sampling a segment so that the cropped part overlaps the target to a certain proportion, and re-scaling to a fixed size after cropping; it can also include random horizontal flipping.
可选的,利用精确的网格划分方法来分割出上述三种图像的目标区域,包括:分别收集若干张患者的胃镜白光图像、蓝激光成像图像和联动成像模式图像,首先通过目标分割,得到目标图像区域,其中目标分割采用精确的标注方法,将图像分成n×n个网格,搜索能够包含目标图像的最优的目标有效区域;对于n×n个网格可以搜索出若干个包含目标的目标有效区域,对于得到的若干个目标有效区域,对其中的每个网格提取其局部特征,并根据特征计算其属于目标区域的得分,得分最高的区域为分割出的最优有效区域。Optionally, use an accurate grid division method to segment the target areas of the above three types of images, including: collecting a number of gastroscopic white light images, blue laser imaging images, and linked imaging mode images of a patient, first, segmenting the target to obtain the target area. The target image area, in which the target segmentation adopts an accurate labeling method, divides the image into n×n grids, and searches for the optimal target effective area that can contain the target image; for n×n grids, several targets can be searched. The target effective area of , for several obtained target effective areas, extract its local features for each grid, and calculate its score belonging to the target area according to the features, and the area with the highest score is the segmented optimal effective area.
可选的,第一识别模型为利用Resnet18作为基础网络,在Resnet18第五,第六卷积层中加入注意力模块,将第六卷积层输出的特征F6经过上采样后与第五卷积层输出的特征F5融合得到特征F5’,将特征F5’再次经过上采样与第四卷积层输出的特征F4融合得到特征F4’,对经过Resnet18的最后一层卷积层的特征记为F,分别将特征F、F6、F5’、F4’输入至分类回归子网络,完成目标的分类和检测。Optionally, the first recognition model is to use Resnet18 as the basic network, add an attention module to the fifth and sixth convolutional layers of Resnet18, and upsample the feature F6 output by the sixth convolutional layer with the fifth convolution. The feature F5 output by the layer is fused to obtain the feature F5', and the feature F5' is again upsampled and fused with the feature F4 output by the fourth convolutional layer to obtain the feature F4', and the feature of the last convolutional layer of Resnet18 is recorded as F. , respectively input the features F, F6, F5', F4' to the classification and regression sub-network to complete the classification and detection of the target.
可选的,第二识别模型为利用沙漏网络作为主干网络,主干网络包含3个堆叠的沙漏网络,每个沙漏网络由3个4×4卷积层和1个跳跃连接残差块构成;在输入到沙漏网络前使用6×6的卷积网络和2个残差块对输入的图像进行预处理;从沙漏网络输出的特征经过特殊的池化层,特殊的池化层对主干网络输出的特征图进行水平方向从右向左的平均池化,得到特征图A,然后再从底至上做平均池化,得到另一个特征图B,将上述两个特征图的每个像素值求和,得到特征图C,特征图C经过激活层,全连接层后输出目标预测。Optionally, the second recognition model uses an hourglass network as the backbone network, the backbone network includes three stacked hourglass networks, and each hourglass network is composed of three 4×4 convolutional layers and one skip connection residual block; Before inputting to the hourglass network, a 6×6 convolutional network and 2 residual blocks are used to preprocess the input image; the features output from the hourglass network go through a special pooling layer, and the special pooling layer is used for the output of the backbone network. The feature map is average pooled from right to left in the horizontal direction to obtain feature map A, and then average pooled from bottom to top to obtain another feature map B, and the sum of each pixel value of the above two feature maps, The feature map C is obtained, and the feature map C passes through the activation layer and the fully connected layer to output the target prediction.
可选的,第三识别模型为采用VGG16作为基础网络,将其中的第6,7个全连接层替换为普通的卷积层,之后加入5个卷积层,分别表示为 conv_a1~conv_a5,最后经过平均池化输出记为conv_b1,提取卷积层conv_a1, conv_a3,conv_a5以及conv_b1层的各个特征图来分别构造检测框,并利用NMS 筛选得到最终的目标预测结果。Optionally, the third recognition model is to use VGG16 as the basic network, replace the 6th and 7th fully-connected layers with ordinary convolutional layers, and then add 5 convolutional layers, which are represented as conv_a1 to conv_a5, and finally After the average pooling output is denoted as conv_b1, each feature map of the convolutional layer conv_a1, conv_a3, conv_a5 and conv_b1 layers is extracted to construct the detection frame respectively, and the final target prediction result is obtained by NMS screening.
本发明实施例提供的一种胃镜图像智能目标检测系统,系统包括:A gastroscope image intelligent target detection system provided by an embodiment of the present invention includes:
成像模块,其用于获取同一患者的相同部位的待检测的胃镜白光图像、蓝激光成像图像和联动成像模式图像,其中,保持三种图像拍摄的角度相同;an imaging module, which is used to acquire a gastroscope white light image, a blue laser imaging image and a linked imaging mode image of the same part of the same patient to be detected, wherein the shooting angles of the three images are kept the same;
目标区域分割模块,利用精确的网格划分方法来分割出上述三种图像的目标区域;The target area segmentation module uses an accurate grid division method to segment the target areas of the above three images;
第一识别模块,将白光图像相应的目标区域图像输入至第一识别模型,得到第一目标位置、第一病灶类型、第一病灶概率;a first identification module, which inputs the target area image corresponding to the white light image into the first identification model, and obtains the first target position, the first lesion type, and the first lesion probability;
第二识别模块,将蓝激光成像图像相应的目标区域图像输入至第二识别模型,得到第二目标位置、第二病灶类型、第二病灶概率;The second identification module, inputting the corresponding target area image of the blue laser imaging image into the second identification model, to obtain the second target position, the second lesion type, and the second lesion probability;
第三识别模块,将联动成像模式图像相应的目标区域图像输入至第三识别模型,得到第三目标位置、第三病灶类型、第三病灶概率;The third recognition module, which inputs the target area image corresponding to the linked imaging mode image into the third recognition model, and obtains the third target position, the third lesion type, and the third lesion probability;
目标检测分析模块,将得到目标位置的三种图像经处理调整到视角、位置相对一致,对比三张图像中的目标位置,取第一至第三目标位置的交集为最终的目标位置,若第一、二、三病灶类型相同,则确定最终的目标病灶类型为第一或第二或第三类型,且其概率为第一、二、三病灶概率的加权和;若第一病灶类型与第二、第三病灶类型不相同,且第二、第三病灶类型相同,则确定最终的目标病灶类型为第三病灶类型,且其概率为第二、三病灶概率的加权和;若第一病灶类型与第二病灶类型相同、与第三病灶类型不相同,则分别计算第一概率与第二概率的加权和M,将M与第三病灶概率相比,取概率值较大的概率对应的病灶类型为最终的目标病灶类型;若第一病灶类型与第二病灶类型不相同,且第一、第三病灶类型相同,则确定最终的目标病灶类型为第三病灶类型,且其概率为第一、三病灶概率的加权和;若第一第二、第三病灶类型均不相同,则确定最终的目标病灶类型为第三病灶类型,且其概率为第三病灶概率。The target detection and analysis module processes and adjusts the three images of the target position so that the viewing angles and positions are relatively consistent, compares the target positions in the three images, and takes the intersection of the first to third target positions as the final target position. The types of the first, second, and third lesions are the same, and the final target lesion type is determined to be the first, second, or third type, and its probability is the weighted sum of the probabilities of the first, second, and third lesions; 2. The types of the third lesions are not the same, and the types of the second and third lesions are the same, then the final target lesion type is determined to be the third lesion type, and its probability is the weighted sum of the probabilities of the second and third lesions; if the first lesion is of the same type If the type is the same as the type of the second lesion and different from the type of the third lesion, the weighted sum M of the first probability and the second probability is calculated respectively, and M is compared with the probability of the third lesion, and the probability corresponding to the larger probability value is taken. The lesion type is the final target lesion type; if the first lesion type is different from the second lesion type, and the first and third lesion types are the same, the final target lesion type is determined to be the third lesion type, and its probability is the third lesion type. The weighted sum of the probabilities of the first and third lesions; if the types of the first, second, and third lesions are different, the final target lesion type is determined to be the third lesion type, and its probability is the third lesion probability.
本发明还提供一种电子设备,包括:至少一个处理器和存储器;The present invention also provides an electronic device, comprising: at least one processor and a memory;
所述存储器存储计算机执行指令;the memory stores computer-executable instructions;
所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如上第一个方面以及第一个方面各种可能的设计所述的方法。The at least one processor executes computer-implemented instructions stored in the memory to cause the at least one processor to perform the methods described above in the first aspect and various possible designs of the first aspect.
本发明还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一个方面以及第一个方面各种可能的设计所述的方法。The present invention also provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium. When a processor executes the computer-executable instructions, the first aspect and various aspects of the first aspect are implemented. possible designs of the method described.
与现有技术相比,本发明的有益效果是:Compared with the prior art, the beneficial effects of the present invention are:
本发明采用不同的成像模式的图像来识别胃镜图像中的目标,能够克服现有技术中单独采用其中一种图像来识别目标的准确度不高的问题;且对于三种不同成像模式的图像能够根据图像的特点采用适合的目标识别模型,最后根据三种模型的输出结果及其自身特点来得到目标的病灶类型和概率,提高了目标识别的准确度和精度,能更好地辅助医生。The invention adopts images of different imaging modes to identify the target in the gastroscopic image, and can overcome the problem that the accuracy of using one of the images alone to identify the target in the prior art is not high; and the images of three different imaging modes can be According to the characteristics of the image, a suitable target recognition model is adopted, and finally the target lesion type and probability are obtained according to the output results of the three models and their own characteristics, which improves the accuracy and precision of target recognition and can better assist doctors.
本发明采用的目标分割采用了精确的网格标注方法,能够更准确地提取目标感兴趣区域,减少背景图像带来的后续识别的干扰。The target segmentation adopted in the present invention adopts an accurate grid labeling method, which can more accurately extract the target area of interest and reduce the interference of subsequent identification caused by the background image.
附图说明Description of drawings
图1是本发明的胃镜图像智能目标检测方法的流程图。Fig. 1 is a flow chart of the intelligent target detection method for gastroscope images of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
参见图1,本发明提供一种胃镜图像智能目标检测方法:Referring to Figure 1, the present invention provides an intelligent target detection method for gastroscope images:
为实现上述目的,本发明提供了如下方案:For achieving the above object, the present invention provides the following scheme:
一种胃镜图像智能目标检测方法:An intelligent target detection method for gastroscope images:
获取同一患者的相同部位的待检测的胃镜白光图像、蓝激光成像图像和联动成像模式图像,其中,保持三种图像拍摄的角度相同;Acquire the gastroscopic white light image, blue laser imaging image and linkage imaging mode image of the same part of the same patient to be detected, wherein the shooting angles of the three images are kept the same;
利用精确的网格划分方法来分割出上述三种图像的目标区域;Use accurate mesh division method to segment the target area of the above three images;
将白光图像相应的目标区域图像输入至第一识别模型,得到第一目标位置、第一病灶类型、第一病灶概率;Inputting the target area image corresponding to the white light image into the first recognition model to obtain the first target position, the first lesion type, and the first lesion probability;
将蓝激光成像图像相应的目标区域图像输入至第二识别模型,得到第二目标位置、第二病灶类型、第二病灶概率;Inputting the corresponding target area image of the blue laser imaging image into the second recognition model to obtain the second target position, the second lesion type, and the second lesion probability;
将联动成像模式图像相应的目标区域图像输入至第三识别模型,得到第三目标位置、第三病灶类型、第三病灶概率;Inputting the target area image corresponding to the linked imaging mode image into the third recognition model to obtain the third target position, the third lesion type, and the third lesion probability;
将得到目标位置的三种图像经处理调整到视角、位置相对一致,对比三张图像中的目标位置,取第一至第三目标位置的交集为最终的目标位置,若第一、二、三病灶类型相同,则确定最终的目标病灶类型为第一或第二或第三类型,且其概率为第一、二、三的病灶概率的加权和;The three images of the target position are processed and adjusted so that the viewing angles and positions are relatively consistent, and the target positions in the three images are compared, and the intersection of the first to third target positions is taken as the final target position. If the lesion types are the same, the final target lesion type is determined to be the first, second or third type, and its probability is the weighted sum of the first, second, and third lesion probabilities;
若第一病灶类型与第二、第三病灶类型不相同,且第二、第三病灶类型相同,则确定最终的目标病灶类型为第三病灶类型,且其概率为第二、三的病灶概率的加权和;If the first lesion type is different from the second and third lesion types, and the second and third lesion types are the same, the final target lesion type is determined as the third lesion type, and its probability is the second and third lesion probability the weighted sum of ;
若第一病灶类型与第二病灶类型相同、与第三病灶类型不相同,则分别计算第一概率与第二概率的加权和M,将M与第三病灶概率相比,取概率值较大的概率对应的病灶类型为最终的目标病灶类型;If the type of the first lesion is the same as the type of the second lesion and the type of the third lesion is different, calculate the weighted sum M of the first probability and the second probability respectively, compare M with the probability of the third lesion, and take the larger probability value The probability corresponding to the lesion type is the final target lesion type;
若第一病灶类型与第二病灶类型不相同,且第一、第三病灶类型相同,则确定最终的目标病灶类型为第三病灶类型,且其概率为第一、三的病灶概率的加权和;If the first lesion type is different from the second lesion type, and the first and third lesion types are the same, the final target lesion type is determined as the third lesion type, and its probability is the weighted sum of the first and third lesion probabilities ;
若第一第二、第三病灶类型均不相同,则确定最终的目标病灶类型为第三病灶类型,且其概率为三的病灶概率。If the first, second, and third lesion types are different, the final target lesion type is determined to be the third lesion type, and the probability thereof is the lesion probability of three.
可选的,利用精确的网格划分方法来分割出上述三种图像的目标区域之前,还包括:对图像进行预处理。预处理包括随机裁剪,采样一个片段,使得剪裁部分与目标重叠到一定比例,剪裁完成后重新缩放到固定尺寸;还可以随机水平翻转。Optionally, before using an accurate grid division method to segment the target regions of the above three images, the method further includes: preprocessing the images. Preprocessing includes random cropping, sampling a segment so that the cropped part overlaps the target to a certain proportion, and re-scaling to a fixed size after cropping; it can also be randomly flipped horizontally.
可选的,利用精确的网格划分方法来分割出上述三种图像的目标区域,包括:分别收集若干张患者的胃镜白光图像、蓝激光成像图像和联动成像模式图像,首先通过目标分割,得到目标图像区域,其中目标分割采用精确的标注方法,将图像分成n×n个网格,搜索能够包含目标图像的最优的目标有效区域。对于n×n个网格可以搜索出若干个包含目标的目标有效区域,对于得到的若干个目标有效区域,对其中的每个网格提取其局部特征,并根据特征计算其属于目标区域的得分,那么得分最高的区域为分割出的最优有效区域。Optionally, use an accurate grid division method to segment the target areas of the above three types of images, including: collecting a number of gastroscopic white light images, blue laser imaging images, and linked imaging mode images of a patient, first, segmenting the target to obtain the target area. The target image area, in which the target segmentation adopts an accurate labeling method, divides the image into n×n grids, and searches for the optimal target effective area that can contain the target image. For n×n grids, several target effective areas containing the target can be searched, and for the obtained several target effective areas, the local features of each grid are extracted, and the score of the target area is calculated according to the features. , then the region with the highest score is the optimal effective region segmented.
可选的,第一识别模型为利用Resnet18作为基础网络,在Resnet18第五,第六卷积层中加入注意力模块,将第六卷积层输出的特征F6经过上采样后与第五卷积层输出的特征F5融合得到特征F5’,将特征F5’再次经过上采样与第四卷积层输出的特征F4融合得到特征F4’,对经过Resnet18的最后一层卷积层的特征记为F,分别将特征F、F6、F5’、F4’输入至分类回归子网络,完成目标的分类和检测。Optionally, the first recognition model is to use Resnet18 as the basic network, add an attention module to the fifth and sixth convolutional layers of Resnet18, and upsample the feature F6 output by the sixth convolutional layer with the fifth convolution. The feature F5 output by the layer is fused to obtain the feature F5', and the feature F5' is again upsampled and fused with the feature F4 output by the fourth convolutional layer to obtain the feature F4', and the feature of the last convolutional layer of Resnet18 is recorded as F. , respectively input the features F, F6, F5', F4' to the classification and regression sub-network to complete the classification and detection of the target.
第一识别模型在训练时,正负样本的数量是不平衡的,为了降低这种不平衡,模型在训练时采用了优化的交叉熵损失函数作为目标损失函数,具体如下:During the training of the first recognition model, the number of positive and negative samples is unbalanced. In order to reduce this imbalance, the model adopts the optimized cross-entropy loss function as the target loss function during training, as follows:
y是标签,y取1表示是分类为目标, y取0表示分类不是目标,y’表示分类的概率,α是平衡参数,γ是调节的速率。损失函数在表达准确样本和非准确样本时,也会存在不平衡,得到损失大于1的点,可能认为是噪声,对这些点会更加敏感,所以对目标损失函数进一步做平衡,具体如下: y is the label, y takes 1 to indicate that the classification is the target, y takes 0 to indicate that the classification is not the target, y' indicates the probability of classification, α is the balance parameter, and γ is the rate of adjustment. When the loss function expresses accurate samples and inaccurate samples, there will also be imbalance. If the loss is greater than 1, it may be considered as noise, and it will be more sensitive to these points. Therefore, the target loss function is further balanced, as follows:
Lc(loss)=n/m(m|loss|+1)ln(m|loss|+1)-n|loss||loss|<1,Lc(loss)=n/m(m|loss|+1)ln(m|loss|+1)-n|loss||loss|<1,
Lc(loss)=β|loss|+D|loss|≥1,Lc(loss)=β|loss|+D|loss|≥1,
其中,nln(m+1)=βAmong them, nln(m+1)=β
loss是平衡前的损失函数取值,n、m为调整因子,β是调整误差的参数, D为设定的参数,通过调整损失函数,可以得到较为平衡的训练。loss is the value of the loss function before the balance, n and m are the adjustment factors, β is the parameter to adjust the error, and D is the set parameter. By adjusting the loss function, a more balanced training can be obtained.
可选的,第二识别模型为利用沙漏网络作为主干网络,主干网络包含3个堆叠的沙漏网络,每个沙漏网络由3个4×4卷积层和1个跳跃连接残差块构成;在输入到沙漏网络前使用6×6的卷积网络和2个残差块对输入的图像进行预处理;从沙漏网络输出的特征经过特殊的池化层,特殊的池化层对主干网络输出的特征图进行水平方向从右向左的平均池化,得到特征图A,然后再从底至上做平均池化,得到另一个特征图B,将上述两个特征图的每个像素值求和,得到特征图C,特征图C经过激活层,全连接层后输出目标预测。Optionally, the second recognition model uses an hourglass network as the backbone network, the backbone network includes three stacked hourglass networks, and each hourglass network is composed of three 4×4 convolutional layers and one skip connection residual block; Before inputting to the hourglass network, a 6×6 convolutional network and 2 residual blocks are used to preprocess the input image; the features output from the hourglass network go through a special pooling layer, and the special pooling layer is used for the output of the backbone network. The feature map is average pooled from right to left in the horizontal direction to obtain feature map A, and then average pooled from bottom to top to obtain another feature map B, and the sum of each pixel value of the above two feature maps, The feature map C is obtained, and the feature map C passes through the activation layer and the fully connected layer to output the target prediction.
可选的,第三识别模型为采用VGG16作为基础网络,将其中的第6,7个全连接层替换为普通的卷积层,之后加入5个卷积层,分别表示为 conv_a1~conv_a5,最后经过平均池化输出记为conv_b1,提取卷积层conv_a1, conv_a3,conv_a5以及conv_b1层的各个特征图来分别构造检测框,并利用NMS 筛选得到最终的目标预测结果。Optionally, the third recognition model is to use VGG16 as the basic network, replace the 6th and 7th fully-connected layers with ordinary convolutional layers, and then add 5 convolutional layers, which are represented as conv_a1 to conv_a5, and finally After the average pooling output is denoted as conv_b1, each feature map of the convolutional layer conv_a1, conv_a3, conv_a5 and conv_b1 layers is extracted to construct the detection frame respectively, and the final target prediction result is obtained by NMS screening.
本发明实施例提供的一种胃镜图像智能目标检测系统,系统包括:A gastroscope image intelligent target detection system provided by an embodiment of the present invention includes:
成像模块,其用于获取同一患者的相同部位的待检测的胃镜白光图像、蓝激光成像图像和联动成像模式图像,其中,保持三种图像拍摄的角度相同;an imaging module, which is used to acquire a gastroscope white light image, a blue laser imaging image and a linked imaging mode image of the same part of the same patient to be detected, wherein the shooting angles of the three images are kept the same;
目标区域分割模块,利用精确的网格划分方法来分割出上述三种图像的目标区域;The target area segmentation module uses an accurate grid division method to segment the target areas of the above three images;
第一识别模块,将白光图像相应的目标区域图像输入至第一识别模型,得到第一目标位置、第一病灶类型、第一病灶概率;a first identification module, which inputs the target area image corresponding to the white light image into the first identification model, and obtains the first target position, the first lesion type, and the first lesion probability;
第二识别模块,将蓝激光成像图像相应的目标区域图像输入至第二识别模型,得到第二目标位置、第二病灶类型、第二病灶概率;The second identification module, inputting the corresponding target area image of the blue laser imaging image into the second identification model, to obtain the second target position, the second lesion type, and the second lesion probability;
第三识别模块,将联动成像模式图像相应的目标区域图像输入至第三识别模型,得到第三目标位置、第三病灶类型、第三病灶概率;The third recognition module, which inputs the target area image corresponding to the linked imaging mode image into the third recognition model, and obtains the third target position, the third lesion type, and the third lesion probability;
目标检测分析模块,将得到目标位置的三种图像经处理调整到视角、位置相对一致,对比三张图像中的目标位置,取第一至第三目标位置的交集为最终的目标位置,若第一、二、三病灶类型相同,则确定最终的目标病灶类型为第一或第二或第三类型,且其概率为第一、二、三病灶概率的加权和;若第一病灶类型与第二、第三病灶类型不相同,且第二、第三病灶类型相同,则确定最终的目标病灶类型为第三病灶类型,且其概率为第二、三病灶概率的加权和;若第一病灶类型与第二病灶类型相同、与第三病灶类型不相同,则分别计算第一概率与第二概率的加权和M,将M与第三病灶概率相比,取概率值较大的概率对应的病灶类型为最终的目标病灶类型;若第一病灶类型与第二病灶类型不相同,且第一、第三病灶类型相同,则确定最终的目标病灶类型为第三病灶类型,且其概率为第一、三病灶概率的加权和;若第一第二、第三病灶类型均不相同,则确定最终的目标病灶类型为第三病灶类型,且其概率为第三病灶概率。The target detection and analysis module processes and adjusts the three images of the target position so that the viewing angles and positions are relatively consistent, compares the target positions in the three images, and takes the intersection of the first to third target positions as the final target position. The types of the first, second, and third lesions are the same, and the final target lesion type is determined to be the first, second, or third type, and its probability is the weighted sum of the probabilities of the first, second, and third lesions; 2. The types of the third lesions are not the same, and the types of the second and third lesions are the same, then the final target lesion type is determined to be the third lesion type, and its probability is the weighted sum of the probabilities of the second and third lesions; if the first lesion is of the same type If the type is the same as the type of the second lesion and different from the type of the third lesion, the weighted sum M of the first probability and the second probability is calculated respectively, and M is compared with the probability of the third lesion, and the probability corresponding to the larger probability value is taken. The lesion type is the final target lesion type; if the first lesion type is different from the second lesion type, and the first and third lesion types are the same, the final target lesion type is determined to be the third lesion type, and its probability is the third lesion type. The weighted sum of the probabilities of the first and third lesions; if the types of the first, second, and third lesions are different, the final target lesion type is determined to be the third lesion type, and its probability is the third lesion probability.
本发明实施例提供的一种电子设备,用于执行上述实施例提供的胃镜图像目标检测方法,其实现方式与原理相同,不再赘述。An electronic device provided by an embodiment of the present invention is used to execute the gastroscope image target detection method provided by the above-mentioned embodiment, and its implementation manner is the same as the principle, which will not be repeated.
本发明实施例提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上任一实施例提供的胃镜图像目标检测方法。An embodiment of the present invention provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when a processor executes the computer-executable instructions, the gastroscopic image target provided in any of the above embodiments is realized Detection method.
本发明实施例的包含计算机可执行指令的存储介质,可用于存储前述实施例中提供的胃镜图像目标检测方法的计算机执行指令,其实现方式与原理相同,不再赘述。The storage medium containing the computer-executable instructions of the embodiments of the present invention can be used to store the computer-executable instructions of the gastroscopic image target detection method provided in the foregoing embodiments, and the implementation manner is the same as the principle, which will not be repeated.
本领域技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of the description, only the division of the above functional modules is used for illustration. The internal structure is divided into different functional modules to complete all or part of the functions described above. For the specific working process of the apparatus described above, reference may be made to the corresponding process in the foregoing method embodiments, and details are not described herein again.
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions described in the foregoing embodiments can still be modified, or some or all of the technical features thereof can be equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present application. scope.
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210831722.6A CN115205520A (en) | 2022-07-14 | 2022-07-14 | Intelligent target detection method, system, electronic device and storage medium for gastroscope images |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210831722.6A CN115205520A (en) | 2022-07-14 | 2022-07-14 | Intelligent target detection method, system, electronic device and storage medium for gastroscope images |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115205520A true CN115205520A (en) | 2022-10-18 |
Family
ID=83582770
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210831722.6A Pending CN115205520A (en) | 2022-07-14 | 2022-07-14 | Intelligent target detection method, system, electronic device and storage medium for gastroscope images |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115205520A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116310282A (en) * | 2023-03-15 | 2023-06-23 | 郑州大学 | Method and system for identifying lesions in thoracoscopic images |
CN116784827A (en) * | 2023-02-14 | 2023-09-22 | 安徽省儿童医院 | Digestive tract ulcer depth and area measuring and calculating method based on endoscope |
CN117557776A (en) * | 2023-11-10 | 2024-02-13 | 朗信医疗科技(无锡)有限公司 | Multi-mode target detection method and device |
-
2022
- 2022-07-14 CN CN202210831722.6A patent/CN115205520A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116784827A (en) * | 2023-02-14 | 2023-09-22 | 安徽省儿童医院 | Digestive tract ulcer depth and area measuring and calculating method based on endoscope |
CN116784827B (en) * | 2023-02-14 | 2024-02-06 | 安徽省儿童医院 | Digestive tract ulcer depth and area measuring and calculating method based on endoscope |
CN116310282A (en) * | 2023-03-15 | 2023-06-23 | 郑州大学 | Method and system for identifying lesions in thoracoscopic images |
CN116310282B (en) * | 2023-03-15 | 2024-06-18 | 郑州大学 | Method and system for identifying focus in thoracoscopic image |
CN117557776A (en) * | 2023-11-10 | 2024-02-13 | 朗信医疗科技(无锡)有限公司 | Multi-mode target detection method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110599448B (en) | Migratory learning lung lesion tissue detection system based on MaskScoring R-CNN network | |
WO2021213508A1 (en) | Capsule endoscopic image stitching method, electronic device, and readable storage medium | |
CN111784671B (en) | Pathological image lesion area detection method based on multi-scale deep learning | |
CN110852316B (en) | Image tampering detection and positioning method adopting convolution network with dense structure | |
CN110889853B (en) | Tumor segmentation method based on residual error-attention deep neural network | |
CN110889852B (en) | Liver segmentation method based on residual error-attention deep neural network | |
CN115205520A (en) | Intelligent target detection method, system, electronic device and storage medium for gastroscope images | |
CN113065558A (en) | Lightweight small target detection method combined with attention mechanism | |
CN112150493B (en) | Semantic guidance-based screen area detection method in natural scene | |
US11972571B2 (en) | Method for image segmentation, method for training image segmentation model | |
EP3939006A1 (en) | Feature point detection | |
CN111860336A (en) | Position-awareness-based detection method for tilted ships in high-resolution remote sensing images | |
WO2021147429A1 (en) | Endoscopic image display method, apparatus, computer device, and storage medium | |
CN113034462B (en) | Method and system for processing gastric cancer pathological section image based on graph convolution | |
CN111080639A (en) | Multi-scene digestive tract endoscope image identification method and system based on artificial intelligence | |
CN113781489A (en) | Polyp image semantic segmentation method and device | |
US20240005494A1 (en) | Methods and systems for image quality assessment | |
CN116543386A (en) | Agricultural pest image identification method based on convolutional neural network | |
CN115035127A (en) | A Retinal Vessel Segmentation Method Based on Generative Adversarial Networks | |
CN118379696B (en) | Ship target detection method and device and readable storage medium | |
CN112633113B (en) | Cross-camera human face living body detection method and system | |
CN113935961A (en) | Robust breast molybdenum target MLO (Multi-level object) visual angle image pectoral muscle segmentation method | |
CN118279667A (en) | Deep learning vitiligo identification method for dermoscope image | |
CN117710820A (en) | Rice pest and disease detection method and system based on improved YOLOv8 | |
CN116797611A (en) | Polyp focus segmentation method, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |