CN115205520A

CN115205520A - Intelligent target detection method, system, electronic device and storage medium for gastroscope images

Info

Publication number: CN115205520A
Application number: CN202210831722.6A
Authority: CN
Inventors: 牛倩倩; 秦进; 吴芃诺
Original assignee: First Affiliated Hospital of Zhengzhou University
Current assignee: First Affiliated Hospital of Zhengzhou University
Priority date: 2022-07-14
Filing date: 2022-07-14
Publication date: 2022-10-18

Abstract

The invention discloses a gastroscope image intelligent target detection method, a gastroscope image intelligent target detection system, electronic equipment and a storage medium. Acquiring a white light image, a blue laser imaging image and a linkage imaging mode image of a gastroscope to be detected at the same part of the same patient, segmenting a target region of the three images, respectively inputting the three images into corresponding recognition models to recognize the type of a focus, and finally acquiring the type and the probability of the focus of a target in the gastroscope image according to the results of the three recognition models. The invention adopts images of different imaging modes to identify the target in the gastroscope image, and can overcome the problem of low accuracy of singly adopting one of the images to identify the target in the prior art; and for the images of the three different imaging modes, a suitable target recognition model can be adopted according to the characteristics of the images, and finally the focus type and probability of the target can be obtained according to the output results of the three models and the characteristics of the models, so that the accuracy and precision of target recognition are improved.

Description

Intelligent target detection method, system, electronic device and storage medium for gastroscope images

技术领域technical field

本发明涉及目标检测技术领域，特别是涉及一种胃镜图像智能目标检测系统及方法、系统、电子设备及存储介质。The invention relates to the technical field of target detection, in particular to a gastroscope image intelligent target detection system and method, a system, an electronic device and a storage medium.

背景技术Background technique

目前，胃镜是诊断胃部疾病的重要手段，具体可以根据胃镜采集的胃粘膜上皮图像，判断当前患者是否患有胃癌或胃溃疡等疾病。At present, gastroscope is an important means of diagnosing gastric diseases. Specifically, it can be determined whether the current patient suffers from gastric cancer or gastric ulcer and other diseases according to the gastric mucosa epithelial images collected by gastroscope.

随着人工智能技术的发展，越来越多的医疗辅助技术与人工智能相结合，例如对医疗图像进行图像处理，然后利用深度学习对处理后的图像进行特征提取，然后输入到训练好的模型进行目标识别。人工智能技术可以降低以往人工对可能含病灶图像进行筛查识别的工作量，以及提高检测结果的准确率。With the development of artificial intelligence technology, more and more medical assistance technologies are combined with artificial intelligence, such as image processing of medical images, and then using deep learning to extract features from the processed images, and then input them into the trained model Perform target recognition. Artificial intelligence technology can reduce the workload of manual screening and identification of images that may contain lesions in the past, and improve the accuracy of detection results.

目前，胃镜图像数据集主要包括内镜中三种模式影像：白光图像、蓝激光成像技术(blue laser imaging，BLI)图像和内镜下联动成像模式(linked color imaging，LCI)图像。而目前的基于胃镜图像进行深度学习目标识别的方法通常是利用胃镜的白光图像进行一系列预处理，并输入至训练好的模型来识别病灶点的位置以及区域，以给医生提供辅助。但是由于蓝激光成像和LCI图像有着其独特的特点，可以放大一些细微差别，能更好的识别出病灶。且现有的目标识别技术目标区域的选取不够精确，且现有的识别算法例如YOLO，Fast-RCNN的检测准确率不高。基于上述问题，提供一种改进的胃镜图像的识别方法，来提高识别的准确率。At present, the gastroscopic image dataset mainly includes three modes of images in endoscopy: white light images, blue laser imaging (BLI) images, and linked color imaging (LCI) images under endoscopy. The current method of deep learning target recognition based on gastroscope images usually uses the white light images of gastroscope to perform a series of preprocessing, and input them to the trained model to identify the location and area of the lesion to provide assistance to doctors. However, due to the unique characteristics of blue laser imaging and LCI images, some subtle differences can be magnified and lesions can be better identified. And the selection of the target area of the existing target recognition technology is not accurate enough, and the detection accuracy of the existing recognition algorithms such as YOLO and Fast-RCNN is not high. Based on the above problems, an improved recognition method for gastroscopic images is provided to improve the recognition accuracy.

发明内容SUMMARY OF THE INVENTION

本发明提供一种胃镜图像智能目标检测方法、系统、电子设备及存储介质，以解决现有技术检测准确率低等缺陷。The present invention provides an intelligent target detection method, system, electronic device and storage medium for gastroscope images, so as to solve the defects of low detection accuracy in the prior art.

为实现上述目的，本发明提供了如下方案：For achieving the above object, the present invention provides the following scheme:

一种胃镜图像智能目标检测方法：An intelligent target detection method for gastroscope images:

获取同一患者的相同部位的待检测的胃镜白光图像、蓝激光成像图像和联动成像模式图像，其中，保持三种图像拍摄的角度相同；Acquire the gastroscopic white light image, blue laser imaging image and linkage imaging mode image of the same part of the same patient to be detected, wherein the shooting angles of the three images are kept the same;

利用精确的网格划分方法来分割出上述三种图像的目标区域；Use accurate mesh division method to segment the target area of the above three images;

将白光图像相应的目标区域图像输入至第一识别模型，得到第一目标位置、第一病灶类型、第一病灶概率；Inputting the target area image corresponding to the white light image into the first recognition model to obtain the first target position, the first lesion type, and the first lesion probability;

将蓝激光成像图像相应的目标区域图像输入至第二识别模型，得到第二目标位置、第二病灶类型、第二病灶概率；Inputting the corresponding target area image of the blue laser imaging image into the second recognition model to obtain the second target position, the second lesion type, and the second lesion probability;

将联动成像模式图像相应的目标区域图像输入至第三识别模型，得到第三目标位置、第三病灶类型、第三病灶概率；Inputting the target area image corresponding to the linked imaging mode image into the third recognition model to obtain the third target position, the third lesion type, and the third lesion probability;

将得到目标位置的三种图像经处理调整到视角、位置相对一致，对比三张图像中的目标位置，取第一至第三目标位置的交集为最终的目标位置，若第一、二、三病灶类型相同，则确定最终的目标病灶类型为第一或第二或第三类型，且其概率为第一、二、三的病灶概率的加权和；The three images of the target position are processed and adjusted so that the viewing angles and positions are relatively consistent, and the target positions in the three images are compared, and the intersection of the first to third target positions is taken as the final target position. If the lesion types are the same, the final target lesion type is determined to be the first, second or third type, and its probability is the weighted sum of the first, second, and third lesion probabilities;

若第一病灶类型与第二、第三病灶类型不相同，且第二、第三病灶类型相同，则确定最终的目标病灶类型为第三病灶类型，且其概率为第二、三的病灶概率的加权和；If the first lesion type is different from the second and third lesion types, and the second and third lesion types are the same, the final target lesion type is determined as the third lesion type, and its probability is the second and third lesion probability the weighted sum of ;

若第一病灶类型与第二病灶类型相同、与第三病灶类型不相同，则分别计算第一概率与第二概率的加权和M，将M与第三病灶概率相比，取概率值较大的概率对应的病灶类型为最终的目标病灶类型；If the type of the first lesion is the same as the type of the second lesion and the type of the third lesion is different, calculate the weighted sum M of the first probability and the second probability respectively, compare M with the probability of the third lesion, and take the larger probability value The probability corresponding to the lesion type is the final target lesion type;

若第一病灶类型与第二病灶类型不相同，且第一、第三病灶类型相同，则确定最终的目标病灶类型为第三病灶类型，且其概率为第一、三的病灶概率的加权和；If the first lesion type is different from the second lesion type, and the first and third lesion types are the same, the final target lesion type is determined as the third lesion type, and its probability is the weighted sum of the first and third lesion probabilities ;

若第一第二、第三病灶类型均不相同，则确定最终的目标病灶类型为第三病灶类型，且其概率为三的病灶概率。If the first, second, and third lesion types are different, the final target lesion type is determined to be the third lesion type, and the probability thereof is the lesion probability of three.

可选的，利用精确的网格划分方法来分割出上述三种图像的目标区域之前，还包括：对图像进行预处理。预处理包括随机裁剪，采样一个片段，使得裁剪部分与目标重叠到一定比例，裁剪完成后重新缩放到固定尺寸；还可以包括随机水平翻转。Optionally, before using an accurate grid division method to segment the target regions of the above three images, the method further includes: preprocessing the images. Preprocessing includes random cropping, sampling a segment so that the cropped part overlaps the target to a certain proportion, and re-scaling to a fixed size after cropping; it can also include random horizontal flipping.

可选的，利用精确的网格划分方法来分割出上述三种图像的目标区域，包括：分别收集若干张患者的胃镜白光图像、蓝激光成像图像和联动成像模式图像，首先通过目标分割，得到目标图像区域，其中目标分割采用精确的标注方法，将图像分成n×n个网格，搜索能够包含目标图像的最优的目标有效区域；对于n×n个网格可以搜索出若干个包含目标的目标有效区域，对于得到的若干个目标有效区域，对其中的每个网格提取其局部特征，并根据特征计算其属于目标区域的得分，得分最高的区域为分割出的最优有效区域。Optionally, use an accurate grid division method to segment the target areas of the above three types of images, including: collecting a number of gastroscopic white light images, blue laser imaging images, and linked imaging mode images of a patient, first, segmenting the target to obtain the target area. The target image area, in which the target segmentation adopts an accurate labeling method, divides the image into n×n grids, and searches for the optimal target effective area that can contain the target image; for n×n grids, several targets can be searched. The target effective area of , for several obtained target effective areas, extract its local features for each grid, and calculate its score belonging to the target area according to the features, and the area with the highest score is the segmented optimal effective area.

可选的，第一识别模型为利用Resnet18作为基础网络，在Resnet18第五，第六卷积层中加入注意力模块，将第六卷积层输出的特征F6经过上采样后与第五卷积层输出的特征F5融合得到特征F5’,将特征F5’再次经过上采样与第四卷积层输出的特征F4融合得到特征F4’,对经过Resnet18的最后一层卷积层的特征记为F，分别将特征F、F6、F5’、F4’输入至分类回归子网络，完成目标的分类和检测。Optionally, the first recognition model is to use Resnet18 as the basic network, add an attention module to the fifth and sixth convolutional layers of Resnet18, and upsample the feature F6 output by the sixth convolutional layer with the fifth convolution. The feature F5 output by the layer is fused to obtain the feature F5', and the feature F5' is again upsampled and fused with the feature F4 output by the fourth convolutional layer to obtain the feature F4', and the feature of the last convolutional layer of Resnet18 is recorded as F. , respectively input the features F, F6, F5', F4' to the classification and regression sub-network to complete the classification and detection of the target.

可选的，第二识别模型为利用沙漏网络作为主干网络，主干网络包含3个堆叠的沙漏网络，每个沙漏网络由3个4×4卷积层和1个跳跃连接残差块构成；在输入到沙漏网络前使用6×6的卷积网络和2个残差块对输入的图像进行预处理；从沙漏网络输出的特征经过特殊的池化层，特殊的池化层对主干网络输出的特征图进行水平方向从右向左的平均池化，得到特征图A，然后再从底至上做平均池化，得到另一个特征图B，将上述两个特征图的每个像素值求和，得到特征图C，特征图C经过激活层，全连接层后输出目标预测。Optionally, the second recognition model uses an hourglass network as the backbone network, the backbone network includes three stacked hourglass networks, and each hourglass network is composed of three 4×4 convolutional layers and one skip connection residual block; Before inputting to the hourglass network, a 6×6 convolutional network and 2 residual blocks are used to preprocess the input image; the features output from the hourglass network go through a special pooling layer, and the special pooling layer is used for the output of the backbone network. The feature map is average pooled from right to left in the horizontal direction to obtain feature map A, and then average pooled from bottom to top to obtain another feature map B, and the sum of each pixel value of the above two feature maps, The feature map C is obtained, and the feature map C passes through the activation layer and the fully connected layer to output the target prediction.

可选的，第三识别模型为采用VGG16作为基础网络，将其中的第6，7个全连接层替换为普通的卷积层，之后加入5个卷积层，分别表示为 conv_a1～conv_a5,最后经过平均池化输出记为conv_b1，提取卷积层conv_a1， conv_a3，conv_a5以及conv_b1层的各个特征图来分别构造检测框，并利用NMS 筛选得到最终的目标预测结果。Optionally, the third recognition model is to use VGG16 as the basic network, replace the 6th and 7th fully-connected layers with ordinary convolutional layers, and then add 5 convolutional layers, which are represented as conv_a1 to conv_a5, and finally After the average pooling output is denoted as conv_b1, each feature map of the convolutional layer conv_a1, conv_a3, conv_a5 and conv_b1 layers is extracted to construct the detection frame respectively, and the final target prediction result is obtained by NMS screening.

本发明实施例提供的一种胃镜图像智能目标检测系统，系统包括：A gastroscope image intelligent target detection system provided by an embodiment of the present invention includes:

成像模块，其用于获取同一患者的相同部位的待检测的胃镜白光图像、蓝激光成像图像和联动成像模式图像，其中，保持三种图像拍摄的角度相同；an imaging module, which is used to acquire a gastroscope white light image, a blue laser imaging image and a linked imaging mode image of the same part of the same patient to be detected, wherein the shooting angles of the three images are kept the same;

目标区域分割模块，利用精确的网格划分方法来分割出上述三种图像的目标区域；The target area segmentation module uses an accurate grid division method to segment the target areas of the above three images;

第一识别模块，将白光图像相应的目标区域图像输入至第一识别模型，得到第一目标位置、第一病灶类型、第一病灶概率；a first identification module, which inputs the target area image corresponding to the white light image into the first identification model, and obtains the first target position, the first lesion type, and the first lesion probability;

第二识别模块，将蓝激光成像图像相应的目标区域图像输入至第二识别模型，得到第二目标位置、第二病灶类型、第二病灶概率；The second identification module, inputting the corresponding target area image of the blue laser imaging image into the second identification model, to obtain the second target position, the second lesion type, and the second lesion probability;

第三识别模块，将联动成像模式图像相应的目标区域图像输入至第三识别模型，得到第三目标位置、第三病灶类型、第三病灶概率；The third recognition module, which inputs the target area image corresponding to the linked imaging mode image into the third recognition model, and obtains the third target position, the third lesion type, and the third lesion probability;

目标检测分析模块，将得到目标位置的三种图像经处理调整到视角、位置相对一致，对比三张图像中的目标位置，取第一至第三目标位置的交集为最终的目标位置，若第一、二、三病灶类型相同，则确定最终的目标病灶类型为第一或第二或第三类型，且其概率为第一、二、三病灶概率的加权和；若第一病灶类型与第二、第三病灶类型不相同，且第二、第三病灶类型相同，则确定最终的目标病灶类型为第三病灶类型，且其概率为第二、三病灶概率的加权和；若第一病灶类型与第二病灶类型相同、与第三病灶类型不相同，则分别计算第一概率与第二概率的加权和M，将M与第三病灶概率相比，取概率值较大的概率对应的病灶类型为最终的目标病灶类型；若第一病灶类型与第二病灶类型不相同，且第一、第三病灶类型相同，则确定最终的目标病灶类型为第三病灶类型，且其概率为第一、三病灶概率的加权和；若第一第二、第三病灶类型均不相同，则确定最终的目标病灶类型为第三病灶类型，且其概率为第三病灶概率。The target detection and analysis module processes and adjusts the three images of the target position so that the viewing angles and positions are relatively consistent, compares the target positions in the three images, and takes the intersection of the first to third target positions as the final target position. The types of the first, second, and third lesions are the same, and the final target lesion type is determined to be the first, second, or third type, and its probability is the weighted sum of the probabilities of the first, second, and third lesions; 2. The types of the third lesions are not the same, and the types of the second and third lesions are the same, then the final target lesion type is determined to be the third lesion type, and its probability is the weighted sum of the probabilities of the second and third lesions; if the first lesion is of the same type If the type is the same as the type of the second lesion and different from the type of the third lesion, the weighted sum M of the first probability and the second probability is calculated respectively, and M is compared with the probability of the third lesion, and the probability corresponding to the larger probability value is taken. The lesion type is the final target lesion type; if the first lesion type is different from the second lesion type, and the first and third lesion types are the same, the final target lesion type is determined to be the third lesion type, and its probability is the third lesion type. The weighted sum of the probabilities of the first and third lesions; if the types of the first, second, and third lesions are different, the final target lesion type is determined to be the third lesion type, and its probability is the third lesion probability.

本发明还提供一种电子设备，包括：至少一个处理器和存储器；The present invention also provides an electronic device, comprising: at least one processor and a memory;

所述存储器存储计算机执行指令；the memory stores computer-executable instructions;

所述至少一个处理器执行所述存储器存储的计算机执行指令，使得所述至少一个处理器执行如上第一个方面以及第一个方面各种可能的设计所述的方法。The at least one processor executes computer-implemented instructions stored in the memory to cause the at least one processor to perform the methods described above in the first aspect and various possible designs of the first aspect.

本发明还提供一种计算机可读存储介质，所述计算机可读存储介质中存储有计算机执行指令，当处理器执行所述计算机执行指令时，实现如上第一个方面以及第一个方面各种可能的设计所述的方法。The present invention also provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium. When a processor executes the computer-executable instructions, the first aspect and various aspects of the first aspect are implemented. possible designs of the method described.

与现有技术相比，本发明的有益效果是：Compared with the prior art, the beneficial effects of the present invention are:

本发明采用不同的成像模式的图像来识别胃镜图像中的目标，能够克服现有技术中单独采用其中一种图像来识别目标的准确度不高的问题；且对于三种不同成像模式的图像能够根据图像的特点采用适合的目标识别模型，最后根据三种模型的输出结果及其自身特点来得到目标的病灶类型和概率，提高了目标识别的准确度和精度，能更好地辅助医生。The invention adopts images of different imaging modes to identify the target in the gastroscopic image, and can overcome the problem that the accuracy of using one of the images alone to identify the target in the prior art is not high; and the images of three different imaging modes can be According to the characteristics of the image, a suitable target recognition model is adopted, and finally the target lesion type and probability are obtained according to the output results of the three models and their own characteristics, which improves the accuracy and precision of target recognition and can better assist doctors.

本发明采用的目标分割采用了精确的网格标注方法，能够更准确地提取目标感兴趣区域，减少背景图像带来的后续识别的干扰。The target segmentation adopted in the present invention adopts an accurate grid labeling method, which can more accurately extract the target area of interest and reduce the interference of subsequent identification caused by the background image.

附图说明Description of drawings

图1是本发明的胃镜图像智能目标检测方法的流程图。Fig. 1 is a flow chart of the intelligent target detection method for gastroscope images of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

参见图1，本发明提供一种胃镜图像智能目标检测方法：Referring to Figure 1, the present invention provides an intelligent target detection method for gastroscope images:

可选的，利用精确的网格划分方法来分割出上述三种图像的目标区域之前，还包括：对图像进行预处理。预处理包括随机裁剪，采样一个片段，使得剪裁部分与目标重叠到一定比例，剪裁完成后重新缩放到固定尺寸；还可以随机水平翻转。Optionally, before using an accurate grid division method to segment the target regions of the above three images, the method further includes: preprocessing the images. Preprocessing includes random cropping, sampling a segment so that the cropped part overlaps the target to a certain proportion, and re-scaling to a fixed size after cropping; it can also be randomly flipped horizontally.

可选的，利用精确的网格划分方法来分割出上述三种图像的目标区域，包括：分别收集若干张患者的胃镜白光图像、蓝激光成像图像和联动成像模式图像，首先通过目标分割，得到目标图像区域，其中目标分割采用精确的标注方法，将图像分成n×n个网格，搜索能够包含目标图像的最优的目标有效区域。对于n×n个网格可以搜索出若干个包含目标的目标有效区域，对于得到的若干个目标有效区域，对其中的每个网格提取其局部特征，并根据特征计算其属于目标区域的得分，那么得分最高的区域为分割出的最优有效区域。Optionally, use an accurate grid division method to segment the target areas of the above three types of images, including: collecting a number of gastroscopic white light images, blue laser imaging images, and linked imaging mode images of a patient, first, segmenting the target to obtain the target area. The target image area, in which the target segmentation adopts an accurate labeling method, divides the image into n×n grids, and searches for the optimal target effective area that can contain the target image. For n×n grids, several target effective areas containing the target can be searched, and for the obtained several target effective areas, the local features of each grid are extracted, and the score of the target area is calculated according to the features. , then the region with the highest score is the optimal effective region segmented.

第一识别模型在训练时，正负样本的数量是不平衡的，为了降低这种不平衡，模型在训练时采用了优化的交叉熵损失函数作为目标损失函数，具体如下：During the training of the first recognition model, the number of positive and negative samples is unbalanced. In order to reduce this imbalance, the model adopts the optimized cross-entropy loss function as the target loss function during training, as follows:

y是标签，y取1表示是分类为目标， y取0表示分类不是目标，y’表示分类的概率，α是平衡参数，γ是调节的速率。损失函数在表达准确样本和非准确样本时，也会存在不平衡，得到损失大于1的点，可能认为是噪声，对这些点会更加敏感，所以对目标损失函数进一步做平衡，具体如下：

y is the label, y takes 1 to indicate that the classification is the target, y takes 0 to indicate that the classification is not the target, y' indicates the probability of classification, α is the balance parameter, and γ is the rate of adjustment. When the loss function expresses accurate samples and inaccurate samples, there will also be imbalance. If the loss is greater than 1, it may be considered as noise, and it will be more sensitive to these points. Therefore, the target loss function is further balanced, as follows:

Lc(loss)＝n/m(m|loss|+1)ln(m|loss|+1)-n|loss||loss|＜1，Lc(loss)=n/m(m|loss|+1)ln(m|loss|+1)-n|loss||loss|<1,

其中，nln(m+1)＝βAmong them, nln(m+1)=β

loss是平衡前的损失函数取值，n、m为调整因子，β是调整误差的参数， D为设定的参数，通过调整损失函数，可以得到较为平衡的训练。loss is the value of the loss function before the balance, n and m are the adjustment factors, β is the parameter to adjust the error, and D is the set parameter. By adjusting the loss function, a more balanced training can be obtained.

本发明实施例提供的一种电子设备，用于执行上述实施例提供的胃镜图像目标检测方法，其实现方式与原理相同，不再赘述。An electronic device provided by an embodiment of the present invention is used to execute the gastroscope image target detection method provided by the above-mentioned embodiment, and its implementation manner is the same as the principle, which will not be repeated.

本发明实施例提供了一种计算机可读存储介质，所述计算机可读存储介质中存储有计算机执行指令，当处理器执行所述计算机执行指令时，实现如上任一实施例提供的胃镜图像目标检测方法。An embodiment of the present invention provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when a processor executes the computer-executable instructions, the gastroscopic image target provided in any of the above embodiments is realized Detection method.

本发明实施例的包含计算机可执行指令的存储介质，可用于存储前述实施例中提供的胃镜图像目标检测方法的计算机执行指令，其实现方式与原理相同，不再赘述。The storage medium containing the computer-executable instructions of the embodiments of the present invention can be used to store the computer-executable instructions of the gastroscopic image target detection method provided in the foregoing embodiments, and the implementation manner is the same as the principle, which will not be repeated.

本领域技术人员可以清楚地了解到，为描述的方便和简洁，仅以上述各功能模块的划分进行举例说明，实际应用中，可以根据需要而将上述功能分配由不同的功能模块完成，即将装置的内部结构划分成不同的功能模块，以完成以上描述的全部或者部分功能。上述描述的装置的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of the description, only the division of the above functional modules is used for illustration. The internal structure is divided into different functional modules to complete all or part of the functions described above. For the specific working process of the apparatus described above, reference may be made to the corresponding process in the foregoing method embodiments, and details are not described herein again.

最后应说明的是：以上各实施例仅用以说明本申请的技术方案，而非对其限制；尽管参照前述各实施例对本申请进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分或者全部技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions described in the foregoing embodiments can still be modified, or some or all of the technical features thereof can be equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present application. scope.

Claims

1. An intelligent target detection method for gastroscope images is characterized by comprising the following steps:

acquiring a white light image, a blue laser imaging image and a linkage imaging mode image of a gastroscope to be detected at the same part of the same patient, wherein the shooting angles of the three images are kept the same;

segmenting the target areas of the three images by using an accurate grid division method;

inputting a target area image corresponding to the white light image into a first recognition model to obtain a first target position, a first focus type and a first focus probability;

inputting a target area image corresponding to the blue laser imaging image into a second recognition model to obtain a second target position, a second focus type and a second focus probability;

inputting a target area image corresponding to the linkage imaging mode image into a third recognition model to obtain a third target position, a third focus type and a third focus probability;

processing and adjusting the three images with the obtained target positions to have relatively consistent visual angles and positions, comparing the target positions in the three images, taking the intersection of the first target position and the third target position as a final target position, and if the types of the first, second and third focuses are the same, determining that the final type of the target focus is a first type, a second type or a third type, and the probability of the final type of the target focus is the weighted sum of the probabilities of the first, second and third focuses;

if the first focus type is different from the second focus type and the third focus type, and the second focus type and the third focus type are the same, determining that the final target focus type is the third focus type, and the probability of the final target focus type is the weighted sum of the probabilities of the second focus and the third focus;

if the first lesion type is the same as the second lesion type and is not the same as the third lesion type, calculating a weighted sum M of the first probability and the second probability, and comparing the M with the third lesion probability, and taking the lesion type corresponding to the probability with a higher probability value as a final target lesion type;

if the first focus type is different from the second focus type and the first focus type and the third focus type are the same, determining that the final target focus type is the third focus type and the probability is the weighted sum of the probabilities of the first focus and the third focus;

if the first, second and third lesion types are different, determining that the final target lesion type is the third lesion type, and the probability is the third lesion probability.

2. The intelligent target detection method for gastroscope image according to claim 1 is characterized by that said accurate grid division method is used to divide the target region of the above-mentioned three images, including: respectively collecting gastroscope white light images, blue laser imaging images and linkage imaging mode images of a plurality of patients, firstly, obtaining a target image area through target segmentation, wherein the target segmentation adopts an accurate labeling method to divide the images into n multiplied by n grids, and searching an optimal target effective area capable of containing the target images; for n x n grids, a plurality of target effective areas containing targets can be searched, for the obtained target effective areas, local features of each grid are extracted, scores of the grids belonging to the target areas are calculated according to the features, and the area with the highest score is the segmented optimal effective area.

3. The intelligent gastroscopic image target detection method according to claim 1, wherein said first identification model is that Resnet18 is used as a basic network, attention modules are added into the fifth and sixth convolutional layers of Resnet18, the feature F6 output by the sixth convolutional layer is up-sampled and then fused with the feature F5 output by the fifth convolutional layer to obtain a feature F5', the feature F5' is up-sampled again and fused with the feature F4 output by the fourth convolutional layer to obtain a feature F4', the feature of the last convolutional layer passing through Resnet18 is marked as F, and the features F, F6, F5' and F4' are respectively input into a classification regression subnetwork to complete the classification and detection of targets.

4. The intelligent target detection method for gastroscope images according to claim 1 is characterized in that the second identification model is an hourglass network used as a backbone network, the backbone network comprises 3 stacked hourglass networks, and each hourglass network is composed of 3 4 x 4 convolution layers and 1 jump connection residual block; preprocessing an input image by using a 6 x 6 convolutional network and 2 residual blocks before inputting the input image into an hourglass network; the characteristics output from the hourglass network pass through a special pooling layer, the special pooling layer performs average pooling on the characteristic diagram output by the main network from right to left in the horizontal direction to obtain a characteristic diagram A, then performs average pooling from bottom to top to obtain another characteristic diagram B, each pixel value of the two characteristic diagrams is summed to obtain a characteristic diagram C, and the characteristic diagram C passes through an activation layer and a full connection layer to output target prediction.

5. The intelligent target detection method for gastroscope images according to claim 1, characterized in that the third identification model is to use VGG16 as the basic network, replace the 6 th and 7 th fully connected layers therein with ordinary convolutional layers, then add 5 convolutional layers, respectively denoted as conv _ a1 to conv _ a5, finally output them as conv _ b1 through average pooling, extract the characteristic maps of the convolutional layers conv _ a1, conv _ a3, conv _ a5 and conv _ b1 to respectively construct the detection frame, and use NMS to screen to obtain the final target prediction result.

6. The intelligent target detection method for gastroscope image according to claim 1, characterized by that before the target area of the above-mentioned three images is divided by using accurate mesh division method, it also includes: preprocessing an image; the pre-processing includes random clipping, sampling a segment so that the clipped portion overlaps the target to a certain proportion, and rescaling to a fixed size after clipping is completed.

7. A gastroscopic image intelligent object detection system, characterized in that said system comprises:

the imaging module is used for acquiring a white light image, a blue laser imaging image and a linkage imaging mode image of the gastroscope to be detected at the same part of the same patient, wherein the shooting angles of the three images are kept the same;

the target area segmentation module is used for segmenting the target areas of the three images by using an accurate grid division method;

the first identification module is used for inputting a target region image corresponding to the white light image into the first identification model to obtain a first target position, a first focus type and a first focus probability;

the second identification module is used for inputting the target area image corresponding to the blue laser imaging image into a second identification model to obtain a second target position, a second focus type and a second focus probability;

the third recognition module is used for inputting the target area image corresponding to the linkage imaging mode image into a third recognition model to obtain a third target position, a third focus type and a third focus probability;

the target detection and analysis module is used for processing and adjusting the three images with the target positions to be relatively consistent in visual angle and position, comparing the target positions in the three images, taking the intersection of the first target position and the third target position as a final target position, and if the types of the first, second and third focuses are the same, determining that the final type of the target focus is the first type, the second type or the third type, and the probability of the final type of the target focus is the weighted sum of the probabilities of the first, second and third focuses; if the first focus type is different from the second focus type and the third focus type, and the second focus type and the third focus type are the same, determining that the final target focus type is the third focus type, and the probability of the final target focus type is the weighted sum of the probabilities of the second focus and the third focus; if the first lesion type is the same as the second lesion type and is different from the third lesion type, respectively calculating a weighted sum M of the first probability and the second probability, comparing the M with the third lesion probability, and taking the lesion type corresponding to the probability with a higher probability value as a final target lesion type; if the first focus type is different from the second focus type and the first focus type and the third focus type are the same, determining that the final target focus type is the third focus type and the probability is the weighted sum of the probabilities of the first focus and the third focus; if the first, second and third lesion types are different, determining that the final target lesion type is the third lesion type, and the probability is the third lesion probability.

8. An electronic device, characterized in that the electronic device comprises:

at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the gastroscopic image intelligent object detection method according to any one of claims 1 to 6.

9. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the gastroscopic image intelligent object detection method according to any one of claims 1 to 6.