CN112164026A

CN112164026A - Endoscope polyp real-time detection method, system and terminal

Info

Publication number: CN112164026A
Application number: CN202010902144.1A
Authority: CN
Inventors: 黄晓霖; 何凡; 周璐; 姚乐宇; 徐金田; 陈思哲; 彭海霞; 杨杰
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2020-09-01
Filing date: 2020-09-01
Publication date: 2021-01-01
Anticipated expiration: 2040-09-01
Also published as: CN112164026B

Abstract

The invention provides a method, a system and a terminal for detecting polyp of an endoscope in real time, comprising the following steps: acquiring an output image of endoscope equipment; preprocessing an input image; calling an endoscope polyp detection model to detect the preprocessed image; inputting the image with polyps in the output result into a generator model to output a simulation diagram after artificial correction and labeling, inputting the polyp image and the simulation diagram into an attacker to obtain a disturbed polyp image and a disturbed simulation diagram, and storing the disturbed image and the simulation diagram which cannot be identified by the current detection model into a training database; a self-updating strategy for retraining the endoscope polyp detection model is periodically performed using a training database. The invention can acquire the video of the endoscope in real time, detect polyp and output the video before and after lesion detection, thereby facilitating the observation of doctors. The invention has simple operation and no technical threshold when in use. The self-expansion model timing updating function of the data is provided, so that the later maintenance is simple and the accuracy is high.

Description

Endoscope polyp real-time detection method, system and terminal

Technical Field

The invention relates to the technical field of medical image processing, in particular to a method, a system and a terminal for detecting polyp of an endoscope in real time.

Background

Colorectal cancer is the third highest in men and the second highest in women worldwide today. The predominant examination method today is the use of endoscopes. This procedure requires invasion into the body, is expensive and time consuming, and is painful for the patient. In recent years, neural networks have achieved great performance in recognition-related tasks, and have the advantage of being efficient compared to manual work. If it can be applied to endoscopic procedures, it is desirable to shorten the examination time, to relieve the pain of the patient on the one hand, and to relieve the medical pressure.

However, the detection and identification method based on the neural network generally depends on the marking data with large data volume, and the manual marking cost is large. For polyp images, there is a further problem that it is difficult for a general person to label and a professional person is required to correctly recognize the images. This further increases labor costs. For this reason, it is imperative to develop efficient data enhancement algorithms.

At present, a common data enhancement method is based on surface layer characteristics, and data expansion is carried out in modes of rotation, stretching and the like, so that a certain effect is achieved, but the actual requirements cannot be met. In the patent, the invention with publication number CN110335230A discloses a real-time detection method and device for endoscopic image lesions, but no additional technology is developed for data expansion. The invention with the publication number of CN110363751A discloses a method for detecting polyp of a large intestine endoscope based on a generation cooperative network, wherein a cooperator model carries out denoising processing on an input black-and-white labeled graph, so that an intermediate product of the noisy black-and-white labeled graph can become data required by a training set, and further the effect of data volume self-expansion is achieved. Such data amplification based on denoising techniques lacks physical meaning and interpretability.

There have been many recent studies on data enhancement of endoscopes, which indicate that the general data generation is not effective to ensure the existence of a lesion in a picture, and that improvement of a detection model is not much helpful. Therefore, the researchers have proposed a data enhancement method based on the synthetic focus and have achieved certain effects. It is worth noting, however, that synthesis-based methods require some manual intervention, including extracting the lesion, designing the synthesis location based on the angle of the lesion and the background, etc. And the focus in the synthesized result picture is less natural and is easy to be detected as foreign matter. This has limited assistance in detecting the rise of the network.

Disclosure of Invention

In view of the defects in the prior art, the invention aims to provide an endoscope polyp real-time detection method based on cooperation of a plurality of deep convolutional networks.

According to a first aspect of the present invention, there is provided an endoscopic polyp real-time detection method, comprising:

acquiring an output image of endoscope equipment;

preprocessing the image, cutting out irrelevant information in the image, and performing image contrast enhancement and exposure removal;

inputting the preprocessed image into an endoscope polyp detection model for polyp detection and outputting a result;

retraining the endoscope polyp detection model according to the output result, wherein the retrained endoscope polyp detection model is used as an endoscope polyp detection model for next detection;

wherein retraining the endoscope polyp detection model comprises:

the generator model outputs a simulation diagram, and the simulation diagram is N images with unchanged polyp positions, changed backgrounds and slightly different polyp textures;

inputting the polyp image output by the endoscope polyp detection model and the picture which can be identified by the current detection model in the simulated image into an attacker to obtain a disturbed polyp image and a disturbed simulated image, wherein the disturbed image is an image which cannot be identified by the current endoscope polyp detection model;

storing the disturbed image and the simulated image which cannot be identified by the current detection model into a training database; and regularly retraining the endoscope polyp detection model by using the training database.

Optionally, the generator model outputs a simulated view that is input to the endoscopic polyp detection model, wherein:

if the polyp can not be detected, the polyp is a simulated image which has the advantages that the polyp position is unchanged, the background is changed, the polyp texture is slightly different, and the polyp cannot be identified by the endoscope polyp detection model;

if the polyp is detected, inputting the simulated image into an attacker, calculating to obtain attack disturbance, and adding the attack disturbance to the original simulated image to obtain a disturbed simulated image;

inputting the polyp image output by the endoscope polyp detection model into an attacker, calculating to obtain attack disturbance, and adding the attack disturbance to the original polyp image to obtain a disturbed polyp image;

and storing the disturbed polyp image and the corresponding simulation diagram into a training database.

Optionally, the generator model generates a network based on a single picture of confrontation, the network comprising two sub-networks:

a generator network for generating a simulated image; the input of the generator network is an image called as an original image, and the output of the generator network is an equal-size image called as a simulation image;

and the discriminator network is used for discriminating whether the input image and the original image belong to the same type.

Optionally, the generator network and the discriminator network both use the same pyramid full convolution network structure;

the optimization problem for the generator network and the discriminant network training is as follows:

wherein D_n,G_nA discriminator network and a generator network, respectively, against loss functions

Using a WGAN-GP loss function; reconstruction loss function

The following were used:

wherein

The output of the n-th layer generator network, x, with the input being the reconstructed output of the n + 1-th layer and 0 noise added_nAn artwork representing the nth layer; (A)_ROIrepresenting the region of interest in map a, i.e., the polyp location in the input label; (A)_Otherrepresenting positions in the diagram a other than the region of interest; beta is a hyperparameter.

Optionally, rendering the simulated view from the generator model unrecognizable by said endoscopic polyp detection model by an attack, wherein attack perturbation is implemented using the following optimization problem:

wherein B is a set of polyp prediction boxes detected by the endoscopic polyp detection model, C_i,polypRepresenting the confidence that the ith prediction box belongs to a polyp, x is the input image, i.e., the simulated image output by the generator, and x' is an optimization variable consistent with the size of x.

Optionally, retraining the endoscope polyp detection model is performed periodically using the training database, wherein when the number of samples in the training database is increased to a set ratio in the last training, retraining the endoscope polyp detection model for the next time is performed.

Optionally, the endoscope polyp detection model is based on a deep convolutional neural network, is trained by data labeled by a large number of professional doctors, and can directly predict whether polyps exist and positions of the polyps through images.

According to a second aspect of the present invention, there is provided an endoscopic polyp real-time detection system comprising:

a real-time endoscope image acquisition and polyp detection module which acquires an output image of an endoscope device, preprocesses the image, inputs the preprocessed image into an endoscope polyp detection model for polyp detection and outputs a result;

the model upgrading module is used for retraining the endoscope polyp detection model according to the output result, and the retrained endoscope polyp detection model is used as an endoscope polyp detection model for next detection; wherein retraining the endoscope polyp detection model comprises:

a generator model, which outputs a simulation image, wherein the simulation image is N images with unchanged polyp positions, changed backgrounds and slightly different polyp textures;

the attacker inputs the polyp existing image output by the endoscope polyp detection model and the picture which can be identified by the current detection model in the simulated image into the attacker to obtain a disturbed polyp image and a disturbed simulated image, wherein the disturbed image is an image which cannot be identified by the current endoscope polyp detection model;

In the system of the present invention, the retraining of the endoscopic polyp detection model employs the corresponding technique of the above-described method.

According to a third aspect of the present invention, there is provided an endoscopic polyp real-time detection terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor being operable to execute the endoscopic polyp real-time detection method when executing the program.

The invention adopts the generator model to generate the simulated image, so that the positions of the generated image and the original image focus are unchanged, the cost of manual re-marking is saved, and meanwhile, the generated image is more real and finer and has certain interpretability. Meanwhile, due to the introduction of an attack algorithm, a generated image is disguised and cannot be identified by the existing detection network, and the robustness of the detection model is improved through the data enhancement mode.

Compared with the prior art, the invention has at least one of the following beneficial effects:

the invention can detect polyp without manual intervention for the input endoscope image and output the reference position of the polyp to assist medical personnel in diagnosis and treatment. Meanwhile, the dependence on high-dose manual marking data is a common problem of the current algorithm based on artificial intelligence, and the data self-amplification method adopts the generator model to perform data self-amplification, so that the data volume required by early-stage starting and the workload of later-stage model improvement can be greatly reduced.

The generator model in the invention can generate the simulation images of the input images, the polyp positions of the simulation images are unchanged, additional manual marking is not needed, and the labor cost is greatly saved.

The generator model in the invention can prevent the amplified image from detecting polyps by the existing detection method through attack disturbance, and the robustness of the detection method can be better improved by using the image.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a general flow chart of a method in an embodiment of the invention;

FIG. 2 is a flow chart illustrating retraining an endoscopic polyp detection model in accordance with an embodiment of the present invention;

FIG. 3 is a flow chart of an embodiment of the present invention;

FIG. 4 is a functional block diagram of a system used in an embodiment of the present invention;

FIG. 5 is a schematic diagram of the data amplification of a generator model according to an embodiment of the present invention;

FIG. 6 shows the result of polyp detection in an embodiment of the present invention;

fig. 7 is a diagram of a polyp and a simulation thereof in an embodiment of the present invention.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the present invention.

In order to relieve medical pressure of hospital enteroscope polyp detection, improve detection precision and shorten detection time, the embodiment of the invention provides a real-time endoscope polyp detection method, which is shown in fig. 1 and is a general flow chart of the method of the embodiment of the invention, and specifically comprises the following steps:

s1, acquiring an output image of the endoscope equipment; in the step, the endoscope image is collected in real time;

s2, preprocessing the image, cutting out irrelevant information in the image, and performing image contrast enhancement and exposure removal to make the endoscope polyp detection model focus more on the characteristics of polyps so as to improve the accuracy of subsequent detection;

s3, inputting the preprocessed image into an endoscope polyp detection model for polyp detection and outputting a result, wherein the endoscope polyp detection model is a common polyp detection model;

and S4, retraining the endoscope polyp detection model according to the output result, and using the retrained endoscope polyp detection model as the endoscope polyp detection model for next detection, thereby realizing the upgrade of the model and further improving the detection accuracy.

In order to provide a large number of training samples and solve the problems of manual labeling of data, complex data acquisition, troublesome upgrading of a later-stage endoscope polyp detection model and high labor cost, in the embodiment, a device is designed, which can automatically store valuable data encountered in the using process and can effectively amplify the labeled data without manual participation. Specifically, fig. 2 is a flowchart illustrating retraining an endoscopic polyp detection model according to an embodiment of the present invention, and referring to fig. 2, the retraining an endoscopic polyp detection model includes:

s401, inputting the image with the polyp in the output result into a generator model after artificial correction and labeling;

s402, outputting N simulated images with unchanged polyp positions, changed backgrounds and slightly different polyp textures by a generator model, and obtaining a polyp image and a simulated image thereof which cannot be identified by the current endoscope polyp detection model by an attack algorithm;

s403, storing a polyp image which cannot be identified by the current endoscope polyp detection model and a simulation image corresponding to the polyp image into a training database;

and S404, regularly retraining the endoscope polyp detection model by adopting the training database.

Referring to fig. 5, in S402, the obtained simulation diagram is input to the endoscope polyp detection model: if the polyp can not be detected, the polyp is a simulated image which has the advantages that the polyp position is unchanged, the background is changed, the polyp texture is slightly different, and the polyp cannot be identified by the endoscope polyp detection model; if the polyp is detected, inputting the simulated image into an attacker, calculating to obtain attack disturbance, and adding the attack disturbance to the original simulated image to obtain a disturbed simulated image; inputting the polyp image output by the endoscope polyp detection model into an attacker, calculating to obtain attack disturbance, and adding the attack disturbance to the original polyp image to obtain a disturbed polyp image; and storing the disturbed polyp image and the corresponding simulation diagram into a training database.

The endoscope image obtained in real time and the polyp detection are carried out, so that whether the polyp exists in the current endoscope and the position of the polyp can be efficiently detected in real time, and a doctor can conveniently observe the polyp; the simulation diagram is generated through the generator model, so that data are expanded, and the simulation diagram has a timing updating function and high accuracy. The detection result can assist a doctor to improve the clinical diagnosis efficiency, relieve the pain of patients and relieve the pressure of medical staff. In order to reduce the result false positive, the method performs data preprocessing on the part which is easy to detect polyp, such as overexposure, and the like, in a targeted manner, and makes the network pay more attention to the characteristics of the polyp.

In a preferred embodiment, the endoscopic polyp detection model may use common deep convolutional neural networks, such as YOLO, SSD, etc. In order to improve the network speed and simultaneously consider the requirements of the precision and the real-time performance of the model, the invention adopts the YOLOv3-tiny model, the model removes a residual error layer, only two YOLO output layers with different scales are used, the number of prior frames is reduced, and the network forward path can be ensured to run in real time.

In a preferred embodiment, the generator model may use a challenge-generating convolutional neural network, preferably a single picture-based challenge-generating network, the model comprising a generator network for generating the simulated image and a discriminator network; the input of the generator network is a single image, called the original image, and the output is a single equally large image, called the simulated image. And the discriminator network is used for discriminating whether the input image and the original image belong to the same type. For example, in one embodiment, the generator network has a total of 9 convolutional networks, each of which has as its input the upsampled and scaled noise map of the output of the previous convolutional network and has as its output an equally scaled image. The output of the convolution network at the uppermost layer is the resolution of the original image, and the resolution of the output image of the convolution network at the lowermost layer is the resolution of the original image after 8 times of down-sampling. The input of the bottom layer network is a noise image obtained through training; correspondingly, the discriminator network comprises a 9-layer convolution network, the input generator outputs 0 and 1 values corresponding to the output of the layer network, 0 represents that the input image is not the same type as the original image, and 1 represents that the input image is the same type as the original image.

In particular, the original single-picture-based confrontation generation network is unsupervised learning, and the embodiment of the invention changes the training loss into supervised learning through modification of the training loss. In the embodiment of the invention, the generator model receives the position information of the polyp in the input label, so that the position of the polyp in the generated simulated image is basically consistent with the original image.

In a preferred embodiment, the acquisition of the output image of the endoscope device can be realized by an acquisition card, for example, the acquisition card is connected with the endoscope shooting device and a computer, a driver of the acquisition card is installed in the computer, and the endoscope image acquired by the endoscope shooting device is read by calling an interface in the acquisition card SDK. Calling an interface of an acquisition card SDK, checking the availability of input signals, checking whether a newly input video frame exists in a video input buffer area in each period of program execution, if so, converting the new video frame into RBG images with 8-bit depth in each channel, and sending the images to a preprocessing link.

In a preferred embodiment, the pre-processing of the acquired endoscopic images mainly includes cropping away extraneous information, image contrast enhancement and image de-exposure. Wherein, the exposure removing comprises three steps: marking an exposure area, iteratively searching each pixel in an image, and marking out a point, which is larger than a certain specific value, of the pixel in an RGB channel by using blue as exposure; expanding the exposure area, and marking each exposure point and the points in a certain radius as the exposure points; and thirdly, filling the exposure area, carrying out interpolation operation by using the values of the effective points on the left and the right of the exposure area to obtain the value of the current exposure point, and replacing the current exposure point by using the upper point if no effective value exists.

In another embodiment of the present invention, there is provided an endoscopic polyp real-time detection system comprising:

a real-time endoscope image acquisition and polyp detection module, which acquires an output image of endoscope equipment, preprocesses the image, inputs the preprocessed image into an endoscope polyp detection model for polyp detection and outputs a result;

the model upgrading module is used for retraining the endoscope polyp detection model according to the output result, and the retrained endoscope polyp detection model is used as the endoscope polyp detection model for the next detection; wherein retraining an endoscope polyp detection model comprises: inputting the image with the polyp in the output result into a generator model after artificial correction and labeling; the generator model outputs a simulation diagram which is N images with unchanged polyp positions, changed backgrounds and slightly different polyp textures; inputting the polyp image output by the endoscope polyp detection model and the picture which can be identified by the current detection model in the simulated image into an attacker to obtain a disturbed polyp image and a disturbed simulated image, wherein the disturbed image is an image which cannot be identified by the current endoscope polyp detection model; and storing the disturbed image and the simulated image which cannot be identified by the current detection model into a training database.

In the real-time detection system for endoscope polyps of the above embodiment, specific implementation of each module may refer to the technology adopted in the steps of the real-time detection method for endoscope polyps, and details are not described here.

In another embodiment of the present invention, there is provided an endoscope polyp real-time detection terminal, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor being operable to execute the endoscope polyp real-time detection method of any of the above embodiments when executing the program.

For better understanding of the technical solution of the present invention, the following description is provided with a specific application example, and it should be understood that the specific application example is not intended to limit the present invention.

Referring to fig. 4, in this specific application example, the hardware used includes: the system comprises an endoscope shooting device, a collecting card, a computer and a display device. The output interface of the endoscope shooting equipment is connected to the input interface of the acquisition card, the output interface of the acquisition card is connected to the input interface of the computer, and the output interface of the computer is connected to the input interface of the display equipment. When software is not used, the output interface of the endoscope shooting device can also be directly connected to the input interface of the display device.

In the specific application example, the endoscopic polyp detection model is constructed based on a deep convolutional neural network, and the structure of the endoscopic polyp detection model can be improved YOLOv3-tiny (Redmon J, Farhadi A. Yolov3: An analytical improvement [ J ]. arXiv prediction arXiv:1804.02767,2018) of a YOLOv2 detection network published in CVPR by Redmon et al, and is specifically realized by using a C + + language and a darknet framework. Trained by a large number of physician-labeled data, which is normalized by a bounding box generated with a polyp mask. In the training process of the endoscope detection model, two data enhancement methods of image rotation and image brightness adjustment are adopted to carry out diversified expansion on training data, so that the trained model can have a good detection effect on images with different angles and different brightness. Image rotation rotates the image by 90,180,270 degrees to correspond to the case where images are taken at different angles. The image brightness changes to the initial 33%, 66% and 133% to accommodate different ambient brightness. The endoscope polyp detection model is obtained by training the network, and whether polyps exist and the positions of the polyps can be directly predicted through the image. When training an endoscope polyp detection model, the number of target types is 1, the label of polyp is 0, the images obtained by the acquisition card are cut to a uniform size of 352 x 384, the size of batch learning sample is set to 64, and the priori frames of training data obtained by KNN are 29,30,53,57,65,90,99,74,142,108,165 and 147 without prediction as the background except for the polyp. The output filters of the yolo layer is 18.

In one embodiment, the preprocessed image is input into a YOLOv3-tiny network (endoscope polyp detection model), the network structure is shown in table one, where conv represents the convolutional layer, max represents the maximum pooling layer, route represents the output direct layer, upsample represents the upsampling layer, and yolo represents the network prediction layer. The output of yolo indicates that polyp is present if there is a label, and indicates that polyp is absent if there is no label. And storing the result into a buffer area.

Table one: YOLOv3-tiny network structure, taking input 416 x 3 as example

In the specific application example, the generator model is constructed based on a deep learning network. Preferably, the Single-picture based antagonistic generation network published in ICCV by Shaham et al in 2019 can be implemented using Shaham T R, Dekel T, Michaeli T, et al.SinGAN: Learning a genetic Model From a Single Natural Image [ C ]. International conference on computer vision,2019:4570-4580, using Python language and a Pythor framework. The generator model includes two sub-networks: a generator network and a discriminator network. The training data required by the confrontation generation network is a single picture, does not need to be marked, and is unsupervised learning. The example makes detail changes to the countermeasure generation network: by adding a new labeling interface and modifying the training loss function, the method is changed into supervised learning, so that the position of the polyp in the generated simulated image is basically consistent with the original image by learning the position of the polyp in the label through the generator model.

Preferably, the optimization problem for training the generator network and the arbiter network is:

wherein D_n,G_nDiscriminator and generator models, respectively. α is a hyperparameter, and α is 100 in the embodiment of the present invention. Function of penalty of confrontation

The WGAN-GP loss function is used. Reconstruction loss function

The modifications are as follows:

wherein

The n-th layer generator output, x, representing the reconstructed output of the input n + 1-th layer, with 0 noise added_nRepresenting artwork of the nth layer. (A)_ROIRepresenting the region of interest in map a, the present invention refers to the polyp location in the input label. (A)_OtherRepresenting locations in the map a other than the region of interest. In the embodiment of the present invention, the parameter is 100.

In this specific application example, the attacker is an optimization model, wherein the attack perturbation is implemented using the following optimization problem:

Referring to fig. 3, a flowchart of a method for detecting polyps of an endoscope in real time in this application example is shown, which may be specifically executed according to the following steps:

step 1, initializing an endoscope polyp detection model and a generator model;

step 2, checking whether a video input buffer area has an input new video frame (an output image of the endoscope device), and if so, entering step 3; if not, repeating the step 2;

step 3, preprocessing an input image;

step 4, inputting the preprocessed image into an endoscope polyp detection model;

step 5, displaying the image and the detection result through the video display window according to the return result of the step 4;

step 6, outputting the polyp position stored in the temporary storage area in the step 5 and the original image, and inputting the polyp position and the original image into a generator model after artificial correction and marking;

step 7, outputting N (for example, 10) simulation graphs with unchanged polyp positions and changed backgrounds by the generator model for each input image;

and 8, inputting the simulation diagram into an endoscope polyp detection model. Inputting the image in which the polyp can be detected into an attacker, obtaining the attack disturbance size through optimization calculation, and adding the attack disturbance to the original simulation image;

step 9, inputting the original image into an attacker, obtaining the attack disturbance size through optimization calculation, and adding the attack disturbance to the original polyp image;

and step 10, storing the disturbed simulated image and the disturbed original image into a training database, and performing retraining of the endoscope polyp detection model next time when the number of samples in the database is increased to 110% of that in the previous training.

In a specific application example, the steps 1-5 and the steps 6-9 belong to different threads and are triggered in an idle state. The endoscope polyp detection model and input and output processing in the application example can achieve the real-time processing speed of 25 frames/second.

In step 2 of the above specific application example, the acquisition of the output image of the endoscope apparatus may be realized by a video signal acquisition thread. The video signal acquisition is realized through an independent thread, whether acquisition card hardware is normally connected is checked by using a driving interface in an initialization stage, the acquisition card hardware is configured by calling a bottom layer interface according to a signal source set by a user, the format of a video input signal is obtained by calling a bottom layer driving program, an input buffer area is reserved in a memory according to the format of the video signal, the effective data length of the buffer area is initialized to be zero, and the base address of the buffer area is transmitted to the driving program. In the subsequent operation, the bottom driver program continuously transmits the video frames acquired by the video acquisition card to the memory according to the previous configuration. Whenever a new video frame is available, the signal acquisition thread stores the video into a reserved input buffer and updates the effective data amount of the buffer.

In step 3 of the above specific application example, the obtained output image of the endoscopic apparatus is subjected to preprocessing including cropping of extraneous information, image contrast enhancement, and image de-exposure. Cutting out irrelevant information will cut out the image edge recording endoscope examination time and other non-image information area. Image contrast enhancement makes polyp boundaries more visible. Image de-exposure finds the exposed areas and their neighborhoods, and filling with nearby colors makes the convolutional network more focused on the features of the polyp itself rather than the features of the polyp surface reflections.

In the above specific application example, the endoscope polyp detection model is called by the timed triggered polyp detection thread to perform polyp detection, i.e., the polyp detection thread. In the initialization stage of the polyp detection thread, firstly, the neural network weight file, the configuration file and the detection metafile required by the endoscope polyp detection model are read, and the initialization of the endoscope polyp detection model is completed through the configuration information of the neural network weight file, the configuration file and the detection metafile. A timer is then initialized that will trigger the polyp detection thread at a fixed frequency. When the timer is triggered, the polyp detection thread is triggered and awakened by the timer to enter a running state, the effective data length in a video input buffer area is checked firstly, if the effective data length is zero, the polyp detection thread actively finishes the calling, enters a ready state and waits for awakening next time; if the effective data length in the video input buffer area is not zero, namely an effective video frame for detection exists, firstly copying the video frame into a working memory, carrying out format conversion, converting the video frame into RBG images with 8-bit depth in each channel, preprocessing the images, cutting and removing ineffective edge areas in the images, and carrying out image contrast enhancement and exposure removal processing. The preprocessed image is sent to an endoscope polyp detection model obtained through initialization, and a detection result is output.

Considering that the time required by the video detection process has certain uncertainty, in order to further stabilize the frame rate of the detection result, when the detection is started, the current timestamp is obtained through a time module and is recorded, then the video frame subjected to format conversion and preprocessing is sent to an endoscope polyp detection model for detection calculation, the detected polyp is labeled in an image according to the size and the position of the polyp and the confidence coefficient given by the endoscope polyp detection model according to the detection result, the size and the position of the polyp are labeled through a square frame (a prediction frame), and the confidence coefficient is given by using numbers at the top of the square frame. After the detection calculation is completed, the time stamp is obtained again and compared with the previous time stamp, so that the time spent in the detection process is obtained. And the operation time in the detection process is less than 10 milliseconds, and in order to stabilize the frame rate, the process enters a sleep waiting state after the detection is finished, and the next step of processing is carried out until the time of 10 milliseconds arrives.

And after the detection process and the waiting state are finished, releasing the data in the video input buffer area, resetting the effective data length of the video input buffer area to be zero, and then displaying the marked image in a video display window.

In the above specific application example, operations for interacting with a user may be further added, and the operations may include:

setting the size of a video display window to be the same as the video resolution of the endoscope, predefining pixels in a working memory, applying a memory space in advance for subsequent image display, creating a scene for the pixels, and adding the scene to a window view;

initializing a control window for interacting with a user, setting video input source selection, video screenshot, video recording start and stop, and polyp detection function opening and closing keys in the window for the user to operate, and binding the keys with corresponding bottom layer driving interfaces and background detection module interfaces;

when the user selects to close the window or directly quit the program, the closing subprogram is called, all running timers are stopped and cancelled, the connected acquisition card hardware is checked, an acquisition stopping command is sent to the acquisition card through the bottom layer driving program, the connection with the acquisition card is finished, and the interface of the computer is released.

In the above specific application example, retraining of the endoscope polyp detection model is used for model update refinement, which is an important innovation point in the present invention, and in an embodiment, the following steps can be specifically referred to:

and S1, if the endoscope polyp detection model considers that polyps exist in the endoscope image, storing the original image and possible polyp positions to the local, and then carrying out manual marking and correction on the original image and possible polyp positions to obtain a new sample.

S2 inputs the endoscopic image of the corrected polyp into the initialized generator model, and 10 simulated images with different backgrounds are obtained without changing the polyp position.

S3 inputs the 10 simulated images into the endoscope polyp detection model, and if a polyp cannot be detected, S4 is performed.

If polyps are detected, the perturbation is calculated by the following formula and is applied to the simulated image.

Wherein B is a set of prediction frames for the detection of Yolo, C_i,polypRepresenting the confidence that the ith prediction box belongs to a polyp. x is the input image and is an optimization variable consistent with the size. With the above formula, the embodiment of the present invention suppresses the polyp confidence of the image in Yolo, and obtains a counterimage x' which is very close to the original image but cannot detect polyps.

S4, the polyps in the 10 simulated images can not be detected by the detection network. And adding the original image and the simulation image into a training database.

And S5, retraining and updating the detection model when the number of newly added samples in the database exceeds 10% of the number of samples in the last training.

Fig. 6 is a schematic diagram showing the detection result of polyps in an endoscope image, and it can be seen that polyps of various shapes and sizes can be effectively detected by the present invention. There is also good accuracy when multiple polyps are present in one image.

Quantitative test results: this example uses 1833 images as the training set, with 1715 polyp-bearing images and 118 polyp-free images. 257 images were used as a test set, with 192 polyp images and 65 polyp images. The indices used are as follows:

wherein TP, TN, FP and FN are true positive, true negative, false positive and false negative samples respectively.

Table of polyp detection results

Training set size	Test set size	Precision	Sensitivity	Specificity	IOU threshold
						1833	257	0.79	0.78	0.52	0.25

Fig. 7 shows a polyp and its simulated view, an original view of the polyp (left one) and a generated view (right one, two) according to an embodiment of the present invention.

The embodiment of the invention can acquire the video of the endoscope in real time, detect polyp and output the video before and after lesion detection, thereby facilitating the observation of doctors. The invention has simple operation and no technical threshold when in use. The self-expansion model timing updating function of the data is provided, so that the later maintenance is simple and the accuracy is high.

It should be noted that, the steps in the method provided by the present invention may be implemented by using corresponding modules, devices, units, and the like in the system, and those skilled in the art may refer to the technical solution of the system to implement the step flow of the method, that is, the embodiment in the system may be understood as a preferred example for implementing the method, and details are not described herein.

Those skilled in the art will appreciate that, in addition to implementing the system and its various devices provided by the present invention in purely computer readable program code means, the method steps can be fully programmed to implement the same functions by implementing the system and its various devices in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices thereof provided by the present invention can be regarded as a hardware component, and the devices included in the system and various devices thereof for realizing various functions can also be regarded as structures in the hardware component; means for performing the functions may also be regarded as structures within both software modules and hardware components for performing the methods.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The above-described preferred features may be used in any combination without conflict with each other.

Claims

1. A method for real-time detection of endoscopic polyps, comprising:

acquiring an output image of endoscope equipment;

wherein retraining the endoscope polyp detection model comprises:

inputting the image with the polyp in the output result into a generator model after artificial correction and labeling;

inputting the polyp image output by the endoscope polyp detection model and a picture which can be identified by the current detection model in the simulated image into an attacker to obtain a disturbed polyp image and a disturbed simulated image, wherein the disturbed image is an image which cannot be identified by the current endoscope polyp detection model;

storing the disturbed image and a simulation image which cannot be identified by the current endoscope polyp detection model into a training database;

and regularly retraining the endoscope polyp detection model by using the training database.

2. The method of claim 1, wherein said generator model outputs a simulated view and inputs said simulated view into said endoscopic polyp detection model, wherein,

3. The method of claim 2, wherein the generator model comprises a generator network for generating a simulated image and a discriminator network for determining whether the input image is an original image, the generator network and the discriminator network using the same pyramid full convolution network structure;

wherein D_n,G_nRespectively, discriminator network and generator network, attack loss function

Using a WGAN-GP loss function; reconstruction loss function

The following were used:

wherein

The reconstructed output of the n +1 th layer representing the input, the n-th layer generator network output with 0 noise added, x_nAn artwork representing the nth layer; (A)_ROIrepresenting interest in graph ARegion, i.e. polyp location in input label; (A)_Otherrepresenting positions in the diagram a other than the region of interest; beta is a hyperparameter.

4. The method of claim 3, wherein the simulated image output from the generator model is attacked by an attacker, wherein the attacker is an optimized model for computing attack perturbations, and wherein the computation of the attack perturbations is performed using the following optimization problem:

wherein B is a set of polyp prediction boxes detected by the endoscopic polyp detection model, C_i,polypRepresenting the confidence that the ith prediction box belongs to a polyp, x is the input image, i.e., the simulated view of the generator network output, and x' is an optimization variable consistent with the size of x.

5. The method according to claim 1, wherein retraining of the endoscopic polyp detection model is periodically performed using the training database, and wherein, when the number of samples in the training database is increased to a set ratio at the time of the previous training, retraining of the endoscopic polyp detection model is performed next time.

6. The method of any of claims 1-5, wherein the endoscopic polyp detection model is based on a deep convolutional neural network, trained from data labeled by a large number of medical professionals, and capable of directly predicting the presence and location of polyps from images.

7. The method of claim 6, wherein the endoscopic polyp detection model uses a YOLO v3-tiny network, the YOLO v3-tiny network removes residual layers and uses two different scales of YOLO output layers.

8. The method of real-time detection of endoscopic polyps as defined in any of claims 1-5, wherein said image is pre-processed, wherein said de-exposure comprises three steps:

marking an exposure area, iteratively searching each pixel in an image, and marking a point of the pixel, which is larger than a certain specific value, in an RGB channel as exposure;

expanding the exposure area, and marking each exposure point and the points in a certain radius as the exposure points;

and thirdly, filling the exposure area, carrying out interpolation operation by using the values of the effective points on the left and the right of the exposure area to obtain the value of the current exposure point, and replacing the current exposure point by using the upper point if no effective value exists.

9. An endoscopic polyp real-time detection system, comprising:

10. An endoscopic polyp real-time detection terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the program when executed by the processor is operable to perform the method of any of claims 1-8.