CN113436141A - Gastroscope image target detection method and device, electronic equipment and storage medium - Google Patents

Gastroscope image target detection method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113436141A
CN113436141A CN202110529666.6A CN202110529666A CN113436141A CN 113436141 A CN113436141 A CN 113436141A CN 202110529666 A CN202110529666 A CN 202110529666A CN 113436141 A CN113436141 A CN 113436141A
Authority
CN
China
Prior art keywords
image
detected
gastroscope
target
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110529666.6A
Other languages
Chinese (zh)
Inventor
戴捷
李亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zidong Information Technology Suzhou Co ltd
Original Assignee
Zidong Information Technology Suzhou Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zidong Information Technology Suzhou Co ltd filed Critical Zidong Information Technology Suzhou Co ltd
Priority to CN202110529666.6A priority Critical patent/CN113436141A/en
Publication of CN113436141A publication Critical patent/CN113436141A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30092Stomach; Gastric

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Medical Informatics (AREA)
  • Quality & Reliability (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a gastroscope image target detection method, a gastroscope image target detection device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a gastroscope image to be detected; extracting a plurality of characteristic layers of a gastroscope image to be detected; determining a plurality of target frames with different scales according to the plurality of feature layers; selecting a prediction frame from the target frames according to the position relation between the focus point in the gastroscope image to be detected and each target frame; acquiring a category label corresponding to the prediction frame and an image recognition result of the gastroscope image to be detected; and when the class label is the same as the image identification result, determining a target detection result of the gastroscope image to be detected according to the prediction frame and the image identification result. The method provided by the scheme not only determines the specific position information of the focus, but also determines the recognition result of the focus, thereby achieving the effect of manual detection of a doctor and improving the detection efficiency.

Description

Gastroscope image target detection method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of image recognition technologies, and in particular, to a gastroscope image target detection method and apparatus, an electronic device, and a storage medium.
Background
At present, gastroscopes are important means for diagnosing stomach diseases, and whether a current patient has stomach cancer, gastric ulcer and other diseases can be judged according to gastric mucosal epithelial images collected by the gastroscopes.
In the prior art, it is common for a physician to determine the location of a lesion and the result of a diagnosis by viewing a gastroscopic image.
However, under the condition that there are many gastroscope images, if each gastroscope image is still identified and analyzed based on the prior art, a large amount of working time of a doctor is occupied, and the detection efficiency is not guaranteed.
Disclosure of Invention
The application provides a gastroscope image target detection method, a gastroscope image target detection device, electronic equipment and a storage medium, and aims to overcome the defects of low detection efficiency and the like in the prior art.
A first aspect of the present application provides a gastroscopic image target detection method comprising:
acquiring a gastroscope image to be detected;
extracting a plurality of characteristic layers of the gastroscope image to be detected;
determining a plurality of target frames with different scales according to the plurality of feature layers;
selecting a prediction frame from the target frames according to the position relation between the focus point in the gastroscope image to be detected and each target frame;
acquiring a category label corresponding to the prediction frame and an image recognition result of the gastroscope image to be detected;
and when the class label is the same as the image identification result, determining a target detection result of the gastroscope image to be detected according to the prediction frame and the image identification result.
Optionally, the method further includes:
and returning to the step of selecting a prediction frame from the target frame when the class label is different from the image recognition result.
Optionally, the selecting a prediction frame from the target frames according to the position relationship between the focus point in the gastroscope image to be detected and each target frame includes:
acquiring the maximum distance from the focus point to each frame in the target frame;
screening undetermined prediction frames according to the maximum distance and the centrifugal intervals corresponding to the target frames;
obtaining the confidence corresponding to each undetermined prediction frame;
and determining the prediction frame according to the corresponding confidence coefficient of each undetermined prediction frame.
Optionally, the acquiring an image recognition result of the gastroscope image to be detected includes:
acquiring a preset neural network model;
and inputting the gastroscope image to be detected into the neural network model so as to determine an image identification result of the gastroscope image to be detected by utilizing the neural network model.
Optionally, the extracting multiple feature layers of the gastroscope image to be detected includes:
determining a plurality of characteristic images according to the gastroscope image to be detected;
and extracting a plurality of characteristic layers from the characteristic image according to different down-sampling step lengths.
Optionally, before extracting the plurality of feature layers of the gastroscope image to be detected, the method further comprises:
and carrying out image preprocessing on the gastroscope image to be detected.
Optionally, the obtaining the confidence corresponding to each undetermined prediction frame includes:
acquiring a trained target detection model;
and determining the confidence degree of the undetermined prediction frame by using the target detection model.
A second aspect of the present application provides a gastroscopic image target detection apparatus comprising:
the first acquisition module is used for acquiring a gastroscope image to be detected;
the extraction module is used for extracting various characteristic layers of the gastroscope image to be detected;
the determining module is used for determining a plurality of target frames with different scales according to the plurality of feature layers;
the selection module is used for selecting a prediction frame from the target frames according to the position relation between the focus point in the gastroscope image to be detected and each target frame;
the second acquisition module is used for acquiring the class label corresponding to the prediction frame and the image recognition result of the gastroscope image to be detected;
and the detection module is used for determining a target detection result of the gastroscope image to be detected according to the prediction frame and the image identification result when the class label is the same as the image identification result.
Optionally, the detection module is further configured to:
and returning to the step of selecting a prediction frame from the target frame when the class label is different from the image recognition result.
Optionally, the selecting module is specifically configured to:
acquiring the maximum distance from the focus point to each frame in the target frame;
screening undetermined prediction frames according to the maximum distance and the centrifugal intervals corresponding to the target frames;
obtaining the confidence corresponding to each undetermined prediction frame;
and determining the prediction frame according to the corresponding confidence coefficient of each undetermined prediction frame.
Optionally, the second obtaining module is specifically configured to:
acquiring a preset neural network model;
and inputting the gastroscope image to be detected into the neural network model so as to determine an image identification result of the gastroscope image to be detected by utilizing the neural network model.
Optionally, the extracting module is specifically configured to:
determining a plurality of characteristic images according to the gastroscope image to be detected;
and extracting a plurality of characteristic layers from the characteristic image according to different down-sampling step lengths.
Optionally, the extracting module is further configured to:
and carrying out image preprocessing on the gastroscope image to be detected.
Optionally, the selecting module is specifically configured to:
acquiring a trained target detection model;
and determining the confidence degree of the undetermined prediction frame by using the target detection model.
A third aspect of the present application provides an electronic device, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executes computer-executable instructions stored by the memory to cause the at least one processor to perform the method as set forth in the first aspect above and in various possible designs of the first aspect.
A fourth aspect of the present application provides a computer-readable storage medium having stored thereon computer-executable instructions that, when executed by a processor, implement a method as set forth in the first aspect and various possible designs of the first aspect.
This application technical scheme has following advantage:
according to the gastroscope image target detection method, the gastroscope image target detection device, the electronic equipment and the storage medium, the gastroscope image to be detected is obtained; extracting a plurality of characteristic layers of a gastroscope image to be detected; determining a plurality of target frames with different scales according to the plurality of feature layers; selecting a prediction frame from the target frames according to the position relation between the focus point in the gastroscope image to be detected and each target frame; acquiring a category label corresponding to the prediction frame and an image recognition result of the gastroscope image to be detected; and when the class label is the same as the image identification result, determining a target detection result of the gastroscope image to be detected according to the prediction frame and the image identification result. According to the gastroscope image target detection method provided by the scheme, the prediction frame is determined according to the obtained target frames with different scales, and then the final target detection result is determined according to the class label and the image recognition result of the prediction frame, namely the specific position information of the focus is determined, and the recognition result of the focus is also determined, so that the effect of manual detection of a doctor is achieved, and meanwhile, the detection efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained by those skilled in the art according to these drawings.
FIG. 1 is a schematic diagram of a gastroscopic image target detection system based on an embodiment of the present application;
FIG. 2 is a schematic flow chart of a gastroscopic image target detection method provided by an embodiment of the present application;
FIG. 3 is a schematic structural diagram of an exemplary FCOS detection model provided in an embodiment of the present application;
FIG. 4 is a diagram of a network architecture of an exemplary gastroscopic image target detection system provided by an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a gastroscopic image target detection device provided by an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the concepts of the disclosure to those skilled in the art by reference to specific embodiments.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Furthermore, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. In the description of the following examples, "plurality" means two or more unless specifically limited otherwise.
In the prior art, it is common for a physician to determine the location of a lesion and the result of a diagnosis by viewing a gastroscopic image. However, under the condition that there are many gastroscope images, if each gastroscope image is still identified and analyzed based on the prior art, a large amount of working time of a doctor is occupied, and the detection efficiency is not guaranteed.
In order to solve the above problems, the method, the device, the electronic device, and the storage medium for detecting the target of the gastroscope image provided by the embodiment of the application acquire the gastroscope image to be detected; extracting a plurality of characteristic layers of a gastroscope image to be detected; determining a plurality of target frames with different scales according to the plurality of feature layers; selecting a prediction frame from the target frames according to the position relation between the focus point in the gastroscope image to be detected and each target frame; acquiring a category label corresponding to the prediction frame and an image recognition result of the gastroscope image to be detected; and when the class label is the same as the image identification result, determining a target detection result of the gastroscope image to be detected according to the prediction frame and the image identification result. According to the gastroscope image target detection method provided by the scheme, the prediction frame is determined according to the obtained target frames with different scales, and then the final target detection result is determined according to the class label and the image recognition result of the prediction frame, namely the specific position information of the focus is determined, and the recognition result of the focus is also determined, so that the effect of manual detection of a doctor is achieved, and meanwhile, the detection efficiency is improved.
The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present invention will be described below with reference to the accompanying drawings.
First, a configuration of a gastroscopic image target detection system based on the present application will be explained:
the gastroscope image target detection method, the gastroscope image target detection device, the electronic equipment and the storage medium are suitable for detecting position information and focus types of focuses in gastroscope images. Fig. 1 is a schematic structural diagram of a gastroscopic image target detection system based on an embodiment of the present application, and mainly includes a gastroscopic image acquisition device and a gastroscopic image target detection device for performing target detection on a gastroscopic image. Specifically, the gastroscope image acquisition device transmits the acquired gastroscope image to the gastroscope image target detection device, and the device detects the position information and the lesion category of the lesion in the gastroscope image.
The embodiment of the application provides a gastroscope image target detection method which is used for detecting position information and focus categories of focuses in gastroscope images. The execution subject of the embodiment of the application is an electronic device, such as a server, a desktop computer, a notebook computer, a tablet computer, and other electronic devices that can be used for performing target detection on a gastroscope image.
As shown in fig. 2, a schematic flowchart of a gastroscopic image target detection method provided in an embodiment of the present application is shown, and the method includes:
step 201, acquiring a gastroscope image to be detected.
It should be noted that the gastroscope image to be detected specifically refers to the gastric mucosal epithelium image acquired by the gastroscope.
Step 202, extracting a plurality of characteristic layers of the gastroscope image to be detected.
It should be noted that the different feature layers have different features, for example, the first feature layer shows the edge feature of the gastroscope image to be detected, and the second feature layer shows the color feature of the gastroscope image.
Specifically, a plurality of feature layers may be extracted according to different down-sampling scales (step sizes).
Specifically, in one embodiment, a plurality of characteristic images may be determined according to a gastroscope image to be detected; and extracting a plurality of characteristic layers from the characteristic image according to different down-sampling step lengths.
The feature image is an image in which a certain feature is highlighted.
Fig. 3 is a schematic structural diagram of an exemplary FCOS detection model provided in an embodiment of the present application. The features of the pre-picture can be extracted through other learning models to obtain three feature images { C3, C4, C5}, and the adopted down-sampling step lengths are respectively: 8. 16 and 32. And then, further extracting features of the { C3, C4 and C5} by using a Feature Pyramid module of the FCOS detection model to obtain a final Feature layer { P3, P4, P5, P6 and P7 }. The P3, the P4 and the P5 are obtained by convolving feature images C3, C4 and C5 with 1 × 1 convolution kernels from top to bottom, the P6 and the P7 are obtained by convolving P5 and P6 with a layer with the step size of 2, and the steps of the feature layers { P3, P4, P5, P6 and P7} are 8, 16, 32, 64 and 128 respectively.
Specifically, in one embodiment, in order to ensure the feature layer extraction effect, before extracting the multiple feature layers of the gastroscopic image to be detected, image preprocessing may be performed on the gastroscopic image to be detected.
Specifically, the gastroscope image to be detected may be subjected to image cropping to scale the original image to a uniform size, and then subjected to operations such as mean value removal and normalization to improve the image quality of the image to be detected.
And step 203, determining a plurality of target frames with different scales according to the plurality of characteristic layers.
Specifically, the target frames with different scales can be predicted based on different feature layers by using a preset FCOS detection model, so that most of the target frames with coincidence can be stripped, a good foundation is laid for subsequent image target detection work, and the accuracy of a target detection result is improved.
And 204, selecting a prediction frame from the target frames according to the position relation between the focus point in the gastroscope image to be detected and each target frame.
In the embodiments of the present application, the objective of performing target detection on a gastroscopic image is to detect the lesion type (e.g., gastric cancer, gastric ulcer, etc.) and the location information of the lesion (e.g., antrum, greater curvature, lesser curvature, anterior-posterior wall, etc.). Wherein the target frame specifically refers to a predicted lesion position frame.
Specifically, the distance (l, t, r, b) between the focus point and the four borders of the target frame, max (l, r, t, b)>miOr max (l, t, r, b)<mi-1It is not regressed as a negative sample, otherwise, regression is continued. Where m isi,i∈[2,7]May be set to 0,64,128,256,512, and ∞, respectively. Finally, a prediction box can be determined according to the regression result. The prediction box specifically refers to a prediction box that is predicted to match the current lesion, and the specific matching process may be implemented based on the FCOS detection model provided in the foregoing embodiment.
And step 205, acquiring a class label corresponding to the prediction frame and an image identification result of the gastroscope image to be detected.
It should be noted that the image recognition result specifically refers to a lesion recognition result obtained based on another technique, specifically refers to a lesion type corresponding to the current lesion, and the category label refers to a lesion type corresponding to the prediction frame.
And step 206, when the class label is the same as the image identification result, determining a target detection result of the gastroscope image to be detected according to the prediction frame and the image identification result.
Specifically, if the category label is the same as the image recognition result, it may be determined that a reliable lesion recognition result and corresponding position information are currently obtained. The position information is specifically determined according to the position of the prediction frame. Further, integrating the focus recognition result and the position information to obtain a corresponding target detection result.
Conversely, in an embodiment, when the class label is different from the image recognition result, the step of selecting the prediction frame from the target frame is returned to.
It should be noted that the current image recognition technology is mature, that is, the accuracy of the image recognition result is higher than the accuracy of the category label corresponding to the prediction frame, so that when the category label is different from the image recognition result, it may be determined that the selection of the current prediction frame is not accurate, and the prediction frame needs to be reselected, and the specific reselection rule may refer to the above embodiment.
For example, if two frames to be predicted are determined according to the regression result, wherein the category label of the currently selected frame is different from the image recognition result, the other frame to be predicted is determined as a new frame.
Specifically, in an embodiment, in order to ensure reliability of an image recognition result, a preset neural network model may be obtained; and inputting the gastroscope image to be detected into the neural network model so as to determine an image recognition result of the gastroscope image to be detected by using the neural network model.
It should be noted that the acquired neural network model is a model that has been trained, and the image recognition result of the gastroscope image can be determined by using the neural network model. The specific training process of the neural network model may refer to the prior art, and the embodiments of the present application are not limited.
For example, if the neural network model specifically uses the ResNet-50 network, the image may be compressed to 256 × 256, then cut to 224 × 224, then randomly flipped horizontally, and finally normalized to obtain the image input features. The initial learning rate is set to 0.0001, an Adam optimizer is used for optimizing grid parameters in the training process, and a binary cross entropy function is used as a loss function. The batch size is set to 16 and dropout is set to 0.7 to prevent overfitting. And during training, the ResNet-50 network is trained based on training samples, so that the network learns the gastroscope image classification task, and the loss function of image classification is minimized by using an optimizer, so that the network is finally converged.
Wherein, the loss function of the ResNet-50 model is as follows:
Figure BDA0003066877930000081
wherein L isclsDenotes coke loss, LregRepresents the cross-parallel loss, NposDenotes the number of positive samples, λ is LregThe balance weight of (1) is taken here. The function will be applied to the feature image FiAll the pixel points in (1) are calculated,
Figure BDA0003066877930000082
is an indicator of this loss function only if
Figure BDA0003066877930000083
Is equal to 1, otherwise equal to 0.
On the basis of the above embodiments, in order to further improve the accuracy of the target detection result, as an implementable manner, in an embodiment, the selecting the prediction frame from the target frames according to the positional relationship between the lesion point and each target frame in the gastroscope image to be detected includes:
step 2041, acquiring the maximum distance between a focus point and each frame in a target frame;
step 2042, screening undetermined prediction frames according to the maximum distance and the centrifugal intervals corresponding to the target frames;
2043, obtaining a confidence coefficient corresponding to each undetermined prediction frame;
and 2044, determining the prediction frames according to the corresponding confidence degrees of the undetermined prediction frames.
For example, in the embodiment of the present application, taking the FCOS algorithm as an example, five target frames (five feature layers) are obtained in total, and the centrifuge intervals corresponding to the target frames are (0,60), (64,128), (128,256), (256,512) and (512, ∞), respectively. The distance may specifically refer to a pixel distance.
Specifically, the target frame whose maximum distance belongs to the corresponding centrifugal interval is determined as the frame to be predicted. And then determining the undetermined prediction frame with the highest confidence coefficient as a prediction frame. If the prediction box needs to be reselected subsequently, the confidence coefficient of the current prediction box can be adjusted to be half of the original confidence coefficient, and then the undetermined prediction box with the second confidence coefficient rank is determined as a new prediction box. And the rest can be done in the same way until an accurate prediction frame is obtained.
Specifically, in an embodiment, a trained target detection model may be obtained; and determining the confidence coefficient of the frame to be predicted by using the target detection model.
The target detection model may be an FCOS prediction model, and the confidence of each pending prediction box may be determined according to a training result of the FCOS prediction model.
In order to facilitate those skilled in the art to better understand the gastroscope image target detection method provided in the embodiments of the present application, reference may be made to the following contents to train the FCOS detection model provided in the embodiments of the present application:
reading the positions of the real target boxes in the training set image and the text file, and then preprocessing the image, including scaling and clipping, mean value removing and normalization. The image is convolved to obtain 5 feature layers with different levels, and each feature layer is responsible for predicting target frames with different scales. The output of the target detection is divided into three branches: a category branch, a centrality branch, and a regression branch. And mapping the pixel points in each layer to the original image, wherein if the pixel points fall into the target frame, the pixel points are positive samples, otherwise, the pixel points are negative samples, the category target is the category of the target frame, and the regression target is (l, t, r, b) at the moment and respectively represents the distance from the central point to the left side, the upper side, the right side and the lower side of the target frame. If the point falls into a plurality of overlapped target frames, the target frame with a smaller area is selected for calculation. To make the result more accurate, a weight can be designed in the process: centrality. The output of the category is multiplied by a weight map to obtain the final classification confidence, and the weight map represents the distance from each point in the target frame to the central point, namely the centrality, and the closer the distance is, the higher the weight is. Thus, the final target detection result is divided into three parts: class labels, confidence, and prediction boxes. The learning rate is set to 0.002 at the beginning of training, the learning rate is dynamically changed as the number of training rounds increases, the grid parameters are optimized by using an SGD (storage Gradient decision) optimizer, and the loss function provided by the embodiment is used. The batch size was set to 4. During training, the FCOS network is trained based on training samples, the network is enabled to learn a gastroscope image target detection task, an optimizer is used for minimizing a loss function of target detection, and finally the network is enabled to be converged.
Illustratively, as shown in fig. 4, a network architecture diagram of an exemplary gastroscopic image target detection system provided by the embodiments of the present application is shown. Wherein, First Network in the figure represents a neural Network used for determining the image recognition result of the gastroscope image. The FCOS network architecture can refer to the prior art and is not described in detail.
According to the gastroscope image target detection method provided by the embodiment of the application, the gastroscope image to be detected is obtained; extracting a plurality of characteristic layers of a gastroscope image to be detected; determining a plurality of target frames with different scales according to the plurality of feature layers; selecting a prediction frame from the target frames according to the position relation between the focus point in the gastroscope image to be detected and each target frame; acquiring a category label corresponding to the prediction frame and an image recognition result of the gastroscope image to be detected; and when the class label is the same as the image identification result, determining a target detection result of the gastroscope image to be detected according to the prediction frame and the image identification result. According to the gastroscope image target detection method provided by the scheme, the prediction frame is determined according to the obtained target frames with different scales, and then the final target detection result is determined according to the class label and the image recognition result of the prediction frame, namely the specific position information of the focus is determined, and the recognition result of the focus is also determined, so that the effect of manual detection of a doctor is achieved, and meanwhile, the detection efficiency is improved. In addition, the method provided by the embodiment of the application is a target detection method combining the neural network and the FCOS network, the accuracy is higher, the model classification is more accurate, and therefore the target detection effect is improved.
The embodiment of the application provides a gastroscope image target detection device which is used for executing the gastroscope image target detection method provided by the embodiment.
Fig. 5 is a schematic structural diagram of a gastroscopic image target detection apparatus according to an embodiment of the present application. The gastroscopic image target detection device 50 comprises a first acquisition module 501, an extraction module 502, a determination module 503, a selection module 504, a second acquisition module 505 and a detection module 506.
The first acquisition module is used for acquiring a gastroscope image to be detected; the extraction module is used for extracting various characteristic layers of the gastroscope image to be detected; the determining module is used for determining a plurality of target frames with different scales according to the plurality of feature layers; the selection module is used for selecting a prediction frame from the target frames according to the position relation between the focus point in the gastroscope image to be detected and each target frame; the second acquisition module is used for acquiring the class label corresponding to the prediction frame and the image recognition result of the gastroscope image to be detected; and the detection module is used for determining a target detection result of the gastroscope image to be detected according to the prediction frame and the image identification result when the class label is the same as the image identification result.
Specifically, in an embodiment, the detection module is further configured to:
and returning to the step of selecting the prediction frame from the target frame when the class label is different from the image recognition result.
Specifically, in an embodiment, the selecting module is specifically configured to:
acquiring the maximum distance from a focus point to each frame in a target frame;
screening undetermined prediction frames according to the maximum distance and the centrifugal intervals corresponding to the target frames;
obtaining the confidence corresponding to each undetermined prediction frame;
and determining the prediction frame according to the corresponding confidence coefficient of each undetermined prediction frame.
Specifically, in an embodiment, the second obtaining module is specifically configured to:
acquiring a preset neural network model;
and inputting the gastroscope image to be detected into the neural network model so as to determine an image recognition result of the gastroscope image to be detected by using the neural network model.
Specifically, in an embodiment, the extraction module is specifically configured to:
determining a plurality of characteristic images according to the gastroscope image to be detected;
and extracting a plurality of characteristic layers from the characteristic image according to different down-sampling step lengths.
Specifically, in an embodiment, the extraction module is further configured to:
and carrying out image preprocessing on the gastroscope image to be detected.
Specifically, in an embodiment, the selecting module is specifically configured to:
acquiring a trained target detection model;
and determining the confidence coefficient of the frame to be predicted by using the target detection model.
With regard to the gastroscopic image target detection apparatus in the present embodiment, the specific manner in which the respective modules perform the operations has been described in detail in the embodiment related to the method, and will not be explained in detail here.
The gastroscope image target detection device provided by the embodiment of the application is used for executing the gastroscope image target detection method provided by the embodiment, the implementation mode and the principle are the same, and the details are not repeated.
The embodiment of the application provides electronic equipment for executing the gastroscope image target detection method provided by the embodiment.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device 60 includes: at least one processor 61 and memory 62;
the memory stores computer-executable instructions; the at least one processor executes the computer-executable instructions stored by the memory to cause the at least one processor to perform the gastroscopic image target detection method provided by the above embodiments.
The electronic device provided by the embodiment of the application is used for executing the gastroscope image target detection method provided by the embodiment, the implementation mode and the principle are the same, and the details are not repeated.
The embodiment of the application provides a computer-readable storage medium, wherein a computer executing instruction is stored in the computer-readable storage medium, and when a processor executes the computer executing instruction, the gastroscope image target detection method provided by any one of the above embodiments is realized.
The storage medium containing computer-executable instructions of the embodiment of the present application can be used for storing the computer-executable instructions of the gastroscope image target detection method provided in the foregoing embodiment, and the implementation manner and principle thereof are the same and are not described again.
It is obvious to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to perform all or part of the above described functions. For the specific working process of the device described above, reference may be made to the corresponding process in the foregoing method embodiment, which is not described herein again.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. A gastroscopic image target detection method, comprising:
acquiring a gastroscope image to be detected;
extracting a plurality of characteristic layers of the gastroscope image to be detected;
determining a plurality of target frames with different scales according to the plurality of feature layers;
selecting a prediction frame from the target frames according to the position relation between the focus point in the gastroscope image to be detected and each target frame;
acquiring a category label corresponding to the prediction frame and an image recognition result of the gastroscope image to be detected;
and when the class label is the same as the image identification result, determining a target detection result of the gastroscope image to be detected according to the prediction frame and the image identification result.
2. The method of claim 1, further comprising:
and returning to the step of selecting a prediction frame from the target frame when the class label is different from the image recognition result.
3. The method according to claim 1, wherein the selecting a prediction frame from the target frames according to the position relationship between the focus point in the gastroscopic image to be detected and each target frame comprises:
acquiring the maximum distance from the focus point to each frame in the target frame;
screening undetermined prediction frames according to the maximum distance and the centrifugal intervals corresponding to the target frames;
obtaining the confidence corresponding to each undetermined prediction frame;
and determining the prediction frame according to the corresponding confidence coefficient of each undetermined prediction frame.
4. The method according to claim 1, wherein the acquiring of the image recognition result of the gastroscopic image to be detected comprises:
acquiring a preset neural network model;
and inputting the gastroscope image to be detected into the neural network model so as to determine an image identification result of the gastroscope image to be detected by utilizing the neural network model.
5. The method according to claim 1, wherein the extracting a plurality of feature layers of the gastroscopic image to be detected comprises:
determining a plurality of characteristic images according to the gastroscope image to be detected;
and extracting a plurality of characteristic layers from the characteristic image according to different down-sampling step lengths.
6. The method according to claim 1, wherein prior to extracting the plurality of feature layers of the gastroscopic image to be detected, the method further comprises:
and carrying out image preprocessing on the gastroscope image to be detected.
7. The method of claim 3, wherein the obtaining the confidence level corresponding to each pending prediction box comprises:
acquiring a trained target detection model;
and determining the confidence degree of the undetermined prediction frame by using the target detection model.
8. A gastroscopic image target detection apparatus comprising:
the first acquisition module is used for acquiring a gastroscope image to be detected;
the extraction module is used for extracting various characteristic layers of the gastroscope image to be detected;
the determining module is used for determining a plurality of target frames with different scales according to the plurality of feature layers;
the selection module is used for selecting a prediction frame from the target frames according to the position relation between the focus point in the gastroscope image to be detected and each target frame;
the second acquisition module is used for acquiring the class label corresponding to the prediction frame and the image recognition result of the gastroscope image to be detected;
and the detection module is used for determining a target detection result of the gastroscope image to be detected according to the prediction frame and the image identification result when the class label is the same as the image identification result.
9. An electronic device, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executing the computer-executable instructions stored by the memory causes the at least one processor to perform the method of any of claims 1-7.
10. A computer-readable storage medium having computer-executable instructions stored thereon which, when executed by a processor, implement the method of any one of claims 1 to 7.
CN202110529666.6A 2021-05-14 2021-05-14 Gastroscope image target detection method and device, electronic equipment and storage medium Pending CN113436141A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110529666.6A CN113436141A (en) 2021-05-14 2021-05-14 Gastroscope image target detection method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110529666.6A CN113436141A (en) 2021-05-14 2021-05-14 Gastroscope image target detection method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113436141A true CN113436141A (en) 2021-09-24

Family

ID=77802376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110529666.6A Pending CN113436141A (en) 2021-05-14 2021-05-14 Gastroscope image target detection method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113436141A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149503A (en) * 2020-08-20 2020-12-29 北京迈格威科技有限公司 Target event detection method and device, electronic equipment and readable medium
CN112634261A (en) * 2020-12-30 2021-04-09 上海交通大学医学院附属瑞金医院 Stomach cancer focus detection method and device based on convolutional neural network
CN112633355A (en) * 2020-12-18 2021-04-09 北京迈格威科技有限公司 Image data processing method and device and target detection model training method and device
CN112767389A (en) * 2021-02-03 2021-05-07 紫东信息科技(苏州)有限公司 Gastroscope picture focus identification method and device based on FCOS algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149503A (en) * 2020-08-20 2020-12-29 北京迈格威科技有限公司 Target event detection method and device, electronic equipment and readable medium
CN112633355A (en) * 2020-12-18 2021-04-09 北京迈格威科技有限公司 Image data processing method and device and target detection model training method and device
CN112634261A (en) * 2020-12-30 2021-04-09 上海交通大学医学院附属瑞金医院 Stomach cancer focus detection method and device based on convolutional neural network
CN112767389A (en) * 2021-02-03 2021-05-07 紫东信息科技(苏州)有限公司 Gastroscope picture focus identification method and device based on FCOS algorithm

Similar Documents

Publication Publication Date Title
CN110827251B (en) Power transmission line locking pin defect detection method based on aerial image
CN110866908B (en) Image processing method, image processing apparatus, server, and storage medium
CN109685765B (en) X-ray film pneumonia result prediction device based on convolutional neural network
US20230047131A1 (en) Contour shape recognition method
CN112819821B (en) Cell nucleus image detection method
CN111008576B (en) Pedestrian detection and model training method, device and readable storage medium
CN112614133B (en) Three-dimensional pulmonary nodule detection model training method and device without anchor point frame
CN111179252B (en) Cloud platform-based digestive tract disease focus auxiliary identification and positive feedback system
CN112132166A (en) Intelligent analysis method, system and device for digital cytopathology image
CN113610118B (en) Glaucoma diagnosis method, device, equipment and method based on multitasking course learning
CN112037180B (en) Chromosome segmentation method and device
CN114581375A (en) Method, device and storage medium for automatically detecting focus of wireless capsule endoscope
CN113487610A (en) Herpes image recognition method and device, computer equipment and storage medium
CN114445356A (en) Multi-resolution-based full-field pathological section image tumor rapid positioning method
CN114663426A (en) Bone age assessment method based on key bone area positioning
CN114743195B (en) Thyroid cell pathology digital image recognizer training method and image processing method
CN115205520A (en) Gastroscope image intelligent target detection method and system, electronic equipment and storage medium
CN116168240A (en) Arbitrary-direction dense ship target detection method based on attention enhancement
CN117690128A (en) Embryo cell multi-core target detection system, method and computer readable storage medium
CN115359264A (en) Intensive distribution adhesion cell deep learning identification method
CN116681885A (en) Infrared image target identification method and system for power transmission and transformation equipment
CN114782948A (en) Global interpretation method and system for cervical liquid-based cytology smear
CN117315380B (en) Deep learning-based pneumonia CT image classification method and system
CN112991281B (en) Visual detection method, system, electronic equipment and medium
CN113177554B (en) Thyroid nodule identification and segmentation method, system, storage medium and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination