WO2023207564A1 - 基于图像识别的内窥镜进退镜时间确定方法及装置 - Google Patents

基于图像识别的内窥镜进退镜时间确定方法及装置 Download PDF

Info

Publication number
WO2023207564A1
WO2023207564A1 PCT/CN2023/087314 CN2023087314W WO2023207564A1 WO 2023207564 A1 WO2023207564 A1 WO 2023207564A1 CN 2023087314 W CN2023087314 W CN 2023087314W WO 2023207564 A1 WO2023207564 A1 WO 2023207564A1
Authority
WO
WIPO (PCT)
Prior art keywords
endoscope
endoscopic image
image
current
fusion
Prior art date
Application number
PCT/CN2023/087314
Other languages
English (en)
French (fr)
Inventor
刘威
刘腾营
边成
张志诚
Original Assignee
小荷医疗器械(海南)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 小荷医疗器械(海南)有限公司 filed Critical 小荷医疗器械(海南)有限公司
Publication of WO2023207564A1 publication Critical patent/WO2023207564A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • G06T7/0014Biomedical image inspection using an image reference approach
    • G06T7/0016Biomedical image inspection using an image reference approach involving temporal comparison
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10068Endoscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30028Colon; Small intestine

Definitions

  • the present disclosure relates to the field of medical image technology, and specifically to a method and device for determining the endoscope advancement and withdrawal time based on image recognition.
  • endoscopy such as colonoscopy refers to using an electronic colonoscope to enter the intestine from outside the body, starting the process of entering the scope until the end of the intestine, and then starting to withdraw the scope to observe and diagnose the intestine.
  • the electronic colonoscope is withdrawn from the body, and the ileocecal part (that is, the part where the end of the ileum and the cecum in the human body meet each other) is usually the basis for starting the withdrawal of the scope.
  • the entire process of entering and withdrawing the scope can be divided into the scope entry interval and the return section.
  • the blind interval and the mirror withdrawal interval, the length of the mirror entry interval affects the mirror entry efficiency, and the length of the mirror withdrawal interval affects the detection quality.
  • the present disclosure provides a method for determining the endoscope advancement and retraction time based on image recognition.
  • the endoscope advancement and retraction time determination method includes:
  • a fusion result is determined based on the recognition result of the current endoscopic image and the recognition result of the endoscopic image located a preset number of frames before the current endoscopic image, and the fusion result is used to characterize the acquisition of the current endoscopic image.
  • the position status and the time corresponding to the current endoscopic image, the time when the endoscope body reaches the target position is determined, and the target position includes the body, the ileocecal region or the body outside the body.
  • the present disclosure provides an endoscope advancement and retraction time determination device based on image recognition.
  • the endoscope advancement and retraction time determination device includes:
  • An acquisition module configured to acquire the current endoscopic image and position status, where the position status is used to characterize the position of the endoscope body before acquiring the current endoscopic image
  • a recognition module used to process the current endoscopic image according to a pre-trained endoscopic image recognition model to obtain a recognition result
  • a fusion module configured to determine a fusion result based on the recognition result of the current endoscopic image and the recognition result of the endoscopic image located a preset number of frames before the current endoscopic image, and the fusion result is used to characterize The position of the endoscope body when acquiring the current endoscopic image;
  • Determining module configured to determine the moment when the endoscope body reaches the target position according to the fusion result, the position status and the moment corresponding to the current endoscopic image.
  • the target position includes the body, ileocecal or in vitro.
  • the present disclosure provides a computer-readable medium on which a computer program is stored.
  • the steps of the method for determining the endoscope advance and retract time in the first aspect are implemented.
  • an electronic device including:
  • a processing device configured to execute the computer program in the storage device to implement the steps of the method for determining the endoscope advancement and retraction time in the first aspect.
  • the present disclosure provides a computer program that, when executed by a processor, implements the steps of the method for determining the endoscope advancement and retraction time described in the first aspect.
  • the present disclosure provides a computer program product, including a computer program.
  • the computer program When the computer program is executed by a processor, the steps of the method for determining the endoscope advancement and retraction time described in the first aspect are implemented.
  • the fusion result is used to characterize the position of the endoscope body when the current endoscopic image is acquired, and the position state is used to characterize the position of the endoscope body before the current endoscopic image is acquired, therefore,
  • the fusion results are compared with the positional status to determine the moment when the endoscope body reaches the body, the ileocecal region, or the body outside the body.
  • FIG. 1 is a flow chart of a method for determining the endoscope advancement and retraction time based on image recognition according to an exemplary embodiment of the present disclosure.
  • Figure 2 is a schematic diagram of a colonoscopy interval according to an exemplary embodiment of the present disclosure.
  • Figure 3 is a schematic structural diagram of an endoscope image recognition model according to an exemplary embodiment of the present disclosure.
  • FIG. 4 is another flowchart of a method for determining the endoscope advancement and retraction time based on image recognition according to an exemplary embodiment of the present disclosure.
  • FIG. 5 is a block diagram of an endoscope advancement and retraction time determination device based on image recognition according to an exemplary embodiment of the present disclosure.
  • FIG. 6 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
  • the term “include” and its variations are open-ended, ie, “including but not limited to.”
  • the term “based on” means “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; and the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • a prompt message is sent to the user to clearly remind the user that the operation requested will require the acquisition and use of the user's personal information. Therefore, users can autonomously choose whether to provide personal information to software or hardware such as electronic devices, applications, servers or storage media that perform the operations of the technical solution of the present disclosure based on the prompt information.
  • the method of sending prompt information to the user may be, for example, a pop-up window, and the prompt information may be presented in the form of text in the pop-up window.
  • the pop-up window can also contain a selection control for the user to choose "agree” or "disagree” to provide personal information to the electronic device.
  • the identification of the scope entry interval and the scope withdrawal interval relies on the recognition of images, that is, classifying the images, and determining the location of the electronic colonoscope based on the classification results.
  • the intestinal environment is very complex, filled with feces, air bubbles and other debris.
  • the electronic colonoscope travels in the twisted intestine. Due to the instability of the electronic colonoscopy camera, a lot of problems will occur.
  • the ileocecal valve image accounts for a very small proportion of the entire colonoscopy video, and the annotation data is relatively scarce, and the annotation data is There is more or less noise, which makes it easy for the depth model to overfit on the training data.
  • the ileocecal valve accounts for a small proportion of the entire image, and due to camera shake, shooting angle, etc. Different degrees of impact, leading to the return of The image features of the blind valve are not obvious, so that the ileocecal valve structure cannot be well represented in the image.
  • the present disclosure provides a method for determining the endoscope advancement and retraction time based on image recognition, which combines the fusion result used to characterize the position of the endoscope body when acquiring the current endoscopic image, with the fusion result used to characterize the position of the endoscope body when acquiring the current endoscopic image.
  • the position status of the endoscope body in front of the current endoscopic image is compared to determine the moment when the endoscope body reaches the body, ileoceae, or outside the body.
  • the endoscope body According to the time the endoscope body reaches the body, ileoceae, and the time outside the body can reflect the length of the scope entry interval and the scope withdrawal interval, which is convenient to improve the scope entry efficiency and inspection quality; and the fusion result is the recognition result of the current endoscopic image and the predetermined time in front of the current endoscopic image.
  • the number of frames is determined by the recognition results of the endoscopic image, so that the current position of the endoscope body can be accurately estimated.
  • FIG. 1 is a flow chart of a method for determining the endoscope advancement and retraction time based on image recognition according to an exemplary embodiment of the present disclosure.
  • the method for determining the time for advancing and retracting an endoscope can be applied to endoscope detection equipment.
  • the method for determining the time for advancing and retracting an endoscope may include:
  • Step S101 Obtain the current endoscopic image and position status.
  • the position status is used to represent the position of the endoscope body before acquiring the current endoscopic image.
  • Step S102 Process the current endoscopic image according to the pre-trained endoscopic image recognition model to obtain a recognition result.
  • Step S103 Determine the fusion result based on the recognition result of the current endoscopic image and the recognition result of the endoscopic image located a preset number of frames before the current endoscopic image.
  • the fusion result is used to characterize the time when the current endoscopic image is acquired. The position of the speculum body.
  • Step S104 Determine the time when the endoscope body reaches the target position based on the fusion result, position status, and time corresponding to the current endoscopic image.
  • the target position includes the body, the ileocecal region, or the body outside the body.
  • the positions involved in the colonoscopy process of the endoscope body are first exemplified.
  • the colonoscope body enters the intestine from outside the body (outside the intestine), starts the process of entering the scope until the end of the intestine, and then begins to withdraw the scope.
  • the scope entry interval to the scope withdrawal interval are both located in the body (intestine), and the boundary between the scope entry interval and the scope withdrawal interval is called the ileocecal interval. Therefore, ileocecal area can also be regarded as a special position in the body.
  • the position of the endoscope body can be divided into in vitro, in vivo (colonoscopy interval shown in Figure 2) and ileocecal area (ileocecal interval shown in Figure 2).
  • the arrows shown in Figure 2 from left to right the first arrow represents the moment when the colonoscope body reaches the body from outside the body, the second arrow represents the moment when the colonoscope body reaches the ileocecal area, and the third arrow represents The time when the colonoscope body reaches the outside of the body from the inside of the body can be determined according to the time indicated by the first and second arrows, and the time of withdrawal can be determined according to the time indicated by the second and third arrows.
  • the endoscope body as a colonoscope body as an example.
  • the endoscopic image described below may be a colonoscope image.
  • the endoscopic image recognition model can be trained, wherein the endoscopic image recognition model is a classification model used to classify endoscopic images.
  • the endoscopic image recognition model can be trained by: obtaining endoscopic image samples; performing data enhancement on the endoscopic image samples to obtain endoscopic image enhanced samples; and enhancing the endoscopic image according to Use the sample to train the endoscopic image recognition model to obtain a trained endoscopic image recognition model.
  • the frame sampling frequency may be 5 frames/time, which means sampling is performed every 5 frames.
  • the obtained endoscopic image can be preprocessed to obtain a preprocessed endoscopic image.
  • preprocessing may be to filter blurred images to facilitate model training.
  • the endoscopic image can be manually annotated to obtain an endoscopic image sample.
  • the endoscopic image sample carries a sample label, and the sample label is used to indicate the category of the endoscopic image sample.
  • the manual annotation process can be as follows: first, manually screen out endoscopic images with ileocecal intervals, extract frames from the screened endoscopic images, and then manually annotate endoscopic images containing echoceae from the extracted frames. Endoscopic image of the cecal valve (ileocecal valve: the upper and lower half-moon-shaped folds at the end of the ileum facing the cecum). Combined with the above annotation process, endoscopic images can be annotated into three categories.
  • the three categories are ileocecal valve images (images in which the endoscope is located in the ileocecal area and the image contains the ileocecal valve), in-vivo images (endoscopic images) images with the endoscope inside the intestine and without the ileocecal valve) and in vitro images (images with the endoscope outside the intestine).
  • data enhancement can be performed on the endoscopic image sample.
  • the endoscopic image recognition model is trained through data-enhanced endoscopic image samples, which can solve the problem of the very small proportion of ileocecal valve images in all endoscopic image samples and the presence of certain noise in the annotated data.
  • the endoscopic image recognition model can easily overfit on the training data, thereby improving the robustness of the endoscopic image recognition model.
  • data enhancement may include adding random Gaussian noise, adding motion blur, adding color changes, multi-scale scaling of images, random flipping of images, etc.
  • the endoscopic image recognition model may include a convolutional neural network (Convolutional Neural Network, CNN), a feature aggregation layer and a fully connected layer.
  • the steps of training the endoscopic image recognition model based on the endoscopic image enhancement samples may include: inputting the endoscopic image enhancement samples into the CNN network for feature extraction processing, and obtaining the feature information output by the CNN network; Input the feature information into the feature aggregation layer for generalized mean pooling to obtain the target feature information; input the target feature information into the fully connected layer to obtain the predicted recognition result; according to the predicted recognition result and the sample label corresponding to the endoscopic image enhancement sample, Determine the loss function; adjust the parameters of the endoscopic image recognition model based on the loss function.
  • CNN convolutional Neural Network
  • the CNN network is prone to overfitting on the training data.
  • the selection can be, for example, a CNN network with multiple input paths (which can be understood as feature sampling paths), such as the Se-ResNet50 network.
  • the regularization method can be applied to the Se-ResNet50 network to prevent overfitting of the model.
  • Improve model robustness can be, for example, a CNN network with multiple input paths (which can be understood as feature sampling paths), such as the Se-ResNet50 network.
  • the regularization method can be an invalid path (droppath).
  • the droppath can randomly "invalidate" multiple input paths in the Se-ResNet50 network, so that the Se-ResNet50 network can select different input paths to implement endoscopy. Extraction of feature information of image enhancement samples, since the feature information obtained by sampling from different input paths is different, thus avoiding overfitting of the model.
  • the ileocecal valve accounts for a small proportion of the entire endoscopic image due to its own structure. This is different from the ImageNet image library (used in visual object recognition software research) where objects are usually located in the center of the image. There are large differences in the images in the large-scale visualization database), and due to factors such as lens shaking and shooting angles, the ileocecal valve structure cannot be well represented in endoscopic images, which in turn causes the characteristics of the ileocecal valve image extracted by the CNN network There is a problem with the information that is not obvious.
  • the feature aggregation layer is used to perform generalized mean pooling on the feature information output by the CNN network to obtain the target feature information. This can make the target feature information contain more ileocecal valve structures. Image feature information.
  • the feature information output by the CNN network can be characterized as: f ⁇ R W ⁇ H ⁇ K , where K is the number of channels of feature information, and the feature information f k of the k-th channel has W ⁇ H activation values.
  • f g is the target feature information output by the feature aggregation layer
  • T represents the inversion of the matrix
  • It is the information obtained by generalized averaging of the feature information corresponding to the k-th channel output by the CNN network
  • q k is the pooling parameter.
  • the fully connected layer is a classification head that is used to output the probability that the endoscopic image enhancement sample belongs to each category based on the target feature information.
  • the categories here include in vitro images, in vivo images, and ileocecal valve images.
  • in vitro images correspond to in vitro probabilities.
  • the in-vivo image corresponds to the in-vivo probability
  • the ileocecal valve image corresponds to the ileocecal probability.
  • L cls is the value of the loss function, i is 0, 1, and 2, which can be used to represent the three categories of in vitro images, in vivo images, and ileocecal valve images respectively.
  • the parameters of the endoscopic image recognition model are adjusted according to the value of the loss function.
  • backpropagation can be used to adjust the parameters involved in the fully connected layer, feature aggregation layer, and CNN network in sequence.
  • the CNN network can use regularization methods to act on some samples in a batch.
  • the trained endoscopic image recognition model can be used to process the endoscopic image. It can be understood that when colonoscopy detection is turned on, the steps of obtaining the current endoscopic image and position status are performed.
  • the current endoscopic image may be an image stored locally in the endoscope detection device, or may be an image obtained from other devices, which is not limited in this implementation.
  • the current endoscopic image may be an image acquired in real time during the colonoscopy, whereby the advancement and withdrawal time of the colonoscope during the colonoscopy may be determined in real time based on the current endoscopic image.
  • the position of the endoscope body when the current endoscopic image is captured by the colonoscope body changes compared with the position represented by the endoscopic image before the current endoscopic image is captured, , it indicates that the colonoscope body has currently reached a new position.
  • the above step S104 may include: passing a position state used to characterize the position of the endoscope body before acquiring the current endoscopic image, and a position state used to characterize the position of the endoscope body when acquiring the current endoscopic image.
  • the fusion results are used to determine whether the endoscope body meets the preset conditions for reaching the target position (which can be understood as the condition that the position represented by the front and rear images changes). After determining that the endoscope body is satisfied If the preset conditions corresponding to reaching the target position are met, the time corresponding to the current endoscopic image is determined as the time when the endoscope body reaches the target position.
  • the position status needs to be updated according to the target position.
  • the updated position status indicates that the endoscope body is at the target position.
  • the position status is used to characterize the position of the endoscope body before acquiring the current endoscopic image.
  • three types of status information can be set and marked, and the marked results are used to determine the position status. That is, it is used to determine whether it has reached the body, whether it has reached the ileocecal area, and whether it has reached the outside of the body. Based on this, the location status can be determined based on the information of the tags that read each status information.
  • inbody inbody
  • ileo inileo
  • outbody outbody
  • inbody false means that it has not reached the body
  • inileo false means that it has not reached the ileocecal area
  • outbody false means that it has not reached the outside of the body
  • inbody true means that it has reached the body
  • inileo true means that it has reached the ileocecal area
  • outbody true means that it has reached the outside of the body.
  • the recognition result of the current endoscopic image includes the probability that the current endoscopic image belongs to the internal body image, the external body image and the ileocecal image respectively.
  • the recognition results of the endoscopic images of a preset number of frames before the endoscope image, and the determined fusion results may include extracorporeal fusion probability and ileocecal fusion probability. It should be noted that the endoscopic images located a preset number of frames before the current endoscopic image are endoscopic images located a preset number of consecutive frames before the current endoscopic image.
  • the in vitro fusion probability may be the mean of the sum of the in vitro probabilities of the endoscopic images of 5 consecutive frames including the current endoscopic image (4 of which are endoscopic images before the current endoscopic image).
  • the ileocecal fusion probability may be the sum of the ileocecal probabilities of endoscopic images of 250 consecutive frames including the current endoscopic image (249 frames are endoscopic images located before the current endoscopic image). It should be noted that the above example does not limit the preset number of frames.
  • the recognition result of the endoscopic image of the historical frame can be stored in the memory of the colonoscopy device to facilitate the calculation of the fusion result.
  • the history in the memory does not participate in the calculation of the fusion result corresponding to the current endoscopic image.
  • frames can be automatically deleted to save memory space.
  • the present disclosure will describe the determination of the mirror advance and withdrawal time.
  • the dotted line box on the right side of Figure 4 illustrates how the colonoscope moves from the outside of the body to the body, then to the ileocaecum, and then exits from the body to the body. Blind and in vitro moments.
  • the position state indicates that the endoscope body is outside the body and the extracorporeal fusion probability (outprob) is less than or equal to the first preset probability threshold (H1 shown in Figure 4)
  • the time T corresponding to the current endoscope image can be determined as the time when the endoscope body reaches the body, and based on The target position reached this time (in the body) will be initialized
  • the next endoscopic image of the current endoscopic image will be obtained as the new current endoscopic image, and a new position state will be obtained, and the position state represents that the endoscope body is in the body and not in the ileocaecum.
  • the ileocecal fusion probability ileoprob
  • H2 the second preset probability threshold
  • the next endoscopic image of the current endoscopic image will be obtained as the new current endoscopic image, and a new position state will be obtained, and the position state represents that the endoscope body is located in the ileocaecum and is fused outside the body.
  • the probability is greater than or equal to the third preset probability threshold (H3 shown in Figure 4).
  • H3 the third preset probability threshold
  • H1, H2 and H3 can be set according to the actual situation, and are not limited in this embodiment.
  • the process shown in Figure 4 above is an exemplary illustration of updating the position status and sequentially determining the time when a colonoscope reaches the body, the ileoceae, and the body during the process from outside the body to the body, then to the ileocecal area, and then exits from the body to the outside the body.
  • the time T corresponding to the current endoscopic image can be determined in the following way: determining the frame number of the current endoscopic image, based on the product of the frame number of the current endoscopic image and the frame rate of the video corresponding to the current endoscopic image. Determine the time T corresponding to the current endoscopic image. For example, if the frame rate is 25 frames/second and the frame number of the current endoscopic image is 25, it can be determined that the time T corresponding to the current endoscopic image is 1 second.
  • the corresponding preset conditions may also include a time judgment condition.
  • the values corresponding to inbody, inileo, outbody, inbodytime, inileotime, and outbodytime can be displayed synchronously on the colonoscopy device in real time to facilitate the doctor's viewing.
  • the time when the endoscope body reaches each target position in the colonoscopy video with a frame rate of 25 can be determined in real time, as well as the indication information of whether it has reached each target position, and the colonoscope can be used for instructions in the human body.
  • the indication information of whether each target position has been reached is updated synchronously on the colonoscopy equipment in real time.
  • embodiments of the present disclosure also provide an endoscope advancement and retraction time determination device based on image recognition.
  • the endoscope advancement and retraction time determination device 500 includes:
  • the acquisition module 501 is used to acquire the current endoscopic image and position status, where the position status is used to represent the position of the endoscope body before acquiring the current endoscopic image;
  • the recognition module 502 is used to process the current endoscopic image according to a pre-trained endoscopic image recognition model to obtain a recognition result;
  • the fusion module 503 is configured to determine a fusion result based on the recognition result of the current endoscopic image and the recognition result of the endoscopic image located a preset number of frames before the current endoscopic image, and the fusion result is used for Characterize the position of the endoscope body when the current endoscopic image is acquired;
  • Determining module 504 is configured to determine the time when the endoscope body reaches a target position according to the fusion result, the position status and the time corresponding to the current endoscope image.
  • the target position includes the body, back and forth. Blind or in vitro.
  • the determining module 504 includes:
  • a first determination sub-module configured to determine whether the endoscope body meets the preset conditions corresponding to reaching the target position according to the fusion result and the position status
  • the second determination sub-module is used to determine the time corresponding to the current endoscope image as the endoscope body when it is determined that the endoscope body meets the preset condition corresponding to reaching the target position. The moment when the mirror body reaches the target position, and the position status is updated according to the target position.
  • the recognition result of the current endoscopic image includes the probability that the current endoscopic image belongs to an in-vivo image, an in-vitro image and an ileocecal image respectively, the target position is the in-vivo body, and the fusion
  • the result includes an extracorporeal fusion probability
  • the first determination sub-module is specifically used to indicate that the position state indicates that the endoscope body is located outside the body and the extracorporeal fusion probability is less than or equal to a first preset probability threshold. , determining that the endoscope body meets the corresponding preset conditions for reaching the body.
  • the target position is the ileocecal point
  • the fusion result includes an ileocecal fusion probability
  • the first determination sub-module is specifically used to characterize that the endoscope body is located in the position state.
  • the body is not located in the ileocecal region and the ileocecal fusion probability is greater than or equal to the second preset probability threshold, it is determined that the endoscope body meets the preset conditions corresponding to reaching the ileocecal region.
  • the target position is the extracorporeal body
  • the first determination sub-module is specifically configured to indicate that the endoscope body is located in the ileocecal position when the position state is greater than or equal to the third in vitro fusion probability.
  • the endoscope body meets the corresponding preset conditions for reaching the in vitro body.
  • the endoscope advancement and retraction time determination device 500 also includes:
  • a sample acquisition module is used to acquire endoscopic image samples
  • a data enhancement module used to perform data enhancement on the endoscopic image sample to obtain an endoscopic image enhanced sample
  • a training module configured to train the endoscopic image recognition model according to the endoscopic image enhancement sample to obtain a trained endoscopic image recognition model.
  • the endoscopic image recognition model includes a CNN network, a feature aggregation layer and a fully connected layer
  • the training module includes:
  • An extraction submodule used to input the endoscopic image enhancement sample to the CNN network for feature extraction processing to obtain the feature information output by the CNN network;
  • the pooling submodule is used to input the feature information to the feature aggregation layer for generalized mean pooling to obtain target feature information;
  • a prediction sub-module used to input the target feature information into the fully connected layer to obtain prediction recognition results
  • the third determination sub-module is used to determine the loss function based on the predicted recognition result and the sample label corresponding to the endoscopic image enhancement sample;
  • the adjustment submodule is used to adjust the parameters of the endoscopic image recognition model according to the loss function.
  • the extraction sub-module is specifically used to input the endoscopic image enhancement sample to the CNN network, and use a regularization method to perform feature extraction processing in the CNN network to obtain the CNN network output.
  • Feature information is specifically used to input the endoscopic image enhancement sample to the CNN network, and use a regularization method to perform feature extraction processing in the CNN network to obtain the CNN network output.
  • embodiments of the present disclosure also provide a computer-readable medium on which a computer program is stored, which is characterized in that when the program is executed by a processing device, the steps of the method for determining the endoscope advance and retract time are implemented.
  • an electronic device including:
  • a processing device configured to execute the computer program in the storage device to implement the steps of the method for determining the endoscope advancement and retraction time.
  • Terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA), tablet computers (Portable Android Device, PAD), portable multimedia players Mobile terminals such as (Portable Media Player, PMP), vehicle-mounted terminals (such as vehicle-mounted navigation terminals), and fixed terminals such as digital television (TV), desktop computers, colonoscopy equipment, etc.
  • PDA Personal Digital Assistant
  • PAD Portable multimedia players Mobile terminals
  • PMP Portable Media Player
  • vehicle-mounted terminals such as vehicle-mounted navigation terminals
  • fixed terminals such as digital television (TV), desktop computers, colonoscopy equipment, etc.
  • TV digital television
  • desktop computers colonoscopy equipment
  • the electronic device 600 may include a processing device (such as a central processing unit, a graphics processor, etc.) 601, which may process data according to a program stored in a read-only memory (Read Only Memory, ROM) 602 or from a storage device 608
  • a processing device such as a central processing unit, a graphics processor, etc.
  • the program loaded into the random access memory (Random Access Memory, RAM) 603 performs various appropriate actions and processing.
  • RAM 603 Random Access Memory
  • various programs and data required for the operation of the electronic device 600 are also stored.
  • the processing device 601, ROM 602 and RAM 603 are connected to each other via a bus 604.
  • An input/output (I/O) interface 605 is also connected to bus 604.
  • input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD) , an output device 607 such as a speaker, a vibrator, etc.; a storage device 608 including a magnetic tape, a hard disk, etc.; and a communication device 609.
  • Communication device 609 may allow electronic device 600 to communicate wirelessly or wiredly with other devices to exchange data.
  • FIG. 6 illustrates electronic device 600 with various means, it should be understood that implementation or availability of all illustrated means is not required. More or fewer means may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product including a computer program carried on a non-transitory computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via communication device 609, or from storage device 608, or from ROM 602.
  • the processing device 601 When the computer program is executed by the processing device 601, the above functions defined in the method of the embodiment of the present disclosure are performed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium may be, for example, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or any combination thereof.
  • Computer readable storage media may include, but are not limited to: an electrical connection having one or more conductors, a portable computer disk, a hard drive, random access memory (RAM), read only memory (ROM), erasable Programmable Read Only Memory (Erasable Programmable Read Only Memory, EPROM), optical fiber, portable compact disk read only memory (Compact Disc Read Only Memory, CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that may be sent, propagated, or transmitted for use by an instruction execution system, device, or A program for use with or in conjunction with a device.
  • Program code contained on a computer-readable medium can be transmitted using any appropriate medium, including but not limited to: wires, optical cables, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the above.
  • electronic devices may communicate utilizing any currently known or future developed network protocol, such as HyperText Transfer Protocol (HTTP), and may communicate with any form or medium of digital data ( For example, communication network) interconnection.
  • HTTP HyperText Transfer Protocol
  • Examples of communication networks include Local Area Networks (LANs), Wide Area Networks (WANs), the Internet (e.g., the Internet), and end-to-end networks (e.g., ad hoc end-to-end networks), as well as any current network for knowledge or future research and development.
  • LANs Local Area Networks
  • WANs Wide Area Networks
  • the Internet e.g., the Internet
  • end-to-end networks e.g., ad hoc end-to-end networks
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; it may also exist independently without being assembled into the electronic device.
  • the computer-readable medium carries one or more programs.
  • the electronic device obtains the current endoscopic image and position status, and the position status is used to characterize the current endoscope image.
  • the recognition result of the image and the recognition result of the endoscopic image located a preset number of frames before the current endoscopic image determine the fusion result, and the fusion result is used to characterize the endoscopic image when the current endoscopic image is acquired.
  • the position of the endoscope body; according to the fusion result, the position status and the time corresponding to the current endoscope image determine the time when the endoscope body reaches the target position, and the target position includes the body, Ileocecal or external.
  • Computer program code for performing the operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and Includes conventional procedural programming languages - such as "C" or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as an Internet service provider). connected via the Internet).
  • LAN local area network
  • WAN wide area network
  • Internet service provider such as an Internet service provider
  • each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved.
  • each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or operations. , or can be implemented using a combination of specialized hardware and computer instructions.
  • the modules involved in the embodiments of the present disclosure can be implemented in software or hardware.
  • the name of the module does not constitute a limitation on the module itself under certain circumstances.
  • the acquisition module can also be described as "a module that acquires the current endoscopic image and position status.”
  • exemplary types of hardware logic components include: field-programmable gate arrays (Field-Programmable Gate Array (FPGA), Application Specific Integrated Circuit (ASIC), Application Specific Standard Parts (ASSP), System On Chip (SOC), Complex Programmable Logic Device , CPLD) and so on.
  • FPGA Field-Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • ASSP Application Specific Standard Parts
  • SOC System On Chip
  • CPLD Complex Programmable Logic Device
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing.
  • machine-readable storage media may include an electrical connection based on one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM portable compact disk read-only memory
  • magnetic storage device or any suitable combination of the above.
  • Example 1 provides a method for determining the endoscope advancement and retraction time based on image recognition.
  • the endoscope advancement and retraction time determination method includes:
  • a fusion result is determined based on the recognition result of the current endoscopic image and the recognition result of the endoscopic image located a preset number of frames before the current endoscopic image, and the fusion result is used to characterize the acquisition of the The position of the endoscope body in the current endoscopic image;
  • the position status and the time corresponding to the current endoscopic image, the time when the endoscope body reaches the target position is determined, and the target position includes the body, the ileocecal region or the body outside the body.
  • Example 2 provides the method of Example 1, which determines whether the endoscope is The moment when the mirror reaches the target position, including:
  • the time corresponding to the current endoscope image is determined as the time when the endoscope body reaches the target position. time, and update the location status according to the target location.
  • Example 3 provides the method of Example 2, and the identification result of the current endoscopic image includes that the current endoscopic image belongs to an in-vivo image, an in-vitro image, and an ileocecal image respectively.
  • Corresponding preset conditions include:
  • the position state represents that the endoscope body is located outside the body and the extracorporeal fusion probability is less than or equal to a first preset probability threshold, it is determined that the endoscope body meets the corresponding requirement of reaching the body. Preset conditions.
  • Example 4 provides the method of Example 3, the target position is the ileocecal point, the fusion result includes an ileocecal fusion probability, and according to the fusion result and the Position status, which determines whether the endoscope body meets the preset conditions corresponding to reaching the target position, including:
  • the position state indicates that the endoscope body is located in the body, not located in the ileocecal region, and the ileocecal fusion concept If the rate is greater than or equal to the second preset probability threshold, it is determined that the endoscope body meets the preset condition corresponding to reaching the ileocele.
  • Example 5 provides the method of Example 4, the target location is the extracorporeal body, and the endoscope body is determined according to the fusion result and the position status. Whether the preset conditions corresponding to reaching the target location are met, including:
  • the position state represents that the endoscope body is located in the ileocecal region and the extracorporeal fusion probability is greater than or equal to the third preset probability threshold, it is determined that the endoscope body meets the requirement of reaching the extracorporeal correspondence. preset conditions.
  • Example 6 provides the method of any one of Examples 1-5, which is characterized in that the endoscopic image recognition model is trained in the following manner:
  • the endoscopic image recognition model is trained according to the endoscopic image enhanced sample to obtain a trained endoscopic image recognition model.
  • Example 7 provides the method of Example 6.
  • the endoscopic image recognition model includes a CNN network, a feature aggregation layer and a fully connected layer.
  • Sample training of the endoscopic image recognition model includes:
  • Example 8 provides the method of Example 7, which involves inputting the endoscopic image enhancement sample to the CNN network for feature extraction processing to obtain features output by the CNN network.
  • information including:
  • the endoscopic image enhancement sample is input to the CNN network, and a regularization method is used to perform feature extraction processing in the CNN network to obtain feature information output by the CNN network.
  • Example 9 provides an endoscope advancement and retraction time determination device based on image recognition.
  • the endoscope advancement and retraction time determination device includes:
  • An acquisition module configured to acquire the current endoscopic image and position status, where the position status is used to characterize the position of the endoscope body before acquiring the current endoscopic image
  • a recognition module used to process the current endoscopic image according to a pre-trained endoscopic image recognition model to obtain a recognition result
  • a fusion module configured to determine a fusion result based on the recognition result of the current endoscopic image and the recognition result of the endoscopic image located a preset number of frames before the current endoscopic image, and the fusion result is used to characterize The position of the endoscope body when acquiring the current endoscopic image;
  • Determining module configured to determine the moment when the endoscope body reaches the target position according to the fusion result, the position status and the moment corresponding to the current endoscopic image.
  • the target position includes the body, ileocecal or in vitro.
  • Example 10 provides a computer-readable medium having a computer-readable medium stored thereon A computer program that, when executed by a processing device, implements the steps of the method for determining the endoscope advancement and retraction time described in any one of Examples 1-8.
  • Example 11 provides an electronic device, including:
  • a processing device configured to execute the computer program in the storage device to implement the steps of the method for determining the endoscope advance and retract time in any one of Examples 1-8.
  • Example 12 provides a computer program that, when executed by a processor, implements the method for determining the endoscope advance and withdrawal time described in any one of Examples 1-8. step.
  • Example 13 provides a computer program product, including a computer program.
  • the computer program When the computer program is executed by a processor, the endoscope advances and retreats in any one of Examples 1-8. The steps of the mirror time determination method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Endoscopes (AREA)

Abstract

本公开涉及一种基于图像识别的内窥镜进退镜时间确定方法及装置,内窥镜进退镜时间确定方法包括:获取当前内窥镜图像和位置状态;根据预训练好的内窥镜图像识别模型对所述当前内窥镜图像进行处理,得到识别结果;根据所述当前内窥镜图像的识别结果和位于所述当前内窥镜图像前预设帧数的内窥镜图像的识别结果,确定融合结果;根据所述融合结果、所述位置状态和所述当前内窥镜图像对应的时刻,确定所述内窥镜镜体到达目标位置的时刻,所述目标位置包括体内、回盲或体外,如此,通过上述方式,可以准确地确定出内窥镜镜体达到体内、回盲以及体外的时刻,进而可以确定内窥镜进退镜时间,便于提升进镜效率以及检查质量。

Description

基于图像识别的内窥镜进退镜时间确定方法及装置
相关申请交叉引用
本申请要求于2022年04月29日提交中国专利局、申请号为202210472934.X、发明名称为“基于图像识别的内窥镜进退镜时间确定方法及装置”的中国专利申请的优先权,其全部内容通过引用并入本文。
技术领域
本公开涉及医疗图像技术领域,具体地,涉及一种基于图像识别的内窥镜进退镜时间确定方法及装置。
背景技术
在相关技术中,内窥镜检查例如结肠镜检查是指利用电子肠镜从体外进入肠道,开始进镜过程直到肠道末端,然后开始退镜,对肠道进行观察和诊断,在退镜过程结束后将电子肠镜退出体外,而回盲部(即人体中回肠末端与盲肠互相交接的部位)通常是开启退镜的依据,基于此,整个进退镜过程可以分为进镜区间、回盲区间和退镜区间,进镜区间的时长影响进镜效率,退镜区间的时长影响检测质量。
发明内容
提供该发明内容部分以便以简要的形式介绍构思,这些构思将在后面的具体实施方式部分被详细描述。该发明内容部分并不旨在表示要求保护的技术方案的关键特征或必要特征,也不旨在用于限制所要求的保护的技术方案的范围。
第一方面,本公开提供一种基于图像识别的内窥镜进退镜时间确定方法,所述内窥镜进退镜时间确定方法包括:
获取当前内窥镜图像和位置状态,所述位置状态用于表征在获取所述当前内窥镜图像前内窥镜镜体的位置;
根据预训练好的内窥镜图像识别模型对所述当前内窥镜图像进行处理,得到识别结果;
根据所述当前内窥镜图像的识别结果和位于所述当前内窥镜图像前预设帧数的内窥镜图像的识别结果,确定融合结果,所述融合结果用于表征获取所述当前内窥镜图像时所述内窥镜镜体的位置;
根据所述融合结果、所述位置状态和所述当前内窥镜图像对应的时刻,确定所述内窥镜镜体到达目标位置的时刻,所述目标位置包括体内、回盲或体外。
第二方面,本公开提供一种基于图像识别的内窥镜进退镜时间确定装置,所述内窥镜进退镜时间确定装置包括:
获取模块,用于获取当前内窥镜图像和位置状态,所述位置状态用于表征在获取所述当前内窥镜图像前内窥镜镜体的位置;
识别模块,用于根据预训练好的内窥镜图像识别模型对所述当前内窥镜图像进行处理,得到识别结果;
融合模块,用于根据所述当前内窥镜图像的识别结果和位于所述当前内窥镜图像前预设帧数的内窥镜图像的识别结果,确定融合结果,所述融合结果用于表征获取所述当前内窥镜图像时所述内窥镜镜体的位置;
确定模块,用于根据所述融合结果、所述位置状态和所述当前内窥镜图像对应的时刻,确定所述内窥镜镜体到达目标位置的时刻,所述目标位置包括体内、回盲或体外。
第三方面,本公开提供一种计算机可读介质,其上存储有计算机程序,所述计算机程序被处理装置执行时实现第一方面中所述内窥镜进退镜时间确定方法的步骤。
第四方面,本公开提供一种电子设备,包括:
存储装置,其上存储有计算机程序;
处理装置,用于执行所述存储装置中的所述计算机程序,以实现第一方面中所述内窥镜进退镜时间确定方法的步骤。
第五方面,本公开提供一种计算机程序,所述计算机程序被处理器执行时,实现第一方面中所述内窥镜进退镜时间确定方法的步骤。
第六方面,本公开提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现第一方面中所述内窥镜进退镜时间确定方法的步骤。
通过上述技术方案,由于融合结果用于表征获取当前内窥镜图像时内窥镜镜体的位置,且位置状态用于表征在获取当前内窥镜图像前内窥镜镜体的位置,因此,将融合结果和位置状态相比,以此来确定内窥镜镜体到达体内、回盲或体外的时刻。
本公开的其他特征和优点将在随后的具体实施方式部分予以详细说明。
附图说明
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点将变得更加明显。贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,元件和元素不一定按照比例绘制。在附图中:
图1是根据本公开一示例性实施例示出的一种基于图像识别的内窥镜进退镜时间确定方法的流程图。
图2是根据本公开一示例性实施例示出的一种结肠镜检查区间的示意图。
图3是根据本公开一示例性实施例示出的一种内窥镜图像识别模型的结构示意图。
图4是根据本公开一示例性实施例示出的一种基于图像识别的内窥镜进退镜时间确定方法的另一流程图。
图5是根据本公开一示例性实施例示出的一种基于图像识别的内窥镜进退镜时间确定装置的框图。
图6是根据本公开一示例性实施例示出的一种电子设备的框图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。
可以理解的是,在使用本公开各实施例公开的技术方案之前,均应当依据相关法律法规通过恰当的方式对本公开所涉及个人信息的类型、使用范围、使用场景等告知用户并获得用户的授权。
例如,在响应于接收到用户的主动请求时,向用户发送提示信息,以明确地提示用户,其请求执行的操作将需要获取和使用到用户的个人信息。从而,使得用户可以根据提示信息来自主地选择是否向执行本公开技术方案的操作的电子设备、应用程序、服务器或存储介质等软件或硬件提供个人信息。
作为一种可选的但非限定性的实现方式,响应于接收到用户的主动请求,向用户发送提示信息的方式例如可以是弹窗的方式,弹窗中可以以文字的方式呈现提示信息。此外,弹窗中还可以承载供用户选择“同意”或者“不同意”向电子设备提供个人信息的选择控件。
可以理解的是,上述通知和获取用户授权过程仅是示意性的,不对本公开的实现方式构成限定,其它满足相关法律法规的方式也可应用于本公开的实现方式中。
同时,可以理解的是,本技术方案所涉及的数据(包括但不限于数据本身、数据的获取或使用)应当遵循相应法律法规及相关规定的要求。
在进退镜过程中如何准确确定出进镜区间以及退镜区间的时长对提升进镜效率以及检查质量至关重要。而识别进镜区间以及退镜区间依赖于对图像的识别,即对图像进行分类,根据分类结果来确定电子肠镜所处的位置。而在真实的结肠镜检查过程中,肠道环境非常复杂,充斥着粪便、气泡等杂物,电子肠镜在扭曲的肠道内行进,由于电子肠镜的摄像头的不稳定性,会产生非常多的模糊、过曝等低质图像;不同的电子肠镜导致采集的图像的分辨率差异较大;回盲瓣图像在整个肠镜视频中占比非常小,标注数据较为匮乏,且标注数据中或多或少存在一定的噪声,导致深度模型很容易在训练数据上过拟合,此外,回盲瓣由于自身结构原因,在整张图像中的占比较小,且由于摄像头晃动、拍摄角度等不同程度的影响,导致回 盲瓣图像特征不明显,使得回盲瓣结构无法在图像中很好呈现,这些因素使得对图像的识别算法的鲁棒性具有较高要求,如此,才能确保基于分类结果准确地确定出内窥镜镜体到达各目标位置的时刻。
有鉴于此,本公开提供一种基于图像识别的内窥镜进退镜时间确定方法,将用于表征获取当前内窥镜图像时内窥镜镜体的位置的融合结果,与用于表征在获取当前内窥镜图像前内窥镜镜体的位置的位置状态进行相比,以此来确定内窥镜镜体到达体内、回盲或体外的时刻,根据内窥镜镜体到达体内、回盲和体外的时刻,即可反映出进镜区间以及退镜区间的时长,如此便于提升进镜效率以及检查质量;且融合结果是当前内窥镜图像的识别结果和位于当前内窥镜图像前预设帧数的内窥镜图像的识别结果来确定的,如此可以实现内窥镜镜体的当前位置的准确预估。
图1是根据本公开一示例性实施例示出的一种基于图像识别的内窥镜进退镜时间确定方法的流程图。该内窥镜进退镜时间确定方法可以应用于内窥镜检测设备,参照图1,该内窥镜进退镜时间确定方法可以包括:
步骤S101,获取当前内窥镜图像和位置状态,位置状态用于表征在获取当前内窥镜图像前内窥镜镜体的位置。
步骤S102,根据预训练好的内窥镜图像识别模型对当前内窥镜图像进行处理,得到识别结果。
步骤S103,根据当前内窥镜图像的识别结果和位于当前内窥镜图像前预设帧数的内窥镜图像的识别结果,确定融合结果,融合结果用于表征获取当前内窥镜图像时内窥镜镜体的位置。
步骤S104,根据融合结果、位置状态和当前内窥镜图像对应的时刻,确定内窥镜镜体到达目标位置的时刻,目标位置包括体内、回盲或体外。
在对上述步骤进行解释说明之前,首先对内窥镜镜体(例如结肠镜镜体)在结肠镜检查过程中涉及的位置进行示例性说明。参照图2,在检查过程中,结肠镜镜体从体外(肠道外)进入肠道,开始进镜过程直到肠道末端,然后开始退镜。其中,进镜区间至退镜区间均位于体内(肠道内),且进镜区间至退镜区间的分界称为回盲区间,因此,回盲也可以看做体内的一种特殊位置。结合前文,内窥镜镜体的位置可以分为体外、体内(图2所示的肠镜检查区间)和回盲(图2所示的回盲区间)。图2中从左至右所示箭头,第一个箭头表示为结肠镜镜体从体外到达体内的时刻,第二个箭头表示为结肠镜镜体到达回盲的时刻,第三个箭头表示为结肠镜镜体从体内到达体外的时刻,根据第一个箭头和第二个箭头所示时刻可以确定进镜时长,根据第二个箭头和第三个箭头所示时刻可以确定退镜时长。
以下以内窥镜镜体为结肠镜镜体为例对上述步骤进行解释说明。另外,在内窥镜镜体为结肠镜镜体时,下文所述内窥镜图像可以为结肠镜图像。
在步骤S101之前,可以对内窥镜图像识别模型进行训练,其中,内窥镜图像识别模型是一种分类模型,用于对内窥镜图像进行分类。在一些实施例中,内窥镜图像识别模型可以通过以下方式训练得到:获取内窥镜图像样本;对内窥镜图像样本进行数据增强,得到内窥镜图像增强样本;根据内窥镜图像增强样本对内窥镜图像识别模型进行训练,以得到训练好的内窥镜图像识别模型。
在本实施例中,为保证数据多样性,可以从不同结肠镜设备中进行抽样得到不同的内窥 镜图像。示例地,抽帧频率可以是5帧/次,表征每间隔5帧进行抽样。
在本实施例中,可以对得到的内窥镜图像进行预处理,得到预处理后的内窥镜图像。示例地,预处理可以是过滤模糊图像,以便于模型的训练。
在得到内窥镜图像后,可以对内窥镜图像进行人工标注,进而得到内窥镜图像样本,内窥镜图像样本是携带样本标签的,该样本标签用于指示内窥镜图像样本的类别。人工标注流程可以是:首先,人工筛选出具有回盲区间的内窥镜图像,从筛选出的内窥镜图像中抽帧,再由人工从抽帧得到的内窥镜图像中标注出含有回盲瓣的内窥镜图像(回盲瓣:回肠末端朝向盲肠的上下两片半月形的皱襞)。结合上述标注流程,内窥镜图像可以标注为三个类别,三个类别分别是回盲瓣图像(内窥镜位于回盲部位区间且图像中含有回盲瓣的图像)、体内图像(内窥镜位于肠道内且没有回盲瓣的图像)和体外图像(内窥镜位于肠道外的图像)。
在本实施例中,在将内窥镜图像样本输入至内窥镜图像识别模型之前,可以对内窥镜图像样本进行数据增强。通过数据增强后的内窥镜图像样本对内窥镜图像识别模型进行训练,如此可以解决回盲瓣图像在所有内窥镜图像样本中因占比非常小,以及标注数据中存在一定的噪声,导致内窥镜图像识别模型很容易在训练数据上过拟合,从而提升内窥镜图像识别模型的鲁棒性。
示例地,数据增强可以包括增加随机高斯噪声、增加运动模糊、增加颜色变化、图像多尺度缩放和图像随机翻转等等。
在一些实施例中,参照3,内窥镜图像识别模型可以包括卷积神经网络(Convolutional Neural Network,CNN)、特征聚合层和全连接层。在此情况下,根据内窥镜图像增强样本对内窥镜图像识别模型进行训练的步骤可以包括:将内窥镜图像增强样本输入至CNN网络进行特征提取处理,得到CNN网络输出的特征信息;将特征信息输入至特征聚合层进行广义均值池化,得到目标特征信息;将目标特征信息输入至全连接层,得到预测识别结果;根据预测识别结果和内窥镜图像增强样本对应的样本标签,确定损失函数;根据损失函数,调整内窥镜图像识别模型的参数。
在实际应用中,如前文所述,由于回盲瓣图像的标注数据少且存在噪声导致CNN网络容易在训练数据上过拟合,为防止训练得到的内窥镜图像识别模型过拟合,可以选取例如可以是具有多条输入路径(可以理解为特征采样路径)的CNN网络,例如Se-ResNet50网络,如此,便可以将正则化方法作用于Se-ResNet50网络,以防止模型的过拟合,提高模型鲁棒性。
示例地,正则化方法例如可以是无效路径(droppath),droppath可以将Se-ResNet50网络中的多条输入路径随机“失效”,以使得Se-ResNet50网络可以选取不同输入路径来实现对内窥镜图像增强样本的特征信息的提取,由于不同输入路径采样得到的特征信息不同,进而避免模型的过拟合。
在实际应用中,如前文所述,回盲瓣由于自身结构原因,在整张内窥镜图像中的占比较小,这与物体通常位于图像中心的ImageNet图像库(用于视觉对象识别软件研究的大型可视化数据库)中的图像存在较大差异,且由于镜头晃动、拍摄角度等因素使得回盲瓣结构无法在内窥镜图像中很好呈现,进而引起CNN网络提取的回盲瓣图像的特征信息存在不明显的问题。为解决回盲瓣图像的特征信息不明显的问题,利用特征聚合层对CNN网络输出的特征信息进行广义均值池化,得到目标特征信息,如此可以使目标特征信息包含更多回盲瓣结构的图像特征信息。
示例地,CNN网络输出的特征信息可以表征为:f∈RW×H×K,其中,K为特征信息的通道数,第k个通道的特征信息fk拥有W×H个激活值。
特征聚合层输出的目标特征信息可以表征为:
其中,fg为特征聚合层输出的目标特征信息,T表示矩阵的倒置,为对CNN网络输出的第k个通道对应的特征信息进行广义均值化得到的信息,qk为池化参数。经实验表明,当qk为3时,模型性能最佳。
全连接层是一个分类头,用于根据目标特征信息输出内窥镜图像增强样本属于各个类别的概率,这里的类别包括体外图像、体内图像和回盲瓣图像,对应的,体外图像对应体外概率,体内图像对应体内概率,回盲瓣图像对应回盲概率。
损失函数可以表征为:
其中,Lcls为损失函数的值,i为0,1,2,可以用于分别表示体外图像、体内图像和回盲瓣图像这三个类别,当内窥镜图像增强样本的样本标签的类别为0时,在上述损失函数中,y0=1,y1=y2=0,pi表征对应类别所对应的概率。
在本实施例,根据损失函数的值调整内窥镜图像识别模型的参数。例如可以采用反向传播的方式,依次对全连接层、特征聚合层、CNN网络涉及的参数进行调整。
在一些实施例中,同时为了确保训练过程中三个类别的内窥镜图像样本数据的样本平衡,从每个类别的内窥镜图像中分别采样128张从而组合成一个批次进行训练,每次迭代的损失值为所有内窥镜图像的损失之和的均值,这样保证每次训练过程中三个类别的贡献大致均衡。在此情况下,CNN网络可以采用正则化方法作用于一个批次中的部分样本。
在得到训练好的内窥镜图像识别模型后,则可以利用训练好的内窥镜图像识别模型对内窥镜图像进行处理。可以理解的是,在开启结肠镜检测时,执行获取当前内窥镜图像和位置状态的步骤。
在一些实施例中,当前内窥镜图像可以是内窥镜检测设备中本地存储的图像,也可以是从其他设备上获取的图像,本实施在此不作限定。
在一些实施例中,当前内窥镜图像可以是在结肠镜检查过程中实时获取的图像,以此可以根据当前内窥镜图像实时确定结肠镜检查过程中结肠镜的进退镜时间。
需要说明的是,在结肠镜镜体当前拍摄到的当前内窥镜图像时内窥镜镜体的位置,相较于拍摄的当前内窥镜图像前的内窥镜图像表征的位置发生变化时,则表征结肠镜镜体当前到达了新的位置,如此,便可以通过检查结肠镜镜体的位置变化情况和当前内窥镜图像对应的时刻来确定内窥镜镜体到达目标位置(包括前文所述体内、回盲和体外)的时刻,进而根据确定的内窥镜镜体到达各目标位置的时刻,确定内窥镜进退镜时间。因此,上述步骤S104可以包括:通过用于表征在获取当前内窥镜图像前内窥镜镜体的位置的位置状态,以及用于表征获取当前内窥镜图像时内窥镜镜体的位置的融合结果,来确定内窥镜镜体是否满足到达目标位置的预设条件(可以理解为前后图像表征的位置发生变化的条件),在确定内窥镜镜体满 足到达目标位置对应的预设条件的情况下,将当前内窥镜图像对应的时刻确定为内窥镜镜体到达目标位置的时刻。
需要说明的是,由于涉及多个位置,且在确定内窥镜镜体的当前位置是否发生变化时,是将当前位置的上一位置与当前位置进行比较,因此,在确定内窥镜镜体满足到达目标位置对应的预设条件的情况下,还需要根据目标位置更新位置状态,更新后的位置状态表征内窥镜镜体已位于目标位置。
位置状态是用于表征获取当前内窥镜图像前内窥镜镜体的位置,在一些实施例中,示例地,可以设置三种状态信息,并进行标记,标记的结果用于确定位置状态,即用于确定是否到达体内、是否到达回盲以及是否到达体外。基于此,根据读取各状态信息的标记的信息,则可以确定位置状态。
示例地,分别用体内(inbody)、回盲(inileo)和体外(outbody)这三种状态信息来表示,用假(false)和真(true)来进行标记。例如,inbody=false表征未到达体内,inileo=false表征未到达回盲,outbody=false表征未到达体外;inbody=true表征到达体内,inileo=true表征到达回盲,outbody=true表征到达体外。
在一些实施例中,当前内窥镜图像的识别结果包括当前内窥镜图像分别属于体内图像、体外图像和回盲图像的概率,对应的,根据当前内窥镜图像的识别结果和位于当前内窥镜图像前预设帧数的内窥镜图像的识别结果,确定的融合结果可以包括体外融合概率和回盲融合概率。需要说明的是,位于当前内窥镜图像前预设帧数的内窥镜图像是位于当前内窥镜图像前连续的预设帧数的内窥镜图像。
示例地,体外融合概率可以是包括当前内窥镜图像在内连续5帧(其中4帧为位于当前内窥镜图像前的内窥镜图像)内窥镜图像的体外概率之和的均值。示例地,回盲融合概率可以是包括当前内窥镜图像在内连续250帧(其中249帧为位于当前内窥镜图像前的内窥镜图像)内窥镜图像的回盲概率之和。需要说明的,上述示例对预设帧数并不造成限定。
在一些实施例中,对于当前内窥镜图像前的历史帧内窥镜图像而言,历史帧内窥镜图像的识别结果可以存储于结肠镜设备的内存中,便于融合结果的计算。
在一些实施例中,由于是根据当前内窥镜图像前预设帧数的内窥镜图像进行融合结果的计算,因此,对于内存中不参与针对当前内窥镜图像对应的融合结果计算的历史帧内窥镜图像的识别结果而言,可以自动进行删除,以节省内存空间。
结合前文,以目标位置分别为体内、回盲和体外为例,对本公开确定进退镜时间进行说明。首先,在开启结肠镜检查时,可以初始化位置状态的三种状态信息,初始化的结果为inbody=false,inileo=false,outbody=false;并初始化内窥镜镜体到达各目标位置的时刻,初始化的结果为inbodytime(表征到达体内的时刻)=0,inileotime(表征到达回盲的时刻)=0,outbodytime(表征到达体外的时刻,等同于离开体内的时刻)=0。在此情况下,参照图4,在图4右侧虚线框内示意了结肠镜从体外到体内,再到回盲,再从体内退出到体外这个过程中更新位置状态和依次确定到达体内、回盲和体外的时刻。
具体来讲,首先,在位置状态(初始化得到的位置状态)表征内窥镜镜体位于体外且体外融合概率(outprob)小于等于第一预设概率阈值(图4所示H1)的情况下,确定内窥镜镜体满足到达体内的预设条件(即inbody=false且outprob≤H1),则可以将当前内窥镜图像对应的时刻T确定为内窥镜镜体到达体内的时刻,并根据本次到达的目标位置(体内),将初始化 结果中的位置状态中的inbody=false更新为inbody=true,得到新的位置状态。
接着,会获取当前内窥镜图像的下一内窥镜图像作为新的当前内窥镜图像,并获取新的位置状态,且在该位置状态表征内窥镜镜体位于体内、未位于回盲且回盲融合概率(ileoprob)大于等于第二预设概率阈值(图4所示H2)的情况下,确定内窥镜镜体满足到达回盲对应的预设条件(inbody=true、inileo=false且ileoprob≥H2),则可以将当前内窥镜图像对应的时刻T确定为内窥镜镜体到达回盲的时刻,并根据本次到达的目标位置(回盲),将本次获取的位置状态中的inileo=false更新为inileo=true。
再接着,会获取当前内窥镜图像的下一内窥镜图像作为新的当前内窥镜图像,并获取新的位置状态,且在该位置状态表征内窥镜镜体位于回盲且体外融合概率大于等于第三预设概率阈值(图4所示H3)的情况下,确定内窥镜镜体满足到达体外对应的预设条件(inbody=true、outbody=false且outprob≥H3),则可以将当前内窥镜图像对应的时刻T确定为内窥镜镜体到达体外的时刻,并根据本次到达的目标位置(体外),将本次获取的位置状态中的outbody=false更新为outbody=true。
其中,H1、H2以及H3可以根据实际情况进行设定,本实施例在此不作限定。
上述图4所示过程为一个结肠镜从体外到体内,再到回盲,再从体内退出到体外这个过程中更新位置状态和依次确定到达体内、回盲和体外的时刻的示例性说明。
其中,当前内窥镜图像对应的时刻T可以通过以下方式确定:确定当前内窥镜图像的帧序号,根据当前内窥镜图像的帧序号与当前内窥镜图像对应视频的帧率的乘积结果确定为当前内窥镜图像对应的时刻T。例如,帧率为25帧/秒,当前内窥镜图像的帧序号为25,则可以确定当前内窥镜图像对应的时刻T为1秒。
此外,为提高确定到达各目标位置的时刻的准确性,在确定到达回盲和体外的时刻时,对应的预设条件还可以包括一个时间的判断条件。示例地,针对判断是否到达回盲时,设定的预设条件可以是满足inbody=true、inileo=false、ileoprob≥H2且T>inbodytime。
在一些实施例中,可以将inbody、inileo、outbody、inbodytime、inileotime、outbodytime对应的值实时同步显示在结肠镜设备上,便于医生查看。
通过上述方式,能够实时确定帧率为25的肠镜视频中内窥镜镜体到达各目标位置的时刻,以及是否到达各个目标位置的指示信息,并能够将结肠镜在人体内的用于指示是否到达各个目标位置的指示信息实时同步更新在结肠镜设备上。
基于同一发明构思,本公开实施例还提供一种基于图像识别的内窥镜进退镜时间确定装置,参照图5,所述内窥镜进退镜时间确定装置500包括:
获取模块501,用于获取当前内窥镜图像和位置状态,所述位置状态用于表征在获取所述当前内窥镜图像前内窥镜镜体的位置;
识别模块502,用于根据预训练好的内窥镜图像识别模型对所述当前内窥镜图像进行处理,得到识别结果;
融合模块503,用于根据所述当前内窥镜图像的识别结果和位于所述当前内窥镜图像前预设帧数的内窥镜图像的识别结果,确定融合结果,所述融合结果用于表征获取所述当前内窥镜图像时所述内窥镜镜体的位置;
确定模块504,用于根据所述融合结果、所述位置状态和所述当前内窥镜图像对应的时刻,确定所述内窥镜镜体到达目标位置的时刻,所述目标位置包括体内、回盲或体外。
可选的,所述确定模块504包括:
第一确定子模块,用于根据所述融合结果和所述位置状态,确定所述内窥镜镜体是否满足到达所述目标位置对应的预设条件;
第二确定子模块,用于在确定所述内窥镜镜体满足到达所述目标位置对应的预设条件的情况下,将所述当前内窥镜图像对应的时刻确定为所述内窥镜镜体到达所述目标位置的时刻,并根据所述目标位置更新所述位置状态。
可选的,所述当前内窥镜图像的所述识别结果包括所述当前内窥镜图像分别属于体内图像、体外图像和回盲图像的概率,所述目标位置为所述体内,所述融合结果包括体外融合概率,所述第一确定子模块具体用于在所述位置状态表征所述内窥镜镜体位于所述体外且所述体外融合概率小于等于第一预设概率阈值的情况下,确定所述内窥镜镜体满足到达所述体内对应的预设条件。
可选的,所述目标位置为所述回盲,所述融合结果包括回盲融合概率,所述第一确定子模块具体用于在所述位置状态表征所述内窥镜镜体位于所述体内、未位于所述回盲且所述回盲融合概率大于等于第二预设概率阈值的情况下,确定所述内窥镜镜体满足到达所述回盲对应的预设条件。
可选的,所述目标位置为所述体外,所述第一确定子模块具体用于在所述位置状态表征所述内窥镜镜体位于所述回盲且所述体外融合概率大于等于第三预设概率阈值的情况下,确定所述内窥镜镜体满足到达所述体外对应的预设条件。
可选的,所述内窥镜进退镜时间确定装置500还包括:
样本获取模块,用于获取内窥镜图像样本;
数据增强模块,用于对所述内窥镜图像样本进行数据增强,得到内窥镜图像增强样本;
训练模块,用于根据所述内窥镜图像增强样本对所述内窥镜图像识别模型进行训练,以得到训练好的内窥镜图像识别模型。
可选的,所述内窥镜图像识别模型包括CNN网络、特征聚合层和全连接层,所述训练模块包括:
提取子模块,用于将所述内窥镜图像增强样本输入至所述CNN网络进行特征提取处理,得到所述CNN网络输出的特征信息;
池化子模块,用于将所述特征信息输入至所述特征聚合层进行广义均值池化,得到目标特征信息;
预测子模块,用于将所述目标特征信息输入至所述全连接层,得到预测识别结果;
第三确定子模块,用于根据所述预测识别结果和所述内窥镜图像增强样本对应的样本标签,确定损失函数;
调整子模块,用于根据所述损失函数,调整所述内窥镜图像识别模型的参数。
可选的,所述提取子模块具体用于将所述内窥镜图像增强样本输入至所述CNN网络,在所述CNN网络中采用正则化方法进行特征提取处理,得到所述CNN网络输出的特征信息。
基于同一发明构思,本公开实施例还提供一种计算机可读介质,其上存储有计算机程序,其特征在于,该程序被处理装置执行时实现上述内窥镜进退镜时间确定方法的步骤。
基于同一发明构思,本公开实施例还提供一种电子设备,包括:
存储装置,其上存储有计算机程序;
处理装置,用于执行所述存储装置中的所述计算机程序,以实现上述内窥镜进退镜时间确定方法的步骤。
下面参考图6,其示出了适于用来实现本公开实施例的电子设备600的结构示意图。本公开实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,PDA)、平板电脑(Portable Android Device,PAD)、便携式多媒体播放器(Portable Media Player,PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字电视(Television,TV)、台式计算机、结肠镜设备等等的固定终端。图6示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图6所示,电子设备600可以包括处理装置(例如中央处理器、图形处理器等)601,其可以根据存储在只读存储器(Read Only Memory,ROM)602中的程序或者从存储装置608加载到随机访问存储器(Random Access Memory,RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有电子设备600操作所需的各种程序和数据。处理装置601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(Input/Output,I/O)接口605也连接至总线604。
通常,以下装置可以连接至I/O接口605:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置606;包括例如液晶显示器(Liquid Crystal Display,LCD)、扬声器、振动器等的输出装置607;包括例如磁带、硬盘等的存储装置608;以及通信装置609。通信装置609可以允许电子设备600与其他设备进行无线或有线通信以交换数据。虽然图6示出了具有各种装置的电子设备600,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置609从网络上被下载和安装,或者从存储装置608被安装,或者从ROM 602被安装。在该计算机程序被处理装置601执行时,执行本公开实施例的方法中限定的上述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或 者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。
在一些实施方式中,电子设备可以利用诸如超文本传输协议(HyperText Transfer Protocol,HTTP)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:获取当前内窥镜图像和位置状态,所述位置状态用于表征在获取所述当前内窥镜图像前内窥镜镜体的位置;根据预训练好的内窥镜图像识别模型对所述当前内窥镜图像进行处理,得到识别结果;根据所述当前内窥镜图像的识别结果和位于所述当前内窥镜图像前预设帧数的内窥镜图像的识别结果,确定融合结果,所述融合结果用于表征获取所述当前内窥镜图像时所述内窥镜镜体的位置;根据所述融合结果、所述位置状态和所述当前内窥镜图像对应的时刻,确定所述内窥镜镜体到达目标位置的时刻,所述目标位置包括体内、回盲或体外。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言——诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)——连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的模块可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,模块的名称在某种情况下并不构成对该模块本身的限定,例如,获取模块还可以被描述为“获取当前内窥镜图像和位置状态的模块”。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field-Programmable  Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Parts,ASSP)、片上系统(System On Chip,SOC)、复杂可编程逻辑设备(Complex Programmable Logic Device,CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例可以包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
根据本公开的一个或多个实施例,示例1提供了一种基于图像识别的内窥镜进退镜时间确定方法,所述内窥镜进退镜时间确定方法包括:
获取当前内窥镜图像和位置状态,所述位置状态用于表征在获取所述当前内窥镜图像前内窥镜镜体的位置;
根据预训练好的内窥镜图像识别模型对所述当前内窥镜图像进行处理,得到识别结果;
根据所述当前内窥镜图像的所述识别结果和位于所述当前内窥镜图像前预设帧数的内窥镜图像的识别结果,确定融合结果,所述融合结果用于表征获取所述当前内窥镜图像时所述内窥镜镜体的位置;
根据所述融合结果、所述位置状态和所述当前内窥镜图像对应的时刻,确定所述内窥镜镜体到达目标位置的时刻,所述目标位置包括体内、回盲或体外。
根据本公开的一个或多个实施例,示例2提供了示例1的方法,所述根据所述融合结果、所述位置状态和所述当前内窥镜图像对应的时刻,确定所述内窥镜镜体到达目标位置的时刻,包括:
根据所述融合结果和所述位置状态,确定所述内窥镜镜体是否满足到达所述目标位置对应的预设条件;
在确定所述内窥镜镜体满足到达所述目标位置对应的预设条件的情况下,将所述当前内窥镜图像对应的时刻确定为所述内窥镜镜体到达所述目标位置的时刻,并根据所述目标位置更新所述位置状态。
根据本公开的一个或多个实施例,示例3提供了示例2的方法,所述当前内窥镜图像的所述识别结果包括所述当前内窥镜图像分别属于体内图像、体外图像和回盲图像的概率,所述目标位置为所述体内,所述融合结果包括体外融合概率,所述根据所述融合结果和所述位置状态,确定所述内窥镜镜体是否满足到达所述目标位置对应的预设条件,包括:
在所述位置状态表征所述内窥镜镜体位于所述体外且所述体外融合概率小于等于第一预设概率阈值的情况下,确定所述内窥镜镜体满足到达所述体内对应的预设条件。
根据本公开的一个或多个实施例,示例4提供了示例3的方法,所述目标位置为所述回盲,所述融合结果包括回盲融合概率,所述根据所述融合结果和所述位置状态,确定所述内窥镜镜体是否满足到达所述目标位置对应的预设条件,包括:
在所述位置状态表征所述内窥镜镜体位于所述体内、未位于所述回盲且所述回盲融合概 率大于等于第二预设概率阈值的情况下,确定所述内窥镜镜体满足到达所述回盲对应的预设条件。
根据本公开的一个或多个实施例,示例5提供了示例4的方法,所述目标位置为所述体外,所述根据所述融合结果和所述位置状态,确定所述内窥镜镜体是否满足到达所述目标位置对应的预设条件,包括:
在所述位置状态表征所述内窥镜镜体位于所述回盲且所述体外融合概率大于等于第三预设概率阈值的情况下,确定所述内窥镜镜体满足到达所述体外对应的预设条件。
根据本公开的一个或多个实施例,示例6提供了示例1-5任一项的方法,其特征在于,所述内窥镜图像识别模型通过以下方式训练得到:
获取内窥镜图像样本;
对所述内窥镜图像样本进行数据增强,得到内窥镜图像增强样本;
根据所述内窥镜图像增强样本对所述内窥镜图像识别模型进行训练,以得到训练好的内窥镜图像识别模型。
根据本公开的一个或多个实施例,示例7提供了示例6的方法,所述内窥镜图像识别模型包括CNN网络、特征聚合层和全连接层,所述根据所述内窥镜图像增强样本对所述内窥镜图像识别模型进行训练包括:
将所述内窥镜图像增强样本输入至所述CNN网络进行特征提取处理,得到所述CNN网络输出的特征信息;
将所述特征信息输入至所述特征聚合层进行广义均值池化,得到目标特征信息;
将所述目标特征信息输入至所述全连接层,得到预测识别结果;
根据所述预测识别结果和所述内窥镜图像增强样本对应的样本标签,确定损失函数;
根据所述损失函数,调整所述内窥镜图像识别模型的参数。
根据本公开的一个或多个实施例,示例8提供了示例7的方法,所述将所述内窥镜图像增强样本输入至所述CNN网络进行特征提取处理,得到所述CNN网络输出的特征信息,包括:
将所述内窥镜图像增强样本输入至所述CNN网络,在所述CNN网络中采用正则化方法进行特征提取处理,得到所述CNN网络输出的特征信息。
根据本公开的一个或多个实施例,示例9提供了一种基于图像识别的内窥镜进退镜时间确定装置,所述内窥镜进退镜时间确定装置包括:
获取模块,用于获取当前内窥镜图像和位置状态,所述位置状态用于表征在获取所述当前内窥镜图像前内窥镜镜体的位置;
识别模块,用于根据预训练好的内窥镜图像识别模型对所述当前内窥镜图像进行处理,得到识别结果;
融合模块,用于根据所述当前内窥镜图像的识别结果和位于所述当前内窥镜图像前预设帧数的内窥镜图像的识别结果,确定融合结果,所述融合结果用于表征获取所述当前内窥镜图像时所述内窥镜镜体的位置;
确定模块,用于根据所述融合结果、所述位置状态和所述当前内窥镜图像对应的时刻,确定所述内窥镜镜体到达目标位置的时刻,所述目标位置包括体内、回盲或体外。
根据本公开的一个或多个实施例,示例10提供了一种计算机可读介质,其上存储有计算 机程序,所述计算机程序被处理装置执行时实现示例1-8中任一项所述内窥镜进退镜时间确定方法的步骤。
根据本公开的一个或多个实施例,示例11提供了一种电子设备,包括:
存储装置,其上存储有计算机程序;
处理装置,用于执行所述存储装置中的所述计算机程序,以实现示例1-8中任一项所述内窥镜进退镜时间确定方法的步骤。
根据本公开的一个或多个实施例,示例12提供了一种计算机程序,所述计算机程序被处理器执行时,实现示例1-8中任一项所述内窥镜进退镜时间确定方法的步骤。
根据本公开的一个或多个实施例,示例13提供了一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现示例1-8中任一项所述内窥镜进退镜时间确定方法的步骤。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。

Claims (13)

  1. 一种基于图像识别的内窥镜进退镜时间确定方法,所述内窥镜进退镜时间确定方法包括:
    获取当前内窥镜图像和位置状态,所述位置状态用于表征在获取所述当前内窥镜图像前内窥镜镜体的位置;
    根据预训练好的内窥镜图像识别模型对所述当前内窥镜图像进行处理,得到识别结果;
    根据所述当前内窥镜图像的所述识别结果和位于所述当前内窥镜图像前预设帧数的内窥镜图像的识别结果,确定融合结果,所述融合结果用于表征获取所述当前内窥镜图像时所述内窥镜镜体的位置;
    根据所述融合结果、所述位置状态和所述当前内窥镜图像对应的时刻,确定所述内窥镜镜体到达目标位置的时刻,所述目标位置包括体内、回盲或体外。
  2. 根据权利要求1所述的内窥镜进退镜时间确定方法,其中,所述根据所述融合结果、所述位置状态和所述当前内窥镜图像对应的时刻,确定所述内窥镜镜体到达目标位置的时刻,包括:
    根据所述融合结果和所述位置状态,确定所述内窥镜镜体是否满足到达所述目标位置对应的预设条件;
    在确定所述内窥镜镜体满足到达所述目标位置对应的预设条件的情况下,将所述当前内窥镜图像对应的时刻确定为所述内窥镜镜体到达所述目标位置的时刻,并根据所述目标位置更新所述位置状态。
  3. 根据权利要求2所述的内窥镜进退镜时间确定方法,其中,所述当前内窥镜图像的所述识别结果包括所述当前内窥镜图像分别属于体内图像、体外图像和回盲图像的概率,所述目标位置为所述体内,所述融合结果包括体外融合概率,所述根据所述融合结果和所述位置状态,确定所述内窥镜镜体是否满足到达所述目标位置对应的预设条件,包括:
    在所述位置状态表征所述内窥镜镜体位于所述体外且所述体外融合概率小于等于第一预设概率阈值的情况下,确定所述内窥镜镜体满足到达所述体内对应的预设条件。
  4. 根据权利要求3所述的内窥镜进退镜时间确定方法,其中,所述目标位置为所述回盲,所述融合结果包括回盲融合概率,所述根据所述融合结果和所述位置状态,确定所述内窥镜镜体是否满足到达所述目标位置对应的预设条件,包括:
    在所述位置状态表征所述内窥镜镜体位于所述体内、未位于所述回盲且所述回盲融合概率大于等于第二预设概率阈值的情况下,确定所述内窥镜镜体满足到达所述回盲对应的预设条件。
  5. 根据权利要求4所述的内窥镜进退镜时间确定方法,其中,所述目标位置为所述体外,所述根据所述融合结果和所述位置状态,确定所述内窥镜镜体是否满足到达所述目标位置对应的预设条件,包括:
    在所述位置状态表征所述内窥镜镜体位于所述回盲且所述体外融合概率大于等于第三预设概率阈值的情况下,确定所述内窥镜镜体满足到达所述体外对应的预设条件。
  6. 根据权利要求1-5中任一所述的内窥镜进退镜时间确定方法,其中,所述内窥镜 图像识别模型通过以下方式训练得到:
    获取内窥镜图像样本;
    对所述内窥镜图像样本进行数据增强,得到内窥镜图像增强样本;
    根据所述内窥镜图像增强样本对所述内窥镜图像识别模型进行训练,以得到训练好的内窥镜图像识别模型。
  7. 根据权利要求6所述的内窥镜进退镜时间确定方法,其中,所述内窥镜图像识别模型包括卷积神经网络CNN网络、特征聚合层和全连接层,所述根据所述内窥镜图像增强样本对所述内窥镜图像识别模型进行训练包括:
    将所述内窥镜图像增强样本输入至所述CNN网络进行特征提取处理,得到所述CNN网络输出的特征信息;
    将所述特征信息输入至所述特征聚合层进行广义均值池化,得到目标特征信息;
    将所述目标特征信息输入至所述全连接层,得到预测识别结果;
    根据所述预测识别结果和所述内窥镜图像增强样本对应的样本标签,确定损失函数;
    根据所述损失函数,调整所述内窥镜图像识别模型的参数。
  8. 根据权利要求7所述的内窥镜进退镜时间确定方法,其中,所述将所述内窥镜图像增强样本输入至所述CNN网络进行特征提取处理,得到所述CNN网络输出的特征信息,包括:
    将所述内窥镜图像增强样本输入至所述CNN网络,在所述CNN网络中采用正则化方法进行特征提取处理,得到所述CNN网络输出的特征信息。
  9. 一种基于图像识别的内窥镜进退镜时间确定装置,所述内窥镜进退镜时间确定装置包括:
    获取模块,用于获取当前内窥镜图像和位置状态,所述位置状态用于表征在获取所述当前内窥镜图像前内窥镜镜体的位置;
    识别模块,用于根据预训练好的内窥镜图像识别模型对所述当前内窥镜图像进行处理,得到识别结果;
    融合模块,用于根据所述当前内窥镜图像的识别结果和位于所述当前内窥镜图像前预设帧数的内窥镜图像的识别结果,确定融合结果,所述融合结果用于表征获取所述当前内窥镜图像时所述内窥镜镜体的位置;
    确定模块,用于根据所述融合结果、所述位置状态和所述当前内窥镜图像对应的时刻,确定所述内窥镜镜体到达目标位置的时刻,所述目标位置包括体内、回盲或体外。
  10. 一种计算机可读介质,其上存储有计算机程序,所述计算机程序被处理装置执行时实现权利要求1-8中任一项所述内窥镜进退镜时间确定方法的步骤。
  11. 一种电子设备,包括:
    存储装置,其上存储有计算机程序;
    处理装置,用于执行所述存储装置中的所述计算机程序,以实现权利要求1-8中任一项所述内窥镜进退镜时间确定方法的步骤。
  12. 一种计算机程序,所述计算机程序被处理器执行时,实现权利要求1-8中任一项所述内窥镜进退镜时间确定方法的步骤。
  13. 一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实 现权利要求1-8中任一项所述内窥镜进退镜时间确定方法的步骤。
PCT/CN2023/087314 2022-04-29 2023-04-10 基于图像识别的内窥镜进退镜时间确定方法及装置 WO2023207564A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210472934.X 2022-04-29
CN202210472934.XA CN114782388A (zh) 2022-04-29 2022-04-29 基于图像识别的内窥镜进退镜时间确定方法及装置

Publications (1)

Publication Number Publication Date
WO2023207564A1 true WO2023207564A1 (zh) 2023-11-02

Family

ID=82434973

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/087314 WO2023207564A1 (zh) 2022-04-29 2023-04-10 基于图像识别的内窥镜进退镜时间确定方法及装置

Country Status (2)

Country Link
CN (1) CN114782388A (zh)
WO (1) WO2023207564A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114782388A (zh) * 2022-04-29 2022-07-22 小荷医疗器械(海南)有限公司 基于图像识别的内窥镜进退镜时间确定方法及装置
CN115553685B (zh) * 2022-10-24 2024-05-07 南京索图科技有限公司 一种判断内窥镜进出的方法
CN116051486A (zh) * 2022-12-29 2023-05-02 抖音视界有限公司 内窥镜图像识别模型的训练方法、图像识别方法及装置
CN116188466B (zh) * 2023-04-26 2023-07-21 广州思德医疗科技有限公司 医疗器械体内停留时长确定方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991561A (zh) * 2019-12-20 2020-04-10 山东大学齐鲁医院 一种下消化道内窥镜图像识别方法及系统
CN111000633A (zh) * 2019-12-20 2020-04-14 山东大学齐鲁医院 一种内镜诊疗操作过程的监控方法及系统
CN111012285A (zh) * 2019-12-06 2020-04-17 腾讯科技(深圳)有限公司 内窥镜移动时间确定方法、装置和计算机设备
JP2021037356A (ja) * 2020-12-01 2021-03-11 Hoya株式会社 内視鏡用プロセッサ、情報処理装置、内視鏡システム、プログラム及び情報処理方法
CN114782388A (zh) * 2022-04-29 2022-07-22 小荷医疗器械(海南)有限公司 基于图像识别的内窥镜进退镜时间确定方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111012285A (zh) * 2019-12-06 2020-04-17 腾讯科技(深圳)有限公司 内窥镜移动时间确定方法、装置和计算机设备
CN110991561A (zh) * 2019-12-20 2020-04-10 山东大学齐鲁医院 一种下消化道内窥镜图像识别方法及系统
CN111000633A (zh) * 2019-12-20 2020-04-14 山东大学齐鲁医院 一种内镜诊疗操作过程的监控方法及系统
JP2021037356A (ja) * 2020-12-01 2021-03-11 Hoya株式会社 内視鏡用プロセッサ、情報処理装置、内視鏡システム、プログラム及び情報処理方法
CN114782388A (zh) * 2022-04-29 2022-07-22 小荷医疗器械(海南)有限公司 基于图像识别的内窥镜进退镜时间确定方法及装置

Also Published As

Publication number Publication date
CN114782388A (zh) 2022-07-22

Similar Documents

Publication Publication Date Title
WO2023207564A1 (zh) 基于图像识别的内窥镜进退镜时间确定方法及装置
CN112767329B (zh) 图像处理方法及装置、电子设备
WO2022036972A1 (zh) 图像分割方法及装置、电子设备和存储介质
CN114332019B (zh) 内窥镜图像检测辅助系统、方法、介质和电子设备
WO2023061080A1 (zh) 组织图像的识别方法、装置、可读介质和电子设备
US11694114B2 (en) Real-time deployment of machine learning systems
WO2023030370A1 (zh) 内窥镜图像检测方法、装置、存储介质及电子设备
WO2023029741A1 (zh) 用于内窥镜的组织腔体定位方法、装置、介质及设备
CN113470029B (zh) 训练方法及装置、图像处理方法、电子设备和存储介质
WO2023125008A1 (zh) 基于人工智能的内窥镜图像处理方法、装置、介质及设备
WO2021093689A1 (zh) 面部图像变形方法、装置、电子设备和计算机可读介质
WO2023030523A1 (zh) 用于内窥镜的组织腔体定位方法、装置、介质及设备
CN111292420A (zh) 用于构建地图的方法和装置
WO2023185516A1 (zh) 图像识别模型的训练方法、识别方法、装置、介质和设备
WO2023138619A1 (zh) 内窥镜图像的处理方法、装置、可读介质和电子设备
JP2022541897A (ja) 機械学習システムのリアルタイム展開
WO2023165332A1 (zh) 组织腔体的定位方法、装置、可读介质和电子设备
WO2023185497A1 (zh) 组织图像的识别方法、装置、可读介质和电子设备
KR20220012407A (ko) 이미지 분할 방법 및 장치, 전자 기기 및 저장 매체
CN110349108B (zh) 处理图像的方法、装置、电子设备、及存储介质
CN114937178B (zh) 基于多模态的图像分类方法、装置、可读介质和电子设备
WO2023025085A1 (zh) 视频处理方法、装置、设备、介质及程序产品
CN114863124A (zh) 模型训练方法、息肉检测方法、相应装置、介质及设备
CN111310595A (zh) 用于生成信息的方法和装置
CN113505672B (zh) 虹膜采集装置、虹膜采集方法、电子设备和可读介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23794997

Country of ref document: EP

Kind code of ref document: A1