WO2021261727A1 - Capsule endoscopy image reading system and method - Google Patents
Capsule endoscopy image reading system and method Download PDFInfo
- Publication number
- WO2021261727A1 WO2021261727A1 PCT/KR2021/004735 KR2021004735W WO2021261727A1 WO 2021261727 A1 WO2021261727 A1 WO 2021261727A1 KR 2021004735 W KR2021004735 W KR 2021004735W WO 2021261727 A1 WO2021261727 A1 WO 2021261727A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- capsule endoscope
- capsule
- endoscope image
- images
- Prior art date
Links
- 239000002775 capsule Substances 0.000 title claims abstract description 206
- 238000001839 endoscopy Methods 0.000 title claims abstract description 99
- 238000000034 method Methods 0.000 title claims abstract description 58
- 230000003902 lesion Effects 0.000 claims abstract description 78
- 238000007781 pre-processing Methods 0.000 claims abstract description 21
- 230000004913 activation Effects 0.000 claims abstract description 8
- 238000013527 convolutional neural network Methods 0.000 claims description 36
- 238000011176 pooling Methods 0.000 claims description 23
- 238000012360 testing method Methods 0.000 claims description 18
- 238000012549 training Methods 0.000 claims description 12
- 230000003190 augmentative effect Effects 0.000 claims description 10
- 238000001514 detection method Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 10
- 230000003416 augmentation Effects 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 2
- 239000003623 enhancer Substances 0.000 claims 1
- 238000013528 artificial neural network Methods 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 18
- 230000006870 function Effects 0.000 description 15
- 238000013136 deep learning model Methods 0.000 description 10
- 239000004065 semiconductor Substances 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 4
- 210000000813 small intestine Anatomy 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 210000002784 stomach Anatomy 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 208000010643 digestive system disease Diseases 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 210000003238 esophagus Anatomy 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 210000002429 large intestine Anatomy 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000009747 swallowing Effects 0.000 description 1
- 210000000225 synapse Anatomy 0.000 description 1
- 230000000946 synaptic effect Effects 0.000 description 1
- 230000026676 system process Effects 0.000 description 1
Images
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B1/00—Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
- A61B1/00002—Operational features of endoscopes
- A61B1/00004—Operational features of endoscopes characterised by electronic signal processing
- A61B1/00009—Operational features of endoscopes characterised by electronic signal processing of image signals during a use of endoscope
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B1/00—Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B1/00—Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
- A61B1/00002—Operational features of endoscopes
- A61B1/00043—Operational features of endoscopes provided with output arrangements
- A61B1/00045—Display arrangement
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B1/00—Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
- A61B1/00064—Constructional details of the endoscope body
- A61B1/00108—Constructional details of the endoscope body characterised by self-sufficient functionality for stand-alone use
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B1/00—Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
- A61B1/04—Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor combined with photographic or television appliances
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B1/00—Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
- A61B1/04—Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor combined with photographic or television appliances
- A61B1/041—Capsule endoscopes for imaging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H30/00—ICT specially adapted for the handling or processing of medical images
- G16H30/40—ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B2576/00—Medical imaging apparatus involving image processing or analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10068—Endoscopic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
Definitions
- the present invention relates to a capsule endoscopic image reading system and method.
- Capsule endoscope is a device used for diagnosing digestive diseases by swallowing a pill-shaped capsule with the mouth, taking pictures of health conditions of the esophagus, stomach, and small intestine, and analyzing and reading these images.
- the small intestine is located in the middle of the stomach and large intestine, and is about 6 meters long. When taking a small intestine with a capsule endoscope, it takes more than 10 hours and records more than 50,000 images, but there are limitations in time and accuracy for a doctor to read it directly.
- Embodiments provide a capsule endoscope image reading system and method that can reduce a doctor's reading time and increase accuracy for a large amount of endoscopic images taken with a capsule endoscope.
- embodiments provide a capsule endoscopy image reading system and method by which a doctor can intuitively check the location of a lesion by visualizing a location of a lesion located on a capsule endoscopy image without separate labeling.
- the technical task to be achieved by the present embodiment is not limited to the technical task as described above, and other technical tasks may exist.
- a pre-processing unit for pre-processing the capsule endoscope image taken by the capsule endoscope; a convolutional neural network (CNN) that determines whether a lesion exists in a capsule endoscopy image by inputting the preprocessed capsule endoscopy image; and a grad-CAM acquisition unit for acquiring a gradient class activation map (grad-CAM) for the capsule endoscopy image, wherein the convolutional neural network includes an input layer for receiving the preprocessed capsule endoscope image; one or more convolutional layers for extracting features of the preprocessed capsule endoscopy image input through the input layer; one or more maximal pooling layers for subsampling features for a capsule endoscopy image; and an output layer that outputs a probability value indicating the presence or absence of a lesion with respect to the capsule endoscopy image, wherein the grad-CAM acquisition unit acquires the grad-CAM from the layer determined to have the highest lesion location detection ability among the convolutional layer and the maximum pool
- Another embodiment is a capsule endoscope image reading method using a capsule endoscope image reading system, a pre-processing step of pre-processing the capsule endoscope image taken by the capsule endoscope; an input step of receiving a pre-processed capsule endoscope image; a processing step of repeatedly executing a processing operation of extracting features for the pre-processed capsule endoscopy image input in the input step and subsampling the extracted features; a grad-CAM acquisition step of acquiring a grad-CAM (gradient class activation map) based on the result of the processing step; and outputting a probability value indicating whether a lesion exists in the capsule endoscopy image.
- a pre-processing step of pre-processing the capsule endoscope image taken by the capsule endoscope
- an input step of receiving a pre-processed capsule endoscope image
- a processing step of repeatedly executing a processing operation of extracting features for the pre-processed capsule endoscopy image input in the input step and subs
- the capsule endoscope image reading system and method according to the embodiments it is possible to reduce a doctor's reading time and increase accuracy for a large amount of endoscopic images taken with the capsule endoscope.
- the position of the lesion located on the capsule endoscope image is visualized without separate labeling, so that the doctor can intuitively check the position of the lesion.
- FIG. 1 is a block diagram showing an example of a capsule endoscope image reading system according to the present invention.
- FIG. 2 is a block diagram illustrating an example of a preprocessor of a capsule endoscope image reading system according to the present invention.
- FIG. 3 is a diagram illustrating an example of removing noise from an image through the preprocessor of FIG. 2 .
- FIG. 4 is a diagram illustrating an example of augmenting an image through the preprocessor of FIG. 2 .
- FIG. 5 is a diagram illustrating a convolutional neural network of a capsule endoscope image reading system according to the present invention.
- FIG. 6 is a view showing an example of grad-CAM generated by the capsule endoscopy image reading system according to the present invention.
- FIG. 7 is a diagram illustrating an example in which the capsule endoscopy image reading system according to the present invention determines a layer from which grad-CAM is acquired.
- FIG. 8 is a view showing an example of the structure of a video clip generated by the capsule endoscope image reading system according to the present invention.
- FIG. 9 is a diagram illustrating an example of a frame included in a video clip generated by the capsule endoscope image reading system according to the present invention.
- FIG. 10 is a diagram illustrating an example of a frame included in two video clips generated by the capsule endoscope image reading system according to the present invention.
- FIG. 11 is a diagram illustrating a new video clip in which the two video clips of FIG. 10 are merged;
- FIG. 12 is a view showing a capsule endoscope image set applied to the capsule endoscope image reading system according to the present invention.
- FIG. 13 is a flowchart of a capsule endoscopy image reading method according to the present invention.
- FIG. 14 is a flowchart illustrating details of a pre-processing step of a capsule endoscopy image reading method according to the present invention.
- a "part" includes a unit realized by hardware, a unit realized by software, and a unit realized using both.
- one unit may be implemented using two or more hardware, and two or more units may be implemented by one hardware.
- Some of the operations or functions described as being performed by the terminal, apparatus, or device in the present specification may be performed instead of by a server connected to the terminal, apparatus, or device. Similarly, some of the operations or functions described as being performed by the server may also be performed in a terminal, apparatus, or device connected to the server.
- mapping or matching with the terminal means mapping or matching the terminal's unique number or personal identification information, which is the identification data of the terminal. can be interpreted as
- FIG. 1 is a block diagram showing an example of a capsule endoscope image reading system according to the present invention.
- the capsule endoscopy image reading system 100 includes a preprocessor 110 , a convolutional neural network 120 , and a gradient class activation map (grad-CAM) acquisition unit 130 . can do.
- the preprocessor 110 may preprocess the capsule endoscope image captured by the capsule endoscope.
- the convolutional neural network 120 may determine whether a lesion exists in the capsule endoscope image, that is, whether a lesion exists in the capsule endoscope image, by receiving the capsule endoscope image preprocessed by the preprocessor 110 as an input.
- the capsule endoscopy image reading system processes a large amount of capsule endoscopy images through the convolutional neural network 120, so that the presence of a lesion is read faster than a doctor reading the capsule endoscopy images one by one with the naked eye to determine the presence of a lesion. make it possible
- the convolutional neural network 120 is an input layer 121 that receives a preprocessed capsule endoscope image from the preprocessor 110 , and extracts features of the preprocessed capsule endoscope image input through the input layer 121 .
- the grad-CAM acquisition unit 130 may acquire the grad-CAM for the capsule endoscopy image.
- the grad-CAM acquisition unit 130 may acquire the grad-CAM from the layer determined to have the highest lesion location detection capability among the above-described convolutional layer 122 and the maximum pooling layer 123 .
- the capsule endoscopy image reading system 100 may further include a video clip generation unit 140 in addition to the preprocessor 110 , the convolutional neural network 120 , and the grad-CAM acquisition unit 130 described above.
- the video clip generator 140 may generate a video clip corresponding to the capsule endoscope image based on the capsule endoscope image.
- FIG. 2 is a block diagram illustrating an example of a preprocessor of a capsule endoscope image reading system according to the present invention.
- the pre-processing unit 110 may include a noise removing unit 111 and an image enhancing unit 112 .
- the noise removing unit 111 may remove noise from the capsule endoscope image input to the preprocessing unit 110 .
- the image augmentation unit 112 may generate a plurality of augmented images by performing at least one of rotation and vertical inversion on the noise-removed capsule endoscope image.
- FIG. 3 is a diagram illustrating an example of removing noise from an image through the preprocessor of FIG. 2 .
- the noise removing unit 111 may remove noise including letters, numbers, and symbols recorded in the capsule endoscope image from the capsule endoscope image. This is because letters, numbers, symbols, etc. are irrelevant to determining whether a lesion exists, and the convolutional neural network 120 may interfere with determining whether a lesion exists in the capsule endoscopy image.
- the noise removing unit 111 deletes letters, numbers, symbols, etc. indicating the time, date, and photographing equipment, etc. on the edge of the capsule endoscopy image of 576 * 576 * 3 to 512 * 512 * 3 Capsule endoscopy images of
- FIG. 4 is a diagram illustrating an example of augmenting an image through the preprocessor of FIG. 2 .
- the image augmentation unit 112 may generate eight augmented images by performing at least one of rotation (90 degrees/180 degrees/270 degrees) and vertical inversion on the noise-removed capsule endoscope image. have. Meanwhile, FIG. 4 describes a case in which both rotation (90 degrees/180 degrees/270 degrees) and vertical inversion are performed. However, in the present invention, the image augmentation unit 112 may perform all of the rotation and up/down inversion, or may generate less than 8 augmented images by performing only some of the rotation and up/down inversion.
- the reason why the image augmentation unit 112 generates the augmented image by performing at least one of rotation and vertical inversion is to prevent the characteristics of the lesion present in the original capsule endoscopy image for the augmented image from being damaged in the image processing process. to be.
- Capsule endoscopy images generally have a shape that shows images of the small intestine in a circular shape in a black background.
- FIG. 5 is a diagram illustrating a convolutional neural network of a capsule endoscope image reading system according to the present invention.
- the convolutional neural network 120 may extract a feature of the capsule endoscope image preprocessed through the input layer 121 through the convolution layer 122 .
- the convolutional neural network 120 may subsample the feature extracted through the convolution layer 122 through the maximum pooling layer 123 .
- the convolutional neural network 120 may repeat the process of subsampling a feature extracted through any one convolutional layer through any one maximum pooling layer and then inputting the result back to another convolutional layer.
- the convolutional neural network 120 uses the result generated using the convolutional layer 122 and the maximum pooling layer 123 to generate a probability value indicating whether a lesion exists in the capsule endoscopy image through the output layer 124 . can be printed out.
- the output layer 124 analyzes the result value generated using the convolution layer 122 and the maximum pooling layer 123 using one or more fully connected layers, and applies various transform functions (eg softmax) to it. It can be converted into a probability value indicating the presence or absence of a lesion using various transform functions (eg softmax)
- the grad-CAM acquisition unit 130 may acquire the grad-CAM from the layer determined to have the highest lesion location detection capability among the convolutional layer 122 and the maximum pooling layer 123 .
- the doctor can check with high accuracy which part of the capsule endoscopy image where the lesion was judged to have had an influence on the judgment of the presence of the lesion through grad-CAM.
- a CAM Class Activation Map
- the CAM is a map that visualizes the result of calculating the sum of the weights of the feature map using the weight just before the layer that predicts the probability value. can be found If the CAM is superimposed on the capsule endoscopy image, it is possible to easily determine the area where the lesion has occurred in the capsule endoscopy image.
- grad-CAM is acquired using the grad-CAM acquisition unit 130 to obtain a CAM result without depending on the global average pooling (GAP) layer and without modifying the structure of the convolutional neural network. Therefore, the use of grad-CAM does not impose any constraints on the structure of the convolutional neural network, so the ability to detect the presence of lesions and the ability to track the location of lesions can be improved.
- GAP global average pooling
- the grad-CAM may obtain an importance weight by the following equation by using the gradient of the above-described convolutional layer or the maximum pooling layer and result information passing the layer.
- is class is the result information passing through the layer, is the value corresponding to the result information (i,j) to be observed, denotes the effect of y on A, i.e. the gradient.
- FIG. 6 is a view showing an example of grad-CAM generated by the capsule endoscopy image reading system according to the present invention.
- the grad-CAM for the capsule endoscopy image on the left is displayed on the right.
- the middle part marked with a different color from the outer part had a great influence on the determination of the presence of a lesion, and it can be seen that the probability of lesion presence in the capsule endoscopy image is about 71.64%.
- a viewer e.g. a doctor reading the capsule endoscope image can check the location of the lesion in the capsule endoscope image without separate labeling.
- FIG. 7 is a diagram illustrating an example in which the capsule endoscopy image reading system according to the present invention determines a layer from which grad-CAM is acquired.
- the grad-CAM acquisition unit 130 may acquire the grad-CAM from the layer determined to have the highest lesion location detection ability among one or more convolutional layers 122 and one or more maximum pooling layers 123 .
- the highest lesion location detection ability means that the region that has an important influence on the grad-CAM to determine the lesion existence probability and the region where the lesion exists in the actual capsule endoscopy image match the most.
- the grad-CAM acquisition unit 130 includes a first convolutional layer (CONV_1), a second convolutional layer (CONV_2), and a third convolutional layer included in the convolutional layer 122 for a capsule endoscopy image. (CONV_3), the fourth convolutional layer (CONV_4), the first maximum pooling layer (MAXP_1) included in the maximum pooling layer 123, the second maximum pooling layer (MAXP_2), and the third maximum pooling layer (MAXP_3) Based on grad-CAM, it is possible to determine the layer with the highest lesion localization ability.
- the grad-CAM acquisition unit 130 may acquire the grad-CAM for one or more test capsule endoscopy images in order to determine the layer having the highest lesion location detection capability. If there are a plurality of capsule endoscopy images for testing, the grad-CAM acquisition unit 130 calculates, for example, the average of the lesion location detection capability of grad-CAM for each test capsule endoscope image for each layer for each layer of the lesion of each layer. It is possible to determine the location detection capability.
- the grad-CAM acquisition unit 130 may determine that the grad-CAM acquired from the first maximum pooling layer MAXP_1 with respect to the capsule endoscopy image for testing has the highest lesion location detection capability. In this case, the grad-CAM acquisition unit 130 may then acquire the grad-CAM for the capsule endoscopy image from the first maximum pooling layer MAXP_1 among the convolutional layer 122 and the maximum pooling layer 123 .
- FIG. 8 is a view showing an example of the structure of a video clip generated by the capsule endoscope image reading system according to the present invention.
- the video clip generating unit 140 may generate a video clip based on a video image in which a lesion exists.
- the presence or absence of a lesion is expressed through a probability value, so the video clip generator 140 calculates a probability value indicating the presence of a lesion using the convolutional neural network 120 described above for the capsule endoscopy image.
- a video clip corresponding to the capsule endoscope image may be generated based on a capsule endoscope image in which a probability value indicating the presence of a lesion in the capsule endoscope image is equal to or greater than a threshold value (eg 0.8).
- FIG. 9 is a diagram illustrating an example of a frame included in a video clip generated by the capsule endoscope image reading system according to the present invention.
- the video clip generator 140 adds frames as much as the maximum reference value (eg 5) before and after the capsule endoscopy image in which the lesion is determined to exist (that is, the probability value is greater than or equal to the threshold value) to the capsule endoscopy image.
- a corresponding video clip can be created.
- the video clip generating unit 140 may add up to 5 frames before and after the capsule endoscopy image determined to exist of the lesion to the video clip. If the number of frames to be played is less than 5 (eg the 4th frame after the start), all appendable frames can be added to the video clip.
- the video clip generator 140 may generate a video clip by combining not only the N-th frame, but also A frames before the N-th frame and A frames after the N-th frame.
- the video clip includes not only the capsule endoscopy image determined to have a lesion, but also the capsule endoscopy image before and after it is included in the video clip. This is in order to confirm the continuous change to the endoscopic image or vice versa through the video clip.
- a video clip consisting only of images predicted to have a lesion is unsuitable for reading because the front and back frames appear to be cut off to the viewer.
- the aforementioned reference value may be arbitrarily determined at a level that does not cause inconvenience to the viewer.
- FIG. 10 is a diagram illustrating an example of a frame included in two video clips generated by the capsule endoscope image reading system according to the present invention.
- the video clip generating unit 140 generates video clip 1 which is a video clip based on the capsule endoscope image of frame M, and a video clip that is based on the capsule endoscope image of frame N. You can create video clip 2.
- frame (M+A), which is the last frame of video clip 1 is a frame earlier than frame (N-A), which is the start frame of video clip 2. That is, video clip 1 and video clip 2 have frames overlapping each other.
- the video clip generator 140 merges the video clip 1 and the video clip 2 instead of separately generating the video clip 1 and the video clip 2 to form one video. You can create clips.
- FIG. 11 is a diagram illustrating a new video clip in which the two video clips of FIG. 10 are merged;
- the video clip generator 140 may generate the video clip 3 by merging the aforementioned video clip 1 and video clip 2 .
- the video clip 3 may include a frame (N+A), which is the most recently generated frame, from a frame (M-A) that is the earliest generated frame among frames included in the video clip 1 or the video clip 2 .
- the video clip generating unit 140 may express a change between frame M and frame N, which is a capsule endoscopy image in which a lesion exists, through one video clip.
- FIG. 12 is a view showing a capsule endoscope image set applied to the capsule endoscope image reading system according to the present invention.
- the number of images included in a training image set which is a set of capsule endoscopy images used for learning the convolutional neural network 120 of the capsule endoscope image reading system, and the convolutional neural network 120
- the ratio of the number of images included in the test image set may be determined as a preset ratio value.
- (the number of images in the training image set):(the number of images in the test image set) may be 7:3 or 8:2.
- the above-mentioned ratio value may be selected as a value that can increase the accuracy of the lesion existence probability the most. This ratio value may be set to a fixed value in the capsule endoscopy image reading system.
- the input layer 121 of the convolutional neural network 120 may receive the same number of images with lesions and images without lesions. This is because, in the learning process of the convolutional neural network 120, when an image corresponding to either an image in which a lesion exists or an image without a lesion is excessively input, a bias is likely to be applied to the excessively inputted image. because it is high
- FIG. 13 is a flowchart of a capsule endoscopy image reading method according to the present invention.
- the capsule endoscope image reading method 1300 may include a pre-processing step ( S1310 ) of pre-processing the capsule endoscope image captured by the capsule endoscope.
- the capsule endoscope image reading method may include an input step (S1320) of receiving the capsule endoscope image preprocessed in step S1310.
- the capsule endoscope image reading method may include a processing step (S1330) of extracting features for the preprocessed capsule endoscope image input in step S1320 and repeatedly executing a processing operation of subsampling the extracted features.
- the capsule endoscopy image reading method may include a grad-CAM acquisition step (S1340) of acquiring a grad-CAM (gradient class activation map) based on the result of step S1330.
- the capsule endoscopy image reading method may include outputting a probability value indicating whether a lesion exists in the capsule endoscope image ( S1350 ).
- the capsule endoscope image reading method 1300 may be performed through the capsule endoscope image reading system 100 described above.
- FIG. 14 is a flowchart illustrating details of a pre-processing step of a capsule endoscopy image reading method according to the present invention.
- the pre-processing step ( S1310 ) may include a noise removing step ( S1410 ) of removing noise from the capsule endoscopy image.
- the preprocessing step (S1310) may include an image augmentation step (S1420) of generating a plurality of augmented images by performing at least one of rotation and vertical inversion on the capsule endoscope image from which the noise has been removed in step S1410.
- the pre-processing step S1310 may be performed by the pre-processing unit 110 of the above-described capsule endoscope image reading system.
- the capsule endoscopy image reading system and method described in the embodiments of the present invention can reduce the reading time of a doctor and increase the accuracy of a large amount of endoscopic images taken with the capsule endoscope.
- the deep learning model described in the embodiments of the present invention may be a model in which artificial neural networks are stacked in multi-layered layers. That is, the deep learning model automatically learns the characteristics of the input value by learning a large amount of data from a deep neural network consisting of a multi-layered network, and through this, the network is trained to minimize the error in the objective function, that is, the prediction accuracy. is the model of
- the deep learning model is a convolutional neural network (CNN)
- CNN convolutional neural network
- a deep learning model can be implemented through a deep learning framework.
- the deep learning framework provides functions commonly used when developing deep learning models in the form of a library and plays a role in supporting the good use of system software or hardware platforms.
- the deep learning model may be implemented using any deep learning framework that is currently public or will be released in the future.
- the capsule endoscope image reading system described above may be implemented by a computing device including at least a portion of a processor, a memory, a user input device, and a presentation device.
- a memory is a medium that stores computer-readable software, applications, program modules, routines, instructions, and/or data, etc. coded to perform specific tasks when executed by a processor.
- the processor may read and execute computer-readable software, applications, program modules, routines, instructions, and/or data stored in the memory and/or the like.
- the user input device may be a means for allowing the user to input a command to the processor to execute a specific task or to input data required for the execution of the specific task.
- the user input device may include a physical or virtual keyboard or keypad, key button, mouse, joystick, trackball, touch-sensitive input means, or a microphone.
- the presentation device may include a display, a printer, a speaker, or a vibrator.
- Computing devices may include various devices such as smartphones, tablets, laptops, desktops, servers, clients, and the like.
- the computing device may be a single stand-alone device, or may include a plurality of computing devices operating in a distributed environment comprising a plurality of computing devices cooperating with each other through a communication network.
- the capsule endoscopy image reading method described above includes a processor, and when executed by the processor, computer readable software, applications, program modules, routines, and instructions coded to perform an image diagnosis method using a deep learning model , and/or may be executed by a computing device having a memory storing data structures, and/or the like.
- the above-described embodiments may be implemented through various means.
- the present embodiments may be implemented by hardware, firmware, software, or a combination thereof.
- the image diagnosis method using the deep learning model includes one or more ASICs (Application Specific Integrated Circuits), DSPs (Digital Signal Processors), DSPDs (Digital Signal Processing Devices), It may be implemented by Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), processors, controllers, microcontrollers or microprocessors, and the like.
- ASICs Application Specific Integrated Circuits
- DSPs Digital Signal Processors
- DSPDs Digital Signal Processing Devices
- PLDs Programmable Logic Devices
- FPGAs Field Programmable Gate Arrays
- processors controllers, microcontrollers or microprocessors, and the like.
- the capsule endoscopy image reading method may be implemented using an artificial intelligence semiconductor device in which neurons and synapses of a deep neural network are implemented with semiconductor elements.
- the semiconductor device may be currently used semiconductor devices, for example, SRAM, DRAM, NAND, or the like, or may be next-generation semiconductor devices, RRAM, STT MRAM, PRAM, or the like, or a combination thereof.
- the capsule endoscope image reading method according to the embodiments is implemented using an artificial intelligence semiconductor device
- the result (weight) of learning a deep learning model with software is transcribed into a synaptic mimic device arranged in an array or learned in an artificial intelligence semiconductor device may proceed.
- the capsule endoscope image reading method may be implemented in the form of an apparatus, procedure, or function for performing the functions or operations described above.
- the software code may be stored in the memory unit and driven by the processor.
- the memory unit may be located inside or outside the processor, and may transmit and receive data to and from the processor by various known means.
- terms such as “system”, “processor”, “controller”, “component”, “module”, “interface”, “model”, or “unit” generally refer to computer-related entities hardware, hardware and software. may mean a combination of, software, or running software.
- the aforementioned component may be, but is not limited to, a process run by a processor, a processor, a controller, a controlling processor, an object, a thread of execution, a program, and/or a computer.
- an application running on a controller or processor and a controller or processor can be a component.
- One or more components may reside within a process and/or thread of execution, and the components may be located on one device (eg, a system, computing device, etc.) or distributed across two or more devices.
- another embodiment provides a computer program stored in a computer recording medium for performing the above-described capsule endoscope image reading method.
- Another embodiment also provides a computer-readable recording medium in which a program for realizing the above-described capsule endoscope image reading method is recorded.
- the program recorded on the recording medium can be read by a computer, installed, and executed to execute the above-described steps.
- the above-described program is C, C++, which the processor (CPU) of the computer can read through the device interface of the computer.
- JAVA and may include code coded in a computer language such as machine language.
- Such codes may include function codes related to functions defining the above-mentioned functions, etc., and may include control codes related to an execution procedure necessary for the processor of the computer to execute the above-mentioned functions according to a predetermined procedure.
- this code may further include additional information necessary for the processor of the computer to execute the above functions or code related to memory reference for which location (address address) in the internal or external memory of the computer to be referenced. .
- the code is transmitted to the computer's processor using the computer's communication module. It may further include a communication-related code for how to communicate with another computer or server, and what kind of information or media to transmit and receive during communication.
- the computer-readable recording medium in which the program as described above is recorded is, for example, ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical media storage device, etc., and also carrier wave (eg, , transmission through the Internet) may be implemented in the form of.
- the computer-readable recording medium is distributed in network-connected computer systems, and computer-readable codes can be stored and executed in a distributed manner.
- the capsule endoscopy image reading method described with reference to FIG. 10 may also be implemented in the form of a recording medium including instructions executable by a computer, such as an application or program module executed by a computer.
- Computer-readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. Also, computer-readable media may include all computer storage media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
- the above-described capsule endoscope image reading method may be executed by an application basically installed in the terminal (which may include a program included in a platform or operating system basically mounted in the terminal), and the user may use an application store server, an application or It may be executed by an application (ie, a program) directly installed in the master terminal through an application providing server such as a web server related to the corresponding service.
- the above-described capsule endoscopy image reading method may be implemented as an application (ie, a program) installed basically in a terminal or directly installed by a user, and may be recorded in a computer-readable recording medium such as a terminal.
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Surgery (AREA)
- Public Health (AREA)
- General Health & Medical Sciences (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Biomedical Technology (AREA)
- Physics & Mathematics (AREA)
- Pathology (AREA)
- Animal Behavior & Ethology (AREA)
- Molecular Biology (AREA)
- Heart & Thoracic Surgery (AREA)
- Optics & Photonics (AREA)
- Biophysics (AREA)
- Veterinary Medicine (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Quality & Reliability (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Signal Processing (AREA)
- Endoscopes (AREA)
Abstract
Disclosed in the present specification are a capsule endoscopy image reading system and method, the system comprising: a preprocessing unit for preprocessing a capsule endoscopy image captured by means of capsule endoscopy; a convolution neural network (CNN) using the preprocessed capsule endoscopy image as an input to determine whether a lesion is present in the capsule endoscopy image; and a gradient class activation map (grad-CAM) acquisition unit for acquiring a grad-CAM for the capsule endoscopy image.
Description
본 발명은 캡슐 내시경 영상 판독 시스템 및 방법에 관한 것이다.The present invention relates to a capsule endoscopic image reading system and method.
캡슐 내시경은 알약 모양의 캡슐을 입으로 삼켜 식도, 위장, 소장 등의 건강상태를 촬영하고 이 영상을 분석 및 판독하여 소화기 질환 진단에 이용되는 기기이다. 소장은 위와 대장의 중간에 위치하며, 길이는 6미터 가량 된다. 캡슐 내시경으로 소장을 촬영할 때 10시간 이상 촬영하며 5만장 이상의 영상이 기록되는데, 이를 의사가 직접 판독하기에는 시간과 정확도에 한계가 있다.Capsule endoscope is a device used for diagnosing digestive diseases by swallowing a pill-shaped capsule with the mouth, taking pictures of health conditions of the esophagus, stomach, and small intestine, and analyzing and reading these images. The small intestine is located in the middle of the stomach and large intestine, and is about 6 meters long. When taking a small intestine with a capsule endoscope, it takes more than 10 hours and records more than 50,000 images, but there are limitations in time and accuracy for a doctor to read it directly.
실시예들은, 캡슐내시경으로 촬영된 대량의 내시경 영상에 대해 의사의 판독 시간을 줄이고 정확도를 높일 수 있는 캡슐 내시경 영상 판독 시스템 및 방법을 제공한다.Embodiments provide a capsule endoscope image reading system and method that can reduce a doctor's reading time and increase accuracy for a large amount of endoscopic images taken with a capsule endoscope.
또한, 실시예들은 캡슐 내시경 영상에 위치한 병변의 위치를 별도의 라벨링 없이 시각화하여, 의사가 병변의 위치를 직관적으로 확인할 수 있는 캡슐 내시경 영상 판독 시스템 및 방법을 제공한다.In addition, embodiments provide a capsule endoscopy image reading system and method by which a doctor can intuitively check the location of a lesion by visualizing a location of a lesion located on a capsule endoscopy image without separate labeling.
다만, 본 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제로 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다.However, the technical task to be achieved by the present embodiment is not limited to the technical task as described above, and other technical tasks may exist.
일 실시예는, 캡슐 내시경에 의해 촬영된 캡슐 내시경 영상을 전처리하는 전처리부; 전처리된 캡슐 내시경 영상을 입력으로 하여 캡슐 내시경 영상의 병변 존재 여부를 판단하는 컨볼루션 뉴럴 네트워크(CNN, Convolution Neural Network); 및 캡슐 내시경 영상에 대한 grad-CAM(Gradient Class Activation Map)을 획득하는 grad-CAM 획득부를 포함하고, 컨볼루션 뉴럴 네트워크는 전처리된 캡슐 내시경 영상을 입력받는 입력층; 입력층을 통해 입력된 전처리된 캡슐 내시경 영상에 대한 특징을 추출하는 하나 이상의 합성곱층; 캡슐 내시경 영상에 대한 특징을 서브 샘플링하는 하나 이상의 최대 풀링층; 및 캡슐 내시경 영상에 대하여 병변 존재 여부를 지시하는 확률값을 출력하는 출력층을 포함하고, grad-CAM 획득부는 합성곱층 및 최대 풀링층 중에서 병변 위치 탐지 능력이 가장 높다고 판단된 층에서 grad-CAM을 획득하는 캡슐 내시경 영상 판독 시스템을 제공한다.One embodiment, a pre-processing unit for pre-processing the capsule endoscope image taken by the capsule endoscope; a convolutional neural network (CNN) that determines whether a lesion exists in a capsule endoscopy image by inputting the preprocessed capsule endoscopy image; and a grad-CAM acquisition unit for acquiring a gradient class activation map (grad-CAM) for the capsule endoscopy image, wherein the convolutional neural network includes an input layer for receiving the preprocessed capsule endoscope image; one or more convolutional layers for extracting features of the preprocessed capsule endoscopy image input through the input layer; one or more maximal pooling layers for subsampling features for a capsule endoscopy image; and an output layer that outputs a probability value indicating the presence or absence of a lesion with respect to the capsule endoscopy image, wherein the grad-CAM acquisition unit acquires the grad-CAM from the layer determined to have the highest lesion location detection ability among the convolutional layer and the maximum pooling layer. A capsule endoscopy image reading system is provided.
다른 실시예는 캡슐 내시경 영상 판독 시스템을 이용한 캡슐 내시경 영상 판독 방법으로, 캡슐 내시경에 의해 촬영된 캡슐 내시경 영상을 전처리하는 전처리 단계; 전처리된 캡슐 내시경 영상을 입력받는 입력 단계; 입력 단계에서 입력된 전처리된 캡슐 내시경 영상에 대한 특징을 추출하고 추출된 특징을 서브 샘플링하는 처리 동작을 반복적으로 실행하는 처리 단계; 처리 단계의 결과를 기초로 grad-CAM(gradient Class Activation Map)을 획득하는 grad-CAM 획득 단계; 및 캡슐 내시경 영상의 병변 존재 여부를 지시하는 확률값을 출력하는 출력 단계를 포함하는 캡슐 내시경 영상 판독 방법을 제공한다.Another embodiment is a capsule endoscope image reading method using a capsule endoscope image reading system, a pre-processing step of pre-processing the capsule endoscope image taken by the capsule endoscope; an input step of receiving a pre-processed capsule endoscope image; a processing step of repeatedly executing a processing operation of extracting features for the pre-processed capsule endoscopy image input in the input step and subsampling the extracted features; a grad-CAM acquisition step of acquiring a grad-CAM (gradient class activation map) based on the result of the processing step; and outputting a probability value indicating whether a lesion exists in the capsule endoscopy image.
실시예들에 따른 캡슐 내시경 영상 판독 시스템 및 방법에 의하면 캡슐 내시경으로 촬영된 대량의 내시경 영상에 대해 의사의 판독 시간을 줄이고 정확도를 높일 수 있다.According to the capsule endoscope image reading system and method according to the embodiments, it is possible to reduce a doctor's reading time and increase accuracy for a large amount of endoscopic images taken with the capsule endoscope.
또한, 실시예들에 따른 캡슐 내시경 영상 판독 시스템 및 방법에 의하면 캡슐 내시경 영상에 위치한 병변의 위치를 별도의 라벨링 없이 시각화하여, 의사가 병변의 위치를 직관적으로 확인할 수 있다.In addition, according to the capsule endoscopy image reading system and method according to the embodiments, the position of the lesion located on the capsule endoscope image is visualized without separate labeling, so that the doctor can intuitively check the position of the lesion.
도 1은 본 발명에 따른 캡슐 내시경 영상 판독 시스템의 일 예를 블록도로 나타낸 도면이다.1 is a block diagram showing an example of a capsule endoscope image reading system according to the present invention.
도 2는 본 발명에 따른 캡슐 내시경 영상 판독 시스템의 전처리부의 일 예를 블록도로 나타낸 도면이다.2 is a block diagram illustrating an example of a preprocessor of a capsule endoscope image reading system according to the present invention.
도 3은 도 2의 전처리부를 통해 영상의 노이즈를 제거하는 일 예를 나타낸 도면이다.FIG. 3 is a diagram illustrating an example of removing noise from an image through the preprocessor of FIG. 2 .
도 4는 도 2의 전처리부를 통해 영상을 증강하는 일 예를 나타낸 도면이다.4 is a diagram illustrating an example of augmenting an image through the preprocessor of FIG. 2 .
도 5는 본 발명에 따른 캡슐 내시경 영상 판독 시스템의 컨볼루션 뉴럴 네트워크를 나타낸 도면이다.5 is a diagram illustrating a convolutional neural network of a capsule endoscope image reading system according to the present invention.
도 6은 본 발명에 따른 캡슐 내시경 영상 판독 시스템이 생성한 grad-CAM의 일 예를 나타낸 도면이다.6 is a view showing an example of grad-CAM generated by the capsule endoscopy image reading system according to the present invention.
도 7은 본 발명에 따른 캡슐 내시경 영상 판독 시스템이 grad-CAM을 획득하는 층을 결정하는 일 예를 나타낸 도면이다.7 is a diagram illustrating an example in which the capsule endoscopy image reading system according to the present invention determines a layer from which grad-CAM is acquired.
도 8은 본 발명에 따른 캡슐 내시경 영상 판독 시스템이 생성한 비디오 클립의 구조의 일 예를 나타낸 도면이다.8 is a view showing an example of the structure of a video clip generated by the capsule endoscope image reading system according to the present invention.
도 9는 본 발명에 따른 캡슐 내시경 영상 판독 시스템이 생성한 비디오 클립에 포함되는 프레임의 일 예를 나타낸 도면이다.9 is a diagram illustrating an example of a frame included in a video clip generated by the capsule endoscope image reading system according to the present invention.
도 10은 본 발명에 따른 캡슐 내시경 영상 판독 시스템이 생성한 2개의 비디오 클립에 포함되는 프레임의 일 예를 나타낸 도면이다.10 is a diagram illustrating an example of a frame included in two video clips generated by the capsule endoscope image reading system according to the present invention.
도 11은 도 10의 2개의 비디오 클립이 병합된 새로운 비디오 클립을 나타낸 도면이다.11 is a diagram illustrating a new video clip in which the two video clips of FIG. 10 are merged;
도 12는 본 발명에 따른 캡슐 내시경 영상 판독 시스템에 적용되는 캡슐 내시경 영상 세트를 나타낸 도면이다.12 is a view showing a capsule endoscope image set applied to the capsule endoscope image reading system according to the present invention.
도 13은 본 발명에 따른 캡슐 내시경 영상 판독 방법에 대한 흐름도이다.13 is a flowchart of a capsule endoscopy image reading method according to the present invention.
도 14는 본 발명에 따른 캡슐 내시경 영상 판독 방법의 전처리 단계의 세부 내용에 대한 흐름도이다.14 is a flowchart illustrating details of a pre-processing step of a capsule endoscopy image reading method according to the present invention.
아래에서는 첨부한 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시예를 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art can easily implement them. However, the present invention may be embodied in several different forms and is not limited to the embodiments described herein. And in order to clearly explain the present invention in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.
명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는"직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미하며, 하나 또는 그 이상의 다른 특징이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Throughout the specification, when a part is "connected" to another part, it includes not only the case where it is "directly connected" but also the case where it is "electrically connected" with another element interposed therebetween. . In addition, when a part "includes" a certain component, it means that other components may be further included, rather than excluding other components, unless otherwise stated, and one or more other features However, it is to be understood that the existence or addition of numbers, steps, operations, components, parts, or combinations thereof is not precluded in advance.
명세서 전체에서 사용되는 정도의 용어 "약", "실질적으로" 등은 언급된 의미에 고유한 제조 및 물질 허용오차가 제시될 때 그 수치에서 또는 그 수치에 근접한 의미로 사용되고, 본 발명의 이해를 돕기 위해 정확하거나 절대적인 수치가 언급된 개시 내용을 비양심적인 침해자가 부당하게 이용하는 것을 방지하기 위해 사용된다. 본 발명의 명세서 전체에서 사용되는 정도의 용어 "~(하는) 단계" 또는 "~의 단계"는 "~ 를 위한 단계"를 의미하지 않는다.The terms "about", "substantially", etc. to the extent used throughout the specification are used in or close to the numerical value when manufacturing and material tolerances inherent in the stated meaning are presented, and are intended to enhance the understanding of the present invention. To help, precise or absolute figures are used to prevent unfair use by unconscionable infringers of the stated disclosure. As used throughout the specification of the present invention, the term "step of (to)" or "step of" does not mean "step for".
본 명세서에 있어서 '부(部)'란, 하드웨어에 의해 실현되는 유닛(unit), 소프트웨어에 의해 실현되는 유닛, 양방을 이용하여 실현되는 유닛을 포함한다. 또한, 1개의 유닛이 2개 이상의 하드웨어를 이용하여 실현되어도 되고, 2개 이상의 유닛이 1개의 하드웨어에 의해 실현되어도 된다.In this specification, a "part" includes a unit realized by hardware, a unit realized by software, and a unit realized using both. In addition, one unit may be implemented using two or more hardware, and two or more units may be implemented by one hardware.
본 명세서에 있어서 단말, 장치 또는 디바이스가 수행하는 것으로 기술된 동작이나 기능 중 일부는 해당 단말, 장치 또는 디바이스와 연결된 서버에서 대신 수행될 수도 있다. 이와 마찬가지로, 서버가 수행하는 것으로 기술된 동작이나 기능 중 일부도 해당 서버와 연결된 단말, 장치 또는 디바이스에서 수행될 수도 있다.Some of the operations or functions described as being performed by the terminal, apparatus, or device in the present specification may be performed instead of by a server connected to the terminal, apparatus, or device. Similarly, some of the operations or functions described as being performed by the server may also be performed in a terminal, apparatus, or device connected to the server.
본 명세서에서 있어서, 단말과 매핑(Mapping) 또는 매칭(Matching)으로 기술된 동작이나 기능 중 일부는, 단말의 식별 정보(Identifying Data)인 단말기의 고유번호나 개인의 식별정보를 매핑 또는 매칭한다는 의미로 해석될 수 있다.In this specification, some of the operations or functions described as mapping or matching with the terminal means mapping or matching the terminal's unique number or personal identification information, which is the identification data of the terminal. can be interpreted as
이하 첨부된 도면을 참고하여 본 발명을 상세히 설명하기로 한다.Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.
도 1은 본 발명에 따른 캡슐 내시경 영상 판독 시스템의 일 예를 블록도로 나타낸 도면이다.1 is a block diagram showing an example of a capsule endoscope image reading system according to the present invention.
도 1을 참조하면, 캡슐 내시경 영상 판독 시스템(100)은 전처리부(110), 컨볼루션 뉴럴 네트워크(Convolution Neural Network)(120), grad-CAM(Gradient Class Activation Map) 획득부(130)를 포함할 수 있다.Referring to FIG. 1 , the capsule endoscopy image reading system 100 includes a preprocessor 110 , a convolutional neural network 120 , and a gradient class activation map (grad-CAM) acquisition unit 130 . can do.
전처리부(110)는 캡슐 내시경에 의해 촬영된 캡슐 내시경 영상을 전처리할 수 있다.The preprocessor 110 may preprocess the capsule endoscope image captured by the capsule endoscope.
컨볼루션 뉴럴 네트워크(120)는 전처리부(110)에서 전처리된 캡슐 내시경 영상을 입력으로 하여 캡슐 내시경 영상의 병변 존재 여부, 즉 캡슐 내시경 영상에 병변이 존재하는지 여부를 판단할 수 있다. 캡슐 내시경 영상 판독 시스템은 컨볼루션 뉴럴 네트워크(120)를 통해 대량의 캡슐 내시경 영상을 처리함으로써, 의사가 캡슐 내시경 영상을 육안으로 하나씩 판독하여 병변 존재 여부를 판단하는 것보다 더 빠르게 병변 존재 여부를 판독할 수 있도록 한다.The convolutional neural network 120 may determine whether a lesion exists in the capsule endoscope image, that is, whether a lesion exists in the capsule endoscope image, by receiving the capsule endoscope image preprocessed by the preprocessor 110 as an input. The capsule endoscopy image reading system processes a large amount of capsule endoscopy images through the convolutional neural network 120, so that the presence of a lesion is read faster than a doctor reading the capsule endoscopy images one by one with the naked eye to determine the presence of a lesion. make it possible
이때, 컨볼루션 뉴럴 네트워크(120)는 전처리부(110)로부터 전처리된 캡슐 내시경 영상을 입력받는 입력층(121), 입력층(121)을 통해 입력된 전처리된 캡슐 내시경 영상에 대한 특징을 추출하는 하나 이상의 합성곱층(Convolution Layer)(122), 캡슐 내시경 영상에 대한 특징을 서브 샘플링하는 하나 이상의 최대 풀링층(Max Pooling Layer)(123) 및 캡슐 내시경 영상에 대하여 병변 존재 여부를 지시하는 확률값을 출력하는 출력층(124)을 포함할 수 있다.At this time, the convolutional neural network 120 is an input layer 121 that receives a preprocessed capsule endoscope image from the preprocessor 110 , and extracts features of the preprocessed capsule endoscope image input through the input layer 121 . Outputs one or more convolution layers 122, one or more Max Pooling Layers 123 for sub-sampling features of a capsule endoscopy image, and a probability value indicating whether a lesion exists in the capsule endoscopy image. It may include an output layer 124 that
grad-CAM 획득부(130)는 캡슐 내시경 영상에 대한 grad-CAM을 획득할 수 있다.The grad-CAM acquisition unit 130 may acquire the grad-CAM for the capsule endoscopy image.
이때, grad-CAM 획득부(130)는 전술한 합성곱층(122) 및 최대 풀링층(123) 중에서 병변 위치 탐지 능력이 가장 높다고 판단된 층에서 grad-CAM을 획득할 수 있다.In this case, the grad-CAM acquisition unit 130 may acquire the grad-CAM from the layer determined to have the highest lesion location detection capability among the above-described convolutional layer 122 and the maximum pooling layer 123 .
또한 캡슐 내시경 영상 판독 시스템(100)은 전술한 전처리부(110), 컨볼루션 뉴럴 네트워크(120), grad-CAM 획득부(130) 외에 비디오 클립 생성부(140)를 추가로 포함할 수 있다. 비디오 클립 생성부(140)는 캡슐 내시경 영상을 기초로 캡슐 내시경 영상에 대응하는 비디오 클립을 생성할 수 있다.In addition, the capsule endoscopy image reading system 100 may further include a video clip generation unit 140 in addition to the preprocessor 110 , the convolutional neural network 120 , and the grad-CAM acquisition unit 130 described above. The video clip generator 140 may generate a video clip corresponding to the capsule endoscope image based on the capsule endoscope image.
도 2는 본 발명에 따른 캡슐 내시경 영상 판독 시스템의 전처리부의 일 예를 블록도로 나타낸 도면이다.2 is a block diagram illustrating an example of a preprocessor of a capsule endoscope image reading system according to the present invention.
도 2를 참조하면, 전처리부(110)는 노이즈 제거부(111) 및 영상 증강부(112)를 포함할 수 있다.Referring to FIG. 2 , the pre-processing unit 110 may include a noise removing unit 111 and an image enhancing unit 112 .
노이즈 제거부(111)는 전처리부(110)에 입력된 캡슐 내시경 영상에서 노이즈를 제거할 수 있다.The noise removing unit 111 may remove noise from the capsule endoscope image input to the preprocessing unit 110 .
영상 증강부(112)는 노이즈가 제거된 캡슐 내시경 영상에 대하여 회전 및 상하반전 중 적어도 하나를 수행하여 복수의 증강 영상을 생성할 수 있다.The image augmentation unit 112 may generate a plurality of augmented images by performing at least one of rotation and vertical inversion on the noise-removed capsule endoscope image.
이하, 도 3 및 도 4에서 노이즈 제거부(111) 및 영상 증강부(112)의 동작에 대해 자세히 설명한다.Hereinafter, operations of the noise removing unit 111 and the image enhancing unit 112 in FIGS. 3 and 4 will be described in detail.
도 3은 도 2의 전처리부를 통해 영상의 노이즈를 제거하는 일 예를 나타낸 도면이다.FIG. 3 is a diagram illustrating an example of removing noise from an image through the preprocessor of FIG. 2 .
노이즈 제거부(111)는 캡슐 내시경 영상에 기록된 문자, 숫자, 기호를 포함하는 노이즈를 캡슐 내시경 영상에서 제거할 수 있다. 문자, 숫자, 기호 등은 병변 존재 여부 판단과 무관한 부분으로서, 컨볼루션 뉴럴 네트워크(120)가 캡슐 내시경 영상에 병변이 존재하는지 여부를 판단하는데 방해를 줄 수 있기 때문이다.The noise removing unit 111 may remove noise including letters, numbers, and symbols recorded in the capsule endoscope image from the capsule endoscope image. This is because letters, numbers, symbols, etc. are irrelevant to determining whether a lesion exists, and the convolutional neural network 120 may interfere with determining whether a lesion exists in the capsule endoscopy image.
도 3을 참조하면, 노이즈 제거부(111)는 576 * 576 * 3의 캡슐 내시경 영상의 테두리에 있는 시간, 날짜, 촬영 장비 등을 지시하는 문자, 숫자, 기호 등을 삭제하여 512 * 512 * 3의 캡슐 내시경 영상을 생성할 수 있다.Referring to FIG. 3 , the noise removing unit 111 deletes letters, numbers, symbols, etc. indicating the time, date, and photographing equipment, etc. on the edge of the capsule endoscopy image of 576 * 576 * 3 to 512 * 512 * 3 Capsule endoscopy images of
도 4는 도 2의 전처리부를 통해 영상을 증강하는 일 예를 나타낸 도면이다.4 is a diagram illustrating an example of augmenting an image through the preprocessor of FIG. 2 .
도 4를 참조하면, 영상 증강부(112)는 노이즈가 제거된 캡슐 내시경 영상에 대하여 회전(90도/180도/270도) 및 상하반전 중 적어도 하나를 수행하여 8개의 증강 영상을 생성할 수 있다. 한편, 도 4에서는 회전(90도/180도/270도) 및 상하반전을 모두 수행하는 경우에 대해서 설명하고 있다. 하지만 본 발명에서 영상 증강부(112)는 회전 및 상하반전 전부를 수행할 수도 있고, 회전 및 상하반전 중 일부만 수행하여 8개 미만의 증강 영상을 생성할 수도 있다.Referring to FIG. 4 , the image augmentation unit 112 may generate eight augmented images by performing at least one of rotation (90 degrees/180 degrees/270 degrees) and vertical inversion on the noise-removed capsule endoscope image. have. Meanwhile, FIG. 4 describes a case in which both rotation (90 degrees/180 degrees/270 degrees) and vertical inversion are performed. However, in the present invention, the image augmentation unit 112 may perform all of the rotation and up/down inversion, or may generate less than 8 augmented images by performing only some of the rotation and up/down inversion.
이처럼 영상 증강부(112)가 회전 및 상하반전 중 적어도 하나를 수행하여 증강 영상을 생성하는 이유는 증강된 영상에 대한 원본 캡슐 내시경 영상에 존재하는 병변의 특징이 영상 처리 과정에서 손상되는 것을 막기 위함이다. The reason why the image augmentation unit 112 generates the augmented image by performing at least one of rotation and vertical inversion is to prevent the characteristics of the lesion present in the original capsule endoscopy image for the augmented image from being damaged in the image processing process. to be.
캡슐 내시경 영상은 일반적으로 검은색 배경 안에 소장의 영상을 원형으로 보여주는 형태를 가진다.Capsule endoscopy images generally have a shape that shows images of the small intestine in a circular shape in a black background.
원본 캡슐 내시경 영상에서 병변을 나타내는 부분의 픽셀이 왜곡 또는 훼손되면 이러한 왜곡된 픽셀이 컨볼루션 뉴럴 네트워크가 병변 존재 여부를 판단하는데 영향을 미칠 수 있기 때문이다.This is because, in the original capsule endoscopy image, if a pixel representing a lesion is distorted or damaged, the distorted pixel may influence the convolutional neural network to determine whether a lesion exists.
도 5는 본 발명에 따른 캡슐 내시경 영상 판독 시스템의 컨볼루션 뉴럴 네트워크를 나타낸 도면이다.5 is a diagram illustrating a convolutional neural network of a capsule endoscope image reading system according to the present invention.
도 5를 참조하면, 컨볼루션 뉴럴 네트워크(120)는 입력층(121)을 통해 전처리된 캡슐 내시경 영상에 대한 특징(feature)을 합성곱층(122)을 통해 추출할 수 있다. 그리고 컨볼루션 뉴럴 네트워크(120)는 합성곱층(122)을 통해 추출한 특징을 최대 풀링층(123)을 통해 서브 샘플링할 수 있다. 한편, 컨볼루션 뉴럴 네트워크(120)는 어느 하나의 합성곱층을 통해 추출한 특징을 어느 하나의 최대 풀링층을 통해 서브 샘플링한 후 그 결과를 다시 다른 합성곱층에 입력하는 과정을 반복할 수 있다.Referring to FIG. 5 , the convolutional neural network 120 may extract a feature of the capsule endoscope image preprocessed through the input layer 121 through the convolution layer 122 . In addition, the convolutional neural network 120 may subsample the feature extracted through the convolution layer 122 through the maximum pooling layer 123 . Meanwhile, the convolutional neural network 120 may repeat the process of subsampling a feature extracted through any one convolutional layer through any one maximum pooling layer and then inputting the result back to another convolutional layer.
그리고 컨볼루션 뉴럴 네트워크(120)는 합성곱층(122) 및 최대 풀링층(123)을 이용하여 생성된 결과를 이용하여, 캡슐 내시경 영상에 대한 병변 존재 여부를 지시하는 확률값을 출력층(124)을 통해 출력할 수 있다.In addition, the convolutional neural network 120 uses the result generated using the convolutional layer 122 and the maximum pooling layer 123 to generate a probability value indicating whether a lesion exists in the capsule endoscopy image through the output layer 124 . can be printed out.
출력층(124)는 하나 이상의 완전 연결 층(fully connected layer)를 이용하여 합성곱층(122) 및 최대 풀링층(123)을 이용하여 생성된 결과값을 분석하고, 이를 다양한 변환 함수(e.g. softmax)를 이용하여 병변 존재 여부를 지시하는 확률값으로 변환할 수 있다.The output layer 124 analyzes the result value generated using the convolution layer 122 and the maximum pooling layer 123 using one or more fully connected layers, and applies various transform functions (eg softmax) to it. It can be converted into a probability value indicating the presence or absence of a lesion using
이때, grad-CAM 획득부(130)는 합성곱층(122) 및 최대 풀링층(123) 중에서 병변 위치 탐지 능력이 가장 높다고 판단된 층에서 grad-CAM을 획득할 수 있다. 의사는 grad-CAM을 통해 병변이 존재한다고 판단된 캡슐 내시경 영상의 어느 부분이 병변 존재 여부를 판단하는데 영향을 미쳤는지를 높은 정확도로 확인할 수 있다.In this case, the grad-CAM acquisition unit 130 may acquire the grad-CAM from the layer determined to have the highest lesion location detection capability among the convolutional layer 122 and the maximum pooling layer 123 . The doctor can check with high accuracy which part of the capsule endoscopy image where the lesion was judged to have had an influence on the judgment of the presence of the lesion through grad-CAM.
일반적으로, 캡슐 내시경 영상의 어느 부분이 컨볼루션 뉴럴 네트워크가 병변 여부를 판단하는데 큰 영향을 미치는지를 분석하기 위해 CAM(Class Activation Map)이 사용된다. CAM은 확률값을 예측하는 층 직전의 가중치(weight)를 사용하여 특징 맵의 가중치의 합을 계산한 결과를 시각화한 맵으로서, CAM을 통해 각 클래스를 어떻게 판단했는지 확인하고, 해당 클래스의 대략적인 위치를 찾을 수 있다. CAM을 캡슐 내시경 영상 위에 겹쳐서 나타내면 캡슐 내시경 영상에서 병변이 발생한 부위를 쉽게 판단할 수 있다.In general, a CAM (Class Activation Map) is used to analyze which part of a capsule endoscopy image has a great influence on determining whether a convolutional neural network is a lesion. The CAM is a map that visualizes the result of calculating the sum of the weights of the feature map using the weight just before the layer that predicts the probability value. can be found If the CAM is superimposed on the capsule endoscopy image, it is possible to easily determine the area where the lesion has occurred in the capsule endoscopy image.
그러나 CAM을 얻기 위해서는 전역 평균 풀링(Global Average Pooling, GAP) 층을 사용하여 튜닝을 수행하는 과정이 필수적이다. 따라서, 전역 평균 풀링 층을 컨볼루션 뉴럴 네트워크에 포함시킬 수 있도록 하기 위해 컨볼루션 뉴럴 네트워크의 구조에 수정이 가해지며 이로 인해 컨볼루션 뉴럴 네트워크가 병변을 탐지하는 능력이 감소할 수 있다.However, in order to obtain CAM, the process of performing tuning using a global average pooling (GAP) layer is essential. Therefore, in order to include the global average pooling layer in the convolutional neural network, modifications are made to the structure of the convolutional neural network, which may reduce the convolutional neural network's ability to detect lesions.
따라서, 본 발명에서는 전역 평균 풀링(GAP) 층에 의존하지 않고 컨볼루션 뉴럴 네트워크의 구조 수정 없이 CAM 결과를 얻기 위해서 grad-CAM 획득부(130)를 이용하여 grad-CAM을 획득한다. 따라서, grad-CAM을 사용하면 컨볼루션 뉴럴 네트워크의 구조에 대한 제약이 발생하지 않으므로 병변의 존재를 탐지하는 능력 및 병변의 위치를 추적하는 능력이 향상될 수 있다.Accordingly, in the present invention, grad-CAM is acquired using the grad-CAM acquisition unit 130 to obtain a CAM result without depending on the global average pooling (GAP) layer and without modifying the structure of the convolutional neural network. Therefore, the use of grad-CAM does not impose any constraints on the structure of the convolutional neural network, so the ability to detect the presence of lesions and the ability to track the location of lesions can be improved.
일 예로, grad-CAM은 전술한 합성곱층 또는 최대 풀링층의 그래디언트(gradient)와 그 층을 통과한 결과 정보를 이용하여 중요성 가중치(importance weight)를 다음과 같은 수학식으로 획득할 수 있다.As an example, the grad-CAM may obtain an importance weight by the following equation by using the gradient of the above-described convolutional layer or the maximum pooling layer and result information passing the layer.
이때, 는 class, 는 층을 통과한 결과 정보, 는 관찰하려는 결과 정보 (i,j)에 대응하는 값, 는 y가 A에 미치는 영향, 즉 그래디언트를 나타낸다.At this time, is class, is the result information passing through the layer, is the value corresponding to the result information (i,j) to be observed, denotes the effect of y on A, i.e. the gradient.
전술한 와 를 이용하여 grad-CAM은 다음과 같은 수학식으로 계산될 수 있다.the aforementioned Wow grad-CAM can be calculated using the following equation.
한편, 수학식 2에서 ReLU를 이용하는 이유는 class에 대한 양(plus)의 영향만을 반영하기 위함이다.On the other hand, the reason for using ReLU in Equation 2 is to reflect only the effect of the positive (plus) on the class.
도 6은 본 발명에 따른 캡슐 내시경 영상 판독 시스템이 생성한 grad-CAM의 일 예를 나타낸 도면이다.6 is a view showing an example of grad-CAM generated by the capsule endoscopy image reading system according to the present invention.
도 6을 참조하면, 좌측의 캡슐 내시경 영상에 대한 grad-CAM이 우측에 표시되어 있다. grad-CAM을 살펴보면, 바깥쪽 부분과 다른 색으로 표시된 가운데 부분이 병변 존재 여부 판단에 큰 영향을 미친 것을 확인할 수 있으며 캡슐 내시경 영상의 병변 존재 확률은 약 71.64%임을 알 수 있다. 이처럼, 캡슐 내시경 영상에 대한 grad-CAM에 병변의 위치가 직관적으로 표시되므로, 캡슐 내시경 영상을 판독하는 시청자(e.g. 의사)는 캡슐 내시경 영상에서 병변의 위치를 별도의 라벨링 없이 확인 가능하다.Referring to FIG. 6 , the grad-CAM for the capsule endoscopy image on the left is displayed on the right. Looking at the grad-CAM, it can be seen that the middle part marked with a different color from the outer part had a great influence on the determination of the presence of a lesion, and it can be seen that the probability of lesion presence in the capsule endoscopy image is about 71.64%. As such, since the location of the lesion is intuitively displayed on the grad-CAM for the capsule endoscopy image, a viewer (e.g. a doctor) reading the capsule endoscope image can check the location of the lesion in the capsule endoscope image without separate labeling.
도 7은 본 발명에 따른 캡슐 내시경 영상 판독 시스템이 grad-CAM을 획득하는 층을 결정하는 일 예를 나타낸 도면이다.7 is a diagram illustrating an example in which the capsule endoscopy image reading system according to the present invention determines a layer from which grad-CAM is acquired.
전술한 바와 같이, grad-CAM 획득부(130)는 하나 이상의 합성곱층(122) 및 하나 이상의 최대 풀링층(123) 중에서 병변 위치 탐지 능력이 가장 높다고 판단된 층에서 grad-CAM을 획득할 수 있다. 이때, 병변 위치 탐지 능력이 가장 높다는 것은 grad-CAM 상에서 병변 존재 확률을 판단하는데 중요한 영향을 미친 영역과 실제 캡슐 내시경 영상에서 병변이 존재하는 영역이 가장 많이 일치한다는 것을 의미한다.As described above, the grad-CAM acquisition unit 130 may acquire the grad-CAM from the layer determined to have the highest lesion location detection ability among one or more convolutional layers 122 and one or more maximum pooling layers 123 . . In this case, the highest lesion location detection ability means that the region that has an important influence on the grad-CAM to determine the lesion existence probability and the region where the lesion exists in the actual capsule endoscopy image match the most.
도 7을 참조하면, 일 예로 grad-CAM 획득부(130)는 캡슐 내시경 영상에 대해 합성곱층(122)에 포함된 제1 합성곱층(CONV_1), 제2 합성곱층(CONV_2), 제3 합성곱층(CONV_3), 제4 합성곱층(CONV_4), 최대 풀링층(123)에 포함된 제1 최대 풀링층(MAXP_1), 제2 최대 풀링층(MAXP_2), 제3 최대 풀링층(MAXP_3)으로부터 획득한 grad-CAM을 기초로 병변 위치 탐지 능력이 가장 높은 층을 판단할 수 있다.Referring to FIG. 7 , as an example, the grad-CAM acquisition unit 130 includes a first convolutional layer (CONV_1), a second convolutional layer (CONV_2), and a third convolutional layer included in the convolutional layer 122 for a capsule endoscopy image. (CONV_3), the fourth convolutional layer (CONV_4), the first maximum pooling layer (MAXP_1) included in the maximum pooling layer 123, the second maximum pooling layer (MAXP_2), and the third maximum pooling layer (MAXP_3) Based on grad-CAM, it is possible to determine the layer with the highest lesion localization ability.
이때, grad-CAM 획득부(130)는 병변 위치 탐지 능력이 가장 높은 층을 판단하기 위해서 하나 이상의 테스트용 캡슐 내시경 영상에 대한 grad-CAM을 획득할 수 있다. 만약 테스트용 캡슐 내시경 영상이 복수인 경우, grad-CAM 획득부(130)는 일 예로 각 테스트용 캡슐 내시경 영상에 대한 grad-CAM의 병변 위치 탐지 능력의 평균을 각 층 별로 계산하여 각 층의 병변 위치 탐지 능력을 판단할 수 있다.In this case, the grad-CAM acquisition unit 130 may acquire the grad-CAM for one or more test capsule endoscopy images in order to determine the layer having the highest lesion location detection capability. If there are a plurality of capsule endoscopy images for testing, the grad-CAM acquisition unit 130 calculates, for example, the average of the lesion location detection capability of grad-CAM for each test capsule endoscope image for each layer for each layer of the lesion of each layer. It is possible to determine the location detection capability.
도 7에서, grad-CAM 획득부(130)는 테스트용 캡슐 내시경 영상에 대해 제1 최대 풀링층(MAXP_1)으로부터 획득한 grad-CAM이 가장 병변 위치 탐지 능력이 높다고 판단할 수 있다. 이 때, grad-CAM 획득부(130)는 이후 캡슐 내시경 영상에 대한 grad-CAM을 합성곱층(122) 및 최대 풀링층(123) 중 제1 최대 풀링층(MAXP_1)에서 획득할 수 있다.In FIG. 7 , the grad-CAM acquisition unit 130 may determine that the grad-CAM acquired from the first maximum pooling layer MAXP_1 with respect to the capsule endoscopy image for testing has the highest lesion location detection capability. In this case, the grad-CAM acquisition unit 130 may then acquire the grad-CAM for the capsule endoscopy image from the first maximum pooling layer MAXP_1 among the convolutional layer 122 and the maximum pooling layer 123 .
도 8은 본 발명에 따른 캡슐 내시경 영상 판독 시스템이 생성한 비디오 클립의 구조의 일 예를 나타낸 도면이다.8 is a view showing an example of the structure of a video clip generated by the capsule endoscope image reading system according to the present invention.
도 8을 참조하면, 비디오 클립 생성부(140)는 캡슐 내시경 영상에 대응하는 비디오 클립을 생성할 때, 병변이 존재하는 영상 이미지를 기초로 비디오 클립을 생성할 수 있다.Referring to FIG. 8 , when generating a video clip corresponding to a capsule endoscopy image, the video clip generating unit 140 may generate a video clip based on a video image in which a lesion exists.
본 발명에서 병변 존재 여부는 확률값을 통해 표현되므로, 비디오 클립 생성부(140)는 캡슐 내시경 영상에 대해 전술한 컨볼루션 뉴럴 네트워크(120)를 이용하여 병변 존재 여부를 지시하는 확률값을 계산한 결과, 캡슐 내시경 영상 중 병변 존재 여부를 지시하는 확률값이 임계값(e.g. 0.8) 이상인 캡슐 내시경 영상을 기초로 하여, 캡슐 내시경 영상에 대응하는 비디오 클립을 생성할 수 있다.In the present invention, the presence or absence of a lesion is expressed through a probability value, so the video clip generator 140 calculates a probability value indicating the presence of a lesion using the convolutional neural network 120 described above for the capsule endoscopy image. A video clip corresponding to the capsule endoscope image may be generated based on a capsule endoscope image in which a probability value indicating the presence of a lesion in the capsule endoscope image is equal to or greater than a threshold value (eg 0.8).
도 9는 본 발명에 따른 캡슐 내시경 영상 판독 시스템이 생성한 비디오 클립에 포함되는 프레임의 일 예를 나타낸 도면이다.9 is a diagram illustrating an example of a frame included in a video clip generated by the capsule endoscope image reading system according to the present invention.
도 9를 참조하면, 비디오 클립 생성부(140)는 병변이 존재한다고 판단된(즉, 확률값이 임계값 이상인) 캡슐 내시경 영상의 전후로 최대 기준값(e.g. 5)만큼의 프레임을 추가하여 캡슐 내시경 영상에 대응하는 비디오 클립을 생성할 수 있다. 예를 들어 기준값이 5인 경우, 비디오 클립 생성부(140)는 병변의 존재한다고 판단된 캡슐 내시경 영상의 전후로 최대 5개까지 프레임을 비디오 클립에 추가할 수 있는데, 만약 캡슐 내시경 영상의 전후에 존재하는 프레임의 개수가 5개 미만이면(e.g. 시작 이후 4번째 프레임) 추가 가능한 모든 프레임을 비디오 클립에 추가할 수 있다.Referring to FIG. 9 , the video clip generator 140 adds frames as much as the maximum reference value (eg 5) before and after the capsule endoscopy image in which the lesion is determined to exist (that is, the probability value is greater than or equal to the threshold value) to the capsule endoscopy image. A corresponding video clip can be created. For example, when the reference value is 5, the video clip generating unit 140 may add up to 5 frames before and after the capsule endoscopy image determined to exist of the lesion to the video clip. If the number of frames to be played is less than 5 (eg the 4th frame after the start), all appendable frames can be added to the video clip.
도 9에서, N번 프레임의 캡슐 내시경 영상에 병변이 존재한다고 판단된 경우를 예를 들어 설명한다. 이때, 기준값이 A라고 가정하면, 비디오 클립 생성부(140)는 N번 프레임뿐 아니라 N번 프레임 이전의 A개의 프레임 및 N번 프레임 이후의 A개의 프레임을 합쳐서 비디오 클립을 생성할 수 있다.In FIG. 9 , a case in which it is determined that a lesion exists in the capsule endoscopy image of frame N will be described as an example. In this case, assuming that the reference value is A, the video clip generator 140 may generate a video clip by combining not only the N-th frame, but also A frames before the N-th frame and A frames after the N-th frame.
이처럼 병변이 존재한다고 판단된 캡슐 내시경 영상뿐 아니라 그 전후의 캡슐 내시경 영상까지 비디오 클립에 포함시키는 이유는, 비디오 클립을 시청하는 시청자(e.g. 의사)가 병변이 미발생한 캡슐 내시경 영상에서 병변이 발생한 캡슐 내시경 영상으로의 변화 또는 그 반대로의 연속적인 변화를 비디오 클립을 통해 확인할 수 있도록 하기 위함이다. 병변이 존재한다고 예측된 영상만으로 구성된 비디오 클립은 시청자가 보기에 앞뒤의 프레임이 끊기는 것처럼 보여서 판독에 부적합하다. 이때, 전술한 기준값은 시청자가 보기에 불편함이 없는 수준에서 임의로 결정될 수 있다.The reason that the video clip includes not only the capsule endoscopy image determined to have a lesion, but also the capsule endoscopy image before and after it is included in the video clip. This is in order to confirm the continuous change to the endoscopic image or vice versa through the video clip. A video clip consisting only of images predicted to have a lesion is unsuitable for reading because the front and back frames appear to be cut off to the viewer. In this case, the aforementioned reference value may be arbitrarily determined at a level that does not cause inconvenience to the viewer.
도 10은 본 발명에 따른 캡슐 내시경 영상 판독 시스템이 생성한 2개의 비디오 클립에 포함되는 프레임의 일 예를 나타낸 도면이다.10 is a diagram illustrating an example of a frame included in two video clips generated by the capsule endoscope image reading system according to the present invention.
도 10에서, M번 프레임과 N번 프레임의 캡슐 내시경 영상에 병변이 발생하였다고 가정한다. 이때, 도 9에서 설명한 방법에 따라 비디오 클립 생성부(140)는 M번 프레임의 캡슐 내시경 영상을 기초로 한 비디오 클립인 비디오 클립 1과, N번 프레임의 캡슐 내시경 영상을 기초로 한 비디오 클립인 비디오 클립 2를 생성할 수 있다.In FIG. 10 , it is assumed that a lesion has occurred in the capsule endoscopy images of frame M and frame N. At this time, according to the method described with reference to FIG. 9 , the video clip generating unit 140 generates video clip 1 which is a video clip based on the capsule endoscope image of frame M, and a video clip that is based on the capsule endoscope image of frame N. You can create video clip 2.
이때, 도 10에서 비디오 클립 1의 마지막 프레임인 (M+A)번 프레임은 비디오 클립 2의 시작 프레임인 (N-A)번 프레임보다 이전 프레임이다. 즉, 비디오 클립 1과 비디오 클립 2는 서로 중첩되는 프레임이 존재한다.At this time, in FIG. 10 , frame (M+A), which is the last frame of video clip 1, is a frame earlier than frame (N-A), which is the start frame of video clip 2. That is, video clip 1 and video clip 2 have frames overlapping each other.
비디오 클립 1과 비디오 클립 2에 중첩되는 프레임이 존재하는 경우, 비디오 클립 생성부(140)는 비디오 클립 1, 비디오 클립 2를 별도로 생성하는 대신에 비디오 클립 1과 비디오 클립 2를 병합하여 하나의 비디오 클립을 생성할 수 있다.When there is a frame overlapping the video clip 1 and the video clip 2, the video clip generator 140 merges the video clip 1 and the video clip 2 instead of separately generating the video clip 1 and the video clip 2 to form one video. You can create clips.
도 11은 도 10의 2개의 비디오 클립이 병합된 새로운 비디오 클립을 나타낸 도면이다.11 is a diagram illustrating a new video clip in which the two video clips of FIG. 10 are merged;
도 11을 참조하면, 비디오 클립 생성부(140)는 전술한 비디오 클립 1과 비디오 클립 2를 병합하여 비디오 클립 3을 생성할 수 있다. 비디오 클립 3은 비디오 클립 1 또는 비디오 클립 2에 포함된 프레임 중 가장 이전에 생성된 프레임인 (M-A)번 프레임부터 가장 나중에 생성된 프레임인 (N+A)번 프레임을 포함할 수 있다. 이를 통해 비디오 클립 생성부(140)는 병변이 존재하는 캡슐 내시경 영상인 M번 프레임과 N번 프레임 사이의 변화를 하나의 비디오 클립을 통해 표현할 수 있다.Referring to FIG. 11 , the video clip generator 140 may generate the video clip 3 by merging the aforementioned video clip 1 and video clip 2 . The video clip 3 may include a frame (N+A), which is the most recently generated frame, from a frame (M-A) that is the earliest generated frame among frames included in the video clip 1 or the video clip 2 . Through this, the video clip generating unit 140 may express a change between frame M and frame N, which is a capsule endoscopy image in which a lesion exists, through one video clip.
도 12는 본 발명에 따른 캡슐 내시경 영상 판독 시스템에 적용되는 캡슐 내시경 영상 세트를 나타낸 도면이다.12 is a view showing a capsule endoscope image set applied to the capsule endoscope image reading system according to the present invention.
도 12를 참조하면, 캡슐 내시경 영상 판독 시스템의 컨볼루션 뉴럴 네트워크(120)의 학습에 사용되는 캡슐 내시경 영상의 세트(set)인 학습 영상 세트에 포함되는 영상의 개수와, 컨볼루션 뉴럴 네트워크(120)의 테스트에 사용되는 캡슐 내시경 영상의 세트인 테스트 영상 세트에 포함되는 영상의 개수의 비율은 미리 설정된 비율값으로 결정될 수 있다. 일 예로 (학습 영상 세트의 영상 개수):(테스트 영상 세트의 영상 개수)는 7:3 또는 8:2일 수 있다. Referring to FIG. 12 , the number of images included in a training image set, which is a set of capsule endoscopy images used for learning the convolutional neural network 120 of the capsule endoscope image reading system, and the convolutional neural network 120 ), the ratio of the number of images included in the test image set, which is a set of capsule endoscopy images used for the test, may be determined as a preset ratio value. As an example, (the number of images in the training image set):(the number of images in the test image set) may be 7:3 or 8:2.
전술한 비율값은 병변 존재 확률의 정확도를 가장 높일 수 있는 값으로 선택될 수 있다. 이 비율값은 캡슐 내시경 영상 판독 시스템 내에 고정된 값으로 설정될 수 있다.The above-mentioned ratio value may be selected as a value that can increase the accuracy of the lesion existence probability the most. This ratio value may be set to a fixed value in the capsule endoscopy image reading system.
이때, 전체 캡슐 내시경 영상의 수가 많아질수록 학습 영상 세트의 영상 개수 대비 테스트 영상 세트의 영상 개수의 비율은 감소될 수 있다. 예를 들어 전체 캡슐 내시경 영상의 수가 10000개일 때 (학습 영상 세트의 영상 개수):(테스트 영상 세트의 영상 개수) = 7:3이라고 가정하면, 전체 캡슐 내시경 영상의 수가 20000개일 때 (학습 영상 세트의 영상 개수):(테스트 영상 세트의 영상 개수) = 8:2가 될 수 있다.In this case, the ratio of the number of images of the test image set to the number of images of the training image set may decrease as the number of the entire capsule endoscopy image increases. For example, if the total number of capsule endoscopy images is 10000 (number of images in the training image set):(number of images in the test image set) = 7:3, when the number of total capsule endoscopy images is 20000 (the number of images in the training image set) number of images in):(number of images in the test image set) = 8:2.
이는 전체 캡슐 내시경 영상의 수가 많아질수록 테스트 영상 세트의 영상 개수의 비율이 줄더라도 테스트 영상 세트의 영상 개수의 비율은 일정한 수준 이상을 만족하는 반면, 학습 영상 세트의 영상 개수가 증가할수록 컨볼루션 뉴럴 네트워크(120)의 정확도가 증가하므로 학습 영상 세트의 영상 개수는 최대한 증가하는 것이 바람직하기 때문이다.This means that as the number of total capsule endoscopy images increases, the ratio of the number of images in the test image set decreases, but the ratio of the number of images in the test image set satisfies a certain level or more, whereas as the number of images in the training image set increases, the convolutional neural This is because it is desirable to increase the number of images in the training image set as much as possible because the accuracy of the network 120 increases.
한편, 전술한 비율값은 병변이 존재하는 캡슐 내시경 영상 및 병변이 존재하지 않는 캡슐 내시경 영상 각각에 대하여 동일하게 적용될 수 있다. 예를 들어, 병변이 존재하는 캡슐 내시경 영상에 대해 (학습 영상 세트의 개수):(테스트 영상 세트의 개수) = 8:2이라면, 병변이 존재하지 않는 캡슐 내시경 영상에 대해서도 (학습 영상 세트의 개수):(테스트 영상 세트의 개수) = 8:2이다.Meanwhile, the aforementioned ratio value may be equally applied to each of a capsule endoscopy image in which a lesion exists and a capsule endoscopy image in which a lesion does not exist. For example, if (the number of training image sets):(the number of test image sets) = 8:2 for a capsule endoscopy image in which a lesion exists, even for a capsule endoscopy image in which a lesion does not exist (the number of training image sets) ):(number of test image sets) = 8:2.
한편, 컨볼루션 뉴럴 네트워크(120)의 입력층(121)은 학습 영상 세트를 입력받을 때, 병변이 존재하는 영상과 병변이 없는 영상을 동일한 개수로 입력받을 수 있다. 이는 컨볼루션 뉴럴 네트워크(120)의 학습 과정에서 병변이 존재하는 영상 또는 병변이 없는 영상 중 한쪽에 해당하는 영상이 과도하게 입력될 경우에, 과도하게 입력된 쪽에 대해 바이어스(bias)가 걸릴 가능성이 높기 때문이다.Meanwhile, when receiving the training image set, the input layer 121 of the convolutional neural network 120 may receive the same number of images with lesions and images without lesions. This is because, in the learning process of the convolutional neural network 120, when an image corresponding to either an image in which a lesion exists or an image without a lesion is excessively input, a bias is likely to be applied to the excessively inputted image. because it is high
도 13은 본 발명에 따른 캡슐 내시경 영상 판독 방법에 대한 흐름도이다.13 is a flowchart of a capsule endoscopy image reading method according to the present invention.
도 13을 참조하면 캡슐 내시경 영상 판독 방법(1300)은 캡슐 내시경에 의해 촬영된 캡슐 내시경 영상을 전처리하는 전처리 단계(S1310)를 포함할 수 있다.Referring to FIG. 13 , the capsule endoscope image reading method 1300 may include a pre-processing step ( S1310 ) of pre-processing the capsule endoscope image captured by the capsule endoscope.
그리고 캡슐 내시경 영상 판독 방법은 S1310 단계에서 전처리된 캡슐 내시경 영상을 입력받는 입력 단계(S1320)를 포함할 수 있다.And the capsule endoscope image reading method may include an input step (S1320) of receiving the capsule endoscope image preprocessed in step S1310.
그리고 캡슐 내시경 영상 판독 방법은 S1320 단계에서 입력된 전처리된 캡슐 내시경 영상에 대한 특징을 추출하고 추출된 특징을 서브 샘플링하는 처리 동작을 반복적으로 실행하는 처리 단계(S1330)를 포함할 수 있다.In addition, the capsule endoscope image reading method may include a processing step (S1330) of extracting features for the preprocessed capsule endoscope image input in step S1320 and repeatedly executing a processing operation of subsampling the extracted features.
그리고 캡슐 내시경 영상 판독 방법은 S1330 단계의 결과를 기초로 grad-CAM(gradient Class Activation Map)을 획득하는 grad-CAM 획득 단계(S1340)를 포함할 수 있다.And the capsule endoscopy image reading method may include a grad-CAM acquisition step (S1340) of acquiring a grad-CAM (gradient class activation map) based on the result of step S1330.
그리고 캡슐 내시경 영상 판독 방법은 캡슐 내시경 영상의 병변 존재 여부를 지시하는 확률값을 출력하는 출력 단계(S1350)를 포함할 수 있다.In addition, the capsule endoscopy image reading method may include outputting a probability value indicating whether a lesion exists in the capsule endoscope image ( S1350 ).
한편, 캡슐 내시경 영상 판독 방법(1300)은 전술한 캡슐 내시경 영상 판독 시스템(100)을 통해 수행될 수 있다.Meanwhile, the capsule endoscope image reading method 1300 may be performed through the capsule endoscope image reading system 100 described above.
도 14는 본 발명에 따른 캡슐 내시경 영상 판독 방법의 전처리 단계의 세부 내용에 대한 흐름도이다.14 is a flowchart illustrating details of a pre-processing step of a capsule endoscopy image reading method according to the present invention.
도 14를 참조하면, 전처리 단계(S1310)는 캡슐 내시경 영상에서 노이즈를 제거하는 노이즈 제거 단계(S1410)을 포함할 수 있다.Referring to FIG. 14 , the pre-processing step ( S1310 ) may include a noise removing step ( S1410 ) of removing noise from the capsule endoscopy image.
그리고 전처리 단계(S1310)는 S1410 단계에서 노이즈가 제거된 캡슐 내시경 영상에 대하여 회전 및 상하반전 중 적어도 하나를 수행하여, 복수의 증강 영상을 생성하는 영상 증강 단계(S1420)를 포함할 수 있다.In addition, the preprocessing step (S1310) may include an image augmentation step (S1420) of generating a plurality of augmented images by performing at least one of rotation and vertical inversion on the capsule endoscope image from which the noise has been removed in step S1410.
한편, 전처리 단계(S1310)는 전술한 캡슐 내시경 영상 판독 시스템의 전처리부(110)를 통해 수행될 수 있다.Meanwhile, the pre-processing step S1310 may be performed by the pre-processing unit 110 of the above-described capsule endoscope image reading system.
본 발명의 실시예들에서 설명하는 캡슐 내시경 영상 판독 시스템 및 방법은 캡슐내시경으로 촬영된 대량의 내시경 영상에 대해 의사의 판독 시간을 줄이고 정확도를 높일 수 있다.The capsule endoscopy image reading system and method described in the embodiments of the present invention can reduce the reading time of a doctor and increase the accuracy of a large amount of endoscopic images taken with the capsule endoscope.
본 발명의 실시예들에서 설명한 딥러닝 모델은 인공 신경망을 다층 레이어로 쌓은 형태의 모델일 수 있다. 즉, 딥 러닝 모델은 다층의 네트워크로 이루어진 심층 신경망에서 다량의 데이터를 학습시킴으로써 입력값에 대한 특징을 자동으로 학습하고, 이를 통해 목적 함수, 즉 예측 정확도의 에러를 최소화하도록 네트워크를 학습시켜 나아가는 형태의 모델이다.The deep learning model described in the embodiments of the present invention may be a model in which artificial neural networks are stacked in multi-layered layers. That is, the deep learning model automatically learns the characteristics of the input value by learning a large amount of data from a deep neural network consisting of a multi-layered network, and through this, the network is trained to minimize the error in the objective function, that is, the prediction accuracy. is the model of
본 발명에서는 딥 러닝 모델이 컨볼루션 뉴럴 네트워크(CNN)인 경우를 예를 들어 설명하였으나, 본 발명은 이에 제한되지 않고 현재 또는 장래에 사용될 수 있는 다양한 딥 러닝 모델을 이용할 수 있다.In the present invention, a case in which the deep learning model is a convolutional neural network (CNN) has been described as an example, but the present invention is not limited thereto, and various deep learning models that can be used now or in the future may be used.
딥 러닝 모델은 딥 러닝 프레임워크를 통해 구현될 수 있다. 딥 러닝 프레임워크는 딥 러닝 모델을 개발할 때 공통적으로 사용되는 기능들을 라이브러리 형태로 제공하고, 시스템 소프트웨어나 하드웨어 플랫폼을 잘 사용할 수 있도록 지원하는 역할을 한다. 본 실시예에서 딥 러닝 모델은 현재 공개되었거나 장래 공개될 어떠한 딥 러닝 프레임워크를 이용하여 구현될 수 있다.A deep learning model can be implemented through a deep learning framework. The deep learning framework provides functions commonly used when developing deep learning models in the form of a library and plays a role in supporting the good use of system software or hardware platforms. In this embodiment, the deep learning model may be implemented using any deep learning framework that is currently public or will be released in the future.
전술한 캡슐 내시경 영상 판독 시스템은, 프로세서, 메모리, 사용자 입력장치, 프레젠테이션 장치 중 적어도 일부를 포함하는 컴퓨팅 장치에 의해 구현될 수 있다. 메모리는, 프로세서에 의해 실행되면 특정 태스크를 수행할 있도록 코딩되어 있는 컴퓨터-판독가능 소프트웨어, 애플리케이션, 프로그램 모듈, 루틴, 인스트럭션(instructions), 및/또는 데이터 등을 저장하는 매체이다. 프로세서는 메모리에 저장되어 있는 컴퓨터-판독가능 소프트웨어, 애플리케이션, 프로그램 모듈, 루틴, 인스트럭션, 및/또는 데이터 등을 판독하여 실행할 수 있다. 사용자 입력장치는 사용자로 하여금 프로세서에게 특정 태스크를 실행하도록 하는 명령을 입력하거나 특정 태스크의 실행에 필요한 데이터를 입력하도록 하는 수단일 수 있다. 사용자 입력장치는 물리적인 또는 가상적인 키보드나 키패드, 키버튼, 마우스, 조이스틱, 트랙볼, 터치-민감형 입력수단, 또는 마이크로폰 등을 포함할 수 있다. 프레젠테이션 장치는 디스플레이, 프린터, 스피커, 또는 진동장치 등을 포함할 수 있다.The capsule endoscope image reading system described above may be implemented by a computing device including at least a portion of a processor, a memory, a user input device, and a presentation device. A memory is a medium that stores computer-readable software, applications, program modules, routines, instructions, and/or data, etc. coded to perform specific tasks when executed by a processor. The processor may read and execute computer-readable software, applications, program modules, routines, instructions, and/or data stored in the memory and/or the like. The user input device may be a means for allowing the user to input a command to the processor to execute a specific task or to input data required for the execution of the specific task. The user input device may include a physical or virtual keyboard or keypad, key button, mouse, joystick, trackball, touch-sensitive input means, or a microphone. The presentation device may include a display, a printer, a speaker, or a vibrator.
컴퓨팅 장치는 스마트폰, 태블릿, 랩탑, 데스크탑, 서버, 클라이언트 등의 다양한 장치를 포함할 수 있다. 컴퓨팅 장치는 하나의 단일한 스탠드-얼론 장치일 수도 있고, 통신망을 통해 서로 협력하는 다수의 컴퓨팅 장치들로 이루어진 분산형 환경에서 동작하는 다수의 컴퓨팅 장치를 포함할 수 있다.Computing devices may include various devices such as smartphones, tablets, laptops, desktops, servers, clients, and the like. The computing device may be a single stand-alone device, or may include a plurality of computing devices operating in a distributed environment comprising a plurality of computing devices cooperating with each other through a communication network.
또한 전술한 캡슐 내시경 영상 판독 방법은, 프로세서를 구비하고, 또한 프로세서에 의해 실행되면 딥 러닝 모델을 활용한 영상 진단 방법을 수행할 수 있도록 코딩된 컴퓨터 판독가능 소프트웨어, 애플리케이션, 프로그램 모듈, 루틴, 인스트럭션, 및/또는 데이터 구조 등을 저장한 메모리를 구비하는 컴퓨팅 장치에 의해 실행될 수 있다.In addition, the capsule endoscopy image reading method described above includes a processor, and when executed by the processor, computer readable software, applications, program modules, routines, and instructions coded to perform an image diagnosis method using a deep learning model , and/or may be executed by a computing device having a memory storing data structures, and/or the like.
상술한 본 실시예들은 다양한 수단을 통해 구현될 수 있다. 예를 들어, 본 실시예들은 하드웨어, 펌웨어(firmware), 소프트웨어 또는 그것들의 결합 등에 의해 구현될 수 있다.The above-described embodiments may be implemented through various means. For example, the present embodiments may be implemented by hardware, firmware, software, or a combination thereof.
하드웨어에 의한 구현의 경우, 본 실시예들에 따른 딥 러닝 모델을 활용한 영상 진단 방법은 하나 또는 그 이상의 ASICs(Application Specific Integrated Circuits), DSPs(Digital Signal Processors), DSPDs(Digital Signal Processing Devices), PLDs(Programmable Logic Devices), FPGAs(Field Programmable Gate Arrays), 프로세서, 컨트롤러, 마이크로 컨트롤러 또는 마이크로 프로세서 등에 의해 구현될 수 있다.In the case of hardware implementation, the image diagnosis method using the deep learning model according to the present embodiments includes one or more ASICs (Application Specific Integrated Circuits), DSPs (Digital Signal Processors), DSPDs (Digital Signal Processing Devices), It may be implemented by Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), processors, controllers, microcontrollers or microprocessors, and the like.
예를 들어 실시예들에 따른 캡슐 내시경 영상 판독 방법은 심층 신경망의 뉴런(neuron)과 시냅스(synapse)가 반도체 소자들로 구현된 인공지능 반도체 장치를 이용하여 구현될 수 있다. 이때 반도체 소자는 현재 사용하는 반도체 소자들, 예를 들어 SRAM이나 DRAM, NAND 등일 수도 있고, 차세대 반도체 소자들, RRAM이나 STT MRAM, PRAM 등일 수도 있고, 이들의 조합일 수도 있다.For example, the capsule endoscopy image reading method according to the embodiments may be implemented using an artificial intelligence semiconductor device in which neurons and synapses of a deep neural network are implemented with semiconductor elements. In this case, the semiconductor device may be currently used semiconductor devices, for example, SRAM, DRAM, NAND, or the like, or may be next-generation semiconductor devices, RRAM, STT MRAM, PRAM, or the like, or a combination thereof.
실시예들에 따른 캡슐 내시경 영상 판독 방법을 인공지능 반도체 장치를 이용하여 구현할 때, 딥 러닝 모델을 소프트웨어로 학습한 결과(가중치)를 어레이로 배치된 시냅스 모방소자에 전사하거나 인공지능 반도체 장치에서 학습을 진행할 수도 있다.When the capsule endoscope image reading method according to the embodiments is implemented using an artificial intelligence semiconductor device, the result (weight) of learning a deep learning model with software is transcribed into a synaptic mimic device arranged in an array or learned in an artificial intelligence semiconductor device may proceed.
펌웨어나 소프트웨어에 의한 구현의 경우, 본 실시예들에 따른 캡슐 내시경 영상 판독 방법은 이상에서 설명된 기능 또는 동작들을 수행하는 장치, 절차 또는 함수 등의 형태로 구현될 수 있다. 소프트웨어 코드는 메모리 유닛에 저장되어 프로세서에 의해 구동될 수 있다. 메모리 유닛은 상기 프로세서 내부 또는 외부에 위치하여, 이미 공지된 다양한 수단에 의해 프로세서와 데이터를 주고 받을 수 있다.In the case of implementation by firmware or software, the capsule endoscope image reading method according to the present embodiments may be implemented in the form of an apparatus, procedure, or function for performing the functions or operations described above. The software code may be stored in the memory unit and driven by the processor. The memory unit may be located inside or outside the processor, and may transmit and receive data to and from the processor by various known means.
또한, 위에서 설명한 "시스템", "프로세서", "컨트롤러", "컴포넌트", "모듈", "인터페이스", "모델", 또는 "유닛" 등의 용어는 일반적으로 컴퓨터 관련 엔티티 하드웨어, 하드웨어와 소프트웨어의 조합, 소프트웨어 또는 실행 중인 소프트웨어를 의미할 수 있다. 예를 들어, 전술한 구성요소는 프로세서에 의해서 구동되는 프로세스, 프로세서, 컨트롤러, 제어 프로세서, 개체, 실행 스레드, 프로그램 및/또는 컴퓨터일 수 있지만 이에 국한되지 않는다. 예를 들어, 컨트롤러 또는 프로세서에서 실행 중인 애플리케이션과 컨트롤러 또는 프로세서가 모두 구성 요소가 될 수 있다. 하나 이상의 구성 요소가 프로세스 및/또는 실행 스레드 내에 있을 수 있으며, 구성 요소들은 하나의 장치(예: 시스템, 컴퓨팅 디바이스 등)에 위치하거나 둘 이상의 장치에 분산되어 위치할 수 있다.Also, as described above, terms such as "system", "processor", "controller", "component", "module", "interface", "model", or "unit" generally refer to computer-related entities hardware, hardware and software. may mean a combination of, software, or running software. For example, the aforementioned component may be, but is not limited to, a process run by a processor, a processor, a controller, a controlling processor, an object, a thread of execution, a program, and/or a computer. For example, both an application running on a controller or processor and a controller or processor can be a component. One or more components may reside within a process and/or thread of execution, and the components may be located on one device (eg, a system, computing device, etc.) or distributed across two or more devices.
한편, 또 다른 실시예는 전술한 캡슐 내시경 영상 판독 방법을 수행하는, 컴퓨터 기록매체에 저장되는 컴퓨터 프로그램을 제공한다. 또한 또 다른 실시예는 전술한 캡슐 내시경 영상 판독 방법을 실현시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 제공한다. On the other hand, another embodiment provides a computer program stored in a computer recording medium for performing the above-described capsule endoscope image reading method. Another embodiment also provides a computer-readable recording medium in which a program for realizing the above-described capsule endoscope image reading method is recorded.
기록매체에 기록된 프로그램은 컴퓨터에서 읽히어 설치되고 실행됨으로써 전술한 단계들을 실행할 수 있다.The program recorded on the recording medium can be read by a computer, installed, and executed to execute the above-described steps.
이와 같이, 컴퓨터가 기록매체에 기록된 프로그램을 읽어 들여 프로그램으로 구현된 기능들을 실행시키기 위하여, 전술한 프로그램은 컴퓨터의 프로세서(CPU)가 컴퓨터의 장치 인터페이스(Interface)를 통해 읽힐 수 있는 C, C++, JAVA, 기계어 등의 컴퓨터 언어로 코드화된 코드(Code)를 포함할 수 있다.In this way, in order for the computer to read the program recorded in the recording medium and execute the functions implemented as the program, the above-described program is C, C++, which the processor (CPU) of the computer can read through the device interface of the computer. , JAVA, and may include code coded in a computer language such as machine language.
이러한 코드는 전술한 기능들을 정의한 함수 등과 관련된 기능적인 코드(Function Code)를 포함할 수 있고, 전술한 기능들을 컴퓨터의 프로세서가 소정의 절차대로 실행시키는데 필요한 실행 절차 관련 제어 코드를 포함할 수도 있다.Such codes may include function codes related to functions defining the above-mentioned functions, etc., and may include control codes related to an execution procedure necessary for the processor of the computer to execute the above-mentioned functions according to a predetermined procedure.
또한, 이러한 코드는 전술한 기능들을 컴퓨터의 프로세서가 실행시키는데 필요한 추가 정보나 미디어가 컴퓨터의 내부 또는 외부 메모리의 어느 위치(주소 번지)에서 참조 되어야 하는지에 대한 메모리 참조 관련 코드를 더 포함할 수 있다.In addition, this code may further include additional information necessary for the processor of the computer to execute the above functions or code related to memory reference for which location (address address) in the internal or external memory of the computer to be referenced. .
또한, 컴퓨터의 프로세서가 전술한 기능들을 실행시키기 위하여 원격(Remote)에 있는 어떠한 다른 컴퓨터나 서버 등과 통신이 필요한 경우, 코드는 컴퓨터의 프로세서가 컴퓨터의 통신 모듈을 이용하여 원격(Remote)에 있는 어떠한 다른 컴퓨터나 서버 등과 어떻게 통신해야만 하는지, 통신 시 어떠한 정보나 미디어를 송수신해야 하는지 등에 대한 통신 관련 코드를 더 포함할 수도 있다.In addition, when the computer's processor needs to communicate with any other computer or server located remotely in order to execute the above-mentioned functions, the code is transmitted to the computer's processor using the computer's communication module. It may further include a communication-related code for how to communicate with another computer or server, and what kind of information or media to transmit and receive during communication.
이상에서 전술한 바와 같은 프로그램을 기록한 컴퓨터로 읽힐 수 있는 기록매체는, 일 예로, ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 미디어 저장장치 등이 있으며, 또한 캐리어 웨이브(예를 들어, 인터넷을 통한 전송)의 형태로 구현되는 것도 포함할 수 있다.The computer-readable recording medium in which the program as described above is recorded is, for example, ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical media storage device, etc., and also carrier wave (eg, , transmission through the Internet) may be implemented in the form of.
또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.In addition, the computer-readable recording medium is distributed in network-connected computer systems, and computer-readable codes can be stored and executed in a distributed manner.
그리고, 본 발명을 구현하기 위한 기능적인(Functional) 프로그램과 이와 관련된 코드 및 코드 세그먼트 등은, 기록매체를 읽어서 프로그램을 실행시키는 컴퓨터의 시스템 환경 등을 고려하여, 본 발명이 속하는 기술분야의 프로그래머들에 의해 용이하게 추론되거나 변경될 수도 있다.In addition, in consideration of the system environment of a computer that reads a recording medium and executes a program by reading a recording medium, a functional program for implementing the present invention and related codes and code segments, programmers in the technical field to which the present invention belongs may be easily inferred or changed by
도 10를 통해 설명된 캡슐 내시경 영상 판독 방법은, 컴퓨터에 의해 실행되는 애플리케이션이나 프로그램 모듈과 같은 컴퓨터에 의해 실행 가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다.The capsule endoscopy image reading method described with reference to FIG. 10 may also be implemented in the form of a recording medium including instructions executable by a computer, such as an application or program module executed by a computer. Computer-readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. Also, computer-readable media may include all computer storage media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
전술한 캡슐 내시경 영상 판독 방법은, 단말기에 기본적으로 설치된 애플리케이션(이는 단말기에 기본적으로 탑재된 플랫폼이나 운영체제 등에 포함된 프로그램을 포함할 수 있다)에 의해 실행될 수 있고, 사용자가 애플리케이션 스토어 서버, 애플리케이션 또는 해당 서비스와 관련된 웹 서버 등의 애플리케이션 제공 서버를 통해 마스터 단말기에 직접 설치한 애플리케이션(즉, 프로그램)에 의해 실행될 수도 있다. 이러한 의미에서, 전술한 캡슐 내시경 영상 판독 방법은 단말기에 기본적으로 설치되거나 사용자에 의해 직접 설치된 애플리케이션(즉, 프로그램)으로 구현되고 단말기에 등의 컴퓨터로 읽을 수 있는 기록매체에 기록될 수 있다.The above-described capsule endoscope image reading method may be executed by an application basically installed in the terminal (which may include a program included in a platform or operating system basically mounted in the terminal), and the user may use an application store server, an application or It may be executed by an application (ie, a program) directly installed in the master terminal through an application providing server such as a web server related to the corresponding service. In this sense, the above-described capsule endoscopy image reading method may be implemented as an application (ie, a program) installed basically in a terminal or directly installed by a user, and may be recorded in a computer-readable recording medium such as a terminal.
전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The above description of the present invention is for illustration, and those of ordinary skill in the art to which the present invention pertains can understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present invention. will be. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive. For example, each component described as a single type may be implemented in a dispersed form, and likewise components described as distributed may be implemented in a combined form.
본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present invention is indicated by the following claims rather than the above detailed description, and all changes or modifications derived from the meaning and scope of the claims and their equivalent concepts should be interpreted as being included in the scope of the present invention. do.
CROSS-REFERENCE TO RELATED APPLICATIONCROSS-REFERENCE TO RELATED APPLICATION
본 특허출원은 2020년 06월 25일 한국에 출원한 특허출원번호 제 10-2020-0077757호에 대해 미국 특허법 119(a)조 (35 U.S.C § 119(a))에 따라 우선권을 주장하며, 그 모든 내용은 참고문헌으로 본 특허출원에 병합된다. 아울러, 본 특허출원은 미국 이외에 국가에 대해서도 위와 동일한 이유로 우선권을 주장하면 그 모든 내용은 참고문헌으로 본 특허출원에 병합된다.This patent application claims priority under section 119(a) of the US Patent Act (35 USC § 119(a)) with respect to Patent Application No. 10-2020-0077757 filed in Korea on June 25, 2020, and All contents are incorporated into this patent application by reference. In addition, if this patent application claims priority for countries other than the United States for the same reason as above, all contents thereof are incorporated into this patent application by reference.
Claims (16)
- 캡슐 내시경에 의해 촬영된 캡슐 내시경 영상을 전처리하는 전처리부;A pre-processing unit for pre-processing the capsule endoscope image taken by the capsule endoscope;상기 전처리된 캡슐 내시경 영상을 입력으로 하여 상기 캡슐 내시경 영상의 병변 존재 여부를 판단하는 컨볼루션 뉴럴 네트워크(CNN, Convolution Neural Network); 및a convolutional neural network (CNN) that receives the pre-processed capsule endoscopy image as an input and determines whether a lesion exists in the capsule endoscope image; and상기 캡슐 내시경 영상에 대한 grad-CAM(Gradient Class Activation Map)을 획득하는 grad-CAM 획득부를 포함하고,and a grad-CAM acquisition unit for acquiring a grad-CAM (Gradient Class Activation Map) for the capsule endoscopy image,상기 컨볼루션 뉴럴 네트워크는,The convolutional neural network is상기 전처리된 캡슐 내시경 영상을 입력받는 입력층;an input layer receiving the pre-processed capsule endoscope image;상기 입력층을 통해 입력된 전처리된 캡슐 내시경 영상에 대한 특징을 추출하는 하나 이상의 합성곱층;one or more convolutional layers for extracting features of the preprocessed capsule endoscopy image input through the input layer;상기 캡슐 내시경 영상에 대한 특징을 서브 샘플링하는 하나 이상의 최대 풀링층; 및one or more maximal pooling layers for sub-sampling features of the capsule endoscopy image; and상기 캡슐 내시경 영상에 대하여 병변 존재 여부를 지시하는 확률값을 출력하는 출력층을 포함하고,and an output layer for outputting a probability value indicating whether a lesion exists with respect to the capsule endoscopy image,상기 grad-CAM 획득부는,The grad-CAM acquisition unit,상기 합성곱층 및 최대 풀링층 중에서 병변 위치 탐지 능력이 가장 높다고 판단된 층에서 grad-CAM을 획득하는 캡슐 내시경 영상 판독 시스템.A capsule endoscopy image reading system for acquiring grad-CAM from the layer determined to have the highest lesion location detection ability among the convolutional layer and the maximum pooling layer.
- 제1항에 있어서,According to claim 1,상기 전처리부는,The preprocessor is상기 캡슐 내시경 영상에서 노이즈를 제거하는 노이즈 제거부; 및a noise removing unit for removing noise from the capsule endoscope image; and상기 노이즈가 제거된 캡슐 내시경 영상에 대하여 회전 및 상하반전 중 적어도 하나를 수행하여 복수의 증강 영상을 생성하는 영상 증강부를 포함하는 캡슐 내시경 영상 판독 시스템.and an image enhancer configured to generate a plurality of augmented images by performing at least one of rotation and vertical inversion on the capsule endoscope image from which the noise has been removed.
- 제2항에 있어서,3. The method of claim 2,상기 노이즈 제거부는,The noise removing unit,상기 캡슐 내시경 영상에 기록된 문자, 숫자, 기호를 포함하는 노이즈를 제거하는 캡슐 내시경 영상 판독 시스템.Capsule endoscope image reading system for removing noise including letters, numbers, and symbols recorded in the capsule endoscope image.
- 제1항에 있어서,According to claim 1,상기 캡슐 내시경 영상 중 병변 존재 여부를 지시하는 확률값이 임계값 이상인 캡슐 내시경 영상을 기초로 상기 캡슐 내시경 영상에 대응하는 비디오 클립을 생성하는 비디오 클립 생성부를 추가로 포함하는 캡슐 내시경 영상 판독 시스템.The capsule endoscope image reading system further comprising a video clip generation unit generating a video clip corresponding to the capsule endoscope image based on the capsule endoscope image in which a probability value indicating the presence of a lesion in the capsule endoscope image is equal to or greater than a threshold value.
- 제4항에 있어서,5. The method of claim 4,상기 비디오 클립 생성부는,The video clip generation unit,상기 캡슐 내시경 영상의 전후로 최대 기준값만큼의 프레임을 추가하여, 상기 캡슐 내시경 영상에 대응하는 비디오 클립을 생성하는 캡슐 내시경 영상 판독 시스템.A capsule endoscope image reading system for generating a video clip corresponding to the capsule endoscope image by adding frames as many as a maximum reference value before and after the capsule endoscope image.
- 제5항에 있어서,6. The method of claim 5,상기 비디오 클립 생성부는,The video clip generation unit,중첩되는 프레임이 존재하는 서로 다른 2개의 비디오 클립이 존재할 때, 상기 2개의 비디오 클립을 병합하는 캡슐 내시경 영상 판독 시스템.When there are two different video clips with overlapping frames, the capsule endoscope image reading system merges the two video clips.
- 제1항에 있어서,According to claim 1,상기 캡슐 내시경 영상 중 병변이 존재하는 영상 및 병변이 없는 영상 각각에 대해서, 상기 컨볼루션 뉴럴 네트워크의 학습에 사용되는 학습 영상 세트의 영상 개수와 상기 컨볼루션 뉴럴 네트워크의 테스트에 사용되는 테스트 영상 세트의 영상 개수의 비율은 미리 설정된 비율값으로 결정되는 캡슐 내시경 영상 판독 시스템.Among the capsule endoscopy images, for each of an image with a lesion and an image without a lesion, the number of images in the training image set used for learning the convolutional neural network and the test image set used for testing the convolutional neural network. A capsule endoscope image reading system in which the ratio of the number of images is determined by a preset ratio value.
- 제7항에 있어서,8. The method of claim 7,상기 입력층은,The input layer is상기 학습 영상 세트에 대해 병변이 존재하는 영상과 병변이 없는 영상을 동일한 개수로 입력받는 캡슐 내시경 영상 판독 시스템.A capsule endoscopy image reading system for receiving the same number of images with lesions and images without lesions with respect to the learning image set.
- 캡슐 내시경 영상 판독 시스템을 이용한 캡슐 내시경 영상 판독 방법으로,A capsule endoscope image reading method using a capsule endoscope image reading system,캡슐 내시경에 의해 촬영된 캡슐 내시경 영상을 전처리하는 전처리 단계;A pre-processing step of pre-processing the capsule endoscope image taken by the capsule endoscope;상기 전처리된 캡슐 내시경 영상을 입력받는 입력 단계;an input step of receiving the pre-processed capsule endoscope image;상기 입력 단계에서 입력된 전처리된 캡슐 내시경 영상에 대한 특징을 추출하고 상기 추출된 특징을 서브 샘플링하는 처리 동작을 반복적으로 실행하는 처리 단계;a processing step of extracting features for the pre-processed capsule endoscope image input in the input step and repeatedly executing a processing operation of subsampling the extracted features;상기 처리 단계의 결과를 기초로 grad-CAM(gradient Class Activation Map)을 획득하는 grad-CAM 획득 단계; 및a grad-CAM obtaining step of obtaining a grad-CAM (gradient class activation map) based on the result of the processing step; and상기 캡슐 내시경 영상의 병변 존재 여부를 지시하는 확률값을 출력하는 출력 단계를 포함하는 캡슐 내시경 영상 판독 방법.and an output step of outputting a probability value indicating whether a lesion exists in the capsule endoscopy image.
- 제9항에 있어서,10. The method of claim 9,상기 전처리 단계는,The pre-processing step is상기 캡슐 내시경 영상에서 노이즈를 제거하는 노이즈 제거 단계; 및a noise removing step of removing noise from the capsule endoscopy image; and상기 노이즈가 제거된 캡슐 내시경 영상에 대하여 회전 및 상하반전 중 적어도 하나를 수행하여, 복수의 증강 영상을 생성하는 영상 증강 단계를 포함하는 캡슐 내시경 영상 판독 방법.and an image augmentation step of generating a plurality of augmented images by performing at least one of rotation and vertical inversion on the capsule endoscope image from which the noise has been removed.
- 제10항에 있어서,11. The method of claim 10,상기 노이즈 제거 단계는,The noise removal step is상기 캡슐 내시경 영상에 기록된 문자, 숫자, 기호를 포함하는 노이즈를 제거하는 캡슐 내시경 영상 판독 방법.A capsule endoscope image reading method for removing noise including letters, numbers, and symbols recorded in the capsule endoscope image.
- 제9항에 있어서,10. The method of claim 9,상기 캡슐 내시경 영상 중 병변 존재 여부를 지시하는 확률값이 임계값 이상인 캡슐 내시경 영상을 기초로, 상기 캡슐 내시경 영상에 대응하는 비디오 클립을 생성하는 비디오 클립 생성 단계를 추가로 포함하는 캡슐 내시경 영상 판독 방법.and a video clip generation step of generating a video clip corresponding to the capsule endoscope image based on a capsule endoscope image in which a probability value indicating whether a lesion exists in the capsule endoscope image is equal to or greater than a threshold value.
- 제12항에 있어서,13. The method of claim 12,상기 비디오 클립 생성 단계는,The video clip creation step comprises:상기 캡슐 내시경 영상의 전후로 최대 기준값만큼의 프레임을 추가하여, 상기 캡슐 내시경 영상에 대응하는 비디오 클립을 생성하는 캡슐 내시경 영상 판독 방법.A capsule endoscopy image reading method for generating a video clip corresponding to the capsule endoscope image by adding frames as many as a maximum reference value before and after the capsule endoscope image.
- 제13항에 있어서,14. The method of claim 13,상기 비디오 클립 생성 단계는,The video clip creation step comprises:중첩되는 프레임이 존재하는 서로 다른 2개의 비디오 클립이 존재할 때, 상기 2개의 비디오 클립을 병합하는 캡슐 내시경 영상 판독 방법.When there are two different video clips having overlapping frames, the capsule endoscopy image reading method is for merging the two video clips.
- 제9항에 있어서,10. The method of claim 9,상기 캡슐 내시경 영상 중 병변이 존재하는 영상 및 병변이 없는 영상 각각에 대해서, 상기 컨볼루션 뉴럴 네트워크의 학습에 사용되는 학습 영상 세트의 영상 개수와 상기 컨볼루션 뉴럴 네트워크의 테스트에 사용되는 테스트 영상 세트의 영상 개수의 비율은 미리 설정된 비율값으로 결정되는 캡슐 내시경 영상 판독 방법.Among the capsule endoscopy images, for each of an image with a lesion and an image without a lesion, the number of images in the training image set used for learning the convolutional neural network and the test image set used for testing the convolutional neural network. A capsule endoscopy image reading method in which the ratio of the number of images is determined by a preset ratio value.
- 제15항에 있어서,16. The method of claim 15,상기 입력 단계는,The input step is상기 학습 영상 세트에 대해 병변이 존재하는 영상과 병변이 없는 영상을 동일한 개수로 입력받는 캡슐 내시경 영상 판독 방법.Capsule endoscopy image reading method for receiving the same number of images with lesions and images without lesions for the learning image set.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2020-0077757 | 2020-06-25 | ||
KR1020200077757A KR102359984B1 (en) | 2020-06-25 | 2020-06-25 | System and Method for reading capsule endoscopy image |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021261727A1 true WO2021261727A1 (en) | 2021-12-30 |
Family
ID=79281497
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2021/004735 WO2021261727A1 (en) | 2020-06-25 | 2021-04-15 | Capsule endoscopy image reading system and method |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR102359984B1 (en) |
WO (1) | WO2021261727A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115035985A (en) * | 2022-06-16 | 2022-09-09 | 安翰科技(武汉)股份有限公司 | Data processing method and device of capsule endoscope system |
CN116542883A (en) * | 2023-07-07 | 2023-08-04 | 四川大学华西医院 | Magnetic control capsule gastroscope image focus mucosa enhancement system |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20240071188A (en) | 2022-11-15 | 2024-05-22 | 동의대학교 산학협력단 | Apparatus and method for supporting capsule endoscopy lesion examination using artificial neural network |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101889725B1 (en) * | 2018-07-04 | 2018-08-20 | 주식회사 루닛 | Method and Apparatus for Diagnosing Malignant Tumor |
KR20180136857A (en) * | 2017-06-14 | 2018-12-26 | 한국전자통신연구원 | Capsule endoscope to determine lesion area and receiving device |
KR20190090150A (en) * | 2018-01-24 | 2019-08-01 | 주식회사 인트로메딕 | An apparatus for creating description of capsule endoscopy and method thereof, a method for searching capsule endoscopy image based on decsription, an apparatus for monitoring capsule endoscopy |
KR20190128292A (en) * | 2018-05-08 | 2019-11-18 | 서울대학교산학협력단 | Method and System for Early Diagnosis of Glaucoma and Displaying suspicious Area |
KR20200070062A (en) * | 2018-12-07 | 2020-06-17 | 주식회사 포인바이오닉스 | System and method for detecting lesion in capsule endoscopic image using artificial neural network |
-
2020
- 2020-06-25 KR KR1020200077757A patent/KR102359984B1/en active IP Right Grant
-
2021
- 2021-04-15 WO PCT/KR2021/004735 patent/WO2021261727A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20180136857A (en) * | 2017-06-14 | 2018-12-26 | 한국전자통신연구원 | Capsule endoscope to determine lesion area and receiving device |
KR20190090150A (en) * | 2018-01-24 | 2019-08-01 | 주식회사 인트로메딕 | An apparatus for creating description of capsule endoscopy and method thereof, a method for searching capsule endoscopy image based on decsription, an apparatus for monitoring capsule endoscopy |
KR20190128292A (en) * | 2018-05-08 | 2019-11-18 | 서울대학교산학협력단 | Method and System for Early Diagnosis of Glaucoma and Displaying suspicious Area |
KR101889725B1 (en) * | 2018-07-04 | 2018-08-20 | 주식회사 루닛 | Method and Apparatus for Diagnosing Malignant Tumor |
KR20200070062A (en) * | 2018-12-07 | 2020-06-17 | 주식회사 포인바이오닉스 | System and method for detecting lesion in capsule endoscopic image using artificial neural network |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115035985A (en) * | 2022-06-16 | 2022-09-09 | 安翰科技(武汉)股份有限公司 | Data processing method and device of capsule endoscope system |
CN116542883A (en) * | 2023-07-07 | 2023-08-04 | 四川大学华西医院 | Magnetic control capsule gastroscope image focus mucosa enhancement system |
CN116542883B (en) * | 2023-07-07 | 2023-09-05 | 四川大学华西医院 | Magnetic control capsule gastroscope image focus mucosa enhancement system |
Also Published As
Publication number | Publication date |
---|---|
KR20220000437A (en) | 2022-01-04 |
KR102359984B1 (en) | 2022-02-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021261727A1 (en) | Capsule endoscopy image reading system and method | |
WO2017213398A1 (en) | Learning model for salient facial region detection | |
WO2018212494A1 (en) | Method and device for identifying object | |
WO2019132168A1 (en) | System for learning surgical image data | |
WO2020122432A1 (en) | Electronic device, and method for displaying three-dimensional image thereof | |
WO2019235828A1 (en) | Two-face disease diagnosis system and method thereof | |
WO2022131642A1 (en) | Apparatus and method for determining disease severity on basis of medical images | |
WO2021230534A1 (en) | Orbital and periorbital lesion prediction apparatus and prediction method therefor | |
WO2021246770A1 (en) | Method and system for automatically reading x-ray image in real time on basis of artificial intelligence | |
WO2022146050A1 (en) | Federated artificial intelligence training method and system for depression diagnosis | |
EP4004872A1 (en) | Electronic apparatus and method for controlling thereof | |
WO2022265197A1 (en) | Method and device for analyzing endoscopic image on basis of artificial intelligence | |
WO2022231200A1 (en) | Training method for training artificial neural network for determining breast cancer lesion area, and computing system performing same | |
KR20220078495A (en) | Method, apparatus and program to read lesion of small intestine based on capsule endoscopy image | |
WO2023075508A1 (en) | Electronic device and control method therefor | |
WO2024034923A1 (en) | Method and system for object recognition and behavior pattern analysis based on video surveillance using artificial intelligence | |
WO2024019337A1 (en) | Video enhancement method and apparatus | |
WO2022119364A1 (en) | Capsule endoscopy image-based small intestine lesion deciphering method, device, and program | |
WO2022004970A1 (en) | Neural network-based key point training apparatus and method | |
WO2021096136A1 (en) | Electronic device and control method therefor | |
WO2021015490A2 (en) | Method and device for analyzing specific area of image | |
WO2022158694A1 (en) | Method for processing pathological tissue image and apparatus therefor | |
WO2019168280A1 (en) | Method and device for deciphering lesion from capsule endoscope image by using neural network | |
WO2023149649A1 (en) | Electronic device and method for improving image quality | |
WO2023182796A1 (en) | Artificial intelligence device for sensing defective products on basis of product images and method therefor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21829218 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21829218 Country of ref document: EP Kind code of ref document: A1 |